[Num-utils] New feature: base changing in numformat

Suso Banderas suso at suso.org
Sat Jun 28 20:26:30 GMT 2008


I have an interesting one here.  I implemented the -b option into numformat.
It allows you to change the base of numbers on input to some other base.
Right now, it only accepts bases of 1 to 61 by default.  And the character set it
uses for this is
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.  So in base 35,
the number 12345 is A2P.  There is also an option -C that lets you
substitute the character set for your own, so you could have something
like 0123456789abcdefghijklmnopqrstuvwxyz instead of the capital letters
first or maybe just abcdefghijklmnopqrstuvwxyz in base 26 if you wanted
to tie letters to their values easier. Like this:

 numprocess /mod26,-1/ | numformat -b 26 -C abcdefghijklmnopqrstuvwxyz

Anyways, I came up with a strange situation.  Base 1 is essentially
tallying.  Where there are the same number of characters as the value of
the number being converted.  The question is, what number makes sense to
use for the character for this.  Right now I'm using 0, which may seem
wrong, but if you understand the logic of the program, it makes sense.


 The way I do the base conversion is this:

1. The $charset variable is set to this by default:
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
 
  If the user specifies an alternate charset, its set here.

2. Check to make sure that the base is within legal bounds (greater than
1 and no more than 1 more than the length of the character set.)

3. Create an array to hold the output characters to represent the new
numbers.

4. If the number is not base 1, then do the following operation:

  While result is not 0, do
   Number / base = result
   number mod(base) = remainder

    Take the remainder and use it as an index for determining what
   character to use in the character set.  Push that character onto the
   array.

  Once I've done this until there is no more remainder, reverse the
array and join it together to make the final answer.


5. If the number is base 1, then you can't do the special base
conversion algorithm on it, so just take the first character in the
character set and print it a number of times that is equal to the value
of the number.


  So using this algorithm, it makes sense that the number used for base
1 is 0 instead of 1.  If you want to use 1 instead, you could just use
the -C 1 option.

What do people think?









More information about the Num-utils mailing list