Significant Digits and You

When discussing portions of a number, we (especially computer scientists) use the terms 'least significant' digits and 'most significant' digits. By this we mean nothing about the importance of those digits. We simply are implying that if you were to change the digit, how much impact would it have on the entire number. For instance, if you had the number 482, changing the 2 into a 3 would have very little effect on the number itself. However, changing the 4 to a 5 would increase it by 100 -- a much more significant effect.

Use of Modulo

When you need to extract certain digits, you can use integer division and modulo by powers of 10 (because we do our numbers in base 10). To keep the least significant 4 digits of a 7-digit number, for instance, you'd do:

   least_sig_4 = big_num % 10000;

The reason this works is thus:

          _____123 R 4567
   10000 / 1234567
           10000
          ------
            23456
            20000
            -----
             34567
             30000
             -----
              4567

Or, you might see it like this:

   1234567 / 10000 = 123.4567

but with integer division this is:

   1234567 / 10000 = 123 R 4567 (over 10000)

From this same process, we see that to throw away those 4 least significant digits, we'd use plain integer division:

   most_sig_3 = big_num / 10000;

Recall that with integer division we keep the quotient instead of the remainder (as with modulo).

You can also generalize this to other digit groups. Let's say that a company's product code was an 8-digit number with 3 segments as so: 12-345-678. We could store this value as a single long integer (which can store up to 10 digits -- although only about 20% of complete 10-digit numbers). This would require only 4 bytes (32 bits) of memory or disk space. Storing the 3 parts would require anywhere from 6 bytes to 10 bytes (storing as 3 short integers or 10 characters -- with the dashes).

To break it apart for printing, we could do this:

    // get 8-digit long integer from memory or disk storage
    first_pair = whole_code / 1000000;
    last_triple = whole_code % 1000;
    middle_group = whole_code % 1000000 / 1000;

Note how the first and last groups were simple integer division or modulo. The middle group is a combination of the two. It first keeps the least significant 6 digits (1e6 as divisor) and then throws away 3 least significant digits (1e3 as divisor). That leaves it with just the 3 most significant of the 6 least significant -- or the middle 3 digits of the overall number!

    12345678
    12345678  % 1000000   keep only least sig 6 digits (1e6 divisor)
      345678  / 1000      throw away least sig 3 digits (1e3 divisor)
      345

This isn't the only possible way, though. When you have more than two digit group extractions, there are generally more than one way to perform them. This extraction could have also been done as:

   middle_group = whole_code / 1000 % 1000;

Here we first throw away the 3 least significant digits before keeping only the 3 least significant digits of what remains.

    12345678
    12345678  / 1000   throw away least sig 3 digits (1e3 divisor)
    12345     % 1000   keep only least sig 3 digits (1e3 divisor)
      345

Digit Extraction Summary

So, in order to keep the n least significant digits of a base 10 number, use modulo with a divisor of 1en (as an integer!).

To throw away the n least significant digits of a base 10 number, use integer division with a divisor of 1en (as an integer!).

(If you are dealing with a different base, use the nth power of your base as the divisor. 'keep' is still modulo and 'throw away' is still integer division.)

Supplement: How Many Digits Can I Store?

Recalling that the integer types have these properties:
Max (signed)Max (unsigned)
short3276765535
long21474836474294967295

We see that a short integer can store any 4-digit number completely. It can also store about 30% of the 5-digit numbers when signed or a little over 60% of them when unsigned. For long integers it can store all 9-digit numbers. When signed, it can hold about 20% of the 10-digit numbers as compared to 40% when unsigned.

Supplement: Storage -- It Takes a Lot of Space

Recalling that the discrete types have these properties:
BitsBytes
bool324
char81
short162
long324

We see that to store a multi-grouped number will take quite a bit of storage if done as individual character digits -- one byte for each digit plus one for each group seperator.

Storing as one short integer per group is slightly better: 2 bytes for each group. But storing as a single long integer is best: 4 bytes and that's it.

Let's look at a couple of examples:
Digit SequenceStored as
charsshortsa long
(in bytes)
123-4567844
12-345-6781064
123-456-7891164
123-45678910N/A4

All of this assumes that your numbers (or even groups) can fit as a single whole number, of course. What if it can't? (Note how the last example can't be stored as a pair of short integers.) We can still minimize it a bit:

   1234-567890       short+long         2+4=6 bytes
   123-456-7890      3*short            3*2=6 bytes
             or      long+short         4+2=6 bytes

The first two are fairly self-explanatory, but what about that last one?! Two groups are combined into a single long integer. Hey...come to think of it, we were doing even three groups combined into a single long integer above. We've seen how to break a long integer into smaller groups, but how do we combine these groups for storage?

To combine the 123-456 into a single long integer for storage, simply multiply the first group by 1e3 (3 being the number of digits in the second group) and add the second group:

   123 * 1000 + 456  // gives 123456..?

Be careful, though. When the compiler sees this exactly, it won't give the correct result. Instead you'll get -7616. (Well, it depends on the compiler, really, but traditionally it would be this.) Why? When you multiply the 123 (short) by 1000, you'd like to get 123000. Instead you get -8072 (123000 crammed into a short). To avoid this possibility and make sure it will work, simply make your 1000 a long integer literal:

   123 * 1000L + 456  // gives 123456

or constant:

   const long dig_shift_3 = 1000;
   123 * dig_shift_3 + 456  // gives 123456

(Note: the L after the literal 1000 makes it a Long integer. You are allowed to use lower case l, but this is MUCH harder to read and is severely frowned upon!)