Bit-Mapped Flags

NOTE:

These are a little sketchy...I'll fill in details 'soon'.

What's a bit?

1 byte == 8 bits: 00110100 would be 52 translated to base 10. Each bit position represents the presence or absence of a power of two in the number. The left-most bit is 2^7 (128) and the right-most bit is 2^0 (1).

    00110100 = 0*2^7 + 0*2^6 + 1*2^5 + 1*2^4 + 0*2^3 + 1*2^2 + 0*2^1 + 0*2^0
             = 32 + 16 + 4
             = 52

(This is analogous to how base 10 works. 1234 is really 1*10^3 + 2*10^2 + 3*10^1 + 4*10^0. We just don't have to think through the steps because we've been using base 10 since kindergarten or even earlier...)

What's a flag?

A flag is a signal that something is (or is not) happening. Historically, flags could have been placed atop a rampart to signal that a new prince had been born. (The absence of the flag would indicate that the birth had not yet occurred.) More recently, flags have been used to signal messages in semaphore code or to route planes on the runway.

In programming, we use a single bit to represent whether something has or has not happened (or even whether something should or should not happen). We continue to use real flag terminology like saying that a flag is set when it is flying. But then we bastardize it by saying that a flag is unset when it is not flying.

So how's it come together?

Let's say that you had 3 things that could either be or not be (fail, eof, and bad for instance). You could assign each of these three its own power of 2 (recall that each bit position represents a particular power of 2). Perhaps:

    enum STATE_TYPE { fail_bit = 1, eof_bit = 2, bad_bit = 4 };

Notice the use of an enumeration to make the declaration of the constants easier and cleaner.

To store the state information, we'll need an integer variable (it is tricky to manipulate floating-point types bit-by-bit). A 'char' can hold 8 bits or 16 bits depending on the system's encoding scheme (quoted due to the fact that we normally don't encode anything other than a keystroke in the character type; here we might make an exception). A short can always hold at least 16 bits. A long can always hold at least 32 bits. And an int can hold 16 or 32 bits depending on the hardware's word size. We'll ignore the 'char' and int possibilities since they are so volatile. Either short or long is good, but which one should we use? Most programmers will err on the side of caution and use a long. This allows them room to grow if later in the maintenance cycle more flags need to be added.

So we make a flag variable:

   long flags;

How do we manipulate the individual flags?

Setting a Flag

The first thing we need to be able to do is to set flags. That is, we need to make the bit corresponding to a certain flag a 1. This could be done with addition, but it is traditionally done with the bit-wise or operator: |. This operator (half a logical or) performs an 'or' on the bits of two integers one-by-one. So, for instance:

    00110100
  | 10010010
 ------------
    10110110

Only when there are two 0's aligned do we get a 0 bit in the result.

So, to set a particular flag, say eof_bit, in our flags variable, we would:

    flags = flags | eof_bit;

Or, more likely, we'd use the short-hand form:

    flags |= eof_bit;

Testing a Flag

The second chore we'd need to accomplish is to test if a flag is or is not set. This can be done by the bit-wise and operator: &. This operator (half a logical and) performs an 'and' on the bits of two integers one-by-one. So, for instance:

    00110100
  & 10010010
 ------------
    00010000

Only when there are two 1's aligned do we get a 1 bit in the result.

So, to test a particular flag, say eof_bit, in our flags variable, we could do:

    if ((flags & eof_bit) != 0)
    {
        // eof_bit is set
    }
    else
    {
        // eof_bit is NOT set
    }

Note that the () around the and expression is necessary because == and != have precedence over the bit operators.

Unsetting a Flag

The final task that most people will ask of us is to unset a flag — make the bit a 0. This can be done by a combination of the bit-wise and with the bit-wise not operator: ~. This operator (a tilde) is used to flip the bits in an integer — ones become zeros and vice versa. So, for instance:

    ~ 00110100 = 11001011
    ~ 10010010 = 01101101

But how do these operations combine to unset a flag? First, we take the complement (bit-wise inverse/not) of the flag to unset: ~eof_bit. For an 8-bit example:

    ~ 00000100 = 11111011

Next we bit-and this with the flags: flags &= ~eof_bit. (Notice the short-hand bit-and-with-assignment.) Since only the eof_bit is a 0 in ~eof_bit, this bit-and will leave all other flags unchanged (it stays a 0 if it was a 0 and stays a 1 if it was a 1). But the eof_bit flag is bit-and'd with 0 forcing it off.

Toggling a Flag

There is one other operation sometimes requested of bit-flags: toggle. This happens when you don't really care what the current state is, but you know you want it to flip to the other one (kind of like when you have multiple light switches hooked to a single lamp, they will eventually get out of sync so that the lamp is off but both switches are in the classically 'on' position). This operation requires the bit-wise exclusive or operator: ^. (Remember how we said it wasn't power in 121 — we meant it!) Exclusive or is defined to be true when one or the other — but not both nor neither — input is true:

    X      Y      |   X xor Y
   ---------------+------------
    true   true   |    false
    true   false  |    true
    false  true   |    true
    false  false  |    false

Here 'xor' is the typical abbreviation of eXclusive OR. So, in C++ code, you might see:

    flags ^= eof_bit;

To toggle the eof_bit. (Okay, so you'd never toggle the eof bit. But it's just an example!)

Benefits

a single long can store as many as 32 bits at once (much smaller than an array of 32 bool's (each of which is 4 bytes, typically)
the bit manipulations are hardware coded for speed

you can group flags together by making a constant to represent all of them:

    enum STATE_FLAGS { fail_bit = 1, eof_bit = 2, bad_bit = 4,
                       all_states = 7 };

now you can extract all flags dealing with state by:

    flags & all_states

thus making it easier to test for the 'good' condition:

    if ((flags & all_states) == 0)
    {
        // state is good
    }
    else
    {
        // state is NOT good
    }

multiple flags can be combined together:

    enum STRIP_FROM { beginning = 1, end = 2 };
    ... f( ... STRIP_FROM where ... )
    {
        if ((where&beginning) != 0)
        {
            // strip from beginning
        }
        // do something
        if ((where&end) != 0)
        {
            // strip from end
        }
    }

    // call...
    f( ... beginning|end ... );   // strips from both ends