Humans and Their Symbols

Humans like to break up sequences of numbers with symbols:

         dates:     12/13/02
         times:     14:35
  phone number:     847-555-1234
     2D points:     (4.2,-8.3)
    map coords:     46o38'42"     (can't do the degree sign in text,
                                   so I had to use a lowercase O)

Note that the last two even have trailing symbols. The 2D points also have leading symbols. This can be tricky when you program wants to handle or calculate with the numbers in between, before, or after such symbols. You want to allow the user to enter their data in a natural fashion, but how to get past all those weird symbols?!

Doing This In Code

We do this in a C++ program by mixing numeric and character input, because cin will simply stop when it finds a character it can't translate properly:

         C++ source code:                            User types:
     -------------------------------------------------------------
        short hour, minute;                           14:35
        char colon;
        cin >> hour >> colon >> minute;

What does cin do during this input sequence?

Actions of cin Buffer contents
      looking for a short integer (hour):
          see a '1', okay total is: 1, move on
          see a '4', okay total is: 14, move on
          see a ':', stop: hour = 14
      14:35\n
      14:35\n
      14:35\n
      looking for a char (colon):
          see a ':', okay: stop: colon = ':'
      14:35\n
      14:35\n
      looking for a short integer (minute):
          see a '3', okay total is: 3, move on
          see a '5', okay total is: 35, move on
          see a '\n', stop: minute = 35   (user hit Enter giving
                                           the character '\n')
      14:35\n
      14:35\n
      14:35\n

Just remember that the computer doesn't really understand the variable name "colon". It simply knows that it is looking for a character. Any of the following would have worked as well:

       14,35      14$35      14#35      14*35      14'35"

(Of course, the last one wouldn't leave the buffer at a '\n' and so might cause problems for later inputs in the program.)

Handling A Trailing Symbol

Let's look a little further at that last example: 14'35". How would you read that? Well, we know to use a char to read the ' symbol (this is represented as '\'' in C++ code, by the way...). We'd probably need another char to read in the " symbol:

    char min, sec;
    short minutes, seconds;

    cin >> minutes >> min >> seconds >> sec;

(Note that the single quote or 'tick mark' is used as a unit symbol for both feet -- English length unit -- and minutes -- English unit for fractional part (1/60) of a degree. Similarly, the double quote is used as a unit symbol for both inches -- English length unit -- and seconds -- English unit for fractional part (1/60) of a fractional part (1/60) of a degree (or 1/3600 of a degree).)

While that works fine, but it is a tiny bit wasteful. After all, we don't really plan on doing anything with the two character symbols after reading past them in the input. So, why waste two variables storing them when one memory position will do?

    char temp;
    short minutes, seconds;

    cin >> minutes >> temp >> seconds >> temp;

Ah...much better! Extending this idea slightly let's us do even more complicated things like:

    char temp;
    long dollars;
    short cents;

    cin >> temp >> dollars >> temp >> cents;

Here there is a leading symbol (most likely a '$' in this locale), a number of dollars, another symbol (a '.' in this locale), and a number of cents. (Okay, so we could have just done:

    char temp;
    double cost;
    long dollars;
    short cents;

    cin >> temp >> cost;
    //cents = ???;
    //dollars = ???;

But now how do we extract the dollars and cents from the whole cost? It will take more tools than we currently have...)

The Trailing '\n' and Further Input

Wait...what happens to the user's <Enter> key-stroke? Earlier we said it was a newline character in the input stream:

    14:35\n

Well, when translating this time entry, the hours stops when the ':' is reached since ':' is an invalid symbol to be part of a short integer. Then the ':' is extracted into the temporary char variable. Then the 35 tops at the '\n' since spaces can't appear inside a number, either. If that's all the program asked for, cin will stop there and leave the '\n' for later. If the program ends before cin is asked to translate anything else, the '\n' will either be thrown out or given to the operating system (usually the former).

If your program asks cin to translate something else (another extraction operation), it will skip any leading whitespace before the data begins -- including this left-over newline.

So, you don't have to worry about the '\n' characters in the user's input -- cin will automatically ignore them (when extracting; using >>).

Unknown or "Don't Care" Trailing Stuff

Let's look at this example where we prompt a user (a park ranger, supposedly) for the projected growth rate of deer for their park:

    double growth;
    char symbol;

    cout << "\nAnd what was the expected percent growth rate?  ";
    cin >> growth >> symbol;

Here we clearly expect that the user will type in the percent symbol (%) after their growth rate value. But will they? If they don't, this program will hang waiting for that char information to be entered. (Again, it doesn't have to be %, but some non-space input is required.)

A hung program is never fun so let's find a way to fix this. My first question to you is: Do we really need that symbol? I mean, it is a convention that the user should put it, but most humans are lazy and won't actually type it in an understood context like this. So why are we making them enter it? Maybe this would suffice:

    double growth;

    cout << "\nAnd what was the expected percent growth rate?  ";
    cin >> growth;

Sure, there are odd-balls (myself included!) who will obsessively enter that % after the value, but why bow to them? They are the minority, right? True, but they are there and not going away and it isn't that hard to make the code work for both groups of users.

So how can we fix it? We'll need the help of one of cin's crew members. (Recall that cin is the captain of a boat on the input stream from the keyboard.) We have used the extraction crew for a long time now (each variable type we can >> into has a special handler in the cargo bay to oversee its translation). This new crew member is more like a bouncer for the cargo bay and kicks out any characters we don't want to deal with. In this case that silly %.

Function Vocabulary

When you use a function in programming, we don't say 'evaluate' the function or 'invoke' the function as some math teachers do. We say that we 'call' the function. The metaphor here is that the function is sitting at a desk somewhere and we phone it up and tell it any specific information it might need to perform its task. We then wait until the function comes back to the phone and tells us about its results before we both hang up and then we can continue our program from right after the call.

Further, we'll see that the inputs given to the function are said to be 'passed' to the function. Of course, now we're playing catch rather than making a phone call and the metaphor is fractured beyond sanity. But we're programmers — not writers.

To use him, though, we'll have to learn a new syntax. You see, the bouncer isn't an operator like extraction but a function. Worse yet, he's a pretty peculiar function in terms of what information he expects coming in and with whom he'll work. Let's actually look at the simplest call we can make to the bouncer for cin's cargo bay:

    cin.ignore();

As you can see, the bouncer's name is ignore. He is called by placing the () after his name in the code and the . beforehand preceded by the keyboard captain's name. To fully understand this, we have to read it from right to left:

    cin        .        ignore    ()    ;
   input      with     function   call
   stream    respect      ^       this
     ^       to this      |________|
     |___________|

So, when we call on ignore with no input (the parentheses are empty) and with respect to cin, we are telling it to throw out of the cargo bay the very next character that would have been processed.

Is this going to work? It would seem at first test that it works. When my OCD brethren type the % ignore throws it away so that any further inputs from cin happen cleanly. And when the rest of you type nothing...well, what's being ignored in that case? Luckily there is the \n waiting from when the user hit Enter. (Remember, every keystroke is a char and is being processed. Most of the time the blank ones like  , \t, and \n are just thrown out by an >> crew member as useless — containing no information. (More on that in a minute...) But they are there in the cargo bay and can be processed in other ways. Here, they give a sense of balance to our ignore function call! What luck!)

So that's good. We're done, right? No. Sorry. It turns out that there are weirder OCDs in our customer base than we'd expected. Some of them just entered  % — note the extra leading space there — and that was enough to break our program! After all, we only said to ignore a single character and they typed two (three including the \n.)

Others typed the word percent instead of the symbol. Others — remember we're dealing with park rangers here — typed in long diatribes about how lonely they were and how there just aren't enough deer in the park and how they with the growth rate were higher. It was really sad...

To fix these, we need to learn about a second way the bouncer can be called upon. When we tell him nothing — empty () after his name — he throws out a single character (aka keystroke). But, we can also call on him with two pieces of input. These represent a maximum number of characters to throw out and a special character value that, when seen, signals that it is the last one to throw out. We represent this documentationally like so:

   cin.ignore(n, c);

So n represents the maximum number of chars to be thrown out during this call and c represents the special termination character that, when thrown out, stops us from continuing — even if we haven't reached our maximum yet! Note that we use a comma to separate multiple arguments from one another — much like we separated multiple variables declared all of the same data type from one another by commas.

At first, when the  % folks appeared, we tried this:

    cin.ignore(2, '\n');

The % shouldn't be more than a single space out and it would take two chars therefore to reach it. And the \n is still our ultimate terminator since the user has to type it before we even get to start processing their input so it will always be present..!

But then those percent and diatribers showed up and we sought a larger value that seemed useful. Turns out those diatribers can go on for a long time... But, looking at the iostream documentation further, we noticed that there was a special circumstance! If we passed the largest possible integer for the system as the maximum number of characters to throw out, ignore would assume we meant to throw out as many as necessary to reach the special terminator character! (A poor-man's attempt at representing infinity inside the computer, essentially...)

So, what's the largest possible integer for the system? For that bit of information we turn to a new library: limits. In it we find all manner of constants for various limits on memory used to store integer quantities.

Particularly, we are interested in the numeric_limits<streamsize>::max() constant that represents the largest possible size of stream information on the system — our infinity marker. Passing this as the first input to ignore we get our desired result:

    cin.ignore(numeric_limits<streamsize>::max(), '\n');

Putting all this together (don't forget to #include limits!) we have:

    double growth;

    cout << "\nAnd what was the expected percent growth rate?  ";
    cin >> growth;
    cin.ignore(numeric_limits<streamsize>::max(), '\n');

Again, note that we've read in the growth rate variable's value before we ignored the rest of the line of input! If you don't do it in this exact order, your program will make them enter the data twice at best and run amok at worst..!

Other Uses

Also, since this is so effective at cleaning out cin's cargo bay, we might also want to use it after each input [group]. Whenever we are done reading the data from a particular prompt, we can use the ignore(numeric_limits<streamsize>::max(),'\n') pattern to make sure the cargo bay is nice and empty so that further prompt/input actions in our program will have a clean place to start.

But this new tool we've just added to our programming tool-belt is much more than just for throwing out potential symbols or garbage! We can use it also for pausing the program. Assuming we had a clean cargo bay to start with, we can use something like this:

    cout << "\nPress Enter to continue...\n";
    cin.ignore(numeric_limits<streamsize>::max(), '\n');

And the user will be forced to at least hit the Return key to move on with the program.

We'll also see shortly that this form of ignore can be used to allow the user to enter whole words at prompts where we only need the first letter to distinguish their intent. So, for instance:

    char answer;
    // . . .
    cout << "Would you like to go on?  ";
    cin >> answer;
    cin.ignore(numeric_limits<streamsize>::max(), '\n');

Allows the user to enter not only y or n at the prompt, but yes or no as well. (In fact, they could enter whole phrases if they like — as long as they start with a y or n so we can tell them apart in our program. We'll learn exactly how to tell them apart shortly...)

Internal Spaces and Extraction

As with the excess newline and continued input, cin will likewise ignore superfluous spacing inside an entry. If cin's buffer held:

    14 : 35 \n

Our input would go like this:

Actions of cin Buffer contents
      looking for a short integer (hour):
          see a '1', okay total is: 1, move on
          see a '4', okay total is: 14, move on
          see a ' ', stop: hour = 14
      14 : 35 \n
      14 : 35 \n
      14 : 35 \n
      looking for a char (colon):
          see a ' ', spacing is insignificant -- worthless, move on
          see a ':', okay: stop: colon = ':'
      14 : 35 \n
      14 : 35 \n
      14 : 35 \n
      looking for a short integer (minute):
          see a ' ', spacing is insignificant -- worthless, move on
          see a '3', okay total is: 3, move on
          see a '5', okay total is: 35, move on
          see a ' ', stop: minute = 35
      14 : 35 \n
      14 : 35 \n
      14 : 35 \n
      14 : 35 \n

Here the hours (14) will stop at the space (' '). The temporary character extraction will skip the leading space (' ') and get the colon (':'). And the minutes will skip the leading space (' '), grab the 35, and stop at the trailing space (' '). That leaves a space (' ') and newline ('\n') waiting in the input stream (on the boat) for further input to skip -- just as the '\n' alone was skipped above.

cin's Behavior When It Can Translate Nothing

But what if the user is a real moron!? What if they don't type a number at all? What if they type letters instead?!

If the user typed something like this:

    half past two

Instead of 14:35, well, they are just asking for trouble. Later we'll learn to take care of this situation, but for now note that cin will stop when it tries to read the 'h' for a short integer. It will be terribly upset that it found nothing to translate and will refuse to do anything else for the rest of your program. Not only will it not store anything in your hours variable, it won't store anything in your temporary char or your minutes, either. And any later input statements will be essentially ignored as well.

For now, do your best to keep cin happy and not type symbols or letters where it expects a number. *smile*