Dealing With Possibilities

Humans tend to be sloppy. Any notation they come up with is bound to have convenience mechanisms built in.

    full units:     12 meters
    abrv units:     12 m
     full time:     12:42:31
     abrv time:     12:42

Sometimes this is okay, sometimes it can be deadly:

    // read time  -- okay, but annoying
    char colon;
    short hour, minute, second;
    cin >> hour >> colon >> minute
        >> colon >> second;
    // if user enters abbreviated version (without seconds),
    // computer just sits and waits for further input


    // read units  -- potentially deadly
    char unit;
    double measure;
    cin >> measure >> unit;
    // leaves "eters" in keyboard buffer if user enters full
    // next cin will attempt to use "eters" for its input

How can we rectify these situations? By having our program make some decisions about what is being input:

    // read time
    char colon;
    short hour, minute, second;
    cin >> hour >> colon >> minute;
    if (cin.peek() != '\n')    // read seconds if not done
    {
        cin >> colon >> second;
    }
    else                       // if no seconds entered, must be 0
    {
        second = 0;
    }

    // read units
    char unit;
    double measure;
    cin >> measure >> unit;
if (cin.peek() != '\n') // ignore until end-of-line if { // units elongated... cin.ignore(20, '\n'); }

Wow! How does that work?!

New Input Functions

It turns out that cin can do a lot more than just store information in your variables. It has that whole buffer to work with. (Recall the buffer model of stream input.)

In order to work with this, we need to learn a few new functions and a new syntax for calling them. These functions are owned by cin. The . operator signifies 'owns the function'. So, when the above code says 'cin.peek()', it is saying 'cin owns the function peek'. (Of course, the () after the function name signify that we are calling it and that it takes no arguments.)

Here are the new functions and their use:

FunctionTakes InReturns Buffer Action
peeknothing char copies next character
getnothing char copies next character, advances buffer position (effectively removing character from buffer)
get char& nothing copies next character, advances buffer position (effectively removing character from buffer)
ignorenothing nothing advances buffer position one character (again, 'removing' that character from buffer)
ignore count, char nothing advances buffer position count characters or until the given char is seen (this char is 'removed', too)

Note that get and ignore have been overloaded (twice each) to make them useful in different situations. (We won't use get right now, but it will come in handy after we learn to repeat actions and store multiple char values together.)

Also note that all of these functions act on any character -- spaces, letters, digits, punctuation, etc. Normal cin extraction skips all spacing and deals only with the letters, digits, etc.

Putting It To Use

peek effectively allows you to peek ahead to see what character is about to be read in. You can then make decisions based on the value. Let's look back at the examples above:

    // read time
    char colon;
    short hour, minute, second;
    cin >> hour >> colon >> minute;
    if (cin.peek() != '\n')    // read seconds if not done
    {
        cin >> colon >> second;
    }
    else                       // if no seconds entered, must be 0
    {
        second = 0;
    }

Here we are peeking ahead to see if there is a new-line. Recall that newlines represent the Enter key and so indicate the end of the user's input. If the next character to be read is a new-line, then they were done after the minute. Otherwise, we need to keep reading to get the seconds value.

With the units example:

    // read units
    char unit;
    double measure;
    cin >> measure >> unit;
    if (cin.peek() != '\n')     // ignore until end-of-line if
    {                           // units elongated...
        cin.ignore(20, '\n');
    }

Here we again peek ahead to see if there is a new-line waiting. If there is, then we are done. Otherwise, we should throw out all the remaining characters until we reach the new-line. Here I've only thrown out 20 because most units aren't very long.

In general, however, this solution isn't satisfactory. How do we know how many errant characters the user has typed? If, though, we also include limits in our program, we can use this version:

    // read units
    char unit;
    double measure;
    cin >> measure >> unit;
    if (cin.peek() != '\n')     // ignore until end-of-line if
    {                           // units elongated...
        cin.ignore(numeric_limits<streamsize>::max(), '\n');
    }

Since only an insane person would want to ignore exactly the maximum integer (numeric_limits<streamsize>::max()) worth of characters, the stream assumes that you just want it to not count and stop at the specified character. (i.e. it will read as many characters as needed to reach the stop character -- here a new-line.)

To tell the truth, we don't even really need to peek() here. Since there will be a newline no matter what, we can just perform the ignore():

    // read units
    char unit;
    double measure;
    cin >> measure >> unit;
    cin.ignore(numeric_limits<streamsize>::max(), '\n');    // ignore until end-of-line

But we could still peek() to see if they entered units at all:

    // read units
    char unit;
    double measure;
    cin >> measure;
    if (cin.peek() != '\n')   // see if units entered...
    {
        cin >> unit;
    }
    else                      // if not, assume some unit...
    {
        unit = 'm';
    }
    cin.ignore(numeric_limits<streamsize>::max(), '\n');    // ignore until end-of-line

(Some of you may now be wondering: "Why didn't Jason just use the data type string for unit instead of char? That way he could read in as many characters as there were in the user's units..." Well, while that may be true, it still wouldn't solve all our problems -- see below. Plus, there are many situations where it is desireable to have as few characters to deal with as possible -- formatting output, yes/no questions, etc.)

(When we get to looping, we'll see a more versatile way to throw out extra characters!)

Possible Problems

Of course, what if the user isn't as precise as we'd like? What if, instead of "12:42\n" they typed "12:42 \n"? That extra space would make us believe there were seconds to follow! One possibility would be to use the isspace funtion from the ctype library:

    // read time
    char colon;
    short hour, minute, second;
    cin >> hour >> colon >> minute;
    if (!isspace(cin.peek()))    // read seconds if not done
    {
        cin >> colon >> second;
    }
    else                         // if no seconds entered, must be 0
    {
        second = 0;
    }

This would only try to keep reading when the next character wasn't a space. But what if they typed "12 : 42 : 31\n"? Now there is a space, but there is a number of seconds coming!

The basic idea is that the user can be more moronic than you can be clever. Don't waste precious time with stray bits when you could be working on something more useful than that one silly person who does something off-the-wall.

And what about the units? How about if the user entered "km" for their units? We didn't even think about this! Or what if they aren't metric? They might enter "in"! This isn't a problem with the peek-ahead idea, it is more a problem with the basic design. Here you'd have to rethink your input a little to account for a reasonable amount of variation from the user. (Don't get carried away! Just do basic stuff: in, ft, yd, mi, km, mm, cm. Don't worry about decameters, decimeters, gigameters, angstroms, etc. Such things happen rarely. *smile*)

Case Study: One vs. Two char Units

What do you do with screwy times when you may have a single char for units ('m', 'l', ...) vs. two char's for units ('km', 'dl', ...)? Well, just make two unit variables an if only one unit char is given, make the other one a space:

    // read units
    char unit1, unit2;
    double measure;
    cin >> measure;
    if (cin.peek() != '\n')   // see if units entered...
    {
        cin >> unit1;
        if (cin.peek() != '\n')  // a second unit char!
        {
            cin >> unit2;
        }
        else                     // only one unit char...
        {
            unit2 = ' ';
        }
    }
    else                      // if not, assume some unit...
    {
        unit1 = 'm';
        unit2 = ' ';
    }
    cin.ignore(numeric_limits<streamsize>::max(), '\n');    // ignore until end-of-line

But, again, the user can still screw us up: kilometers instead of km for instance. We'd need some much more powerful logic to handle these kinds of situations!

Detecting Translation Failure

When cin is reading a number (double, short, etc.) and finds a non-numeric character before it translates anything successfully, it gets upset and tells itself it is a failure. (We call it setting the fail state...) When this happens, it will do almost nothing until we give it some medication ...er... send it to therapy ...er... call the clear() function!

But how do we know this has happened? cin has another function called fail() that will report to you that it has (or hasn't) encountered a failure. So, we need to do something like:

    cin >> x;
    if (cin.fail())                // did we translate poorly?
    {
        cin.clear();               // calm cin down
        cin.ignore(numeric_limits<streamsize>::max(), '\n');   // throw out rest of line
        cin >> x;                   // try again
    }

The False Promise of Hope: ignore and the Single Retry

Of course, if they screwed it up the first time, there is no guarantee that they'll get it right with just one more attempt. To fix this, we'll need some looping...