The Library

There is a data type that can store "strings" of characters (like the literal "Hello World!" and others we've seen). This data type is called, oddly enough, string. It is not a built-in type (like short, char, or double), however. This data type is defined as a class in the string standard library. To use the string data type, you'll have to remember #include this library:

    #include <string>

You may recall that we mentioned briefly that cout and cin are objects/variables of class types from over in the iostream library. (In fact, we'll learn to create our own classes later this semester.)

Declaring string Variables

Just Like Normal

To declare a variable of the string type, simply use it like you would any of the built-in types:

    string var_name;

You can even initialize it, declare multiple variables at once, or even make a constant. All of these things are done just like you do them with the built-in types:

    string var = "value", var2, var3;
    const string title = "String Program";

Better Than Normal

However, when you need to declare a string, it can behave so much better than the built-in types. Some of these declaration/construction options allow the string to behave better than a built-in type under the same circumstances. Other declaration/construction options allow the programmer creating the string more flexibility in its initialization than the built-in types allow. There are actually so many options when creating a string object/variable that we just can't cover all of them here in the first course (at least not right away; *grin*).

Better Behaved

The first advantage that the string class has over the built-in types here is that when you declare a string variable and don't initialize it, it will automatically contain an empty string ("") — every time...guaranteed! (Recall that if you don't initialize a double or other built-in type variable, it simply retains the garbage bits that were in that memory location before. Although this often ends up being a 0, it is not guaranteed. Here, we are guaranteed that the uninitialized string variable will be an empty string ("").)

More Flexible

The second advantage that the string class has over the built-in types in declaring/constructing objects/variables is that you can initialize the string object in more than just the normal three ways (not at all, with another variable of the same type, or with a literal compatible with the type). The one we'll find most useful here is the ability to create a string object that contains a certain number of duplicates of a certain character.

Hunh? Oh. Let's see an example to help clarify:

    const string sep_line(75,'-');

There. The constant sep_line is being initialized to hold 75 dashes ('-'). Once created, we can use this anywhere we feel the need to print a separator line (hence the identifier) in our program's output. We could also use it to create bars for a histogram chart (say, for student scores?):

    string histo_bar(static_cast<string::size_type>(floor(score+0.5)), '*');
    cout << setw(7) << score << " | "
         << histo_bar << endl;

Here we are creating histo_bar as a string object consisting of asterisk (star; '*') characters. How many of them? That depends on the value of the variable score. We take that value and round it to the nearest whole value with our standard rounding formula and then converted to an appropriate type with a typecast. (Recall that floor() returns a double. Also recall that positions within a string are numbered from 0 and are of data type 'string::size_type'. Again, here the :: means that size_type is 'inside of' the string class. ...just like it meant for namespace std, but for a class instead of a namespace. *shrug* Just trust me!)

For more on the setw manipulator, see the pretty printing notes elsewhere.

Now we would just repeat this cout for each student. How? Well, um... We'll discuss that a bit later...

And Nary a Variable Could Be Found!

Of course, we could also have coded that cout more like this:

    cout << setw(7) << score << " | "
         << string(static_cast<string::size_type>(floor(score+0.5)), '*')
         << endl;

Here, we are avoiding the creation of a string variable by creating what is known as an 'anonymous object'. By using the construction syntax without a variable name, we still create the string, but it isn't stored into a named memory location. Instead, it is simply sent for display to cout and then tossed away. (*smacks hands together in an up-and-down motion as if wiping them 'of it'*) (Okay...so that works better in person... But I don't have a video camera or enough space for movie clips on my web service!)


*CONSTRUCTION ZONE*

*WATCH FOR FALLING*

*****CHARACTERS****


Using strings

But what does the string class data type allow us to do besides declare variables and automatic initialize them when we don't?

Input/Output

Well, string variables can be used in standard stream insertion and extraction operations:

    string var;
    cin >> var;
    cout << var;

Assignment

They can be used in a standard assignment operation:

    string var, var2;
    var = "value";
    var2 = var;

Concatenation

And strings even do something called concatenation:

    string var = "Hello", var2;
    var2 = var + ' ' + "World" + '!';
    // var2 is now "Hello World!"

The addition operator is used to attach two strings — end-to-beginning — to create a new string with the combined contents of both the originals in the indicated order. (Note that the two strings being combined are left unchanged but a new string is created. Just like when you add 3 and 4 to get 7, 3 and 4 are unchanged but 7 is created. Or, less esoterically, when you build a wall from bricks, the bricks remain but you've created a wall.)

You'll also note in the example that a string variable can be concatenated with not only string literals but also char literals. There is a rule, though!

One of the first two items to be concatenated in a sequence of + operations must be of the string data type.

And, oddly enough, not only is a char literal not of the string data type, but neither is a string literal!!! string literals are of a completely different type from string variables (or constants) you declare! (It is a hold-over from our C language heritage that makes these literals not match their variables. *shrug* Watcha gonna do?)

That means that:

    string name, greeting;
    const string greeting_start = "Hello, ",
                 greeting_end   = "!\n";
    cout << "What's your name?  ";
    cin >> name;

    greeting = greeting_start + name + greeting_end;

    cout << greeting;

is legal, but:

    string greeting;

    greeting = "Hello, " + "Jose" + "!\n";

    cout << greeting;

is not going to do what we want. In fact, it isn't even going to compile! (The why is tedious and icky, but suffice it to say that the compiler can't turn what's on the right side into a string type value to assign to our variable.)

A Posse of Helpers

But that's not all! The string class comes with a bevy of helpers that can make life so much easier for us at times. These helpers take the form of functions — inside the class. Therefore, similar to cin's ignore() function, we'll be using the member access (or 'dot': .) operator to access them from our string variables.

length()

For instance, a string variable can tell you the length of its contents (i.e. the number of characters it contains):

    string word;
    cout << "Please enter a word:  ";
    cin >> word;
    cout << '\'' << word << "' is " << word.length()
         << " characters long.\n";

Again, note the '.' (dot) syntax as we had with (both versions) of cin.ignore(). This is in general how you call a function that is inside a class variable: variable, dot, function (with () after the function's name and any necessary arguments, of course).

(You could also ask the string for its size() instead of its length(). It's really all a matter of taste, thought-process, and vocabulary.)

replace()

You can also replace part of a string with another string:

    string first, middle, last, whole;

    cout << "\nWhat is your name (First Middle Last)?  ";
    cin >> first >> middle >> last;

    whole = first + ' ' + middle + ' ' + last;

    cout << "\nWell, '" << whole <<"' is a fine name,\nbut "
         << "wouldn't '";

    whole.replace(first.length()+1, middle.length(), "Allowicious");

    cout << whole << "' sound so much cooler?\n";

This fragment demonstrates declaration, input, concatenation, assignment, output, and length finding as well as replacement. Note the three arguments given to the replace() function:

  1. a position from which to replace
  2. a number of characters to replace
  3. the replacement string

In our example we want to replace() from the first character of the middle name (which is right after the entire first name and the space that separates the first and middle names) for the entirety of the middle name with "Allowicious" (yes, we are being facetious).

find()

You can also locate a sub-string of a string:

    string s = "...loan of the company car...";
    s.replace(s.find("the"), 3, "a");

Here, the find() function returns the location within s where the first sequence of 't' followed by 'h' and then by 'e' occurs. Then, it replaces those 3 characters with a single 'a'.

Be careful, though, as find() searches without understanding and so the above might find "them" or "other" or "bathe" just as easily as it did "the".

Also, it would NOT find "The" since we asked for the sub-string to start with a lower-case 't' — not a capital!

substr() With = vs. assign()

Finally, you can assign one string to be a sub-string of another string. This can be done in two ways. Explicitly:

    string s = "Happy Birthday George", t;

    t = s.substr(6, 8);

Here t would become equal to "Birthday". (Note that the 'B' in s is at position 6. You might think it should be position 7 with "Happy" taking slots 1-5 and the space (' ') position 6. But the computer likes to think of string positions as distances — or offsets — from the beginning of the string. Thus, "Hello" begins at position 0 and goes to position 4, the space is at position 5, and therefore the 'B' of "Birthday" is at position 6.)

Or we could do the assignment less explicitly (but not quite implicitly...):

    string s = "Happy Birthday George", t;

    t.assign(s, 6, 8);

Here we don't extract the sub-string and then assign (=) it as a second step. Instead, we give assign() the original string, a starting position, and a number of characters to take and then it overwrites the 'calling string' (the one to the left of the dot/.) with the requested sequence of characters out of the 'argument string' (the one inside the () of the function call; the one in the actual argument list, as we'll soon learn). This actually proves to be more efficient than the first version using substr().

Flexibility is the Key!

By providing both facilities to accomplish the same basic task, however, the creators of the string class have given programmers using it more flexibility. They are now free to code in a fashion more natural to their way of thinking and not be forced to do the job a single particular way. Consider, for example, building the word "yellow" from the word "hello". We could use assign() and concatenation (+):

    string s = "hello", t;

    t.assign(s, 1, 4);
    t = 'y' + t + 'w';

Or we could use substr() and concatenation:

    string s = "hello", t;

    t = 'y' + s.substr(1, 4) + 'w';

Some might find the latter a more natural solution than the former. It could also be considered more clear and even more elegant.

Program Friendliness

One can use strings not only to accomplish simple text manipulation, but also to make the program more user friendly:

    string name;

    cout << "\nHello, what's your name (First)?  ";
    cin >> name;
    cout << "\nWelcome to the show, " << name << "...\n";

Further prompts can also be embellished with the user's name:

    char sign;
    double salary;

    // ...stuff happens...

    cout << "So, " << name << ", tell me about yourself...\n"
         << "What do you make each year?  ";
    cin >> sign >> salary;

(Note that the sign variable in this example is NOT a string type variable but a char type variable! When you don't need to save a whole word, don't waste the memory of storing a string! Instead, use a char possibly followed by a ignore(numeric_limits<streamsize>::max(), '\n') — when the char you've read is the last thing you wanted from that line of input, for instance...)

And so on...

But I Want To...

Yes, yes, yes. You want to be able to read in a string from the user that already contains spaces (and maybe tabs). Why should we read single words and then use concatenation (the + above, remember?) to put them together with spaces between them?!

Fine. You really want to read in a string that contains spacing straight from the user? Here you go:

    string s;
    getline(cin, s);

That innocuous yet confusing little line of code will allow you to read in an entire line of input entered by the user (from cin, you see) into the string s (the second 'input' given to the function).

(We'll see this kind of behavior later on, too when we write our own functions. Sometimes the function's result isn't 'returned' — such as to use in an assignment or a cout — but rather simply stored in the caller's variable. Of course, the caller must give the function permission to use its variable space like that...)

With this amazing power, you can perform feats like reading into your program the user's entire name at one fell swoop:

    string whole_name;

    cout << "What's your (whole) name?  ";
    getline(cin, whole_name);

Reading the user's street address or town, state, and zip all together:

    string adrs_street, adrs_city_st_zip;

    cout << "What's your street address?  ";
    getline(cin, adrs_street);
    cout << "What's your city, state, and zip code?  ";
    getline(cin, adrs_city_st_zip;

Or even the stupendous reading of the user's favorite book's title in a single statement!

    string fav_book;

    cout << "What's the name of your favorite book?  ";
    getline(cin, fav_book);

But (there's always a 'but', isn't there? um, that's why those codes were 'not great...' instead of 'good to go', right? *smile*), our new friend getline is not all solemn and hard-working. Sometimes s/he's a bit of a prankster! Let's look at this scenario, for instance:

    string username, password, command;

    cout << "Login name:  ";
    cin >> username;
    cout << "Password:  ";
    cin >> password;

    // validate password for username...nothing in here
    // affects cin or its buffer...

    cout << "$ ";           // prompt for command
    getline(cin, command);

    // try to process command...
    // somehow get back up there to the 'prompt for command' code...

This alarmingly simply situation is just begging for trouble! The user has entered their username and password and been validated. But when they go to enter their first command, the system seems to ignore them!! On their screen, they see something like this:

    Login name:  bob
    Password:  rules

    Login confirmed.  Welcome 'bob'!

    $ $ _

Note that not only does this 'bob' fellow have a crappy password, but s/he gets two prompts before being allowed to enter a command! The computer snubbed him/her the first time around!!!

What's going on?!? Let's look at cin's view of events:

Actions of cin Buffer contents
     [>>] looking for a string (username):
          see a 'b', okay so far: "b", move on
          see a 'o', okay so far: "bo", move on
          see a 'b', okay so far: "bob", move on
          see a '\n', stop: username = "bob"
      bob\n
      bob\n
      bob\n
      bob\n
     [>>] looking for a string (password):
          see a '\n', crap, move on
          see nothing...empty our buffer...nudge cout to print its buffer...wait for user
          see a 'r', okay so far: "r", move on
          see a 'u', okay so far: "ru", move on
          see a 'l', okay so far: "rul", move on
          see a 'e', okay so far: "rule", move on
          see a 's', okay so far: "rules", move on
          see a '\n', stop: password = "rules"
      bob\n
      bob\n
      rules\n
      rules\n
      rules\n
      rules\n
      rules\n
      rules\n
[getline] looking for a line (command):
          see a '\n', okay, we're done, move on, stop: command = ""
      rules\n
      rules\n

See how the new-line character from the extraction of the user's password stuck around and getline just sucked it up and said, "Yippee! I'm done!"? *shiver* That's what happens when you want to do things that are more advanced than you need just yet. *sigh* *shrug*

Well, let's see if any of cin's other helpers can get us out of this jam... We have a slim pick of helpers, actually. But I see that of the three (other than getline — which is really one of string's helpers, but s/he sort of has a little part-time thing with cin...anyway), extraction is the one that got us into trouble in the first place and it would just ignore the new-line this time, too. Hey! Wait a minute! The other two cin helpers are actually named ignore! That can't be a coincidence!? Let's try one:

    string username, password, command;

    cout << "Login name:  ";
    cin >> username;
    cout << "Password:  ";
    cin >> password;

    // validate password for username...nothing in here
    // affects cin or its buffer...

    cout << "$ ";           // prompt for command
    cin.ignore();
    getline(cin, command);

    // try to process command...
    // somehow get back up there to the 'prompt for command' code...

Now, the user sees a single prompt before getting to type their first command!

    Login name:  bob
    Password:  rules

    Login confirmed.  Welcome 'bob'!

    $ ls
  [snip -- lots of stuff in bob's directory -- snip]
    $ _

Why? Look back at the buffer actions:

Actions of cin Buffer contents
 [ignore] tossing a single char:
          tossed '\n', okay, stop
      rules\n
      rules\n
[getline] looking for a line (command):
          see nothing...empty our buffer...nudge cout to print its buffer...wait for user
          see a 'l', okay so far: "l", move on
          see a 's', okay so far: "ls", move on
          see a '\n', okay, we're done, move on, stop: command = "ls"
      rules\n
      ls\n
      ls\n
      ls\n
      ls\n

But if we want to be truly careful, maybe we should use the other ignore twin instead (just in case the programmer who used extraction left anything else behind!):

    string username, password, command;

    cout << "Login name:  ";
    cin >> username;
    cout << "Password:  ";
    cin >> password;

    // validate password for username...nothing in here
    // affects cin or its buffer...

    cout << "$ ";           // prompt for command
    cin.ignore(numeric_limits<streamsize>::max(), '\n');
    getline(cin, command);

    // try to process command...
    // somehow get back up there to the 'prompt for command' code...

This will work the same, but allow us the comfort of not having to worry that the username/password checking programmer left anything else waiting in the buffer for us!


***END CONSTRUCTION***