Arrays VS. The vector class

NOTE:

These are a little sketchy...I'll fill in details 'soon'.

	C-style arrays	C++ vector class
Built-In vs. Library	Arrays are a built-in language feature. They operate essentially the same as arrays work in C -- our ancestral language. To use arrays, you need do nothing except code it and compile -- no special library needs to be #include'd.	There is a class data type that can store 'lists' of any component/base type. This data type is called, quite oddly, `vector`. It is not a built-in type (like `short`, `char`, or `double`), however. This data type is defined as a `class` in the *`vector`* standard library. (Recall that *`cout`* and *`cin`* are objects/variables of `class` types from over in the *`iostream`* library. We'll learn to create our own `class`es later this semester.
Declaration and Storage	In order to declare an array of information, you need to know how much information you'll be storing. This is indicated to the compiler with the following syntax: double arr[10]; Here I've declared an array of [up to] 10 double type quantities. The name of this collection is 'arr'. In memory, it might look something like this: arr +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ \| \| \| \| \| \| \| \| \| \| \| +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ 0 1 2 3 4 5 6 7 8 9 Note that there are 10 contiguous boxes of memory arranged under the name 'arr'. (The boxes are labeled with the numbers 0-9. These labels are called the offset, subscript, or index of that box. More on these later.) When we declare an array, though, we normally don't specify the maximum size with a literal value. This can complicate processing the array later in the program (as you'll see). Instead, we normally will declare a constant to go with the array and use that named value to indicate the size of the array: const short MAX_ARR = 10; double arr[MAX_ARR]; Now we have a name for the maximum number of elements that arr can hold -- MAX_ARR. Also, if we want to change the size, we'll just have to go to that constant declaration line and change the 10 to whatever else we want it to be: 20, 100, 5, ... The only restriction on the size of an array is that it must be a whole number [expression]. Initialization is allowed in a couple of styles: const short MAX_A = 20; short var[MAX_A], var2[MAX_A] = { 42, -42, 89, 904, -12 }, var3[MAX_A] = { 0 }; The first is just like with a normal variable of a built-in type -- the content of the MAX_A boxes specified by 'var' are left as whatever garbage bits were there before it was carved out. In the second, we are filling in the first 5 elements with specific values. Any positions not explicitly filled in are initialized with the 'zero' value for the base-type of the array (0 for short, 0.0 for double, '\0' for char, false for bool, etc.). This fact leads us to an easy way to clear out all the elements as in the third example. There we fill in just the first item explicitly with a 0 and the rest will then be automatically filled in with 0's. (Remember, the automatic value is always the 'zero' value -- not the last explicit value. If we'd filled the first position of 'var3' with 1, the remaining MAX_A-1 positions would still be 0's -- not more 1's!) But those aren't all: const bool flags[] = { false, false, true, false, true, true }; const short MAX_F = sizeof(flags)/sizeof(bool); Here we initialize a seemingly un-sized array and then afterward calculate the size by dividing the number of bytes the entire array takes up by the size of an individual element. This is recommended to only be done with constant arrays -- NOT array variables. Lastly, be careful for the following pitfall: const short MAX_A = 5; long a[MAX_A] = { 42045L, 99567L, -223410L, 90210L, 2112L, 993402L }; Here we've said the maximum is to be 5 long integers. Then we initialize with a list of 6 values! Most compilers will at least issue a warning for this situation, but never assume anything...	To declare a variable of the `vector` type, you must specify the type of items that are to be stored in the elements: vector<base_type> var_name; Just fill in base_type with the type of item you want to store: vector<char> menu_choices; vector<double> inventory_prices; vector<short> damage_dice_count, damage_dice_sides; Note that you can declare more than one vector of the same base type at the same time (like with the dice example). An advantage of vectors over the built-in types here is that when you declare a `vector` variable and don't initialize it, it will contain an empty list -- every time, guaranteed! (Recall that if you don't initialize a `double` or other built-in type variable, it simply retains the garbage bits that were in that memory location before. Although this often ends up being a 0, it is not guaranteed. Here, we are guaranteed that the uninitialized `vector` variable will be an empty list -- no elements, 0 size, etc.
Literals	There are no general array literals. (This may be coming in the next standards revision, but that isn't 'ratified' yet and won't be in compilers for a few years after that...)	See over there <---...
Use as Function Arguments	Passing array arguments to functions is a bit different from passing normal built-in typed arguments to functions. We use brackets (`[]`) on the array argument to indicate that this argument isn't a single value, but a series of contiguous boxes. Unlike the original declaration of the array, these brackets don't need to be filled in. If they are, the compiler just ignores the value, anyway. This is actually a great benefit to us because it will allow us to write a single function that can accept any length actual array argument through its formal array argument. If the base type is plain, then the argument is passed by 'reference to elements'. If the base type is made `const`, then it is passed by 'constant reference to elements'. In neither case does an `&` need to be placed to make this reference effect happen. (Also note that the reference is to the elements inside the array. You cannot use a function to change the number of elements an array can hold -- only the values of the elements in the array.) By design, array arguments specify no size in their brackets. To allow them to process the array effectively, however, the function needs to know how many elements are in use in this array. Toward this end, we'll usually pass the size of the array to the function as a separate argument. (See below about writing your own array processing functions.) Here's an example function head for a function taking in one array argument to be changed and two others to simply examine/use: void vect_add(double c[], const double a[], const double b[], short dim); Here we are adding vectors (like you'd use in calculus or physics) and so we have only one length argument since you cannot add vectors of different dimensionalities.	To pass a vector class object as a function argument, you'll typically want to use either a reference (to change the contents) or constant reference (to not change). Passing a vector class object by value would cause the compiler to make a copy of all that object's internal information -- the list of elements itself, length counter, etc. That can be a lot of data and can take a long time. Here is the 'vector' adding example from the array column done with C++ vector class arguments: void vect_add(vector<double> & c, const vector<double> & a, const vector<double> & b); Note how we don't need a separate argument for the dimensionality since the vector objects themselves know how long they are (see below for aggregate processing functions).
Provided Processing Facilities: Aggregate Processing	There are really no built-in operations you can do to arrays. If you want to do anything to an array, you'll have to write your own processing function using element-wise access techniques. This includes things as simple as input and output!	A few of the things you'd want to do to vectors are already built into the vector class. You can assign one vector to be a copy of another: vector<short> a, b; // ...fill in a somehow... b = a; // b becomes an exact copy of a A vector can place (push) new data at the end (back) of its list: double max_data, min_data, d; vector<double> data; // ...fill min_data and max_data somehow... for (short c = 0; c != 10; c++) // insert 10 randomly generated values { d = rand()%RAND_MAX/(RAND_MAX-1.0)(max_data-min_data)+min_data; data.push_back(d); } A vector object can tell you how many elements it contains, for instance: vector<bool> votes; // ...fill in votes vector somehow... cout << "\nThere are " << votes.size() << " votes in...\n"; A vector can tell the first or last items in its list: vector<short> bolt_inventory_by_bin; // ...fill bin-by-bin inventory counts somehow... cout << "\nThe first bin has " << bolt_inventory_by_bin.front() << " bolts in it.\n"; cout << "\nThe last bin has " << bolt_inventory_by_bin.back() << " bolts in it.\n"; These are some of the most common questions asked. Others involve slightly more complicated features and will be discussed after element-wise access techniques. (Note the '.' (dot) syntax as we had with both versions of cin.ignore(). This is in general how you call a function that is inside a `class`*: object/variable, dot, function -- with () and any arguments necessary.)
Provided Processing Facilities: Element-wise Processing	Given this array: const short MAX_ARR = 10; char arr[MAX_ARR]; If I wanted to access a particular one of the MAX_ARR items, I would use the subscript operator: cout << arr[4]; This would print the 5^th element in the list. 5^th?! But it clearly says '4' there! Well, C++ likes to imagine element position within the list as a distance from the beginning. The 1^st item in the list is, therefore, 0 distant from the beginning of the list -- itself. The 2^nd piece of data is 1 element distant from the beginning. Etc. Recall how we drew such an array above under storage. It had labels on each box that started at 0 and went up by 1 each to 1 less than the declared size of the array. Those labels represented the legal subscripts (or indices) for the array. It is also important to note that the square brackets (`[]`) used to declare the array variable were mere syntax whereas those used to pick a particular element of the array are an operator. Here's another example showing the use of the subscript operator to alter the contents of an array: const short MAX_A = 20; char c[MAX_A]; c[0] = 'H'; cin >> c[1]; Note: Access to an element past the end or before the beginning of an array is illegal. However, due to processing constraints, the compiler cannot check for these things at compile time and such errors won't be found until run time! Depending on how memory processing occurs on your system and how the program loaded into memory this execution, such a logic error may cause strange results or may crash the program. Be wary of such things. The crash is likely to be some sort of 'protection fault' or 'segmentation fault'.	Normal View size_type vs. iterator (size_type-->iterator translation) vector<base_type>::size_type p; // set p to something in [0..v.size() ) vector<base_type>::iterator i = v.begin()+p; // use i to access/change item at offset p in v Or more specifically: p = rand()%v.size(); i = v.begin()+p; // use i to access/change item at offset p in v Another Approach `[]` vs. at vector<long> n; n.push_back(5L); n.push_back(15L); n.push_back(25L); n.push_back(35L); n.push_back(45L); cout << n[5]; // may crash, should just return garbage vs. vector<long> n; n.push_back(5L); n.push_back(15L); n.push_back(25L); n.push_back(35L); n.push_back(45L); cout << n.at(5); // program dies with an exception Other vector class Functions You can use iterator-style arguments to overwrite, erase, or insert into a vector's current contents. Overwriting: vector<double> data1, data2; // ...fill data1 somehow... data2.assign(data1.begin(), data1.end()); // copies all of data1 into data2 data2.assign(data1.begin()+1, data1.end()-1); // copies all but ends of data1 into data2 // -- removing outliers? Erasing: vector<short> inventory; inventory.erase(inventory.begin()+3); // fourth inventory item removed // -- not carrying it anymore? // erase 3rd through 6th items from inventory inventory.erase(inventory.begin()+2, inventory.begin()+5); Inserting: vector<char> grades; grades.insert(grades.begin(), 'A'); // place an A on front of list grades.insert(grades.end(), 6, 'F'); // place 6 F's at end of list string s = "AABBBBCCCCDDDDDFFF"; // place characters from s into grades after the A we placed at front // earlier grades.insert(grades.begin()+1, s.begin(), s.end()); Note how we copied from a string into a vector of characters. We could copy between similarly compatible vectors as well: from a vector of short's to a vector of double's, etc. There are not size_type versions of these functions as there are in the string class. (Although the string class has iterator versions we don't normally use...shrug)
Programmer 'Hand' Processing	When you create an array, you declare a maximum size. That doesn't mean, however, that you should force your user to fill all of those elements! If you've made room for a user to have as many as 500 data, but someone wants to use your program with just 12 items, they should be able to without somehow faking the other 488 values! Toward this end, most arrays are declared with both the array and an integer variable to track the current number of used positions: const short MAX_DATA = 500; double data[MAX_DATA]; short cur_data; As data are entered, we count them -- making sure we don't exceed our maximum. Later, we use the counter to limit our processing loops so we don't wander into array elements with garbage in them: for (short c = 0; c != cur_data; c++) { // do something with data[c] } The `!=` test is sometimes (okay, often) coded as a `<` test due to limited knowledge of the 'cur_data' variable's value. (If some other programmer told us that there were -10 elements and we used a `!=` test, we'd count up to 32767, wrap around to -32768, and continue up until we finally reached -10! That's 65526 iterations! ...well, okay, we'd actually die long before that. Our code would die when c reached far enough past the maximum declared size to cause a nice 'segmentation fault' for us.) Let's look at a few example processing loops [reasonably] properly encapsulated into functions... Output Just fill in base_type below with the type of data in your array...(any formatting options you want for the elements should be set up before you call this function...precision, etc.). void print(const base_type arr[], long max, long per_line = 7, const char pre[] = "", const char betw[] = ", ", const char post[] = ""); // ...stuff (like calls)... void print(const base_type arr[], long max, long per_line, const char pre[], const char betw[], const char post[]) { cout << pre; if (max > 0) { for (long c = 0; c != max-1; c++) { cout << arr[c] << betw << ((c+1)%per_line==0 ? "\n" : ""); } cout << arr[max-1]; } cout << post; return; } Input Just fill in the base_type below with the type of information in your array... long read(base_type arr[], long max, bool clear_crap = true); long read(base_type arr[], long max, bool clear_crap) { base_type value; long count = 0; if (max > 0) { cin >> value; while (cin.good() && count != max) { arr[count] = value; count++; cin >> value; } if (clear_crap && !cin.good()) { cin.clear(); cin.ignore(numeric_limits<streamsize>::max(), '\n'); } } return count; } Many programmers would actually combine the array-centric parts of the [primed] entry loop as: arr[count++] = value; Using the rules for evaluation of post-increment to combine two statements into one. I've no preference one way or another, but you should be prepared to see such things from time to time... Other Other processing functions would be similar to the output but most likely without so many fancy optional arguments. Here's one to find the 'sum' of the elements of an array: base_type total(const base_type arr[], long max) { base_type sum = base_type(); for (long c = 0; c < max; c++) { sum += arr[c]; } return sum; } You could just fill in the base_type with your type of data (short, long, double, string...). And that includes the local variable initialization: `base_type()`. Doing this is an explicit call to the 'default constructor' of the type. We know that classes like string have default constructors, but what if you are adding short integers?! That's fine. You can call the short default constructor yourself...it's just that the compiler won't call it for you when you fail to initialize a short integer variable. (sniff sigh) Also beware that if you add up enough short integers -- no matter how small their values, the total may be larger than a short integer can hold. You might consider changing the return type/accumulator type to the next larger numeric type for such situations.	Since the vector class tracks how many items have been placed into its list, we can use iterators from begin() (inclusive) to end() (exclusive) to process those items: vector<string> v; vector<string>::iterator i; // iterator to alter, const_iterator to view for (i = v.begin(); i != v.end(); ++i) // v is [begin, end), so def loop { // do something with i } We could use size_type positions from 0 to size()-1, but iterators prove more natural for mass processing. We can always map from known size_type positions to their corresponding iterator-style positions by simply adding the size_type value to begin(). (If, for some reason, we need to go backward -- from iterators to size_types -- we can subtract begin() from our iterator to find its size_type relative position.) Following are a few examples of processing functions for vectors: Output As on the left, just replace base_type* with the type of data you are trying to process: void print(vector<base_type>::const_iterator beg, vector<base_type>::const_iterator end, vector<base_type>::size_type per_line = 7, const string & pre = "", const string & betw = ", ", const string & post = ""); inline void print(const vector<base_type> & vec, vector<base_type>::size_type per_line = 7, const string & pre = "", const string & betw = ", ", const string & post = "") { print(vec.begin(), vec.end(), per_line, pre, betw, post); return; } // ...stuff (like calls)... void print(vector<base_type>::const_iterator beg, vector<base_type>::const_iterator end, vector<base_type>::size_type per_line, const string & pre, const string & betw, const string & post) { cout << pre; if (end-beg > 0) { for (vector<base_type>::const_iterator i = beg; i != end-1; ++i) { cout << i << betw << ((i-beg+1)%per_line==0 ? "\n" : ""); } cout << (end-1); } cout << post; return; } Notice how there are two versions: a full version using iterator arguments and an inline version that takes a vector argument and then passes this through to the iterator version. This allows our caller more flexibility in how they use our function. They can use us to print their entire vector or just a part of it (by passing iterators to the sub-range desired). Also, we've saved ourselves much work in maintenance by using just one version for the hard coding and passing the short inline version through to the longer one. Input vector class objects will grow dynamically to hold as many items as we wish to place into them -- hence us not needing to specify a maximum size at declaration time. If, however, you want to be careful (or have a problem-specific need to enforce a maximum), we can do that, too. Here is a fairly general input function (it works for all numeric types -- change the stop condition for char or string contents): void read(vector<numeric_base_type> & vec) { numeric_base_type value; cin >> value; while (cin.good()) { vec.push_back(value); cin >> value; } return; } Other Situations Note how the string class supplies a replace function. We don't have such a beast for the vector class. Kinda odd, really...shrug Let's write one ourselves: void replace(vector<base_type> & vec, vector<base_type>::size_type from, vector<base_type>::size_type for, vector<base_type>::const_iterator with_beg, vector<base_type>::const_iterator with_end) { if (from+for >= vec.size()) { vec.erase(vec.begin()+from, vec.end()); } if (from >= vec.size()) { vec.insert(vec.end(), with_beg, with_end); } else { vec.insert(vec.begin()+from+for, with_beg, with_end); vec.erase(vec.begin()+from, vec.begin()+from+for); } return; } This could, of course, be overloaded with different but compatible base_types for the 'with' iterators for replacing some double's in one vector with long's from another vector or char's from one vector with char's from a string...
Use as 'Array' Base Types	An array of an array base type is by its nature 2D. Note that the base-type of the outer array is itself an array: const short MAX_LINES = 66, MAX_LINE_LEN = 80; char page[MAX_LINES][MAX_LINE_LEN]; Here we make a 2D array of characters. The first dimension is the number of lines. The second dimension is the number of characters in a line. Note that one can 'subscript' the page array either once or twice. Twice would give a single character since offsets into both dimensions would have been specified: page[4][3] // specifies the 5th line's 4th character Specifying only a single 'index' would pick a particular line, but not a particular character within that line. Therefore, the result would be an entire line -- which is itself a 1D array of characters: page[4] // specifies the entire 5th line This subscripting result could be passed to any function/operation that could handle an array of characters (one we wrote or nicer ones if we are treating the lines as (C)strings). The type of `page[4]` is going to be `char[MAX_LINE_LEN]` and so can be passed to any `char[]` or `const char[]` formal argument for a function. We can hide this 2D aspect by using a typedef (type definition). Such typedef's can be placed in a particular function, global to a program file, or in a library for anyone to #include and use. Please note that the typedef for an array also needs a constant to go along with it. This should be placed immediately before the typedef where-ever that may end up (see above). The GridRow typedef in this example is global since it is intended to be used on function arguments that need to process individual rows of the grid. Note how the initial grid array appears 1D, but is in reality 2D because its base type is a 1D array type: const short ROW_LEN = 3; typedef char GridRow[ROW_LEN]; // ... some function ... { const short COL_LEN = 3; GridRow grid[COL_LEN] = { { '1', '2', '3' }, { '4', '5', '6' }, { '7', '8', '9' } }; // ... stuff ... } Notice how the initializer helps enforce our idea of the 2D structure as being an array of arrays. (First element is { '1', '2', '3' } which is an array initializer. Similarly for second and third element initial values.)	An array with a vector class base type is not actually 2D, but can be treated in a 2D fashion (see element-wise access above). // each student has a vector of all scores they've earned... vector<double> scores[MAX_STUDENTS]; One can also make a vector with a vector class base type: vector<vector<double> > erosion_map; Notice the space between the greater-than's! This is important or else the compiler will get confused and think you are trying to extract something during a declaration -- not possible. (The next version of the standard should relax this, but current compilers most likely don't include this relaxation.) One can make a vector with an array base type, but it is quite tricky. We may look into this later...
Member Variables of a class	Access is per element. Mutation also with data validation dependent on your application. Need a way to report current number of elements and add new elements. const short MAX_A = xx; class ArrMemb { double arr[MAX_A]; short curr; public: double get_arr(short which) const { return (which >= 0 && which < curr) ? arr[which] : -1.0; } short get_arr_len(void) const { return curr; } bool set_arr(short which, double val) { bool okay = which >= 0 && which < curr; if (okay) { arr[which] = val; } else { okay = add_arr(val); } return okay; } bool add_arr(double val) { bool okay = curr < MAX_A; if (okay) { arr[++curr] = val; } return okay; } };	Works very much like the array member except it can be initialized in the constructors' member initialization lists. class VecClsMemb { vector<double> vec; public: typedef vector<double>::size_type vec_sz; double get_vec(vec_sz which) const { return (which < vec.size()) ? vec[which] : -1.0; } vec_sz get_vec_len(void) const { return vec.size(); } bool set_vec(vec_sz which, double val) { bool okay = which < vec.size(); if (okay) { vec[which] = val; } else { okay = add_vec(val); } return okay; } bool add_vec(double val) { bool okay = vec.size() < vec.max_size(); if (okay) { vec.push_back(val); } return okay; } };