Barbara Ericson, Allen B. Downey, Jason L. Wright (Editor)
Section15.7The Set data structure
A data structure is a container for grouping a collection of data into a single object. We have seen some examples already, including strings, which are collections of characters, and vectors which are collections of any type.
Both strings and vectors have an ordering; every element has an index we can use to identify it. None of the data structures we have seen so far have the properties of uniqueness or arbitrary size.
To achieve uniqueness, we have to write an add function that searches the set to see if it already exists. To make the set expand as elements are added, we can take advantage of the resize function on vectors.
The instance variables are a vector of strings and an integer that keeps track of how many elements there are in the set. Keep in mind that the number of elements in the set, numElements, is not the same thing as the size of the vector. Usually it will be smaller.
getNumElements and getElement are accessor functions for the instance variables, which are private. numElements is a read-only variable, so we provide a get function but not a set function.
Why do we have to prevent client programs from changing getNumElements? What are the invariants for this type, and how could a client program break an invariant. As we look at the rest of the Set member function, see if you can convince yourself that they all maintain the invariants.
When we use the [] operator to access the vector, it checks to make sure the index is greater than or equal to zero and less than the length of the vector. To access the elements of a set, though, we need to check a stronger condition. The index has to be less than the number of elements, which might be smaller than the length of the vector.
intSet::find(const string& s)const{for(int i=0; i < numElements; i++){if(elements[i]== s)return i;}return-1;}
So that leaves us with add. Often the return type for something like add would be void, but in this case it might be useful to make it return the index of the element.
intSet::add(const string& s){// if the element is already in the set, return its indexint index =find(s);if(index !=-1)return index;// if the vector is full, double its sizeif(numElements == elements.length()){
elements.resize(elements.length()*2);}// add the new elements and return its index
index = numElements;
elements[index]= s;
numElements++;return index;}
The tricky thing here is that numElements is used in two ways. It is the number of elements in the set, of course, but it is also the index of the next element to be added.
It takes a minute to convince yourself that that works, but consider this: when the number of elements is zero, the index of the next element is 0. When the number of elements is equal to the length of the vector, that means that the vector is full, and we have to allocate more space (using resize) before we can add the new element.