Subsection 25.10.1 Time Complexity of Array Based List Operations
Now that we have seen how to analyze the time complexity of various code constructs, we can apply that knowledge to figure out the time complexity of common operations on array based lists. As we do so, remember that accessing a single element in an array takes constant time,
\(O(1)\text{,}\) because we can compute the memory address of any element directly using its index.
Most operations on an array based list either take constant time,
\(O(1)\text{,}\) or linear time,
\(O(n)\text{.}\) Any operation that involves shifting elements in the array will take linear time, because in the worst case we might have to move all n elements. Operations that just involve accessing or modifying a single element, or adding an element to the end of the list (when there is enough capacity), will take constant time.
Any algorithm that does not involve a loop or function call takes constant time,
\(O(1)\text{.}\) This includes algorithms like:
T ArrayList<T>::get(int location) const {
return m_arr[location];
}
void ArrayList<T>::set(int location, const T& newValue) {
m_arr[location] = newValue;
}
void ArrayList<T>::removeEnd() {
m_size--;
}
The algorithms that involve loops generally take linear time,
\(O(n)\text{.}\) Even though the loops in something like
insertAt and
removeAt do not always run n times, in the worst case they might have to move all n elements (if we are inserting or removing at the start of the list). Therefore, we say that these operations take linear time,
\(O(n)\text{.}\)
Even an average case for these operations would involve moving some percentage of the overall elements. If we assume that on average items are inserted at the middle of the list, than each insertion only has to move half of the elements, or
\(n/2\text{.}\) But half of
\(n\) is still
\(O(n)\text{.}\)
void ArrayList<T>::insertAt(int location, const T& insertItem) {
// Shift elements up to make room for the new item
for (int i = m_size; i > location; --i) { // O(n)
m_arr[i] = m_arr[i - 1]; // O(1)
}
m_arr[location] = insertItem; // O(1)
++m_size; // O(1)
}
Subsection 25.10.2 Insert End and Amortized Time
There is one tricky case for analyzing the efficiency of array based lists: inserting at the end.
Most of the time, we have space already allocated in the array. So we can place the value at the next available position and then increment the size. Both of these operations take constant time,
\(O(1)\text{.}\)
However, if the array is full, we have to resize it first by calling a function like
grow. Resizing involves allocating a new, larger array, and then copying all the existing elements to the new array. This copying step takes linear time,
\(O(n)\text{,}\) because we have to copy all n existing elements.
So inserting at the end sometimes takes constant time,
\(O(1)\text{,}\) and sometimes takes linear time,
\(O(n)\text{.}\) To analyze this situation, we use a concept called amortized time. The idea is to look at the average time taken over a sequence of operations.
When we double the size of the array each time we resize, we can show that the average time per insertion is still constant time, \(O(1)\text{.}\) To see why, consider a sequence of insertions starting with an empty list and an initial capacity of 1. We will count each insertion as taking 1 unit of time. When we have to resize, we will count each copy operation as taking 1 unit of time as well.
-
The first insertion takes 1 unit of time to insert.
-
The second insertion requires a resize. First we copy the 1 existing element into an array of size 2, which takes 1 unit of time, and then we insert the new element, which takes another unit of time.
-
The third insertion also requires a resize. We copy the 2 existing elements into an array of size 4, which takes 2 units of time, and then insert the new element, which takes 1 unit of time.
-
The fourth insertion takes 1 unit of time to insert.
-
The fifth insertion requires a resize. We copy the 4 existing elements into an array of size 8, which takes 4 units of time, and then insert the new element, which takes 1 unit of time.
-
And so on...
In table form, it looks like:
| 1 |
0 |
1 |
1 |
1 |
| 2 |
1 |
1 |
2 |
3 |
| 3 |
2 |
1 |
3 |
6 |
| 4 |
0 |
1 |
1 |
7 |
| 5 |
4 |
1 |
5 |
12 |
| 6 |
0 |
1 |
1 |
13 |
| 7 |
0 |
1 |
1 |
14 |
| 8 |
0 |
1 |
1 |
15 |
| 9 |
8 |
1 |
9 |
24 |
To figure out the worst case, we can focus on the insertions where an array resize occurs. Since we grow by doubling from 1 to 2 to 4 to 8 to 16 and so on, which are powers of 2, the most expensive insertions are all at these powers of 2 + 1. 2, 3, 5, 9, 17, ... will be the points at which we have just spent the most time. The steps that come after those are the ones that get to take advantage of the extra space that is then available.
Focusing on just those, and adding a few more rows to look for a pattern gives us:
| ... |
... |
... |
... |
... |
| 3 |
2 |
1 |
3 |
6 |
| ... |
... |
... |
... |
... |
| 5 |
4 |
1 |
5 |
12 |
| ... |
... |
... |
... |
... |
| 9 |
8 |
1 |
9 |
24 |
| ... |
... |
... |
... |
... |
| 17 |
16 |
1 |
17 |
48 |
| ... |
... |
... |
... |
... |
| 33 |
32 |
1 |
33 |
96 |
Carefully compare the insertion number to the total work so far at that step. Notice that the formula
\begin{equation*}
\text{totalWork} = 3 \cdot \text{insertionNumber} - 3
\end{equation*}
could be used to predict the total work value.
In Big-O terms, the work done is
\(3 \cdot n - 3\) where
\(n\) is the number of insertions done. The constant factors can be ignored, so the time complexity is
\(O(n)\text{.}\)
If adding
\(n\) elements to the list requires
\(O(n)\) work, then the average work per insertion is
\(O(1)\text{.}\)
This analysis depends on our array size doubling with each grow. As long as at each grow, we multiply the size by a constant factor greater than 1, the amortized time for insertions at the end remains
\(O(1)\text{.}\) So we could triple the size each time and still have the same amortized time complexity.
However, if we only added a constant amount to the size each time we grow, by say adding space for 10 more items. There would be too many grow operations. There would be no way to amortize the cost and claim that there is an average
\(O(1)\) time per insertion.
In practice, it is good to avoid frequently resizing. If you know you will be adding 10,000 items to an array based list, it is better to request that it grows once to that size, rather than letting it grow multiple times as items are added.
std::vector provides a
reserve function for this purpose. It allows you to request that the vector allocate enough space for a certain number of items ahead of time.