Analyzing Array Based Lists

Section 25.10 Analyzing Array Based Lists

Subsection 25.10.1 Time Complexity of Array Based List Operations

Now that we have seen how to analyze the time complexity of various code constructs, we can apply that knowledge to figure out the time complexity of common operations on array based lists. As we do so, remember that accessing a single element in an array takes constant time, \(O(1)\text{,}\) because we can compute the memory address of any element directly using its index.

🔗

Most operations on an array based list either take constant time, \(O(1)\text{,}\) or linear time, \(O(n)\text{.}\) Any operation that involves shifting elements in the array will take linear time, because in the worst case we might have to move all n elements. Operations that just involve accessing or modifying a single element, or adding an element to the end of the list (when there is enough capacity), will take constant time.

🔗

Any algorithm that does not involve a loop or function call takes constant time, \(O(1)\text{.}\) This includes algorithms like:

🔗

T ArrayList<T>::get(int location) const {
    return m_arr[location];
}

void ArrayList<T>::set(int location, const T& newValue) {
    m_arr[location] = newValue;
}

void ArrayList<T>::removeEnd() {
    m_size--;
}

The algorithms that involve loops generally take linear time, \(O(n)\text{.}\) Even though the loops in something like insertAt and removeAt do not always run n times, in the worst case they might have to move all n elements (if we are inserting or removing at the start of the list). Therefore, we say that these operations take linear time, \(O(n)\text{.}\)

🔗

Even an average case for these operations would involve moving some percentage of the overall elements. If we assume that on average items are inserted at the middle of the list, than each insertion only has to move half of the elements, or \(n/2\text{.}\) But half of \(n\) is still \(O(n)\text{.}\)

🔗

void ArrayList<T>::insertAt(int location, const T& insertItem) {
    // Shift elements up to make room for the new item
    for (int i = m_size; i > location; --i) {   // O(n)
        m_arr[i] = m_arr[i - 1];                   // O(1)
    }
    m_arr[location] = insertItem;               // O(1)
    ++m_size;                                   // O(1)
}

Insight 25.10.1.

If the job is harder when there are more items, then the time complexity is likely linear, \(O(n)\text{.}\) If not, then it is \(O(1)\text{.}\)

🔗

Subsection 25.10.2 Insert End and Amortized Time

There is one tricky case for analyzing the efficiency of array based lists: inserting at the end.

🔗

Most of the time, we have space already allocated in the array. So we can place the value at the next available position and then increment the size. Both of these operations take constant time, \(O(1)\text{.}\)

🔗

However, if the array is full, we have to resize it first by calling a function like grow. Resizing involves allocating a new, larger array, and then copying all the existing elements to the new array. This copying step takes linear time, \(O(n)\text{,}\) because we have to copy all n existing elements.

🔗

So inserting at the end sometimes takes constant time, \(O(1)\text{,}\) and sometimes takes linear time, \(O(n)\text{.}\) To analyze this situation, we use a concept called amortized time. The idea is to look at the average time taken over a sequence of operations.

🔗

When we double the size of the array each time we resize, we can show that the average time per insertion is still constant time, \(O(1)\text{.}\) To see why, consider a sequence of insertions starting with an empty list and an initial capacity of 1. We will count each insertion as taking 1 unit of time. When we have to resize, we will count each copy operation as taking 1 unit of time as well.

The first insertion takes 1 unit of time to insert.

🔗
The second insertion requires a resize. First we copy the 1 existing element into an array of size 2, which takes 1 unit of time, and then we insert the new element, which takes another unit of time.

🔗
The third insertion also requires a resize. We copy the 2 existing elements into an array of size 4, which takes 2 units of time, and then insert the new element, which takes 1 unit of time.

🔗
The fourth insertion takes 1 unit of time to insert.

🔗
The fifth insertion requires a resize. We copy the 4 existing elements into an array of size 8, which takes 4 units of time, and then insert the new element, which takes 1 unit of time.

🔗
And so on...

🔗

🔗

In table form, it looks like:

🔗

Insertion #	Copy Work	Insertion Work	Work In This Step	Total Work So Far
1	0	1	1	1
2	1	1	2	3
3	2	1	3	6
4	0	1	1	7
5	4	1	5	12
6	0	1	1	13
7	0	1	1	14
8	0	1	1	15
9	8	1	9	24

To figure out the worst case, we can focus on the insertions where an array resize occurs. Since we grow by doubling from 1 to 2 to 4 to 8 to 16 and so on, which are powers of 2, the most expensive insertions are all at these powers of 2 + 1. 2, 3, 5, 9, 17, ... will be the points at which we have just spent the most time. The steps that come after those are the ones that get to take advantage of the extra space that is then available.

🔗

Focusing on just those, and adding a few more rows to look for a pattern gives us:

🔗

Insertion #	Copy Work	Insertion Work	Work In This Step	Total Work So Far
...	...	...	...	...
3	2	1	3	6
...	...	...	...	...
5	4	1	5	12
...	...	...	...	...
9	8	1	9	24
...	...	...	...	...
17	16	1	17	48
...	...	...	...	...
33	32	1	33	96

Carefully compare the insertion number to the total work so far at that step. Notice that the formula

\begin{equation*} \text{totalWork} = 3 \cdot \text{insertionNumber} - 3 \end{equation*}

could be used to predict the total work value.

🔗

In Big-O terms, the work done is \(3 \cdot n - 3\) where \(n\) is the number of insertions done. The constant factors can be ignored, so the time complexity is \(O(n)\text{.}\)

🔗

If adding \(n\) elements to the list requires \(O(n)\) work, then the average work per insertion is \(O(1)\text{.}\)

🔗

Note 25.10.2.

Insert End is \(O(1)\) when averaged over a sequence of operations. Any one particular insertion might take \(O(n)\) time if it requires resizing the array.

🔗

But any time we are talking about a series of insertions, we can consider the average time per insertion. We do not have to say the \(n\) insertions will each take \(O(n)\) time, which would be \(O(n^2)\) total.

🔗

Insight 25.10.3.

Another way to think about it is that we will “pay” 3 units of work for each insertion. One unit gets done right away. The other get “saved up” to pay for the copying work when a resize happens.

🔗

If you make a table where you track “amount saved”, and add 2 to that value at each step, then reduce it by the amount of copying work when a resize happens, you will see that the amount saved never goes negative. Paying 3 units, a constant amount, for each insertion covers all of our work.

🔗

This analysis depends on our array size doubling with each grow. As long as at each grow, we multiply the size by a constant factor greater than 1, the amortized time for insertions at the end remains \(O(1)\text{.}\) So we could triple the size each time and still have the same amortized time complexity.

🔗

However, if we only added a constant amount to the size each time we grow, by say adding space for 10 more items. There would be too many grow operations. There would be no way to amortize the cost and claim that there is an average \(O(1)\) time per insertion.

🔗

Warning 25.10.4.

We must grow by doubling (or some other size multiplier) to get efficient average insertion time.

🔗

In practice, it is good to avoid frequently resizing. If you know you will be adding 10,000 items to an array based list, it is better to request that it grows once to that size, rather than letting it grow multiple times as items are added. std::vector provides a reserve function for this purpose. It allows you to request that the vector allocate enough space for a certain number of items ahead of time.

🔗

Subsection 25.10.3 Summary

Here are the final time complexities for common operations on array based lists.

🔗

Table 25.10.1. Time Complexity of Array based List Operations

🔗

Insert at Start	Remove at Start	Insert at End	Remove at End	Access at Index	Insert/Remove at Index	Find Size	Clear List
\(O(n)\)	\(O(n)\)	\(O(1)\) ¹ amortized time	\(O(1)\)	\(O(n)\)	\(O(n)\)	\(O(1)\)	\(O(1)\)

🔗

You have attempted of activities on this page.

🔗

Prev Top Next

Insertion #	Copy Work	Insertion Work	Work In This Step	Total Work So Far
...	...	...	...	...
3	2	1	3	6
...	...	...	...	...
5	4	1	5	12
...	...	...	...	...
9	8	1	9	24
...	...	...	...	...
17	16	1	17	48
...	...	...	...	...
33	32	1	33	96

Insertion #	Copy Work	Insertion Work	Work In This Step	Total Work So Far
...	...	...	...	...
3	2	1	3	6
...	...	...	...	...
5	4	1	5	12
...	...	...	...	...
9	8	1	9	24
...	...	...	...	...
17	16	1	17	48
...	...	...	...	...
33	32	1	33	96

Insertion #	Copy Work	Insertion Work	Work In This Step	Total Work So Far
...	...	...	...	...
3	2	1	3	6
...	...	...	...	...
5	4	1	5	12
...	...	...	...	...
9	8	1	9	24
...	...	...	...	...
17	16	1	17	48
...	...	...	...	...
33	32	1	33	96