Skip to main content

Section 32.12 Skip Lists

The final “self-balancing tree” structure we will look at is the Skip List. A skip list is not actually a tree at all. But it can be used to provide the same logarithmic time complexity for search, insertion, and deletion operations as the other self-balancing trees we have looked at.
A skip list is a linked list with multiple levels. The bottom level is a regular linked list that contains all the values in sorted order. Each higher level is a subset of the level below it, and contains pointers that allow us to skip over large sections of the list when searching for a value.
Five linked Lists stacked on top of each other, representing the levels of the skip list. Each level has a subset of the nodes in the level below it.🔗
Figure 32.12.1. A skip list with five levels. The bottom level has every value. Each higher level has fewer values.
Any search process, including finding a location to insert a new value, starts at the head of the highest level. We move across that list as long as the next value is less than the value we are looking for. Once we can’t move across any more, we drop down to the next level and repeat the process. We continue this process until we reach the bottom level. At that point, we are at the position where the value we are looking for should be, and we can check if it is there or not.

Insight 32.12.1.

Think of the higher levels of a skip list as “express lanes” We can move faster using them, but can only get on and off at certain points.

Activity 32.12.1. Skip List Find.

The skip list below is set to find the value 800. Press the > button to step through the process of finding that value. Notice how the skip list uses the higher levels to quickly skip over large sections of the list, and then drops down to lower levels to find the exact value.
The structure will be randomly generated each time you reset it, so try resetting it a few times to see how the structure changes.

Instructions.

When the > button is highlighted, an animation is prepared to run. Use the Step button to step through the animation one step at a time.
When no animation is in progress, you can use the data controls to start a new animation.
You can click and drag on the animation area to pan around. Use the Ctrl + mouse wheel (or touchpad scroll gesture) while over the animation area to zoom in and out.
  • Core Controls.
    • << : Skip to the beginning of the animation.
    • < : Step backwards one step in the animation.
    • > : Step forward one step in the animation.
    • >> : Skip to the end of the animation.
    • Auto Step Speed Set to anything other than off to automatically step through the animation at the selected speed.
    • Zoom Set the zoom level. You can also use Ctrl + Mouse Wheel to zoom in and out.
  • Data Controls.
    • Value Use this area to enter a value when inserting, finding, deleting, etc...
So how do we decide how many levels there are and what levels each node should be on? We flip a coin!.
Every time we insert a new value, we flip a coin to determine how many levels it should be on. If the coin comes up heads, we add the value to the next level up. We keep flipping the coin until we get tails. This means that on average, half of the values will be on the first level, a quarter of the values will be on the second level, an eighth of the values will be on the third level, and so on.

Activity 32.12.2. Skip List Insertion.

The skip list below is set to find the value 350. Press the Step button to step through the process.
Each time you run the animation a different sequence of “coin flips” will be generated, so use the Reset button to try it multiple times.
You may want to use the >> button to skip to the end and then < to step backwards to see the coin flips so you can avoid watching the find process.

Instructions.

When the > button is highlighted, an animation is prepared to run. Use the Step button to step through the animation one step at a time.
When no animation is in progress, you can use the data controls to start a new animation.
You can click and drag on the animation area to pan around. Use the Ctrl + mouse wheel (or touchpad scroll gesture) while over the animation area to zoom in and out.
  • Core Controls.
    • << : Skip to the beginning of the animation.
    • < : Step backwards one step in the animation.
    • > : Step forward one step in the animation.
    • >> : Skip to the end of the animation.
    • Auto Step Speed Set to anything other than off to automatically step through the animation at the selected speed.
    • Zoom Set the zoom level. You can also use Ctrl + Mouse Wheel to zoom in and out.
  • Data Controls.
    • Value Use this area to enter a value when inserting, finding, deleting, etc...
So how does this structure provide logarithmic time complexity for search, insertion, and deletion operations?
First, let’s focus on the expected height of the skip list. In a list of \(n\) values, we expect about \(n/2\) values to be on the first level, \(n/4\) values to be on the second level, \(n/8\) values to be on the third level, and so on. This means that the expected number of levels is about \(\log_2 n\text{.}\) So, the expected height of the skip list is logarithmic with respect to the number of values.
Now, working backwards from a node that was found at the end of the lowest level (the worst case), let’s consider how many moves it took to get to that location. Every time we move to the left (working back towards the head), there is a 50% chance that we can step up a level. (Because every node that reached this level got a coin flip to decide if it should be on the level above this one.) So, on average, every two times we move left, we can also move up a level. We expect to have to move up \(\log_2 n\) times. If we move left two times for each move up, that means we expect to move left about \(2 \log_2 n\) times. So, the expected number of moves either up or to the left is \(O(3 \log_2 n) = O(\log n)\text{.}\)
So, the expected time complexity for search, insertion, and deletion operations in a skip list is \(O(\log n)\text{.}\) However, like with a treap, this is a probabilistic guarantee, not a strict guarantee. It is possible to get unlucky with the random coin flips and end up with a very unbalanced skip list.

Insight 32.12.2.

A skip list has a probabilistic average time complexity of \(O(\log n)\) for insert/find/delete.
We also might wonder about the space required to store all of those extra levels. We expect that each level has about half as many nodes as the level below it. So if the first level has \(n\) nodes, the number of nodes in the other levels is \(n/2 + n/4 + n/8 + \cdots\text{.}\) That sequence converges to \(n\text{,}\) so all of the extra levels combine to \(O(n)\) extra space, which is the same order of magnitude as the space required to store the values themselves.
Aside from being an interesting alternate way of achieving logarithmic time complexity, skip lists have some practical advantages over self-balancing trees. One key benefit is that insertions and deletions have minimal impact on the rest of the structure. In self-balancing trees, those operations can cause a series of rotations that affect the structure of the tree in other places.
In data structures designed to support concurrent access by multiple threads, this property can be a big advantage. In a self-balancing tree, if one thread is performing an insertion or deletion that causes rotations, it is hard to determine when other threads can safely access the tree. In a skip list, insertions and deletions only affect the local structure around the inserted or deleted node, so other threads can continue to access the rest of the skip list without waiting.
Real world examples of skip list can be found in Java, which provides a ConcurrentSkipListMap class that is more performant than the standard TreeMap class in concurrent scenarios, and the popular in-memory database Redis, which uses skip lists to implement sorted sets.

Checkpoint 32.12.1.

You have attempted of activities on this page.