Big-O

Section 25.3 Big-O

Subsection 25.3.1 What is Big-O?

As we saw on the last page, the exact way in which we count units of work in an algorithm is not as important as the degree to which the algorithm depends on the size of its input. An algorithm that always involves the same amount of work is more efficient than one where the work grows as a function of the input size.

🔗

This is the essence of Big-O analysis. It is a technique to describe how the work required by an algorithm grows as the size of the input increases. The key thing to identify is not an exact function that describes the work like \(f(n) = 2n + 3\text{,}\) but rather the class of function that describes the growth rate.

🔗

A function’s class, or order (hence the O in Big-O), is determined by its dominant term - the part of the function that grows the fastest as the input size increases. For example, in the function \(f(n) = 2n^2 + 3n + 6\text{,}\) the dominant term is \(2n^2\) because as \(n\) gets very large, \(2n^2\) will grow much faster than either \(3n\) or \(6\text{.}\) Thus, we say that this function is in the class of quadratic functions.

🔗

An algorithm like DrawSquare is called a constant algorithm because the amount of work it requires does not change with the size of the input. The function to describe the work looks like \(f(n) = c\text{,}\) where \(c\) is a constant. What that constant is does not matter for Big-O analysis. An algorithm that’s work is always constant is said to be \(O(1)\) (usually pronounced “big-oh of one”).

🔗

An algorithm like DrawShape is called a linear algorithm because the amount of work it requires grows in direct proportion to the size of the input (a linear function). The function to describe the work looks like \(f(n) = kn + c\text{,}\) where \(k\) and \(c\) are constants. Again, what those constants are does not matter when doing Big-O analysis. \(f(n) = 3n + 6\) or \(f(n) = 5n + 2\) are both linear functions and would both be described as \(O(n)\) in Big-O notation. (Which would be pronounced “big-oh of n”.)

🔗

Saying an algorithm is \(O(n)\) is just a short way of saying that “if you analyze the work \(f(n)\) done by the algorithm for a problem size of \(n\text{,}\) you will find that the \(f(n)\)’s dominant term is \(kn\) where \(k\) is a constant.”

🔗

Insight 25.3.1.

The \(\_\) in \(O(\_)\) always represents the dominant, or fastest-growing, term in the function that describes the work required by an algorithm for an input of size n.

🔗

For instance, say that we discover that the function \(f(n)\) that estimates the work for a particular algorithm is \(2n + 3\text{.}\) In that expression,\(2n\) is the dominant term, so we say that the algorithm is linear or that it has a Big-O of \(O(n)\text{.}\)

🔗

The function that estimates work for another algorithm might be \(f(n) = n^2 - 2n + 1\text{.}\) In that expression,\(n^2\) is the dominant term. A function where \(n^2\) is the dominant term is called quadratic, so we say that the algorithm is quadratic or that it has a Big-O of \(O(n^2)\text{.}\)

🔗

For now, we will only see the following Big-O categories. They are listed in order of slowest growing to fastest growing.

🔗

Table 25.3.1. Some Big-O Categories

🔗

Big-O Category	Name	Description
\(O(1)\)	Constant	The algorithm does the same amount of work regardless of the input size. 🔗
\(O(\log n)\)	Logarithmic	\(log(n)\) is the dominant term in the function describing the work required by the algorithm. 🔗
\(O(n)\)	Linear	\(n\) is the dominant term in the function describing the work required by the algorithm. 🔗
\(O(n\log n)\)	Log Linear	\(n \log n\) (n times log n) is the dominant term in the function describing the work required by the algorithm. 🔗
\(O(n^{2})\)	Quadratic	\(n^{2}\) is the dominant term in the function describing the work required by the algorithm. 🔗

Most algorithms in computer science that involve logarithms use base 2. But, rather than write \(\log_2 n\) (log base 2 of \(n\)), it is common practice to just write \(\log n\) .

🔗

Why is it OK to leave off the base of the logarithm? \(\log_2 100\) on a calculator gives a different answer than \(\log_{10} 100\text{,}\) so why can we not specify which it is?

🔗

Logarithms with different bases differ only by a constant factor. For example, you can convert log base 10 of x to log base 2 of x by multiplying by \(\log_{2} 10\) which is approximately 3.32.

\begin{align*} \log_{2} x \amp = \log_{2} 10 \cdot \log_{10} x\\ \log_{2} \amp \approx {3.32} \cdot {\log_{10} x} \end{align*}

In Big O, we don’t care about constant factors, so the difference between \(\log_2\) and \(\log_{10}\) does not matter.

🔗

Note 25.3.2.

When you see \(\log n\) you should understand it as \(\log_2 n\) (log base 2) unless otherwise specified and use that base for any calculations or comparisons involving logarithms.

🔗

A graph showing common Big-O categories. The x-axis is labeled "Input Size (n)" and the y-axis is labeled "Number of Operations". The graph shows five curves: Constant O(1) is a horizontal line, Logarithmic O(log n) increases slowly, Linear O(n) is a straight diagonal line, Log Linear O(n log n) curves upward more steeply, and Quadratic O(n^2) curves upward steeply. — Figure 25.3.2. Visual Comparison of Big-O Categories. The Big-O category determines how the work, \(f(n)\) relates to the input size \(n\text{.}\)
🔗

Note 25.3.3. Technical Note.

Big-O notation technically describes an upper bound on the growth rate of a function. So when we say that an algorithm is \(O(n)\text{,}\) we are saying that its work grows no faster than a linear function. By that definition, an algorithm that is \(O(n)\) could also be \(O(n^2)\) or \(O(2^n)\text{,}\) since those are upper bounds as well.

🔗

In algorithmic analysis, Big Theta (\(\Theta\)) notation is used to describe a tight bound. That is to say a class of function that could describe both an upper and lower bound.

🔗

However, in practice, when people say that an algorithm is \(O(n)\text{,}\) they generally mean the tighter bound. Outside the context of deep mathematical analysis, you should assume Big O means the dominant term or a tightest upper bound.

🔗

Subsection 25.3.2 Big-O, Constants, and Limitations

A key thing to note in Figure 25.3.2: The Big-O category of a function does not tell us about how efficient at small values of \(n\text{.}\) Look at the graph close to \(n = 0\text{.}\) At small values of \(n\) (less than 1), \(f(n) = n\) are less than \(f(n) = n^2\) even though we consider \(O(n^2)\) to be less efficient than \(O(n)\) .

🔗

At that scale, it also would not be true that the constants and non-dominant terms do not matter. Consider this table of values for \(f(n) = 20n + 1000\) and \(f(n) = n^2 + 1\text{:}\)

🔗

n	1	10	20	30	40	50	100
\(f(n) = 20n + 1000\)	1020	1200	1400	1600	1800	2000	3000
\(f(n) = n^2 + 1\)	2	101	401	901	1601	2501	10001

For values less then ~50, the linear function \(20n + 1000\) actually produces larger values than the quadratic function \(n^2 + 1\text{.}\) It is only after that point that the quadratic function becomes larger.

🔗

For these reasons, we should only use Big-O to consider the efficiency of functions for large input sizes (values of \(n\)). And, if the constant factors are extraordinarily large, we may have to take the Big-O notation with a grain of salt. For example, when comparing \(1000n\) to \(2n \log n\text{,}\) Big-O tells us that the former is more efficient. \(n\) grows more slowly than \(n \log n\text{.}\) However, in practice, it will take an extraordinarily large value of \(n\) for \(2n \log n\) to become larger than \(1000n\text{.}\)

🔗

Insight 25.3.4.

Big-O focuses on the relative performance of functions as the input size grows large.

🔗

It does not tell us much about their performance for small input sizes. It also tells us nothing about the relative performance of algorithms in the same Big-O category.

🔗

Checkpoint 25.3.1.

If the amount of work for a job of size \(n\) is estimated by \(f(n)=2n+3n^{2}-1\) what is the Big O?

🔗

O(\(2n\))
No, \(3n^{2}\) dominates \(2n\text{.}\) Try again.
O(\(n\))
No, \(n^{2}\) dominates \(n\text{.}\) Try again.
O(\(3n^{2}\))
No, the 3 should be omitted because we don’t care about constants in Big-O notation.
O(\(n^{2}\))
Right!
More than one of the above
No, only one of them is correct. Try again.

🔗

Checkpoint 25.3.2.

If the amount of work for a job of size \(n\) is estimated by \(f(n)=2n+5\log n\) what is the Big O?

🔗

O(\(\log n\))
No, \(2n\) dominates \(\log n\text{.}\) Try again.
O(\(n \log n\))
No, \(n \log n\) means “n times log n”. Here the \(\log n\) term is separate from the \(n\) term.
O(\(2n\))
No, the 2 should be omitted because we don’t care about constants in Big-O notation.
O(\(n\))
Right!
More than one of the above
No, only one of them is correct. Try again.

🔗

Checkpoint 25.3.3.

If the amount of work for a job of size \(n\) is estimated by \(f(n)=\frac{n\log n}{2} + 100n + 5\) what is the Big O?

🔗

O(\(\log n\))
No, \(2n\) dominates \(\log n\text{.}\) Try again.
O(\(n \log n\))
No, \(n \log n\) means “n times log n”. Here the \(\log n\) term is separate from the \(n\) term.
O(\(2n\))
No, the 2 should be omitted because we don’t care about constants in Big-O notation.
O(\(n\))
Right!
More than one of the above
No, only one of them is correct. Try again.

🔗

Checkpoint 25.3.4.

Order the following from slowest growing to fastest growing.

🔗

constant O(1)
---
logarithmic O(logn)
---
linear O(n)
---
log linear O(nlogn)
---
quadratic O(n^2)

🔗

Checkpoint 25.3.5.

Which of the following statements is true about the two algorithms? Algorithm 1’s work: \(100n + 1\) Algorithm 2’s work: \(n^2 + n + 1\)

🔗

Algorithm 1 will require a greater number of steps to complete than Algorithm 2
This could be true depending on the input. But what if n is 1,000?
Algorithm 2 will require a greater number of steps to complete than Algorithm 1
This could be true depending on the input. But what if n is 1?
At large values of n, Algorithm 1 will require a greater number of steps to complete than Algorithm 2
Correct!
Algorithm 1 and 2 will always require the same number of steps to complete
No, the efficiency of both will depend on the input

🔗

You have attempted of activities on this page.

🔗

Prev Top Next