As we saw on the last page, the exact way in which we count units of work in an algorithm is not as important as the degree to which the algorithm depends on the size of its input. An algorithm that always involves the same amount of work is more efficient than one where the work grows as a function of the input size.
This is the essence of Big-O analysis. It is a technique to describe how the work required by an algorithm grows as the size of the input increases. The key thing to identify is not an exact function that describes the work like \(f(n) = 2n + 3\text{,}\) but rather the class of function that describes the growth rate.
A functionβs class, or order (hence the O in Big-O), is determined by its dominant term - the part of the function that grows the fastest as the input size increases. For example, in the function \(f(n) = 2n^2 + 3n + 6\text{,}\) the dominant term is \(2n^2\) because as \(n\) gets very large, \(2n^2\) will grow much faster than either \(3n\) or \(6\text{.}\) Thus, we say that this function is in the class of quadratic functions.
An algorithm like DrawSquare is called a constant algorithm because the amount of work it requires does not change with the size of the input. The function to describe the work looks like \(f(n) = c\text{,}\) where \(c\) is a constant. What that constant is does not matter for Big-O analysis. An algorithm thatβs work is always constant is said to be \(O(1)\) (usually pronounced βbig-oh of oneβ).
An algorithm like DrawShape is called a linear algorithm because the amount of work it requires grows in direct proportion to the size of the input (a linear function). The function to describe the work looks like \(f(n) = kn + c\text{,}\) where \(k\) and \(c\) are constants. Again, what those constants are does not matter when doing Big-O analysis. \(f(n)
= 3n + 6\) or \(f(n) = 5n + 2\) are both linear functions and would both be described as \(O(n)\) in Big-O notation. (Which would be pronounced βbig-oh of nβ.)
Saying an algorithm is \(O(n)\) is just a short way of saying that βif you analyze the work \(f(n)\) done by the algorithm for a problem size of \(n\text{,}\) you will find that the \(f(n)\)βs dominant term is \(kn\) where \(k\) is a constant.β
The \(\_\) in \(O(\_)\) always represents the dominant, or fastest-growing, term in the function that describes the work required by an algorithm for an input of size n.
For instance, say that we discover that the function \(f(n)\) that estimates the work for a particular algorithm is \(2n + 3\text{.}\) In that expression,\(2n\) is the dominant term, so we say that the algorithm is linear or that it has a Big-O of \(O(n)\text{.}\)
The function that estimates work for another algorithm might be \(f(n) = n^2 - 2n + 1\text{.}\) In that expression,\(n^2\) is the dominant term. A function where \(n^2\) is the dominant term is called quadratic, so we say that the algorithm is quadratic or that it has a Big-O of \(O(n^2)\text{.}\)
Most algorithms in computer science that involve logarithms use base 2. But, rather than write \(\log_2 n\) (log base 2 of \(n\)), it is common practice to just write \(\log n\) .
Why is it OK to leave off the base of the logarithm? \(\log_2 100\) on a calculator gives a different answer than \(\log_{10} 100\text{,}\) so why can we not specify which it is?
Logarithms with different bases differ only by a constant factor. For example, you can convert log base 10 of x to log base 2 of x by multiplying by \(\log_{2} 10\) which is approximately 3.32.
When you see \(\log n\) you should understand it as \(\log_2 n\) (log base 2) unless otherwise specified and use that base for any calculations or comparisons involving logarithms.
Big-O notation technically describes an upper bound on the growth rate of a function. So when we say that an algorithm is \(O(n)\text{,}\) we are saying that its work grows no faster than a linear function. By that definition, an algorithm that is \(O(n)\) could also be \(O(n^2)\) or \(O(2^n)\text{,}\) since those are upper bounds as well.
In algorithmic analysis, Big Theta (\(\Theta\)) notation is used to describe a tight bound. That is to say a class of function that could describe both an upper and lower bound.
However, in practice, when people say that an algorithm is \(O(n)\text{,}\) they generally mean the tighter bound. Outside the context of deep mathematical analysis, you should assume Big O means the dominant term or a tightest upper bound.
A key thing to note in FigureΒ 25.3.2: The Big-O category of a function does not tell us about how efficient at small values of \(n\text{.}\) Look at the graph close to \(n = 0\text{.}\) At small values of \(n\) (less than 1), \(f(n) = n\) are less than \(f(n) = n^2\) even though we consider \(O(n^2)\) to be less efficient than \(O(n)\) .
At that scale, it also would not be true that the constants and non-dominant terms do not matter. Consider this table of values for \(f(n) = 20n + 1000\) and \(f(n) = n^2 + 1\text{:}\)
For values less then ~50, the linear function \(20n + 1000\) actually produces larger values than the quadratic function \(n^2 + 1\text{.}\) It is only after that point that the quadratic function becomes larger.
For these reasons, we should only use Big-O to consider the efficiency of functions for large input sizes (values of \(n\)). And, if the constant factors are extraordinarily large, we may have to take the Big-O notation with a grain of salt. For example, when comparing \(1000n\) to \(2n \log n\text{,}\) Big-O tells us that the former is more efficient. \(n\) grows more slowly than \(n \log n\text{.}\) However, in practice, it will take an extraordinarily large value of \(n\) for \(2n \log n\) to become larger than \(1000n\text{.}\)
It does not tell us much about their performance for small input sizes. It also tells us nothing about the relative performance of algorithms in the same Big-O category.