DSA Foundations
Before getting too deep its important to review some basics. This section reviews some fundamental programming principals that include recursion, iteration, and some math that includes exponents, logarithms, and series. The code also makes use of generics which is helpful, but not strictly necessary.
Recursion & Iteration
Recursion in programming represents a design pattern wherein you define a procedure in terms of itself. This is done by calling a defined procedure from within itself. Each call populates a new stack frame wherein the computer retains state as the process continues to its base case. Once the computer evaluates the base case it unwinds through all the stack frames to evaluate the operation based on each stack frames state, eventually returning the final result. To avoid an infinite logical loop the procedure relies on some basic assertion called a “base case” which can be evaluated without recursion. The procedure must also contain proper logic to make progress towards the base case with each successive call. In this way you divide a problem/procedure into a finite amount of repeated steps that each make progress toward a basic assertion or base case. In practice each call to the function creates a stack frame or “activation frame” which holds all of the information to evaluate that particular step before it attempts to evaluate the next recursive call. The computer suspends each activation frame to evaluate the next recursive operation. When the base case is reached the process unwinds through each of the temporarily suspended evaluations (activation frames) to compute the final result. Because of this, improperly designed recusrive calls run the risk of overflowing the stack buffer.
In summary recursive implementations must always:
- Have a base case which can be solved without recursion
- Make progress towards the base case with each step; may involve testing for multiple different recursive call iterations
- Assume that all the recursive calls work
- Carry out unique operations and not duplicate work by solving the same instance of a problem with each recursive call
When designing a recursive algorithm its useful to think of the basic operations that each step needs in order to make progress towards a base case. This tends to involve parameterizing the function in a way that is not immediately obvious in order to solve sub-problems. You can always make the function private and give it a cleaner public interface if exposing these additional parameters proves unsightly or unmanageable. Example for those who enjoy living the simple life and avoiding nested functions.
// Private implementationfn array_sum_function(data: Vec<i32>, low: usize, high: usize) -> i32 { ...}// Public interfacepub fn array_sum_interface(data: Vec<i32>) -> i32 { let h = data.clone().len() - 1; return array_sum_3(data, 0, h)}
Iteration typically involves using a language-specific looping mechanism. These loops rely on changes in internal state, such as a loop counter, logical condition, or a finite set of elements within an iterator or range to evaluate all necessary components of the calculation.
Consider the following trivial factorial evaluator functions:
// Recursive functionpub fn factorial_0(n: u32) -> u32 { if n <= 1 { // base case return n } return n * factorial_2(n - 1) // recursive call to self}
// Iterative functionpub fn factorial_1(n: u32) -> u32 { let mut fac = 1; for e in 2..=n { // loops through all iterations 2 to n fac *= e } fac}
A general rule is to use iteration (and specifically iterators) over recursion for simplicity and efficiency. That being said there are situations in which a function is actually more complicated to implement iteratively than recursively, in which case the recursive implementation is the correct choice. One example of this is the Tower of Hanoi problem.
Tail recursion represents a function or algorithmic form where the return call is solely recursive. For example, factorial_0
above does not represent tail recursion because the final return
call involves multiplying the recursive call by n.
return n * factorial_2(n - 1) // NOT tail recursion
However the following binary search algorithm does represent tail recursion because the last operation represents a recursive call.
pub fn binary_search(a: &Vec<i32>, t: i32, left: i32, right: i32) -> i32 { if left > right { return -1 } else { let mid = (left + right) / 2; if t == a[mid as usize] { return mid as i32 } else if t < a[mid as usize] { return bin_search_0(&a, t, left, mid - 1) } else { // Purely recursive final operation signifying tail recursion return bin_search_0(&a, t, mid + 1, right) } }}
Tail recursions are special because they can be automatically reimplemented non-recursively by enclosing the body in a loop. When possible you should eliminate tail recursion to increase performance by elimintaing all those expensive function calls and potentially even eliminting stack overflows. This optimization is so common that some programming languages automatically convert these cases by default during compilation or interpretation (depending on the language, naturally).
Algorithmic Complexity
The best way to measure and compare algorithms (processes) is to measure the growth rate of a function’s time and memory requirements as it’s input size grows. This section discusses two approaches to measuring an algorithm’s asymptotic complexity.
Empirical testing
Empirical testing can be useful tool to compare the behavior of a particular implementation. It also happens to be a great introduction to the idea of the asymptotic behavior. Empirical testing involves implementing an algorithm and running it against a timer with varying input sizes. For example, in Java you could use currentTimeMillis
to benchmark how long an execution takes. This example illustrates a basic test that runs an insertionSort()
algorithm and prints the running times.
public static void empiricalTest() { // Keep tracking of running time ArrayList<Long> runningTime = new ArrayList<>();
// Input to measure the insertionSort() implementation int[] unsorted = {1, 5, 7, 3, 5, 6, 7, 8, 8, 12, 23, 23, 24, 25};
for (int i = 1; i <= 10; i++ ) { long startTime = System.nanoTime(); int[] sorted = insertionSort(unsorted); long endTime = System.nanoTime();
if (i == 1) System.out.println(Arrays.toString(sorted));
// Calculates and adds completionTime for average calculation long completionTime = endTime - startTime; runningTime.add(completionTime); System.out.println("Sorting took " + completionTime + " nanoseconds"); }
// Calculates and prints average running time double avg = runningTime.get(0); for (int i = 1; i < runningTime.size(); i++) { avg += runningTime.get(i); } avg = avg / runningTime.size(); System.out.println("With an average of " + avg + " nanoseconds"); }
To get an idea of how the algorithm behaves as it grows you need to run the test many times for varying input sizes and graph the time against the input size. This particular algorithm happens to grow quadratically with the input size, which is to say that the time it takes to execute related to the input is n^2. This is fine for testing limited/specific instances, but requires an unnecessarily large amount of testing to be used as a general analysis tool to compare algorithmic processes. Additionally there may be elements that cause variance based on various anomalies in input size or value selection.
Asymptotic analysis
Empirical testing can be costly and error-prone, especially if the test is not designed correctly. As a result it is generally preferable to apply a more formalized mathematical framework to analyzing algorithms before any empirical testing takes place. This formalized framework is commonly referred to as asymptotic analysis. The field of mathematics characterizes asymptotic behavior as the behavior of a function as its inputs approach infinity. In computer science you rarely deal with infinite numbers, so it may be better to characterize asymptotic behavior as the behavior of a function as its input becomes very large. Often times any algorithm is fine to use when the data is small, but certain choices become prohibitively expensive in time, computing power, or storage space as inputs become necessarily large. When you compare two functions in terms of their asymptotic behavior, you’re interested in how they grow relative to each other as their input becomes very large.
Asymptotic analysis allows you to define and easily compare the growth rates from generalized, high-level descriptions of the algorithm. For example, if you have a problem that may be solved by two algorithm options with known growth rates it becomes easier to choose the more efficient design approach before any implementation takes place. Once you implement an algorithmic solution it still may be wise to check your assumptions with empirical testing and real world data!
To analyze an algorithm’s complexity you can start by algebraically organizing the algorithm’s basic operations such as arithmetic operations, method calls/returns, accessing array values, comparisons, assignments, etc. Operational specifics can vary by hardware and programming language so the operations must be abstracted in a way that provides a consistent, generalized measure. Due to the goals of the analysis and the underlying mathematics the eventual analysis only takes into account the most significant term of operations. The resultant expression of operations is presented as a function that represents the running time complexity T(n) of the algorithm, where n represents the size of the input.
The general approach is to analyze an operation from the inside out. Use the following rules in addition to counting primitive operations.
- The running time of a loop is at most the running time of the statements inside the loop (including tests) multiplied by the number of iterations.
- Analyze nested loops from the inside out. The total running time is the number of statements inside the inner-most loop multiplied by the number of next outer loop, and again to the top-level loop.
- Consecutive statements are additive, which is to say that only the greatest term is counted. For example, if you have one loop with n operations followed by a nested loop with n^n operations the result is still just n^2 operations.
- if/else blocks are never more than the running time of the test plus the larger time complexity of the branched operation. This may be a drastic overestimate in some situations, but never an underestimate.
To illustrate a basic analysis consider the following implementation of an insertionSort()
method.
// Operation counting by line1 public static int[] insertionSort(int[] data) {2 int n = data.length; // 13 for (int k = 1; k < n; k++) { // 1 + (n - 1) + (n - 1) = 2n - 1 = n -|4 int c = data[k]; // 1 |5 int j = k; // 1 |6 while (j > 0 && data[j - 1] > c) { // 2n -| |7 data[j] = data[j - 1]; // 1 | 2 * 2n = 4n = n | n * n = n^28 j--; // 1 -| |9 } |10 data[j] = c; // 1 |11 } -|12 return data; // 113 }
Starting from the deepest portion of the algorithm puts you in the while
loop. The while
loop executes 2 operations for a variable number of iterations based on k, which could be as high as n - 1, so for simplicity you can call that n times. The outer for
loop contains 3 initialization/assignment operations which are performed up to n times. Overall this means the nested loop structure requires n * n or n^2 operations. The method also contains a length check and a return which represent 2 additional constant operations which do not affect the more significant n^2 term.
The analysis establishes operations as a function of a generalized running time complexity T. That is to say, if you have an algorithmic operation (or function) f() that performs n operations, then the time complexity T of the operation f(n) is T(f(n)) or T(n). In the case of the example analysis of the insertion sort algorithm you could say that its time complexity is T(n) = n^2.
Common growth functions
The growth rate (time complexity) of most algorithms falls into just a handful of classifications. Several of the most common growth rates involve logarithms. Because computers mostly deal with binary, the base (radix) of the logarithmic term is often assumed to be 2. For this reason it is common to omit bases entirely in algorithmic analysis and notation because logarithms of different bases differ only by a constant factor. Therefore, log_2 is commonly written as just log. The following table lists common time complexity classes from cheapest (grows the slowest) to most expensive (grows the fastest):
Function | Time Complexity |
---|---|
T(n) = c | Constant |
T(n) = log(b) * n | Logarithmic |
T(n) = log^2 n | Log Squared |
T(n) = n | Linear |
T(n) = n * log(n) | N-Log-N or Linearithmic |
T(n) = n^2 | Quadratic |
T(n) = n^3 | Cubic |
T(n) = b^n | Exponential |
T(n) = n! | Factorial |
When considering speed and efficiency in practical applications you generally want your data structures to operate in linear time or better, and your algorithms to operate in N-log-N time or better. By this criteria the insertion sort algorithm is not considered very efficient.
Asymptotic notation
Running time complexity can be categorized by a handful of common definitions and expressed in asymptotic notation. As previously stated, the categorization focuses on the most significant terms of the operational complexity because these terms represent the “dominant” growth indicator. Indeed it is preferred to ignore constants and less significant terms within the function’s true operational complexity when determining asymptotic behavior. Additionally, it is most accurate to consider the tightest bounds when describing an algorithm. For example, if your friend lives 3 miles away it is true that your friend lives no more than 3,000 miles away, but more accurate to say that your friend lives within 5 miles of you.
-
Big O
Definition: T(n) = O(f(n)) if there exist positive constants c and n0 such that T(n) <= cf(n) when n >= n0
Big O (derived from the word “order” as in “order of growth”) characterizes the upper bound of a function’s growth rate and represents perhaps the most important metric by which to measure an algorithm’s growth. Big O is popular because in the real world it’s often most pragmatic to consider the worst case scenario for a given computational input. This definition states that there is some threshold n0 which T(n) ≤ c * f(n), meaning that beyond this point, the function T(n) grows no faster than f(n), ignoring constant factors.
It is important to note that Big O represents the tightest upper bound for a function. For example, consider f(n) = 3n^2 + 100n + 6. Taking only the most significant term and disregarding the constant this function is said to be O(n^2). The function is also logically O(n^3), O(n!), etc., because there does not exist any constant c or point n0 at which the function exceeds any of those bounds. However, because O(n^2) is the tightest bound it is most accurately said to be O(n^2).
-
Big Omega
Definition: T(n) = Ω(g(n)) if there are positive constants c and n0 such that T(n) >= cg(n) when n>= n0
Big Omega provides the lower bound, or best case scenario. If Big O describes a fuction that is <= to another function then Big Omega defines a function that is >= to another function.
-
Big Theta
Definition: T(n) = Θ(h(n)) iff T(n) = O(h(n)) and T(n) = Ω(h(n))
Big Theta provides asymptotically tight bounds where h(n) is both an upper and lower bound for T(n). In effect, Big Theta defines a situation where the asymptotic bounds are pragmatically or operationally indistinguishable. That is to say that the two functions grow at the same rate up to constant factors.
-
Little o
Definition: T(n) = o(p(n)) if for all positive constants c there exists an n0 such that T(n) < cp(n) when n > n0; Alternatively, T(n) = o(p(n)) if T(n) = O(p(n)) and T(n) != Θ(p(n))
Little o defines a situation in which a function grows slower than f(n).
Though it is not strictly accurate and there is more nuance, you can generally think of these asymptotic bounds in simpler terms:
- O(n) can be read as “at worst linear time”
- Ω(n) can be read as “at least linear time,” providing a minimum runtime bound
- Θ(n) becomes “almost precisely linear time,” indicating tight bounds
- o(n) can be read as “always faster than linear time”
At first glance O(f(n)) and o(p(n)) appear to be the same thing, but this is where the nuance lays. Big O is “at worst” and little o is “always faster than” which means that an algorithm that is o(n) can never run in linear time, because it is strictly faster than linear time, whereas a function that is O(n) can run in precisely linear time.
Proofs
It is often necessary to prove that an algorithm is correct or that it runs fast. To do this you will commonly encounter the following three basic methodologies.
Counterexample
To prove that a statement is true by example would take an infinite number of examples, but proving that a statemment is false only takes one counterexample. For example, you can prove that the statement “2^i - 1 is prime when i > 1” is false by stating that 2^4 - 1 = 15 = 3 * 5.
Contra attacks
Contra attacks involve de Morgan’s law which states that,
The negation of "A and B" is the same as "not A or not B".and
The negation of "A or B" is the same as "not A and not B".There are two contra attacks: contrapositive and contradiction. Take the statement, "Let *a* and *b* be integers. If _ab_ is even, then _a_ is even OR _b_ is even." The contrapositive would read, "If _a_ is _odd and b is odd_, then _ab_ is _odd_." The contradiction would read, "Assume _ab_ is even, _a_ is odd, and _b_ is odd. This result is that _ab_ is odd which contradicts the assumption that _ab_ is even."
Induction
Inductive proofs operate on the premise that there is a finite set of steps to prove a generality by starting with a trivial example and extrapolating, or inducing the proof. For example, you might start with a concrete example or two and then sub in variables to get to the thing you’re trying to prove. Inductive arguments supply a sequence of direct justifications.
There also exists a more specialized version of the inductive proof called loop invariants. This essentially says that there is some property L that, if true before the start of the loop, then its true at the start of the next iteration of the loop, and also through to the termination of the loop all together. The loop invariant (condition) helps to ensure that an algorithm maintains a desired property throughout its execution and ultimately achieves the intended result when it terminates.
Ordering
Ordering schemes are an important part of the logic of data structures and algorithms. For example, how do you compare String types? Should strings be ordered by length or lexicographical order?
Ordering schemes can be classified by the properties they exhibit. The four main properties are as follows:
- Trasitivity: If k1 <= k2 && k2 <= k3, then k1 <= k3; In a “strict ordering” this applies to the < operator such that if k1 < k2 && k2 < k3, then k1 < k3; Indifference extends comparison operations to “rough equivalency”
- Reflexivity: k <= k; In contrast irriflexivity is when k !< k and applies mostly to “strict ordering”
- Antisemmetry: if k1 <= k2 and k2 <= k1, then k1 == k2; This ensures that no two distinct elements can relate in both directions unless they are the same
- Asymmetry: A type of antisemmetry such that if k1 < k2, then k2 !< k1, mostly used in “strict ordering” to enforce a one-way relationship
- Comparability: Every element must be directly comparable and the primary way to distinguish “partial ordering” from “total ordering”; k1 <= k2 || k2 <= k1
You can combine these properties (and their variants) into the following major ordering type classifications:
Type | Transitive | Reflexive | Antisymmetric | Comparability |
---|---|---|---|---|
Total Order | X | X | X | X |
Partial Order | X | X | X | |
Strict Order | X | X (Asymmetric) | ||
Weak Order | X | X | ||
Preorder | X | X | ||
Quasi-order | X | X |
It is also necessary to distinguish some classifications. Consider atomic and natural ordering.
Atomic (or classical alphanumeric) ordering represents an element-wise ordering based on individual comparisons within each unit. For example, the number (unit) 69 is atomically ordered because 6 < 9. The number 420 is also atomically ordered because each element follows a strict ordering where 4 > 2 > 0. The number 187 is not atomically ordered because 1 < 8 and 8 > 7, which breaks the ordering. You can extend ordering to sorting logic as well. Consider the following sequence of units (integers), separated by spaces, which illustrates an atomic sorting of atomically ordered units: 1 10 11 2 23 5 This may be counter-intuitive for humans because how is 11 < 2? Atomic ordering does not consider the length of the unit, just the elements of each unit themselves. The sorting compares each element with the unit in its atomic order, so think of 11 as [1, 1] and 2 as [2, _]. Because 1 < 2, 11 comes before 2 in an atomic sort. Similarly 23 comes before 5 because 2 < 5.
By contrast, natural ordering compares the entire unit in something that is more “natural” for humans to grok. The same sequence in natural (sorted) order appears as: 1 2 5 10 11 23 Natural ordering extends to alphanumeric sequences and may include traditional numerical systems like hexadecimal as well as arbitrary alphanumeric combinations like “z3”, where 1222 < z3 in ASCII. Of course these types of ordering schemes are dependent on the underlying encoding schemes for the alpha characters.
You can apply this logic to two additional common ordering schemes; lexicographical and dictionary ordering. Both of these ordering schemes compare words by atomic order, where each letter represents an individual unit/value. The biggest difference is how each accounts for capital letters. In ASCII the capital letter P is 80, and the lower-case p is 112. Because P != p, dictionary ordering can only be considered a partial order, whereas lexicographical ordering represents a case-sensitive total order extension of the alphabetic ordering in Unicode. Using this logic you can compare Strings. For example, the String “Peter” translates to 80 101 116 101 114, “Paul” translates to 80 97 117 108, and “Abby” translates to 65 98 98 121 (with capital letters) in ASCII. Individually none of the strings are atomically ordered. In Peter and Paul 8 > 0, but 0 !> 1. In Abby 8 > 0 but 0 !> 6. However, its easy to see that a lexical (alphabetical, in this case) and dictionary ordering places the three names (units) in the following sequence: Abby Paul Peter