We use asymptotic analysis to describe limiting behavior. When an implementation of an algorithm(size n) runs cg(n) steps(c is a positive real number) we express the algorithm by big-O i.e. O(g(n)). Big-oh is also upper bounded, which means the algorithm has a limit on how it behaves(its growth rate). For example, O(n) is linear and it will never exceed cn. Also, when the size n get very large, the difference of time complexity approaches infinity. Therefore, the efficiency of algorithms really matters.
constant: O(1)
logarithmic: O(log n)
linear: O(n)
quadratic: O(n2 )
cubic: O(n3 )
exponential: O(2n)
horrible: O(nn) or O(n!)
linear: O(n)
quadratic: O(n2 )
cubic: O(n3 )
exponential: O(2n)
horrible: O(nn) or O(n!)
We will examine some common sorting algorithms, and compare their time complexity. First of all, the bubble sort is very inefficient, because items are exchanged before reaching their final position. For a size n list, in the first pass, there is n - 1 comparisons; the second pass has n - 2 comparisons. There are n-1 passes, and its time complexity would be n-1 + n-2 + ... + 1 = n(n - 1)/2, and therefore it runs in time O(n^2). Secondly, for the insertion sort, it always have a sorted list from the first item to where we currently implement with, and the next item is inserted in the sorted list by comparing it with previous element. The time complexity is O(n^2) as well. Thirdly, the algorithm for selection sort is 1) find the minimum element though the list; 2) exchange only once, to its proper position; 3) find the minimum element in the remaining unsorted list; 4) and so on. Its time complexity is still O(n^2) but it is slight more efficient. These are sorting methods we have learned in CSC108. They all require O(n^2) running time.
We learned some new methods of sorting: merge sort, quick sort, and they are more efficient.
def merge(L1, L2): """return merge of L1 and L2 >>> merge([1, 3, 5], [2, 4, 6]) [1, 2, 3, 4, 5, 6] >>> merge([1, 2, 3], [0, 4, 5]) [0, 1, 2, 3, 4, 5] >>> merge([0], [1, 2, 3, 4]) [0, 1, 2, 3, 4] """ L = [] i1, i2 = 0, 0 # every slice is a (linear) copy! # while L1[i1:] and L2[i2:]: while i1 < len(L1) and i2 < len(L2): if L1[i1] < L2[i2]: L.append(L1[i1]) i1 += 1 else: L.append(L2[i2]) i2 += 1 return L + L1[i1:] + L2[i2:] def merge_sort(L): """Produce copy of L in non-decreasing order >>> merge_sort([1, 5, 3, 4, 2]) [1, 2, 3, 4, 5] >>> L = list(range(20)) >>> shuffle(L) >>> merge_sort(L) == list(range(20)) True """ if len(L) < 2 : return L[:] else : return merge(merge_sort(L[:len(L)//2]), merge_sort(L[len(L)//2:]))As we can see in the code, merge_sort is a wrapper, and calls its helper function merge. What merge does is returning a sorted list which comes from two sorted list. In merge_sort, the process is: 1) it divides list into halves until the length of sublist is less than 2(recursively: merge_sort); 2) merge the halves (recursively: merge); 3) return the merged list. Noticed that we split the list by O(n) times, and each time cost O(log n) steps. Therefore, merge sort runs in time O(n log n). Next, we will see how quick sort runs.
def quick(L: list) -> list: """Produce a sorted version of L >>> quick([2, 1, 3]) [1, 2, 3] """ if len(L) < 2: return L[:] else: # i = randint(0, len(L) - 1) i = 0 return (quick([x for x in L if x < L[i]]) + [L[i]] + quick([x for x in (L[0:i] + L[i+1:]) if x >= L[i]]))
Quick sort is another efficient sorting method. It is implemented by 1) choose a pivot element; 2) split the list to two parts, one is all less than the pivot(left), the other is all greater than the pivot(right); 3) quick sort the left and right (recursively: quick). Finally all elements in the list are sorted. For its time complexity, the times we find the pivot is O(n), and if the pivot appear about the middle of the list, we can split the list in half, thus the number of steps for each time is O(log n). Therefore, its average time complexity is O(n log n). Nevertheless, in the worst case, when the pivot is the largest or the smallest element, the list divides into a list of 0 elements and a list of n-1 elements, then we will have n-1 + n-2 +... + 1 running time, thus its time complexity is O(n^2).
Furthermore, there is a more efficient sorting method, which is python's built-in sort (timsort). Its time complexity is O(n). It seems like it combines many sort methods and is implemented depend on the list's size, randomness, run length, etc. Timsort is amazing and abstract for me. Above all, sorting is very important and basic, and is frequently used in computer science, and we should really focused on its efficiency, as "Time lost is never found again - Benjamin Franklin".
Thanks for reading!!
Revision:
Thank for Weidong An's correction, I made a mistake on second last paragraph.
Timsort's time complexity in the worst case and average case is O(n log n), and only in the best case is O(n).