Sunday 30 March 2014

Week 11: Sorting and Efficiency

We focused on sorting algorithms in the past two weeks. When judging the sorting algorithms, we usually look at their efficiency and memory requirement. Efficiency affects the overall performance of a program. To evaluate efficiency, we compare the processing time to produce the desired output for different algorithms, and use Big-O to denote the worst case or the upper bound for the number of steps required for a sorting algorithm. The number of steps required for sorting an input often vary dependent on the size of the input. The exact number does not matter, but how the number varies when the size of the input increases does.

We have learned bubble, insertion, and selection sort in csc108. These three sorting algorithms are relatively more time-consuming. Bubble sort compares adjacent items and keep "bubbling" the larger item to the right (the largest to its final place) and then sort the remaining n-1 items. Insertion sort takes the first unsorted item and inserts it into the proper position in the sorted list. Selection sort selects the smallest item from the unsorted side and swap it to the end of the sorted side. They all have a performance of O(n^2) in the worst scenarios.

One thing to notice is that Big-O notation abstracts away the difference between the similar worst-case performance. Constant coefficients are dropped out, but they actually vary across different algorithms, so many algorithms with the same efficiency do not have the same running time on the same input. For example, insertion sort and selection sort have the same efficiency (Big-O) in their worst cases, but insertion sort runs often faster than selection sort. Big-O is useful for the big picture.

In the lectures, quick sort and merge sort were introduced. In quick sort, we pick a pivot, partition the other items into a "smaller than pivot" sublist and a "larger than pivot" sublist, quick sort the sublists, and then combine them into the sorted list. Processing time for quick sort depends on how the pivots were chosen. It performs well in O(n*log(n))when pivot is randomly chosen that help avoid the possible performance of O(n^2) in the worst case when sorting a sorted list. In merge sort, we divide the list in half, mergesort the two halves, and then merge them in linear time. Merge sort runs in O(n*log(n)) in the best and the worst scenarios.

When evaluating sorting algorithms, we usually need to consider all their best, worst, and average performance. For instance, insertion sort has running time of O(n) when dealing with a sorted list and running time of O(n^2) when sorting a reverse sorted list. Its average performance is O(n^2), However, while quick sort also has a performance of O(n^2) in the worst case, its average running time is O(n*log(n)), so overall quick sort is a more efficient sorting algorithm than insertion sort.

Efficiency of a sorting algorithm also depends on the data structure that algorithm runs on. Time for merge sort to run on a Linked List will be much less than the time needed for merge sort to process on a list object.

No comments:

Post a Comment