7.6 Sorting
As datasets get larger, it’s important to be able to manage and sort through data to find results and values more quickly and efficiently. When data is disorganized, it can be hard to find values easily. To find values, it’s much easier to do so if they are in the correct order. That way, searches can be organized at specific points, which in turn can make sure that programs will look through less values as they search.
Take this dataset for example:
1 | 2 | 7 | 44 | 5 | 76 | 9 | 0 | 72 | 4 | 325 | 2 | 753 | 345 | 234 | 53 |
---|
Finding the value 4 in this list requires you to search from the beginning all the way to the end - since the data isn’t in a particular order, there is no way of knowing for certain where the element 4 will be found.
If this list were organized by smallest number to greatest:
0 | 1 | 2 | 2 | 4 | 5 | 7 | 9 | 44 | 53 | 72 | 76 | 234 | 325 | 345 | 753 |
---|
Then finding 4 would be a much easier process, as you can search certain sections of the list for the element in question where it is most likely to be found.
In programming, there are standard algorithms that can be used to sort data in ways that make finding data more manageable. Two standards algorithms that will be explored in this chapter are Selection and Insertion Sort.
Selection sort sorts arrays by repeatedly finding the minimum value in an array, and moving it to the front. Consider the following, unorganized array:
5 | 3 | 4 | 1 | 6 | 2 |
---|
Using selection sort to sort this array, the first index in the array is picked to start.
5 | 3 | 4 | 1 | 6 | 2 |
---|
Then, the list is traversed to find the minimum value in the rest of the list, which in this case is 1. The value 1 is then swapped from its current position to the position where 5 was:
1 | 3 | 4 | 5 | 6 | 2 |
---|
Now the value 1 at index 0 is considered sorted, and the next unsorted element at index 1 will be chosen. The value at index 1 will be compared with the rest of the unsorted items until a minimum value is found. In this case, the value 2 at index 5 is the minimum value, and it is swapped with the value at index 1.
1 | 2 | 4 | 5 | 6 | 3 |
---|
This process continues until the entire list is sorted from least - greatest.
Here is what this process looks like in pseudocode:
This can then be used to create a working algorithm for selection sort:
The initial traversal goes from the first element to the second to last element (array.length - 1) because the final element in the array is considered sorted if every other element in the array has been sorted correctly. The second traversal looks for the minimum value in the array from the current element until the end of the list. It does not search the list for already sorted elements. Once the minimum element is found, it swaps positions with the current element.
Selection sort is a particularly useful sorting algorithm when the elements in a list are in reverse order:
6 | 5 | 4 | 3 | 2 | 1 |
---|
In this case, the algorithm only needs to swap values a few times, as each swap successfully puts each element in its correct position.
Insertion sort sorts an array by sorting each element compared to the elements already sorted to their left.
Given the same list as the previous example, insertion sort marks the first element in the list as sorted:
5 | 3 | 4 | 1 | 6 | 2 |
---|
The next element, 3, is then selected as the next element to sort:
5 | 3 | 4 | 1 | 6 | 2 |
---|
Rather than find the minimum value in the unsorted segment like selection sort, insertion sort compares unsorted elements to the sorted ones, placing the element in the correct position within the sorted list:
3 | 5 | 4 | 1 | 6 | 2 |
---|
The element 3 is now considered sorted, and the sort continues on to the next element, 4, and sorts it within the already sorted list:
3 | 4 | 5 | 1 | 6 | 2 |
---|
This continues until all elements in the list are sorted from least to greatest.
Here is a pseudocode implementation of insertion sort:
This can be used to create a working algorithm for insertion sort:
Insertion sort traverses all elements from index 1 until the end of the array. Each element then traverses the sorted segment of the array, looking for the correct place in the sorted list to go. This is done using a while loop, as the number of iterations depends largely on the content of the sorted list. As the element moves through the sorted list, each element it searches through is shifted one index higher. This ensures that the element can be placed in the correct index without removing any of the pre-sorted elements from the list.
Insertion sort is most useful when a list is mostly sorted:
1 | 2 | 3 | 4 | 6 | 5 |
---|
When lists are almost sorted, the number of times that insertion sort needs to iterate through the list items is limited. In this case, the while loop will not execute for the first 5 index values in this array, limiting the execution count, and reducing the amount of time the algorithm spends sorting items.
When choosing which sorting algorithm to use, it’s important to pay attention to the composition of the existing list, and determine which method will be most efficient, especially as the datasets grow.
-
Incorrect
Correct
No Answer was selected
Invalid Answer
When is it more appropriate to use Insertion Sort than Selection Sort?
Starting the insertion sort algorithm from the example, switch the order for the sort from ascending order to sort in descending order. (for example, 10, 8, 6, 4, 2).
Hint: Where are items compared? Try writing out the steps in the algorithm on paper to help.