Technology Distillate: Insertion Sort

Insertion sort algorithm is interesting because it feels more natural. If you had a randomly sorted deck of cards and picked up one card, it is already sorted. If you picked up another card you would "insert" it into a sorted order which will either be before or after the card you already have. You now have two cards and pick up the third card. At this point you will "traverse" the cards you have and inserted it at the appropriate point, repeating the process until you have the desired number of cards in your hands in some sorted order. This process is an iterative search and should not be confused with a priori, a trait human beings are likely to apply or in short a human being will likely "just know where to place the card" upon looking at what is already in hand.

Insertion sort starts by indexing the first element which it deems is in correct order since the current index is the same as the first index. The index is moved up one and the element at the new index is compared or iterated backwards until a suitable position is found to insert the current element into. A good way to solve this is to store the current element into a temporary buffer. Then as you traverse backwards you swap each previous index with current index until you find where to place the stored element and copy that into the new location. The diagram below illustrates the process which I feel describes it more clearly than words:

To build this into code we need two methods. First method is the InsertionSort method that traverses the array of n elements. At each incremental traversal a second method, Insert, is called which inserts the current element being traversed into the appropriate slot.

InsertionSort code:

public int InsertionSort(ref int[] array)
{
int swapcount = 0;
for (int i = 1; i < array.Length; i++)
{
swapcount += Insert(ref array, i);
}
return swapcount;
}

In the above code inside the for loop Insert method is called which is passed the index of the current element being looked at and the array.

Insert code:

private int Insert(ref int[] array, int i)
{
int swapcount = 0;
for(int j = i - 1; j >= 0; j--)
{
    if (array[j] >= array[i])
    {
      Swap(ref array, i, j);
      i = j;
      swapcount++;
    }
   }
return swapcount;
}

In the above Insert code the array is traversed backwards from an index previous to current index 'i' and each item is compared with the 'current' index 'i'. If current indexed element is smaller than the index previous to it a swap is made and the current index 'i' becomes the previous index 'j'. The loop stops when no more elements can be traversed backwards. One thing I did not include is storing the current index into a temp variable because the solution is so trivial we do not need a temp storage and it was only mentioned above to simplify the explanation. In the code above, optionally, a swap count is kept and returned. Also it is important to notice that unlike bubble sort where you could detect that no swap has taken place and rest assured array is sorted, you cannot do the same here. If the current element is larger than the element before it, we cannot stop the entire insertion sort since we cannot tell if elements after the current element are sorted.

This brings us to the running time which is also quadratic at O(n^2) if none of the data is sorted. However there is one exception and that is Insertion sort can achieve O(n) time if all the data is already sorted. This is because the InsertionSort method will traverse n data and Insert will return with O(1) time because the loop will terminate if the first comparison reveals that no swap should take place in the already sorted portion of the array. We can detect this with a simple modification to our Insert algorithm:

private int Insert(ref int[] array, int i)
{
int swapcount = 0;
for(int j = i - 1; j >= 0; j--)
{
    if (array[j] >= array[i])
    {
      Swap(ref array, i, j);
      i = j;
      swapcount++;
    }
    else break;
}
return swapcount;
}

The extra code is in green which detects, much like bubble sort, if no swap has taken place and breaks the for-loop. It does not abort the entire insertion sort algorithm which happens in bubble sort but only the current passes insert algorithm. Like bubble sort and selection sort, insertion sort is also an in-place algorithm because it requires little or no additional memory to perform its operation.

The best use for insertion sort algorithm is on data that is already mostly sorted in order to reduce the quadratic time. For data that is completely unsorted a better option, from the algorithms covered so far, would be selection sort because although it has a cost of O(n^2) it will have considerably less cost in swapping data.

Next: QuickSort.
Previous: Selection Sort.
Landing page: Part One.

Technology Distillate

Pages

Saturday 26 July 2014

Insertion Sort

No comments :

Post a Comment