Click here to Skip to main content
15,867,594 members
Articles / General Programming / Sorting

QuickSortByRange, QuickSelectByRange, PartitionByRange

Rate me:
Please Sign up or sign in to vote.
4.82/5 (13 votes)
26 Dec 2013CPOL4 min read 30.3K   235   20   17
A performance improvement for the classic sort / select algorithms.

Introduction

The classic QuickSort and QuickSelect algorithms have poor worst-case performance. If bad pivots are consistently chosen, such as decreasing by only a single element each time, then worst-case performance is quadratic: O(n2). Duplicate values in the array are problematic for QuickSort.

This article introduces the QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms which provide a remedy to the problem with duplicates by returning a pivot range from the PartitionByRange algorithm. This approach works well when the array contains several duplicate values.

For example, if every element of the array contains the same value, then QuickSort delivers O(n2) whereas QuickSortByRange delivers O(n). Considering that the average case performance of QuickSort is O(n log n), the all-duplicates scenario becomes a better than average case scenario for QuickSortByRange as compared to the worse than average case scenario we see with QuickSort.

Background

QuickSort and QuickSelect both rely on a partition algorithm. The partition algorithm looks at an initial (pivot) value within a range of the array and then reorganizes the values within that range so that all the values less than or equal to the pivot value are placed to the left of the pivot and the values greater than the pivot value are moved to the right of the pivot. The algorithm will move the pivot element, if necessary, to satisfy this quasi-sort and then it returns the final position of the pivot element.

For example, let's consider the following range within a larger array:

[ < , > , == , x , < , > , < , == ]

In this example, 'x' marks the pivot value, '<' indicates values less than the pivot value, '>' indicates values greater than the pivot value, and '==' indicates values equal to the pivot value.

After a single pass through the quickSort partition algorithm, these elements will have been reorganized to:

[ < , == , < , < , == , x , > , > ]

PartitionByRange takes the quasi-sort a step further. It further reorganizes the array so that elements equal to the pivot element are placed to the immediate left of the pivot element. Values less than the pivot value are placed to the left of this range of equal-value elements.

After a single pass through the PartitionByRange algorithm, these elements will have been reorganized to:

[ < , < , < , == , == , x , > , > ]

Instead of returning a single value indicating the final position of the pivot value, the PartitionByRange algorithm returns two values indicating the range of positions which contain the same value as the pivot value. This enables the QuickSortByRange and QuickSelectByRange algorithms to move the boundaries of the next pass through the PartitionByRange further to the right or left than QuickSort and QuickSelect, respectively.

Using the Code

The code is written in C# for arrays of integers. It's fairly straightforward to port the code to another language or value type.

The attached file is an HttpHandler which is designed to run on the localhost of a machine with IIS. It includes benchmarking code to replicate the results posted at the end of this article.

One important part of the PartitionByRange code is the Stack used to store the locations of values equal to the pivot value. When porting the code to another language, ensure that a LIFO structure is used to store these indexes. An array may be used in place of a stack as long as LIFO order is used.

PartitionByRange

C#
public Tuple<int, int> partitionByRange(int[] a,int left,int right,int pivot) {

  int pivotValue = a[pivot];
  Stack<int> equalValuePositions = null;
  swap(a,pivot,right);
  for(int i=left;i<right;i++)
    if(a[i] <= pivotValue) {
      swap(a,left++,i);
      if(a[i] == pivotValue) {
        if(equalValuePositions == null) equalValuePositions = new Stack<int>();
        equalValuePositions.Push(i);
      }
    }
  swap(a,left,right);
  var rightPivotIndex = left;
  if(equalValuePositions != null) foreach(int i in equalValuePositions) swap(a,--left,i);
  return new Tuple<int, int>(left,rightPivotIndex);

}

public void swap(int[] a,int i,int j){ if(i!=j){ int k=a[i]; a[i]=a[j]; a[j]=k; } }

QuickSortByRange

C#
public void quickSortByRange(int[] a) {

  qSortByRange(a,0,a.Length - 1,new Random());

}

public void qSortByRange(int[] a,int left,int right,Random r) {

  if(left < right) {
    int pivot = left + r.Next(right - left + 1);
    Tuple<int, int> pivotRange = partitionByRange(a,left,right,pivot);
    qSortByRange(a,left,pivotRange.Item1 - 1,r);
    qSortByRange(a,pivotRange.Item2 + 1,right,r);
  }

}

QuickSelectByRange

C#
public int quickSelectByRange(int[] a,int n) {

  return qSelectByRange(a,0,a.Length - 1,n,new Random());

}

public int qSelectByRange(int[] a,int left,int right,int n,Random r) {

  if(left == right) return a[left];

  int pivot = n;

  while(true) {

    Tuple<int, int> pivotRange = partitionByRange(a,left,right,pivot);
    if(n >= pivotRange.Item1 && n <= pivotRange.Item2) return a[n];
    if(n < pivotRange.Item1)
      right = pivotRange.Item1 - 1;
    else
      left = pivotRange.Item2 + 1;
    pivot = left + r.Next(right - left + 1);

  }

}

Overhead

The PartitionByRange algorithm requires a Stack (or similar LIFO structure), extra comparisons, and it returns a tuple of two integers instead of a single integer.

These overhead requirements result in a performance hit which is not overcome until the array contains at least several duplicates.

Performance

  • In arrays with no duplicate entries, the QuickSort algorithm is about 8% faster than the QuickSortByRange algorithm.
  • In an array with just a few duplicate entries, the QuickSort algorithm is about 9% faster than the QuickSortByRange algorithm.
  • In an array with several duplicates, the QuickSortByRange algorithm is about 14% faster than the QuickSort algorithm.
  • In an array with quite a few duplicates, the QuickSortByRange algorithm is about 76% faster than the QuickSort algorithm.
  • In an array with lots of duplicates, the QuickSortByRange algorithm is about 98% faster than the QuickSort algorithm.
  • In an array with all duplicates, the QuickSortByRange algorithm is about 99.9% faster than the QuickSort algorithm.

History

In the original version of this article, I incorrectly stated that QuickSortByRange delivers O(1) performance on an array where every element contains the same value.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer Comprehend Systems
United States United States
I've been fiddling around with computers since my parents bought me a Commodore VIC-20 for Christmas in 1981.

I received a B.Sc. in Mechanical Engineering from the University of Waterloo, but have spent my career in software development.

I focused on FoxPro during my early career before switching over to .NET and JavaScript.

I created Jooshe and wrote a Code Project article on it.

I wrote the front-end (Jooshe & JavaScript) and middleware (.NET) of dbiScript.


Comments and Discussions

 
GeneralMy vote of 2 Pin
Qwertie24-Dec-13 6:34
Qwertie24-Dec-13 6:34 
QuestionUm.... Pin
Qwertie23-Dec-13 13:15
Qwertie23-Dec-13 13:15 
AnswerMessage Closed Pin
23-Dec-13 14:19
Brien Givens23-Dec-13 14:19 
GeneralRe: Um.... Pin
Qwertie24-Dec-13 5:04
Qwertie24-Dec-13 5:04 
GeneralMessage Closed Pin
24-Dec-13 5:35
Brien Givens24-Dec-13 5:35 
GeneralRe: Um.... Pin
Qwertie24-Dec-13 6:32
Qwertie24-Dec-13 6:32 
GeneralRe: Um.... Pin
Brien Givens26-Dec-13 7:15
Brien Givens26-Dec-13 7:15 
GeneralRe: Um.... Pin
KP Lee4-Jan-14 20:44
KP Lee4-Jan-14 20:44 
GeneralRe: Um.... Pin
KP Lee4-Jan-14 20:50
KP Lee4-Jan-14 20:50 
GeneralRe: Um.... Pin
KP Lee4-Jan-14 21:09
KP Lee4-Jan-14 21:09 
GeneralRe: Um.... Pin
KP Lee5-Jan-14 0:21
KP Lee5-Jan-14 0:21 
SuggestionSuggestion Pin
lafffer_11923-Dec-13 10:31
lafffer_11923-Dec-13 10:31 
GeneralMy vote of 5 Pin
John Brett17-Dec-13 23:43
John Brett17-Dec-13 23:43 
Nice to see performance metrics for a performance-related article
GeneralRe: My vote of 5 Pin
Brien Givens17-Dec-13 23:48
Brien Givens17-Dec-13 23:48 
QuestionVery nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Pin
Volynsky Alex17-Dec-13 10:24
professionalVolynsky Alex17-Dec-13 10:24 
AnswerRe: Very nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Pin
Brien Givens17-Dec-13 10:29
Brien Givens17-Dec-13 10:29 
GeneralRe: Very nice article about QuickSortByRange, QuickSelectByRange, and PartitionByRange algorithms Pin
Volynsky Alex17-Dec-13 10:37
professionalVolynsky Alex17-Dec-13 10:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.