When does the worst case of Quicksort occur?

When does the worst case of Quicksort occur?

GeeksforGeeks


Suggest a Topic
Write an Article

  • Lower bound for comparison based sorting algorithms
  • Which sorting algorithm makes minimum number of memory writes?
  • Merge Sort
  • QuickSort
  • Bubble Sort
  • Insertion Sort
  • HeapSort
  • Selection Sort
  • std::sort() in C++ STL
  • Sort an array of 0s, 1s and 2s
  • k largest(or smallest) elements in an array | added Min Heap method
  • Count Inversions in an array | Set 1 (Using Merge Sort)
  • Merge Sort for Linked Lists
  • Counting Sort
  • Radix Sort
  • Minimum number of swaps required to sort an array
  • Sorting Vector of Pairs in C++ | Set 1 (Sort by first and second)
  • Time Complexities of all Sorting Algorithms
  • Sort elements by frequency | Set 1
  • Find a triplet that sum to a given value
  • Sort a nearly sorted (or K sorted) array
  • Given a sorted dictionary of an alien language, find order of characters
  • Given a number, find the next smallest palindrome
  • Chocolate Distribution Problem
  • Merge k sorted arrays | Set 1
  • Find four elements that sum to a given value | Set 2 ( O(n^2Logn) Solution)
  • Longest Consecutive Subsequence
  • Stability in sorting algorithms
  • Find whether an array is subset of another array | Added Method 3
  • Find all triplets with zero sum

When does the worst case of Quicksort occur?

The answer depends on strategy for choosing pivot. In early versions of Quick Sort where leftmost (or rightmost) element is chosen as pivot, the worst occurs in following cases.

1) Array is already sorted in same order.
2) Array is already sorted in reverse order.
3) All elements are same (special case of case 1 and 2)

Since these cases are very common use cases, the problem was easily solved by choosing either a random index for the pivot, choosing the middle index of the partition or (especially for longer partitions) choosing the median of the first, middle and last element of the partition for the pivot. With these modifications, the worst case of Quick sort has less chances to occur, but worst case can still occur if the input array is such that the maximum (or minimum) element is always chosen as pivot.

References:
http://en.wikipedia.org/wiki/Quicksort

My Personal Notes
arrow_drop_up


Recommended Posts:

  • Can QuickSort be implemented in O(nLogn) worst case time complexity?
  • QuickSort Tail Call Optimization (Reducing worst case space to Log n )
  • Find a permutation that causes worst case of Merge Sort
  • QuickSort
  • C++ Program for QuickSort
  • Stable QuickSort
  • Why quicksort is better than mergesort ?
  • Dual pivot Quicksort
  • Java Program for QuickSort
  • QuickSort using Random Pivoting
  • Python Program for QuickSort
  • Generic Implementation of QuickSort Algorithm in C
  • 3-Way QuickSort (Dutch National Flag)
  • QuickSort on Singly Linked List
  • QuickSort on Doubly Linked List


Facebook

Google

LinkedIn

Twitter

Pinterest

Reddit

StumbleUpon

Tumblr


Writing code in comment? Please use ide.geeksforgeeks.org , generate link and share the link here.


Most popular in Sorting
  • Count all distinct pairs with difference equal to k
  • Union and Intersection of two Linked Lists
  • Sort an array in wave form
  • Intersection of two Sorted Linked Lists
  • QuickSort on Singly Linked List

More related articles in Sorting
  • Segregate 0s and 1s in an array
  • Bucket Sort
  • Print All Distinct Elements of a given integer array
  • Merge two sorted arrays
  • Sort a stack using a temporary stack

Most visited in Sorting
  • Sort an array according to the order defined by another array
  • Sort elements by frequency | Set 2
  • Sort string of characters
  • Recursive Bubble Sort
  • Segregate even and odd numbers | Set 3
  • Find minimum difference between any two elements
  • Given a sorted array and a number x, find the pair in array whose sum is closest to x
  • Check if two arrays are equal or not
  • Sort elements by frequency | Set 4 (Efficient approach using hash)
  • Sort a linked list of 0s, 1s and 2s
  • Why Quick Sort preferred for Arrays and Merge Sort for Linked Lists?
  • Merge two sorted arrays with O(1) extra space
  • ShellSort
  • Find the point where maximum intervals overlap
  • Rearrange positive and negative numbers with constant extra space

Khan Academy does not support Internet Explorer.
 

[close]

To use Khan Academy you need to upgrade to another web browser.
Just select one of the options below to start upgrading.


Mozilla Firefox


Microsoft Edge


Google Chrome

If you’re seeing this message, it means we’re having trouble loading external resources on our website.

If you’re behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Computing Computer science Algorithms Quick sort

Quick sort

    • Overview of quicksort
    • Challenge: Implement quicksort
    • Linear-time partitioning
    • Challenge: Implement partition
    • Analysis of quicksort
      This is the currently selected item.

    Next tutorial
    Graph representation

Computing Computer science Algorithms Quick sort

Analysis of quicksort

How is it that quicksort's worst-case and average-case running times differ? Let's start by looking at the worst-case running time. Suppose that we're really unlucky and the partition sizes are really unbalanced. In particular, suppose that the pivot chosen by the partition function is always either the smallest or the largest element in the

n n

-element subarray. Then one of the partitions will contain no elements and the other partition will contain

n1 n-1

elements—all but the pivot. So the recursive calls will be on subarrays of sizes 0 and

n1 n-1

.

As in merge sort, the time for a given recursive call on an

n n

-element subarray is

Θ(n) \Theta(n)

. In merge sort, that was the time for merging, but in quicksort it's the time for partitioning.

Worst-case running time

When quicksort always has the most unbalanced partitions possible, then the original call takes

cn cn

time for some constant

c c

, the recursive call on

n1 n-1

elements takes

c(n1) c(n-1)

time, the recursive call on

n2 n-2

elements takes

c(n2) c(n-2)

time, and so on. Here's a tree of the subproblem sizes with their partitioning times:

Diagram of worst case performance for Quick Sort

When we total up the partitioning times for each level, we get

cn+c(n1)+c(n2)++2c=c(n+(n1)+(n2)++2)=c((n+1)(n/2)1) . \beginaligned cn + c(n-1) + c(n-2) + \cdots + 2c &= c(n + (n-1) + (n-2) + \cdots + 2) \\ &= c((n+1)(n/2) – 1) \ . \endaligned

The last line is because

1+2+3++n 1 + 2 + 3 + \cdots + n

is the arithmetic series, as we saw when we analyzed selection sort . (We subtract 1 because for quicksort, the summation starts at 2, not 1.) We have some low-order terms and constant coefficients, but when we use big-Θ notation, we ignore them. In big-Θ notation, quicksort's worst-case running time is

Θ(n2) \Theta(n^2)

.

Best-case running time

Quicksort's best case occurs when the partitions are as evenly balanced as possible: their sizes either are equal or are within 1 of each other. The former case occurs if the subarray has an odd number of elements and the pivot is right in the middle after partitioning, and each partition has

(n1)/2 (n-1)/2

elements. The latter case occurs if the subarray has an even number

n n

of elements and one partition has

n/2 n/2

elements with the other having

n/21 n/2-1

. In either of these cases, each partition has at most

n/2 n/2

elements, and the tree of subproblem sizes looks a lot like the tree of subproblem sizes for merge sort, with the partitioning times looking like the merging times:

Diagram of best case performance for Quick Sort

Using big-Θ notation, we get the same result as for merge sort:

Θ(nlog2n) \Theta(n \log_2 n)

.

Average-case running time

Showing that the average-case running time is also

Θ(nlog2n) \Theta(n \log_2 n)

takes some pretty involved mathematics, and so we won't go there. But we can gain some intuition by looking at a couple of other cases to understand why it might be

O(nlog2n) O(n \log_2 n)

. (Once we have

O(nlog2n) O(n \log_2 n)

, the

Θ(nlog2n) \Theta(n \log_2 n)

bound follows because the average-case running time cannot be better than the best-case running time.) First, let's imagine that we don't always get evenly balanced partitions, but that we always get at worst a 3-to-1 split. That is, imagine that each time we partition, one side gets

3n/4 3n/4

elements and the other side gets

n/4 n/4

. (To keep the math clean, let's not worry about the pivot.) Then the tree of subproblem sizes and partitioning times would look like this:

Diagram of average case performance for Quick Sort

The left child of each node represents a subproblem size 1/4 as large, and the right child represents a subproblem size 3/4 as large. Since the smaller subproblems are on the left, by following a path of left children, we get from the root down to a subproblem size of 1 faster than along any other path. As the figure shows, after

log4n \log_4 n

levels, we get down to a subproblem size of 1. Why

log4n \log_4 n

levels? It might be easiest to think in terms of starting with a subproblem size of 1 and multiplying it by 4 until we reach

n n

. In other words, we're asking for what value of

x x

is

4x=n 4^x = n

? The answer is

log4n \log_4 n

. How about going down a path of right children? The figure shows that it takes

log4/3n \log_4/3 n

levels to get down to a subproblem of size 1. Why

log4/3n \log_4/3 n

levels? Since each right child is 3/4 of the size of the node above it (its parent node), each parent is 4/3 times the size of its right child. Let's again think of starting with a subproblem of size 1 and multiplying the size by 4/3 until we reach

n n

. For what value of

x x

is

(4/3)x=n (4/3)^x = n

? The answer is

log4/3n \log_4/3 n

.

In each of the first

log4n \log_4 n

levels, there are

n n

nodes (again, including pivots that in reality are no longer being partitioned), and so the total partitioning time for each of these levels is

cn cn

. But what about the rest of the levels? Each has fewer than n nodes, and so the partitioning time for every level is at most

cn cn

. Altogether, there are

log4/3n \log_4/3 n

levels, and so the total partitioning time is

O(nlog4/3n) O(n \log_4/3 n)

. Now, there's a mathematical fact that

logan=logbnlogba \displaystyle \log_a n = \frac\log_b n\log_b a

for all positive numbers

a a

,

b b

, and

n n

. Letting

a=4/3 a = 4/3

and

b=2 b = 2

, we get that

log4/3n=log2nlog2(4/3) , \displaystyle \log_4/3 n = \dfrac\log_2 n\log_2 (4/3) \ ,

and so

log4/3n \log_4/3 n

and

log2n \log_2 n

differ by only a factor of

log2(4/3) \log_2 (4/3)

, which is a constant. Since constant factors don't matter when we use big-O notation, we can say that if all the splits are 3-to-1, then quicksort's running time is

O(nlog2n) O(n \log_2 n)

, albeit with a larger hidden constant factor than the best-case running time.

How often should we expect to see a split that's 3-to-1 or better? It depends on how we choose the pivot. Let's imagine that the pivot is equally likely to end up anywhere in an

n n

-element subarray after partitioning. Then to get a split that is 3-to-1 or better, the pivot would have to be somewhere in the "middle half":

So, if the pivot is equally likely to end up anywhere in the subarray after partitioning, there's a 50% chance of getting at worst a 3-to-1 split. In other words, we expect a split of 3-to-1 or better about half the time.
The other case we'll look at to understand why quicksort's average-case running time is

O(nlog2n) O(n \log_2 n)

is what would happen if the half of the time that we don't get a 3-to-1 split, we got the worst-case split. Let's suppose that the 3-to-1 and worst-case splits alternate, and think of a node in the tree with

k k

elements in its subarray. Then we'd see a part of the tree that looks like this:

instead of like this:

Therefore, even if we got the worst-case split half the time and a split that's 3-to-1 or better half the time, the running time would be about twice the running time of getting a 3-to-1 split every time. Again, that's just a constant factor, and it gets absorbed into the big-O notation, and so in this case, where we alternate between worst-case and 3-to-1 splits, the running time is

O(nlog2n) O(n \log_2 n)

.

Bear in mind that this analysis is not mathematically rigorous, but it gives you an intuitive idea of why the average-case running time might be

O(nlog2n) O(n \log_2 n)

.

Randomized quicksort

Suppose that your worst enemy has given you an array to sort with quicksort, knowing that you always choose the rightmost element in each subarray as the pivot, and has arranged the array so that you always get the worst-case split. How can you foil your enemy?
You could not necessarily choose the rightmost element in each subarray as the pivot. Instead, you could randomly choose an element in the subarray, and use that element as the pivot. But wait—the partition function assumes that the pivot is in the rightmost position of the subarray. No problem—just swap the element that you chose as the pivot with the rightmost element, and then partition as before. Unless your enemy knows how you choose random locations in the subarray, you win!
In fact, with a little more effort, you can improve your chance of getting a split that's at worst 3-to-1. Randomly choose not one, but three elements from the subarray, and take median of the three as the pivot (swapping it with the rightmost element). By the median, we mean the element of the three whose value is in the middle. We won't show why, but if you choose the median of three randomly chosen elements as the pivot, you have a 68.75% chance (11/16) of getting a 3-to-1 split or better. You can go even further. If you choose five elements at random and take the median as the pivot, your chance of at worst a 3-to-1 split improves to about 79.3% (203/256). If you take the median of seven randomly chosen elements, it goes up to about 85.9% (1759/2048). Median of nine? About 90.2% (59123/65536). Median of 11? About 93.1% (488293/524288). You get the picture. Of course, it doesn't necessarily pay to choose a large number of elements at random and take their median, for the time spent doing so could counteract the benefit of getting good splits almost all the time.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom , plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA .

Quick sort

    • Overview of quicksort
    • Challenge: Implement quicksort
    • Linear-time partitioning
    • Challenge: Implement partition
    • Analysis of quicksort
      This is the currently selected item.

    Next tutorial
    Graph representation
Loading
Challenge: Implement partition