Problems in Probability

Border Border
Fall 2011 Homework 4: Question 6
Consider the following algorithm for sorting a list of n distinct numbers into increasing order. Initially they are in a random order, with all orders equally likely. The algorithm compares the numbers in positions 1 and 2, and swaps them if needed, then it compares the new numbers in positions 2 and 3, and swaps them if needed, etc., until it has gone through the whole list. Call this one "sweep" through the list. After the first sweep, the largest number is at the end, so the second sweep (if needed) only needs to work with the first n - 1 positions. Similarly, the third sweep (if needed) only needs to work with the first n - 2 positions, etc. Sweeps are performed until n - 1 sweeps have been completed or there is a swapless sweep. For example, if the initial list is 53241 (omitting commas), then the following 4 sweeps are performed to sort the list, with a total of 10 comparisons:

(a) An inversion is a pair of numbers that are out of order (e.g., 12345 has no inversions, while 53241 has 8 inversions). Find the expected number of inversions in the original list. (b) Show that the expected number of comparisons is between and . Hint for (b): for one bound, think about how many comparisons are made if n - 1 sweeps are done; for the other bound, use Part (a).
Solution: (a) There are nC2 pairs of numbers, each of which is equally likely to be in either order. So by symmetry, linearity, and indicator r.v.s, we immediately have that the expected number of inversions is (1/2)(nC2). (b) Let X be the number of comparisons and V be the number of inversions. On the one hand, X >= V since every inversion must be repaired. So E(X) >= E(V) = (1/2)(nC2). On the other hand, there are n − 1 comparisons needed in the first sweep, n − 2 in the second sweep (if needed),..., and 1 in the (n − 1)st sweep (if needed). So X <= (n − 1) + (n − 2) + ··· + 2 + 1 = n(n − 1)/2 = nC2. Hence, we show our desired result. This algorithm is known as bubble-sort.
"Mathematics is the logic of certainty, but statistics is the logic of uncertainty."
Invisible Link
Copyright © 2011 Stat 110 Harvard. Website layout by former Stat110'er.