Fall 2011 Strategic Practice 4: Section 1 (Distributions and Expected Values for Discrete Random Variables) - Question 6
Job candidates $C_{1}, C_{2},...$ are interviewed one by one, and the interviewer compares them and keeps an updated list of rankings (if n candidates have been interviewed so far, this is a list of the n candidates, from best to worst). Assume that there is no limit on the number of candidates available, that for any n the candidates $C_{1}, C_{2},..., C_{n}$ are equally likely to arrive in any order, and that there are no ties in the rankings given by the interview. Let X be the index of the first candidate to come along who ranks as better than the very first candidate $C_{1}$ (so $C_{X}$ is better than $C_{1}$, but the candidates after 1 but prior to X (if any) are worse than $C_{1}$. For example, if $C_{2}$ and $C_{3}$ are worse than $C_{1}$ but $C_{4}$ is better than $C_{1}$, then X = 4. All 4! orderings of the first 4 candidates are equally likely, so it could have happened that the first candidate was the best out of the first 4 candidates, in which case X > 4. What is E(X) (which is a measure of how long, on average, the interviewer needs to wait to find someone better than the very first candidate)? Hint: find P(X > n) by interpreting what X > n says about how $C_{1}$ compares with other candidates, and then apply the result of the previous problem.
Solution: For n greater than or equal to 2, P(X > n) is the probability that none of C2, C3,..., Cn are better candidates than C1, i.e., the probability that the first candidate is the highest ranked out of the first n. Since any ordering of the first n candidates is equally likely, each of the first n is equally likely to be the highest ranked of the first n, so P(X > n) = 1/n. For n = 0 or n = 1, P(X > n) = 1 (note that it does not make sense to say the probability is 1/n when n = 0). Applying the result of the previous problem, E(X) = infinity. The series is the harmonic series, which diverges. How can the average waiting time to find someone better than the first candidate be infinite? In the real world, there are always only finitely many candidates so the expected waiting time is finite, just as in the St. Petersburg paradox there must in reality be an upper bound on the number of rounds. The harmonic series diverges very slowly, so even with millions of job candidates the average waiting time would not be very large.
"Mathematics is the logic of certainty, but statistics is the logic of uncertainty."