Fall 2011 Strategic Practice 10: Section 1 (Conditional Expectation and Conditional Variance) - Question 4
Emails arrive one at a time in an inbox. Let $T_{n}$ be the time at which the nth email arrives (measured on a continuous scale from some starting point in time). Suppose that the waiting times between emails are i.i.d. Expo($\lambda$), i.e., $T_{1}, T_{2}-T_{1},T_{3}-T_{2}, . . .$ are i.i.d. Expo($\lambda$). Each email is non-spam with probability p, and spam with probability q = 1 - p (independently of the other emails and of the waiting times). Let X be the time at which the first non-spam email arrives (so X is a continuous r.v., with X = $T_{1}$ if the 1st email is non-spam, X = $T_{2}$ if the 1st email is spam but the 2nd one isn't, etc.).
(a) Find the mean and variance of X.
(b) Find the MGF of X. What famous distribution does this imply that X has (be sure to state its parameter values)?
Hint for both parts: let N be the number of emails until the first non-spam (including that one), and write X as a sum of N terms; then condition on N.
Solution: (a) N − 1 ~ Geom(p), and use Adam's Law and Eve's Law. (b) This is the Expo(p lambda) MGF, so X ~ Expo(p lambda). Consult iTunes course for full solution.
"Mathematics is the logic of certainty, but statistics is the logic of uncertainty."