We investigate in this post whether there is any statistical way of confirming or disproving this famous saying. 

It is pretty obvious that, if we are tossing a fair coin and obtain two heads in a row, there is no reason to think that the third filp will also give a head more likely than a tail.

Things get different, however, if we don’t know whether the coin is fair or not. In this case, we need some further statistical assumptions on the coin bias, if we want to make calculations. Suppose, for example, that the the probability of head \(\mu\) is uniformly distributed between \(0\) and \(1\). Now let \(X_k\) be the number of heads after the first \(k\) tosses. The probability of having three heads in a row, given that the first two flips resulted in a head, is \[p=P(X_3=3 |X_2=2).\]Using Bayes Theorem we write \begin{eqnarray}p  &=& \frac{P(X_2=2|X_3=3)P(X_3=3)}{P(X_2=2)}=\\&=&\frac{P(X_3=3)}{P(X_2=2)},\tag{1}\label{eq3114:1}\end{eqnarray}where we used also the fact that \(P(X_2=2|X_3=3)\), i.e. the probability of having two heads in the first two tosses, given that the first three all are head, is clearly \(1\).

Conditioning on \(\mu\) yields \[P(X_k=k|\mu) = \mu^k\]so that \eqref{eq3114:1} can be rewritten as \begin{eqnarray}p&=&\frac{\int_0^1\mu^3 d\mu}{\int_0^1 \mu^2 d\mu}=\\&=&\frac{\left[\frac{\mu^4}{4}\right]_0^1}{\left[\frac{\mu^3}{3}\right]_0^1}=\frac34.\end{eqnarray}

In words, if we got two heads in a row, it is indeed more likely (by a probability of \(75\)%) to get a third head, rather than a tail.

Let us analyze a similar, but slighlty more complicated situation. Suppose an urn contains \(m\) balls, each of which can be either black or white. We don’t know what the number of white balls is. We assume, though, that the possible numbers of white balls, i.e. \(0,1,2\dots,m\) are all equally likely (that is the number of white balls has a discrete uniform distribution). Assume that we return each ball to the urn, once it is extracted. If the first two balls are white, what is the probability that the third ball is also white? You can get to the result by following this track.

  1. Adopt the same notation we used for the previous problem, where now \(X_k\) is the number of white balls in the first \(k\) extractions. Using Bayes theorem allows you to write the required probability again as \[p=\frac{P(X_3=3)}{P(X_2=2)}.\]
  2. Note the that the probability of getting a white ball, conditioned on the number \(j\) of white balls contained in the urn, is equal to \(\frac{j}{m}\), for \(j=1,2,\dots,m\).
  3. Use 2. and the independence of each extraction, together with the uniform distribution of \(j\), to rewrite \(p\) as \[p=\frac{\sum_{j=1}^m\left(\frac{j}{m}\right)^3}{\sum_{j=1}^m\left(\frac{j}{m}\right)^2}.\]
  4. Finally use the known resuts \[\sum_{j=1}^m j^2 = \frac{m(m+1)(2m+1)}6,\] and \[\sum_{j=1}^m j^3 = \frac{m^2(m+1)^2}4\] to obtain \[p=\frac{3(m+1)}{2(2m+1)}.\]

Our result, now, depends on the number of balls \(m\) in the urn. Obviously, for \(m=1\), the first two extractions being white guarantees the only ball in the urn is white, thus \(p=1\). When \(m\to \infty\), \[p\to \frac34,\] as shown in the figure below where the funtion \(f(x) = \frac{3(x+1)}{2(2x+1)}\), for \(x\geq 1\), is plotted together with its horizontal asymptote.

For every \(m\) we have \(p>\frac34\), so that again it is more likely to get a third white extraction, rather than a black one, if the first two where white.

Assume now the urn contains at least \(3\) balls. What happens if the balls are not replaced in the urn once extracted? Computations are more cumbersome, but the result is quite surprising.

  1. Express again the required probability as \(p=\frac{P(X_3=3)}{P(X_2=2)}\).
  2. Note that conditioning on the number of white balls \(j\) now yields \[P(X_2=2|j) = \frac{j(j-1)}{m(m-1)}, \ \ j=1,2,\dots,m.\]
  3. Similarly derive \[P(X_3=3|j) = \frac{j(j-1)(j-2)}{m(m-1)(m-2)}, \ \ j=2,3,\dots,m.\]
  4. Using the above results, write \begin{eqnarray}p&=&\frac{\sum_{k=0}^{m-3}\frac{(m-k)(m-k-1)(m-k-2)}{m(m-1)(m-2)}}{\sum_{k=0}^{m-2}\frac{(m-k)(m-k-1)}{m(m-1)}}=\\&=&\frac1{m-2}\cdot\frac{\sum_{k=0}^{m-3}(m-k)(m-k-1)(m-k-2)}{\sum_{k=0}^{m-2}(m-k)(m-k-1)}=\\&=&\frac1{m-2}\cdot \frac{\mathcal N(m)}{\mathcal D(m)}.\end{eqnarray}
  5. Use induction to show that \begin{eqnarray}\mathcal N(m) &= &\sum_{k=0}^{m-3}(m-k)(m-k-1)(m-k-2)=\\&=&\frac{(m-2)(m-1)m(m+1)}4, \ \ m=3,4,\dots,\end{eqnarray}and, similarly \begin{eqnarray}\mathcal D(m) &= &\sum_{k=0}^{m-2}(m-k)(m-k-1)=\\&=&\frac{(m-1)m(m+1)}3, \ \ m=2,3,\dots.\end{eqnarray}
  6. Conclude that \[p=\frac34,\]independently of the number of balls in the urn, provided, of course, that it is greater than \(2\).

As a further exercise, I propose you to generalize the above situation and determine the probability of having \(t\) successes (\(t\) white balls), given that we have succeeded in all the previous \(t-1\) trials.