A clean calculation of the Luria-Delbruck mathematics

Status: Done
Confidence: Certain

All biology students are made to read Luria and Delbruck’s paper “Mutations of bacteria from virus sensitivity to virus resistance,” which begins with a series of ad hoc probability calculations. When the paper was written in 1943, they represented the best tools available in probability. Today they are quaint to the point of being incomprehensible. Here is the whole calculation from the paper in modern form:

Assume that we have a population of bacteria growing exponentially and deterministically with growth rate 1, so the size of the population at time $t$ is $N(t) = n e^t$, where $n$ is the size at time 0. These bacteria mutate to a state resistant to bacteriophage infections randomly and uniformly in time, but assume that the resistant bacteria are otherwise identical to their progenitors and that their numbers are always negligible compared to the whole population.

$X(\tau)$ as the number of bacteria that mutate in the interval of time $[\tau, \tau + d\tau)$. $X(\tau)$ is Poisson with mean $m n e^\tau$, where $m$ is the mutation rate per cell. The rest of the calculation depends on two facts about $X$:

1. $E[ X(\tau) ] = m n e^\tau$
2. $E[ (X(a) - E[X(a)] ) (X(b) - E[X(b)]) ] = m n e^a \delta( a - b)$

Let Y(t, ) be the number of mutant cells at time $t$ which originated from cells mutating in the interval $[\tau, \tau + d\tau)$. Then

$Y(t, \tau) = e^{t - \tau} X(\tau)$

Finally, let $Z(t)$ be the total number of mutant cells at time $t$. Obviously,

$Z(t) = \int Y(t, y) dt$

Now we calculate the means and variances of $Z$.

$E[ Z(t) ]$
= { definition of $Z$, and $E[ \cdot ]$ commutes with integration }
$\int_0^t E[Y(t, y)] dt$
= { definition of $Y$, and $E[ \cdot ]$ is linear }
$\int_0^t e^{t-y} E[ X(y) ] dy$
= { definition of $X$; algebra }
$m n t e^t$

And the variance,

$E[(Z(t) - E[ Z(t) ])^2 ]$
= { definition of $Z$ }
$E[ (\int_0^t e^{t-y} (X(y) - E[X(y)]) dy )^2 ]$
= { pull $e^{t-y}$ through integral and expectation }
$e^{2t} E[ (\int_0^t dy e^{-y} ( X(y) - E[X(y)] ) )^2]$
= { write out square of integral, renaming one dummy variable }
$e^{2t} E[ \left(\int_0^t dy e^{-y} (X(y) - E[X(y)] ) \right)\ \left( \int_0^t dx e^{-x} (X(x) - E[X(x)] ) \right) ]$
= { pull one integral inside the other }
$e^{2t} E[ \int_0^t dy \int_0^t dx (X(y) - E[X(y)]) (X(x) - E[X(x)]) ]$
= { expectation commutes with integrals }
$e^{2t} \int_0^t dy \int_0^t dx E[ (X(y) - E[X(y)]) (X(x) - E[X(x)]) ]$
= { property 2 of $X$ }
$e^{2t} \int_0^t dy \int_0^t dx \ m n e^x \delta(x-y)$
= { collapse the delta function, evaluate the integral }
$m n e^{2t} (1 - e^{-t})$

The ratio of variance to mean is

$\frac{e^t (1 - e^{-t})}{t}$

and we can ignore the $e^{-t}$ if we are looking very far out in time. The point of the Luria-Delbruck paper is to test whether mutants arise according to the process just described, at random before the addition of phage, or at random in response to the addition of phage. If it were in response to the addition of phage, the incidence should be Poisson distributed, with variance equal to mean. So we have a clean way to measure the difference between the two.

In the original paper, Delbruck prefers to change the lower limit of integration from 0 to the expected time of the first mutation. This produces a statistic which is on average more narrowly distributed, but is biased.