madhadron

A clean calculation of the Luria-Delbruck mathematics

Status: Done
Confidence: Certain

Math in this page not rendering? See the fix

All biology students are made to read Luria and Delbruck’s paper “Mutations of bacteria from virus sensitivity to virus resistance,” which begins with a series of ad hoc probability calculations. When the paper was written in 1943, they represented the best tools available in probability. Today they are quaint to the point of being incomprehensible. Here is the whole calculation from the paper in modern form:

Assume that we have a population of bacteria growing exponentially and deterministically with growth rate 1, so the size of the population at time tt is N(t)=netN(t) = n e^t, where nn is the size at time 0. These bacteria mutate to a state resistant to bacteriophage infections randomly and uniformly in time, but assume that the resistant bacteria are otherwise identical to their progenitors and that their numbers are always negligible compared to the whole population.

X(τ)X(\tau) as the number of bacteria that mutate in the interval of time [τ,τ+dτ)[\tau, \tau + d\tau). X(τ)X(\tau) is Poisson with mean mneτm n e^\tau, where mm is the mutation rate per cell. The rest of the calculation depends on two facts about XX:

  1. E[X(τ)]=mneτE[ X(\tau) ] = m n e^\tau
  2. E[(X(a)E[X(a)])(X(b)E[X(b)])]=mneaδ(ab)E[ (X(a) - E[X(a)] ) (X(b) - E[X(b)]) ] = m n e^a \delta( a - b)

Let Y(t, ) be the number of mutant cells at time tt which originated from cells mutating in the interval [τ,τ+dτ)[\tau, \tau + d\tau). Then

Y(t,τ)=etτX(τ)Y(t, \tau) = e^{t - \tau} X(\tau)

Finally, let Z(t)Z(t) be the total number of mutant cells at time tt. Obviously,

Z(t)=Y(t,y)dtZ(t) = \int Y(t, y) dt

Now we calculate the means and variances of ZZ.

E[Z(t)]E[ Z(t) ]
= { definition of ZZ, and E[]E[ \cdot ] commutes with integration }
0tE[Y(t,y)]dt\int_0^t E[Y(t, y)] dt
= { definition of YY, and E[]E[ \cdot ] is linear }
0tetyE[X(y)]dy\int_0^t e^{t-y} E[ X(y) ] dy
= { definition of XX; algebra }
mntetm n t e^t

And the variance,

E[(Z(t)E[Z(t)])2]E[(Z(t) - E[ Z(t) ])^2 ]
= { definition of ZZ }
E[(0tety(X(y)E[X(y)])dy)2]E[ (\int_0^t e^{t-y} (X(y) - E[X(y)]) dy )^2 ]
= { pull etye^{t-y} through integral and expectation }
e2tE[(0tdyey(X(y)E[X(y)]))2]e^{2t} E[ (\int_0^t dy e^{-y} ( X(y) - E[X(y)] ) )^2]
= { write out square of integral, renaming one dummy variable }
e2tE[(0tdyey(X(y)E[X(y)]))(0tdxex(X(x)E[X(x)]))]e^{2t} E[ \left(\int_0^t dy e^{-y} (X(y) - E[X(y)] ) \right)\ \left( \int_0^t dx e^{-x} (X(x) - E[X(x)] ) \right) ]
= { pull one integral inside the other }
e2tE[0tdy0tdx(X(y)E[X(y)])(X(x)E[X(x)])]e^{2t} E[ \int_0^t dy \int_0^t dx (X(y) - E[X(y)]) (X(x) - E[X(x)]) ]
= { expectation commutes with integrals }
e2t0tdy0tdxE[(X(y)E[X(y)])(X(x)E[X(x)])]e^{2t} \int_0^t dy \int_0^t dx E[ (X(y) - E[X(y)]) (X(x) - E[X(x)]) ]
= { property 2 of XX }
e2t0tdy0tdxmnexδ(xy)e^{2t} \int_0^t dy \int_0^t dx \ m n e^x \delta(x-y)
= { collapse the delta function, evaluate the integral }
mne2t(1et)m n e^{2t} (1 - e^{-t})

The ratio of variance to mean is

et(1et)t\frac{e^t (1 - e^{-t})}{t}

and we can ignore the ete^{-t} if we are looking very far out in time. The point of the Luria-Delbruck paper is to test whether mutants arise according to the process just described, at random before the addition of phage, or at random in response to the addition of phage. If it were in response to the addition of phage, the incidence should be Poisson distributed, with variance equal to mean. So we have a clean way to measure the difference between the two.

In the original paper, Delbruck prefers to change the lower limit of integration from 0 to the expected time of the first mutation. This produces a statistic which is on average more narrowly distributed, but is biased.