5.4.2 De Moivre–Laplace theorem

De Moivre–Laplace theorem

From Wikipedia, the free encyclopedia

Jump to navigationJump to search

Within a system whose bins are filled according to the binomial distribution(such as Galton's "bean machine", shown here), given a sufficient number of trials (here the rows of pins, each of which causes a dropped "bean" to fall toward the left or right), a shape representing the probability distribution of ksuccesses in n trials (see bottom of Fig. 7) matches approximately the Gaussian distribution with mean np and variance np(1−p), assuming the trials are independent and successes occur with probability p.

Consider tossing a set of n coins a very large number of times and counting the number of "heads" that result each time. The possible number of heads on each toss, k, runs from 0 to n along the horizontal axis, while the vertical axis represents the relative frequency of occurrence of the outcome k heads. The height of each dot is thus the probability of observing k heads when tossing ncoins (a binomial distribution based on ntrials). According to the de Moivre–Laplace theorem, as n grows large, the shape of the discrete distribution converges to the continuous Gaussian curve of the normal distribution.

In probability theory, the de Moivre–Laplace theorem, which is a special case of the central limit theorem, states that the normal distribution may be used as an approximation to the binomial distributionunder certain conditions. In particular, the theorem shows that the probability mass function of the random number of "successes" observed in a series of  independent Bernoulli trials, each having probability  of success (a binomial distribution with  trials), converges to the probability density function of the normal distribution with mean  and standard deviation, as  grows large, assuming  is not  or .

The theorem appeared in the second edition of The Doctrine of Chances by Abraham de Moivre, published in 1738. Although de Moivre did not use the term "Bernoulli trials", he wrote about the probability distribution of the number of times "heads" appears when a coin is tossed 3600 times.[1]

This is one derivation of the particular Gaussian function used in the normal distribution.

Contents

Theorem[edit]

As n grows large, for k in the neighborhood of np we can approximate[2][3]

in the sense that the ratio of the left-hand side to the right-hand side converges to 1 as n → ∞.

Proof[edit]

The theorem can be more rigorously stated as follows: , with  a binomially distributed random variable, approaches the standard normal as , with the ratio of the probability mass of  to the limiting normal density being 1. This can be shown for an arbitrary nonzero and finite point . On the unscaled curve for , this would be a point  given by

For example, with  at 3,  stays 3 standard deviations from the mean in the unscaled curve.

The normal distribution with mean  and standard deviation  is defined by the differential equation (DE)

  •  with initial condition set by the probability axiom .

The binomial distribution limit approaches the normal if the binomial satisfies this DE. As the binomial is discrete the equation starts as a difference equation whose limit morphs to a DE. Difference equations use the discrete derivative, the change for step size 1. As , the discrete derivative becomes the continuous derivative. Hence the proof need show only that, for the unscaled binomial distribution,

  •  as .

The required result can be shown directly:

The last holds because the term  dominates both the denominator and the numerator as .

As  takes just integral values, the constant  is subject to a rounding error. However, the maximum of this error, , is a vanishing value.[4]

Alternate Proof[edit]

The proof consists of transforming the left-hand side (in the statement of the theorem) to the right-hand side by three approximations.

First, according to Stirling's formula, the factorial of a large number n can be replaced with the approximation

Thus

Next, the approximation  is used to match the root above to the desired root on the right-hand side.

Finally, the expression is rewritten as an exponential and the Taylor Series approximation for ln(1+x) is used:

Then

Each "" in the above argument is a statement that two quantities are asymptotically equivalent as n increases, in the same sense as in the original statement of the theorem—i.e., that the ratio of each pair of quantities approaches 1 as n → ∞.