Theory 1 - Poisson variable

Poisson variable

A random variable X is Poisson, written XPois(α), when X counts the number of “arrivals” in a fixed “window.” It is applicable when:

  • The arrivals come at a constant average rate.
  • The arrivals are independent of each other.

Poisson PMF:

PX(k)=eααkk!

The “window rate” α must be computed from the “background rate” λ using the window size.

A Poisson variable is comparable with a binomial variable. Both count the occurrences of some “arrivals” over some “space of opportunity.”

  • The binomial opportunity is a set of n repetitions of a trial.
  • The Poisson opportunity is a continuous interval of time.

In the binomial case, success occurs at some rate p, since p=P[A] where A is the success event.

In the Poisson case, arrivals occur at some rate α.


center


The Poisson distribution is actually the limit of binomial distributions by taking n while np remains fixed, so p0 in perfect balance with n.

Fix α and define pn=α/n. (So p is computed to ensure the average rate np does not change.) Let XnBin(n,pn) and let YαPois(α). Then for any k:

PXn(k)PYα(k)as n

For example, let XnBin(n,pn), so with pn=3n, and look at PXn(1) as n:

PXn(1)(n1)(3n)1(13n)n13(13n)n13e3asn

Interpretation - Binomial model of rare events

Let us interpret the assumptions of this limit. For n large but p small such that α=np remains moderate, the binomial distribution describes a large number of trials, a low probability of success per trial, but a moderate total count of successes.

This setup describes a physical system with a large number of parts that may activate, but each part is unlikely to activate; and yet the number of parts is so large that the total number of arrivals is still moderate.

Theory 2 - Poisson limit of binomial

Extra - Derivation of binomial limit to Poisson

Consider a random variable XBin(n,p), and suppose n is very large.

Suppose also that p is very small, such that E[X]=np is not very large, but the extremes of n and p counteract each other. (Notice that then npq will not be large so the normal approximation does not apply.) In this case, the binomial PMF can be approximated using a factor of enp. Consider the following rearrangement of the binomial PMF:

PX(k)(nk)pkqnkn(n1)(nk+1)k!pk(1p)n1qk(1p)n(np)kk![nnn1nn2nnk+1n]1qk

Since n is very large, the factor in brackets is approximately 1, and since p is very small, the last factor of 1/qk is also approximately 1 (provided we consider k small compared to n). So we have:

PX(k)(1p)n(np)kk!.

Write α=np, a moderate number, to find:

PX(k)(1αn)nαkk!.

Here at last we find eα, since (1αn)neα as n. So as n:

PX(k)eααkk!.

Theory 3 - Divisibility

Extra - Divisibility

Consider a sequence of increasing n with decreasing p such that α=np is held fixed. For example, let n=1,2,3, while p=αn.

Think of this process as increasing the number of causal agents represented: group the agents together into n bunches, and consider the odds that such a bunch activates. (For the call center, a bunch is a group of users; for radioactive decay, a bunch is a unit of mass of Uranium atoms.)

As n doubles, we imagine the number of agents per bunch to drop by half. (For the call center, we divide a group in half, so twice as many groups but half the odds of one group making a call; for the Uranium, we divide a chunk of mass in half, getting twice as many portions with half the odds of a decay occurring in each portion.

This process is formally called division of a distribution, and the fact that the Poisson distribution arises as the limit of such division means that it is infinitely divisible.

Theory 4 - Poisson approximation

Extra - Theorem: Poisson approximation of the binomial

Suppose XBin(n,p) and YPois(np). Then:

|PX(k)PY(k)|np2

for any k.

In consequence of this theorem, a Poisson distribution may be used to approximate the probabilities of a binomial distribution for large n when it is impracticable (even for a computer) to calculate large binomial coefficients.

The theorem shows that the Poisson approximation is appropriate when np is a moderate number while np2 is a small number.