Poisson process

01 Theory - Poisson variable

Theory 1 - Poisson variable

Poisson variable

A random variable X is Poisson, written XPois(α), when X counts the number of “arrivals” in a fixed “window.” It is applicable when:

  • The arrivals come at a constant average rate.
  • The arrivals are independent of each other.

Poisson PMF:

PX(k)=eααkk!

The “window rate” α must be computed from the “background rate” λ using the window size.

A Poisson variable is comparable with a binomial variable. Both count the occurrences of some “arrivals” over some “space of opportunity.”

  • The binomial opportunity is a set of n repetitions of a trial.
  • The Poisson opportunity is a continuous interval of time.

In the binomial case, success occurs at some rate p, since p=P[A] where A is the success event.

In the Poisson case, arrivals occur at some rate α.


center


The Poisson distribution is actually the limit of binomial distributions by taking n while np remains fixed, so p0 in perfect balance with n.

Fix α and define pn=α/n. (So p is computed to ensure the average rate np does not change.) Let XnBin(n,pn) and let YαPois(α). Then for any k:

PXn(k)PYα(k)as n

For example, let XnBin(n,pn), so with pn=3n, and look at PXn(1) as n:

PXn(1)(n1)(3n)1(13n)n13(13n)n13e3asn

Interpretation - Binomial model of rare events

Let us interpret the assumptions of this limit. For n large but p small such that α=np remains moderate, the binomial distribution describes a large number of trials, a low probability of success per trial, but a moderate total count of successes.

This setup describes a physical system with a large number of parts that may activate, but each part is unlikely to activate; and yet the number of parts is so large that the total number of arrivals is still moderate.

Link to original

02 Illustration

Example - Arrivals at a post office

Arrivals at a post office

Client arrivals at a post office are modelled well using a Poisson variable.

Each potential client has a very low and independent chance of coming to the post office, but there are many thousands of potential clients, so the arrivals at the office actually come in moderate number.

Suppose the average rate is 5 clients per hour.

(a) Find the probability that nobody comes in the first 10 minutes of opening. (The cashier is considering being late by 10 minutes to run an errand on the way to work.)

(b) Find the probability that 5 clients come in the first hour. (I.e. the average is achieved.)

(c) Find the probability that 9 clients come in the first two hours.

Solution

(a)

Expect 5/6 clients every 10 minutes. Therefore, let XPois(5/6). Seek PX(0) as the answer.

PMF:

PX(k)=e5/6(5/6)kk!

Insert data and compute:

PX(0)e5/60.435

(b)

Rate is already correct. Let XPois(5).

PMF:

PX(5)=e5555!0.175

(c)

Expect 10 clients every 2 hours. Therefore let XPois(10).

PMF:

PX(9)e101099!0.125

Notice that 0.125 is smaller than 0.175.

Link to original

Example - Random leukemia?

Random leukemia?

Suppose the rate of leukemia in a random population is expected to be 8.3 cases per million.

Suppose a given town of 150,000 residents has 12 cases. Is that suspicious?

(Compute the probability of 12 or more cases assuming a Poisson distribution.)

Solution

The expected case rate for 150,000 residents is:

(1.5×105persons)8.3cases1×106persons1.245

Therefore, with a “window” of 150,000, the case rate should be α=1.245. Set XPois(1.245).

Now compute:

P[X12]=k=12PX(k)1k=011e1.245(1.245)kk!9.12×109

Therefore the observed case rate is highly suspicious!

Link to original

Example - Radioactive decay is Poisson

Radioactive decay is Poisson

Consider a macroscopic sample of Uranium.

Each atom decays independently of the others, and the likelihood of a single atom popping off is very low; but the product of this likelihood by the total number of atoms is a moderate number.

So there is some constant average rate of atoms in the sample popping off, and the number of pops per minute follows a Poisson distribution.

Link to original

Example - Typos per page

Typos per page

A draft of a textbook has an average of 6 typos per page.

What is the probability that a randomly chosen page has 4 typos?

(Answer: 0.849. Hint: study the complementary event.)

Link to original

03 Theory - Poisson limit of binomial

Theory 2 - Poisson limit of binomial

Extra - Derivation of binomial limit to Poisson

Consider a random variable XBin(n,p), and suppose n is very large.

Suppose also that p is very small, such that E[X]=np is not very large, but the extremes of n and p counteract each other. (Notice that then npq will not be large so the normal approximation does not apply.) In this case, the binomial PMF can be approximated using a factor of enp. Consider the following rearrangement of the binomial PMF:

PX(k)(nk)pkqnkn(n1)(nk+1)k!pk(1p)n1qk(1p)n(np)kk![nnn1nn2nnk+1n]1qk

Since n is very large, the factor in brackets is approximately 1, and since p is very small, the last factor of 1/qk is also approximately 1 (provided we consider k small compared to n). So we have:

PX(k)(1p)n(np)kk!.

Write α=np, a moderate number, to find:

PX(k)(1αn)nαkk!.

Here at last we find eα, since (1αn)neα as n. So as n:

PX(k)eααkk!.
Link to original

04 Theory - Poisson expected value

02

Expectation of Poisson

Derive the formula E[X]=α for a Poisson variable XPois(α).

Link to original

05 Theory

Theory 1

Exponential variable

A random variable X is exponential, written XExp(λ), when X measures the wait time until first arrival in a Poisson process with rate λ.

Exponential PDF:

fX(t)={λeλtt00t<0
  • Poisson is continuous analog of binomial
  • Exponential is continuous analog of geometric

Notice the coefficient λ in fX. This ensures P[X]=1:

0eλtdtλ1(eλ1)λ1

Notice the “tail probability” is a simple exponential decay:

P[X>t]=eλt

(Compute an improper integral to verify this.)


Erlang variable

A random variable X is Erlang, written XErlang(,λ), when X measures the wait time until th arrival in a Poisson process with rate λ.

Erlang PDF:

fX(t)=λ(1)!t1eλt
  • Erlang is continuous analog of Pascal
Link to original

06 Illustration

Example - Earthquake wait time

Earthquake wait time

Suppose the San Andreas fault produces major earthquakes modeled by a Poisson process, with an average of 1 major earthquake every 100 years.

(a) What is the probability that there will not be a major earthquake in the next 20 years?

(b) What is the probability that three earthquakes will strike within the next 20 years?

Solution

(a)

Since the average wait time is 100 years, we set λ=0.01 earthquakes per year. Set XExp(0.01) and compute:

P[X>20]=eλ20e0.01200.82

(b)

The same Poisson process has the same λ=0.01 earthquakes per year. Set XErlang(3,0.01), so:

fX(t)=λ(1)!t1eλt(0.01)3(31)!t31e0.01t1062t2e0.01t

Now compute:

P[X20]=020fX(x)dx0201062t2e0.01tdt0.00115Link to original

07 Theory

Theory 2

The memoryless distribution is exponential

The exponential distribution is memoryless. This means that knowledge that an event has not yet occurred does not affect the probability of its occurring in future time intervals:

P[X>t+s|X>t]=P[X>s].

This is easily checked using the PDF:

eλ(t+s)/eλt=eλs

No other continuous distribution is memoryless. This means any other (continuous) memoryless distribution agrees in probability with the exponential distribution. The reason is that the memoryless property can be rewritten as P[X>t+s]=P[X>t]P[X>s]. Consider P[X>x] as a function of x, and notice that this function converts sums into products. Only the exponential function can do this.

The geometric distribution is the discrete memoryless distribution.

P[X>n]k=n+1qk1pqnp(1+q+q2+)qnp1qqn

and by substituting n+k, we also know P[X>n+k]=qn+k.

Then:

P[X=n+k|X>n]P[X=n+k]P[X>n]qn+k1pqnqk1pP[X=k]
Link to original

Derived random variables

08 Theory

Theory 1

By composing any function g: with a random variable X:S we obtain a new random variable gX. This one is called a derived random variable.

Notation

The derived random variable gX may be written “g(X)”.

Expectation of derived variables

Discrete case:

E[g(X)]=kg(k)PX(k)

(Here the sum is over all possible values k of X, i.e. where PX(k)0.)

Continuous case:

E[g(X)]=+g(x)fX(x)dx

Notice: when applied to outcome sS:

  • k is the output of X
  • g(k) is the output of gX

The proofs of these formulas are tricky because we must relate the PDF or PMF of X to that of g(X).


Linearity of expectation

For constants a and b:

E[aX+b]=aE[X]+b

For any X and Y on the same probability model:

E[X+Y]=E[X]+E[Y]

Exercise - Linearity of expectation

Using the definition of expectation, verify both linearity formulas for the discrete case.

Be careful!

Usually E[g(X)]g(E[X]).

For example, usually E[XX]E[X]E[X].

We distribute E over sums but not products (unless the factors are independent).


Variance squares the scale factor

For constants a and b:

Var[aX+b]=a2Var[X]

Thus variance ignores the offset and squares the scale factor. It is not linear!


Extra - Moments

The nth moment of X is defined as the expectation of Xn:

Discrete case:

E[Xn]=kknp(k)

Continuous case:

E[Xn]=+xnf(x)dx

A central moment of X is a moment of the variable XE[X]:

E[(XE[X])n]

The data of all the moments collectively determines the probability distribution. This fact can be very useful! In this way moments give an analogue of a series representation, and are sometimes more useful than the PDF or CDF for encoding the distribution.

Link to original

09 Illustration

Example - Function given by chart

Expectation of function on RV given by chart

Suppose that g: in such a way that g:14 and g:21 and g:387 and no other values are mapped to 4,1,87. Define Y=g(X).

X:123
PX(k):1/72/74/7
Y:4187

Then:

E[X]=117+227+347177

And:

E[Y]=417+127+87473547

Therefore:

E[5X+2Y+3]5177+23547+38147 Link to original

Example - PMF across many-to-one function

PMF through many-to-one function

Suppose the PMF of X is given by:

PX(x)={0.2x=10.4x=00.4x=+10else

Now suppose Y=|X|. That is to say, Y=g(X) and g(X)=|X|.

What is the PMF of Y?

Solution

Notice that g(+1)=1 and g(1)=1. So PX(+1) and PX(1) are combined into PY(1):

PY(y)={0.4y=00.6y=10else Link to original

Example - Average pay raise

Average pay raise

Suppose the average salary at Company A is $52,000. Each employee is given a 3% raise and a $2000 bonus. What is the average salary now?

Solution

Let X be a random variable indicating the starting salary of each employee.

Then Y=g(X)=(1.03)X+2000 is a random variable giving the new salary of each employee.

We can calculate the expectation by linearity:

E[Y]=E[g(X)]E[(1.03)X+2000](1.03)E[X]+2000(1.03)(52000)+200055560 Link to original

10 Theory

Theory 2

Suppose we are given the PDF fX(x) of X, a continuous RV.

What is the PDF fg(X), the derived variable given by composing X with g:?

PDF of derived

The PDF of g(X) is not (usually) equal to gfX(x).

Relating PDF and CDF

When the CDF of X is differentiable, we have:

FX(x)=xfX(t)dtfX(x)=ddxFX(x)Fg(X)(x)=xfg(X)(t)dtfg(X)(x)=ddxFg(X)(x)

Therefore, if we know fX(x), we can find fg(X)(x) using a 3-step process:

(1) Integrate PDF to get CDF:

FX(x)=xfX(t)dt

(2) Find Fg(X), the CDF of g(X), by comparing conditions:

When g is monotone increasing, we have equivalent conditions:

g(X)xXg1(x)P[g(X)x]=P[Xg1(x)]Fg(X)(x)=FX(g1(x))

(3) Differentiate CDF to recover PDF:

fg(X)(x)=ddxFg(X)(x)ddxFX(g1(x))fX(g1(x))ddx(g1)

Alternative: Method of differentials

Change variables: The measure for integration is fX(x)dx. Set Y=X2 so dy=2xdx and dx=12ydy. Thus fX(x)dx=fX(y)12ydy. So the measure of integration in terms of y is fY(y)=fX(y)12y.

Warning: this assumes the function is one-to-one.

Link to original

11 Illustration

Example - PDF of derived from CDF

PDF of derived from CDF

Suppose that FX(x)=11+ex.

(a) Find the PDF of X. (b) Find the PDF of eX.

Solution

(a)

Formula:

FX(x)=xfX(t)dtfX(x)=ddxFX(x)

Plug in:

fX(x)=ddx(1+ex)1(1+ex)2(ex)ex(1+ex)2

(b)

By definition:

FeX(x)=P[eXx]

Since eX is increasing, we know:

eXaXlna

Therefore:

FeX(x)=FX(lnx)11+elnx11+x1

Then using differentiation:

feX(x)=ddx(11+x1)(1+x1)2(x2)1(x+1)2 Link to original