Sums of random variables

01 Theory

Theory 1

THEOREM: Continuous PDF of a sum

Let be any joint continuous PDF.

Suppose . Then:

When and are independent, so , this becomes convolution:

Extra - Derivation of PDF

The joint CDF of :

Find by differentiating:

To calculate this derivative, change variables by setting and . The Jacobian is 1, so becomes , and we have:

Link to original

02 Illustration

Example - Sum of parabolic random variables

Sum of parabolic random variables

Suppose is an RV with PDF given by:

Let be an independent copy of . So , but is independent of .

Find the PDF of .

Solution

The graph of matches the graph of except (i) flipped in a vertical mirror, (ii) shifted by to the left.

When , the integrand is nonzero only for :

When , the integrand is nonzero only for :

Final result is:

Link to original

03 Theory - extra

Videos by 3Blue1Brown:

Theory 3

Convolution

The convolution of two continuous functions and is defined by:

center

For more example calculations, look at 9.6.1 and 9.6.2 at this page.

Applications of convolution

  • Convolutional neural networks (machine learning theory: translation invariant NN, low pre-processing)
  • Image processing: edge detection, blurring
  • Signal processing: smoothing and interpolation estimation
  • Electronics: linear translation-invariant (LTI) system response: convolution with impulse function

Extra - Convolution

Geometric meaning of convolution Convolution does not have a neat and precise geometric meaning, but it does have an imprecise intuitive sense.

The product of two quantities tends to be large when both quantities are large; when one of them is small or zero, the product will be small or zero. This behavior is different from the behavior of a sum, where one summand being large is sufficient for the sum to be large. A large summand overrides a small co-summand, whereas a large factor is scaled down by a small cofactor.

The upshot is that a convolution will be large when two functions have similar overall shape. (Caveat: one function must be flipped in a vertical mirror before the overlay is considered.) The argument value where the convolution is largest will correspond to the horizontal offset needed to get the closest overlay of the functions.

Algebraic properties of convolution

The last of these is not the typical Leibniz rule for derivatives of products!

All of these properties can be checked by simple calculations with iterated integrals.

Convolution in more variables Given , their convolution at is defined by integrating the shifted products over the whole domain:

Link to original

04 Illustration

Exercise - Convolution practice

Convolution practice

Suppose is an RV with density:

Suppose is uniform on and independent of .

Find the PDF of . Sketch the graph of this PDF.

Link to original

05 Theory

Theory 6

Recall that in a Poisson process:

  • measures continuous wait time until one arrival
  • measures continuous wait time until arrival

Since the wait times between arrivals are independent, we expect that the sum of exponential distributions is an Erlang distribution. This is true!

Erlang sum rule

Specify a given Poisson process with arrival rate . Suppose that:

  • for any
  • or any
  • and are independent

Then:

Exp plus Exp is Erlang

Recall that .

So we could say:

And:

Link to original

06 Illustration

Example - Exp plus Exp equals Erlang

Exp plus Exp equals Erlang

Let us verify this formula by direct calculation:

Solution

Let be independent RVs.

Therefore:

Now compute the convolution, assuming :

This is the Erlang PDF:

Link to original

Exercise - Erlang induction step

Erlang induction step

By direct computation with PDFs and convolution, derive the formula:

Observation: By repeatedly applying the above formula, we see that:

Link to original

Expectation for two variables

07 Theory

Theory 1

Expectation for a function on two variables

Discrete case:

Continuous case:

These formulas are not trivial to prove, and we omit the proofs. (Recall the technical nature of the proof we gave for in the discrete case.)

Expectation sum rule

Suppose and are any two random variables on the same probability model.

Then:

We already know that expectation is linear in a single variable: .

Therefore this two-variable formula implies:

Expectation product rule: independence

Suppose that and are independent.

Then we have:

Extra - Proof: Expectation sum rule, continuous case

Suppose and give marginal PDFs for and , and gives their joint PDF.

Then:

Observe that this calculation relies on the formula for , specifically with .

Extra - Proof: Expectation product rule

Link to original

08 Illustration

from joint PMF chart

Expectation of X squared plus Y from joint PMF chart

Suppose the joint PMF of and is given by this chart:

0.20.2
0.350.1
0.050.1

Define . Find the expectation .

Solution

First compute the values of for each pair in the chart:

03
14
25

Now take the sum, weighted by probabilities:

Link to original

Exercise - Understanding expectation for two variables

Understanding expectation for two variables

Suppose you know only that and .

Which of the following can you calculate?

Link to original

two ways, and , from joint density

Expectation of Y two ways and Expectation of XY from joint density

Suppose and are random variables with the following joint density:

(a) Compute using two methods.

(b) Compute .

Solution

(a)

(1) Method One: via marginal PDF :

Then expectation:

(2) Method Two: directly, via two-variable formula:

(b) Directly, via two-variable formula:

Link to original

Covariance and correlation

09 Theory

Theory 1

Write and .

Observe that the random variables and are “centered at zero,” meaning that .

Covariance

Suppose and are any two random variables on a probability model. The covariance of and measures the typical synchronous deviation of and from their respective means.

Then the defining formula for covariance of and is:

There is also a shorter formula:

To derive the shorter formula, first expand the product and then apply linearity.

Notice that covariance is always symmetric:

The self covariance equals the variance:

The sign of reveals the correlation type between and :

CorrelationSign
Positively correlated
Negatively correlated
Uncorrelated

Correlation coefficient

Suppose and are any two random variables on a probability model.

Their correlation coefficient is a rescaled version of covariance that measures the synchronicity of deviations:

The rescaling ensures:

center

Covariance depends on the separate variances of and as well as their relationship.

Correlation coefficient, because we have divided out , depends only on their relationship.

Link to original

10 Illustration

Covariance from PMF chart

Covariance from PMF chart

Suppose the joint PMF of and is given by this chart:

0.20.2
0.350.1
0.050.1

Find .

Solution

We need and and .

Therefore:

Link to original

11 Theory

Theory 2

Covariance bilinearity

Given any three random variables , , and , we have:

Covariance and correlation: shift and scale

Covariance scales with each input, and ignores shifts:

Whereas shift or scale in correlation only affects the sign:

Extra - Proof of covariance bilinearity

Extra - Proof of covariance shift and scale rule


Independence implies zero covariance

Suppose that and are any two random variables on a probability model.

If and are independent, then:

Proof:

We know both of these:

But , so those terms cancel and .

Sum rule for variance

Suppose that and are any two random variables on a probability space.

Then:

When and are independent:

Extra - Proof: Sum rule for variance

Extra - Proof that

(1) Create standardizations:

Now and satisfy:

Observe that for any . Variance can’t be negative.


(2) Apply the variance sum rule.

Apply to and :

Simplify:

Notice effect of standardization:

Therefore .


(3) Modify and reapply variance sum rule.

Change to :

Simplify:

Link to original

12 Illustration

Variance of sum of indicators

Variance of sum of indicators

An urn contains 3 red balls and 2 yellow balls.

Suppose 2 balls are drawn without replacement, and counts the number of red balls drawn.

Find .

Solution

Let indicate (one or zero) whether the first ball is red, and indicate whether the second ball is red, so .

Then indicates whether both drawn balls are red; so it is Bernoulli with success probability . Therefore .

We also have .

The variance sum rule gives:

Link to original

Exercise - Covariance rules

Covariance rules

Simplify:

Link to original

Exercise - Independent variables are uncorrelated

Independent variables are uncorrelated

Let be given with possible values and PMF given (uniformly) by for all three possible . Let .

Show that and are dependent but uncorrelated.

Hint: To speed the calculation, notice that .

Link to original