Probability
Home

01 Theory
02 Illustration
03 Theory - extra

Convolution

01 Theory

Theory 1

THEOREM: Continuous PDF of a sum

Let $f_{X, Y} (x, y)$ be any joint continuous PDF.

Suppose $W = X + Y$ . Then:
$f_{W} (w) = \int_{- \infty}^{+ \infty} f_{X, Y} (u, w - u) d u$
When $X$ and $Y$ are independent, so $f_{X, Y} = f_{X} f_{Y}$ , this becomes convolution:
$f_{W} (w) = f_{X} * f_{Y} = \int_{- \infty}^{+ \infty} f_{X} (u) f_{Y} (w - u) d u$

Extra - Derivation of $X + Y$ PDF

The joint CDF of $X + Y$ :
$F_{X + Y} (w) = P [X + Y \leq w] = \iint_{x + y \leq w} f_{X, Y} (x, y) d x d y$
Find $f_{X + Y}$ by differentiating:
$f_{X + Y} (w) = \frac{d}{d w} F_{X + Y} (w) ≫ ≫ \frac{d}{d w} \iint_{x + y \leq w} f_{X, Y} (x, y) d x d y$
To calculate this derivative, change variables by setting $u = x$ and $s = x + y$ . The Jacobian is 1, so $d x d y$ becomes $d u d s$ , and we have:
$≫ ≫ \frac{d}{d w} \int_{- \infty}^{w} \int_{- \infty}^{+ \infty} f_{X, Y} (u, s - u) d u d s ≫ ≫ \int_{- \infty}^{+ \infty} f_{X, Y} (u, w - u) d u$

Link to original

02 Illustration

Example - Sum of parabolic random variables

Sum of parabolic random variables

Suppose $X$ is an RV with PDF given by:
$f_{X} (x) = {\begin{matrix} \frac{3}{4} (1 - x^{2}) & x \in [- 1,1] \\ 0 & otherwise \end{matrix}$
Let $Y$ be an independent copy of $X$ . So $f_{Y} = f_{X}$ , but $Y$ is independent of $X$ .

Find the PDF of $X + Y$ .

Solution

The graph of $f_{X} (w - x)$ matches the graph of $f_{X} (x)$ except (i) flipped in a vertical mirror, (ii) shifted by $w$ to the left.

When $w \in [- 2,0]$ , the integrand is nonzero only for $x \in [- 1, w + 1]$ :
$\begin{matrix} f_{X + Y} (w) & = {(\frac{3}{4})}^{2} \int_{- 1}^{w + 1} (1 - (w - x)^{2}) (1 - x^{2}) d x \\ = \frac{9}{16} (\frac{w^{5}}{30} - \frac{2 w^{3}}{3} - \frac{4 w^{2}}{3} + \frac{16}{15}) \end{matrix}$
When $w \in [0, + 2]$ , the integrand is nonzero only for $x \in [w - 1, + 1]$ :
$\begin{matrix} f_{X + Y} (w) & = {(\frac{3}{4})}^{2} \int_{w - 1}^{+ 1} (1 - (w - x)^{2}) (1 - x^{2}) d x \\ = \frac{9}{16} (- \frac{w^{5}}{30} + \frac{2 w^{3}}{3} - \frac{4 w^{2}}{3} + \frac{16}{15}) \end{matrix}$
Final result is:
$f_{X + Y} (w) = {\begin{matrix} \frac{9}{16} (\frac{w^{5}}{30} - \frac{2 w^{3}}{3} - \frac{4 w^{2}}{3} + \frac{16}{15}) & w \in [- 2,0] \\ \frac{9}{16} (- \frac{w^{5}}{30} + \frac{2 w^{3}}{3} - \frac{4 w^{2}}{3} + \frac{16}{15}) & w \in [0,2] \\ 0 & otherwise \end{matrix}$
Link to original

03 Theory - extra

Theory 3

Videos by 3Blue1Brown:

Why X+Y in probability is a beautiful mess

But what is a convolution?

Convolution

The convolution of two continuous functions $f (x)$ and $g (x)$ is defined by:
$(f * g) (x) = \int_{- \infty}^{+ \infty} f (x - t) g (t) d t$

For more example calculations, look at 9.6.1 and 9.6.2 at this page.

Applications of convolution

Convolutional neural networks (machine learning theory: translation invariant NN, low pre-processing)

Image processing: edge detection, blurring

Signal processing: smoothing and interpolation estimation

Electronics: linear translation-invariant (LTI) system response: convolution with impulse function

Extra - Convolution

Geometric meaning of convolution Convolution does not have a neat and precise geometric meaning, but it does have an imprecise intuitive sense.

The product of two quantities tends to be large when both quantities are large; when one of them is small or zero, the product will be small or zero. This behavior is different from the behavior of a sum, where one summand being large is sufficient for the sum to be large. A large summand overrides a small co-summand, whereas a large factor is scaled down by a small cofactor.

The upshot is that a convolution will be large when two functions have similar overall shape. (Caveat: one function must be flipped in a vertical mirror before the overlay is considered.) The argument value where the convolution is largest will correspond to the horizontal offset needed to get the closest overlay of the functions.

Algebraic properties of convolution

$f * g = g * f$

$f * (g * h) = (f * g) * h$

$f * (g + h) = f * g + f * h$

$a (f * g) = (a f) * g = f * (a g)$

${(f * g)}^{'} = f^{'} * g = f * g^{'}$

The last of these is not the typical Leibniz rule for derivatives of products!

All of these properties can be checked by simple calculations with iterated integrals.

Convolution in more variables Given $f, g : ℝ^{n} \to ℝ$ , their convolution at $𝐱$ is defined by integrating the shifted products over the whole domain:
$(f * g) (𝐱) = ∭_{ℝ^{n}} f (𝐱 - 𝐲) g (𝐲) d y$

Link to original

ProbabilityHome

Table of Contents

Convolution

01 Theory

Theory 1

02 Illustration

Sum of parabolic random variables

03 Theory - extra

Theory 3

Probability
Home