Theory 1 - Minimum mean square error
Suppose our problem is to estimate or guess or predict the value of a random variable
There is no single best answer to this question. The best answer is a function of additional factors in the problem context.
One method is to pick a value where the PMF or PDF of
Another method is to pick the expected value
For the normal distribution, or any symmetrical distribution, these are the same value. For most distributions they are not the same value.
Mean square error
Given an estimate
for a random variable , the mean square error (MSE) of is:
The MSE quantifies the typical (square of the) error, meaning the difference between the true value
Other error estimates are reasonable and useful in niche contexts. For example,
In problem contexts where large errors are more costly than small errors (many real problems), the most likely value of
It turns out the expected value
Minimal mean square error
Given a random variable
, its expectation provides the estimate with minimal mean square error. The MSE error itself of
:
Proof that
gives minimal MSE Expand the MSE error:
Minimize this parabola. Differentiate:
Find zeros:
When the estimate
In the presence of additional information, namely that event
The MSE estimate can also be conditioned on another variable, say
Minimal MSE of
given The minimal MSE estimate of
given another variable : The error of this estimate is
, which equals .
Notice that the minimal MSE of
This variable is a derived variable of
The variable
Theory 2 - Line of minimal MSE
Linear approximation is very common in applied math.
One could consider the linearization of
Instead, one can minimize the MSE over all possible linear functions of
Line of minimal MSE
Let
be the line . Let . The mean square error (MSE) of
is: The linear estimator is the line
with minimal MSE, and it is: The minimal error value
is: The variable of minimal error,
, is uncorrelated with .
Slope and
Notice:
Thus,
is the slope of the minimal MSE line for standardized variables and .

In each graph,
The line of minimal MSE is the “best fit” line,