Packet 13

Matrices IV: Spectrum

Eigenvectors, eigenvalues: Part II

Example

Finding eigenvectors and eigenvalues

Problem: Find the eigenvectors and their eigenvalues for the matrix $A = (\begin{matrix} 4 & 0 & 1 \\ - 2 & 1 & 0 \\ - 2 & 0 & 1 \end{matrix})$ . Solution: First find the eigenvalues:
$\begin{matrix} \det A_{λ} & = | \begin{matrix} 4 - λ & 0 & 1 \\ - 2 & 1 - λ & 0 \\ - 2 & 0 & 1 - λ \end{matrix} | \\ = (4 - λ) | \begin{matrix} 1 - λ & 0 \\ 0 & 1 - λ \end{matrix} | - 0 | \begin{matrix} - 2 & 0 \\ - 2 & 1 - λ \end{matrix} | + 1 | \begin{matrix} - 2 & 1 - λ \\ - 2 & 0 \end{matrix} | \\ = (4 - λ) {(1 - λ)}^{2} - 0 + 2 (1 - λ) \\ = - λ^{3} + 6 λ^{2} - 11 λ + 6. \end{matrix}$
The roots are $λ = 1,2,3$ . (The rational roots theorem leads to the guess $λ = 2$ , and then you can divide the cubic polynomial by $λ - 2$ to obtain a quadratic one, which is easily factored.)

Now we seek eigenvectors by solving $A_{λ} 𝐱 = 𝟎$ for each $λ$ root. When $λ = 1$ :
$A_{λ} 𝐱 = (\begin{matrix} 3 & 0 & 1 \\ - 2 & 0 & 0 \\ - 2 & 0 & 0 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}) .$
To solve for $𝐱$ , row reduce $A_{λ}$ to obtain a matrix in RREF:
$\begin{matrix} A_{λ} & = (\begin{matrix} 3 & 0 & 1 \\ - 2 & 0 & 0 \\ - 2 & 0 & 0 \end{matrix}) \sim (\begin{matrix} 3 & 0 & 1 \\ 0 & 0 & 2 / 3 \\ 0 & 0 & 2 / 3 \end{matrix}) \\ \sim (\begin{matrix} 3 & 0 & 0 \\ 0 & 0 & 2 / 3 \\ 0 & 0 & 0 \end{matrix}) \sim (\begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}) . \end{matrix}$
So we know $x_{1} = x_{3} = 0$ and $x_{2}$ is free. Plugging these in we have $𝐱 = (\begin{matrix} 0 \\ x_{2} \\ 0 \end{matrix})$ , and we choose $x_{2} = 1$ obtaining $(\begin{matrix} 0 \\ 1 \\ 0 \end{matrix})$ for an eigenvector.

Next for $λ = 2$ we have:
$A_{λ} 𝐱 = (\begin{matrix} 2 & 0 & 1 \\ - 2 & - 1 & 0 \\ - 2 & 0 & - 1 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}) .$
To solve for $𝐱$ , row reduce $A_{λ}$ to obtain a matrix in RREF:
$\begin{matrix} A_{λ} & = (\begin{matrix} 2 & 0 & 1 \\ - 2 & - 1 & 0 \\ - 2 & 0 & - 1 \end{matrix}) \sim (\begin{matrix} 2 & 0 & 1 \\ 0 & - 1 & 1 \\ 0 & 0 & 0 \end{matrix}) \sim (\begin{matrix} 1 & 0 & 1 / 2 \\ 0 & 1 & - 1 \\ 0 & 0 & 0 \end{matrix}) . \end{matrix}$
So we know $x_{1} = - \frac{1}{2} x_{3}$ and $x_{2} = + x_{3}$ . Plugging these in we have $𝐱 = (\begin{matrix} - \frac{1}{2} x_{3} \\ x_{3} \\ x_{3} \end{matrix})$ , and we choose $x_{3} = 2$ obtaining $(\begin{matrix} - 1 \\ 2 \\ 2 \end{matrix})$ .

Finally for $λ = 3$ we have:
$A_{λ} 𝐱 = (\begin{matrix} 1 & 0 & 1 \\ - 2 & - 2 & 0 \\ - 2 & 0 & - 2 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}) .$
To solve for $𝐱$ , row reduce $A_{λ}$ to obtain a matrix in RREF:
$\begin{matrix} A_{λ} & = (\begin{matrix} 1 & 0 & 1 \\ - 2 & - 2 & 0 \\ - 2 & 0 & - 2 \end{matrix}) \sim (\begin{matrix} 1 & 0 & 1 \\ 0 & - 2 & 2 \\ 0 & 0 & 0 \end{matrix}) \sim (\begin{matrix} 1 & 0 & 1 \\ 0 & 1 & - 1 \\ 0 & 0 & 0 \end{matrix}) . \end{matrix}$
So we know $x_{1} = - x_{3}$ and $x_{2} = + x_{3}$ . Plugging these in we have $𝐱 = (\begin{matrix} - x_{3} \\ x_{3} \\ x_{3} \end{matrix})$ , and we choose $x_{3} = 1$ obtaining $(\begin{matrix} - 1 \\ 1 \\ 1 \end{matrix})$ .

Example

Changing to eigenbasis yields diagonal matrix

Notice that $A$ in the previous example has three eigenvectors, and that they are independent. Therefore they constitute a basis of $ℝ^{3}$ ; let’s call this basis $𝒞$ . Putting these basis vectors as the columns of a matrix, we create the change of basis transfer matrix from $𝒞$ to the standard basis $ℬ$ :
$_{ℬ} T_{𝒞} = (\begin{matrix} 𝐜_{1} & 𝐜_{2} & 𝐜_{3} \end{matrix}) = (\begin{matrix} 0 & - 1 & - 1 \\ 1 & 2 & 1 \\ 0 & 2 & 1 \end{matrix}) .$
So $_{ℬ} T_{𝒞} \cdot {[𝐱]}_{𝒞} = x_{1} 𝐜_{1} + x_{2} 𝐜_{2} + x_{3} 𝐜_{3} = x_{1} (\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}) + x_{2} (\begin{matrix} - 1 \\ 2 \\ 2 \end{matrix}) + x_{3} (\begin{matrix} - 1 \\ 1 \\ 1 \end{matrix})$ .

Now consider that these are eigenvectors of $A$ with eigenvalues $1,2,3$ :
$A 𝐜_{1} = 𝐜_{1}, A 𝐜_{2} = 2 𝐜_{2}, A 𝐜_{3} = 3 𝐜_{3} .$
If we write $A$ in the basis $𝒞$ using the transfer matrix $_{ℬ} T_{𝒞}$ , we have:
$\begin{matrix} {[A]}_{𝒞} & =_{𝒞} T_{ℬ} \cdot A \cdot_{ℬ} T_{𝒞} \\ = {(\begin{matrix} 0 & - 1 & - 1 \\ 1 & 2 & 1 \\ 0 & 2 & 1 \end{matrix})}^{- 1} (\begin{matrix} 4 & 0 & 1 \\ - 2 & 1 & 0 \\ - 2 & 0 & 1 \end{matrix}) (\begin{matrix} 0 & - 1 & - 1 \\ 1 & 2 & 1 \\ 0 & 2 & 1 \end{matrix}) = (\begin{matrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{matrix}) . \end{matrix}$
In other words, the matrix of $A$ is a diagonal matrix when written in the basis of its own eigenvectors.

The example above illustrates a general fact: if a matrix $A$ has enough independent eigenvectors ${𝐜_{1}, \dots, 𝐜_{n}}$ to form a basis of the vector space $ℝ^{n}$ on which $A$ acts, then if you write $A$ in the basis of these eigenvectors, it will have a diagonal form.

In other words, the eigenbasis of $A$ is a natural basis for $A$ . It is a basis in which the action of $A$ is extremely simple: it is given by scaling the basis vectors by the respective eigenvalue numbers.

Sometimes a matrix has not enough independent eigenvectors. Here is an illustration:

Example

Matrix with too few eigenvectors

The matrix $A = (\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix})$ has $\det A_{λ} = {(1 - λ)}^{2}$ . Therefore it has a “double” eigenvalue $λ = + 1$ . To find the eigenvectors:
$A_{+ 1} \cdot 𝐱 = (\begin{matrix} 0 & 1 \\ 0 & 0 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix}) .$
The matrix is already in RREF. So the equation says that $x_{1}$ is free and $x_{2} = 0$ . Therefore we have the eigenvector $(\begin{matrix} 1 \\ 0 \end{matrix})$ .

However, no more eigenvectors are available (other than scalar multiples of this one, which would not be independent of this one). Therefore it is impossible to find a basis of eigenvectors, and it is thus impossible to convert $A$ to diagonal form by a change of basis.

Sometimes a real matrix has complex eigenvalues and eigenvectors. Here is an illustration:

Example

Matrix with complex eigenvalues and eigenvectors

The matrix $A = (\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})$ has $\det A_{λ} = λ^{2} + 1$ so its eigenvalues are $λ = \pm i$ . To find the eigenvectors:
$A_{i} \cdot 𝐱 = (\begin{matrix} - i & - 1 \\ 1 & - i \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix}) .$
Row reduce the matrix $A_{i}$ :
$A_{i} = (\begin{matrix} - i & - 1 \\ 1 & - i \end{matrix}) \sim (\begin{matrix} - i & - 1 \\ 0 & 0 \end{matrix}) \sim (\begin{matrix} 1 & - i \\ 0 & 0 \end{matrix}) .$
This is now in RREF, so we see that $x_{1} = i x_{2}$ and $x_{2}$ is free, so we get the eigenvector $(\begin{matrix} i \\ 1 \end{matrix})$ .

Following the same procedure for $λ = - i$ , we find the eigenvector $(\begin{matrix} - i \\ 1 \end{matrix})$ .

It is important to be aware of these two situations. For the sake of exams, you should memorize one example of each type. (You can use the two above.)

Question 13-01

Double eigenvalues does not imply insufficient eigenvectors

What are the eigenvalues and eigenvectors of $A = (\begin{matrix} 2 & 0 \\ 0 & 2 \end{matrix})$ ? Are there enough to form a basis of $ℝ^{2}$ ?

Question 13-02

Enough eigenvectors?

Does the matrix $A = (\begin{matrix} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{matrix})$ have enough eigenvectors to form a basis? If not, how many does it have?

Spectral Theorem: symmetric matrices have good eigen theory

A key point of the Spectral Theorem is that symmetric matrices never encounter the phenomena illustrated in the two examples above. Eigenvalues are always real numbers, and there are always enough eigenvectors to form a basis.

Eigenvectors vs. singular vectors

It is helpful to think about the concept of an eigenvector as consisting of two aspects:

(1) Matrix acting by scaling
(2) Fixed lines of matrix action

Aspect (1) allows easy computation and geometric understanding of the matrix action. It also allows us to focus on vectors that are more important (those having a larger scale factor). If we have a basis of such vectors, the matrix will have a simple diagonal form when written in this basis.

Aspect (2) goes much further, even though it essentially includes aspect (1). A fixed line (eigenline) is the span of any eigenvector. Such lines are mapped to themselves by the matrix action. This aspect describes eigenvector spans as analogues of fixed points of a function. The concept of fixed points and fixed lines (‘fixity’ in general) only applies when the function or matrix keeps vectors in the same space, i.e. has the same codomain as domain.

Eigenvectors There are various contexts where a matrix does act on vectors in a single space. Aspect (2) is very useful in these contexts. For example, a differential operator, which produces a system of first-order linear differential equations, involves a matrix $A : V \to V$ for a vector space of functions $V$ . For another example, the moment of inertia matrix describing angular moment of $3$ D bodies in physics. Or the many operators of quantum mechanics.

One of the biggest advantages of aspect (2) is that it is compatible with iteration.

If $A : V \to V$ is a linear transformation (i.e. a matrix) having the same codomain as domain, we can define iterates $A^{2}, A^{3}$ etc. as matrix powers. These powers allow one to apply functions to matrices: if $f (x) = \sum_{n = 0}^{\infty} a_{n} x^{n}$ is a power series, we can define $f (A)$ by the power series $f (A) = \sum_{n = 0}^{\infty} a_{n} A^{n}$ . (Bracketing any problems with convergence.) An extremely important series is $e^{x} = 1 + x + \frac{x^{2}}{2!} + \frac{x^{3}}{3!} + \dots$ , since the function $e^{A t}$ solves the linear ODE system $X^{'} (t) = A X (t)$ .

It is hard to compute $e^{A t}$ , or more generally $\sum_{n = 0}^{\infty} a_{n} A^{n}$ , by direct multiplication and limits of partial sums. On the other hand, it is easy to compute $e^{A t}$ when $A$ is a diagonal matrix, just as it is easy to compute $A^{2} 𝐯$ when $𝐯$ is an eigenvector of $A$ . (Namely, the latter is $λ^{2} 𝐯$ .) For example, in the $2 \times 2$ case we have $e^{(\begin{matrix} λ_{1} & 0 \\ 0 & λ_{2} \end{matrix})} = (\begin{matrix} e^{λ_{1}} & 0 \\ 0 & e^{λ_{2}} \end{matrix})$ .

Even when $A$ is not diagonal, if an eigenbasis matrix $_{ℬ} T_{𝒞} = (\begin{matrix} 𝐜_{1} & 𝐜_{2} & \dots & 𝐜_{n} \end{matrix})$ can be found, then the powers of $A$ are still easy to calculate, because (e.g. in $3$ D):

\begin{matrix} A^{2} & =_{ℬ} T_{𝒞} (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix})_{𝒞} T_{ℬ}_{ℬ} T_{𝒞} (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix})_{𝒞} T_{ℬ} \\ =_{ℬ} T_{𝒞} (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix})_{ℬ} T_{𝒞}^{- 1}_{ℬ} T_{𝒞} (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix})_{𝒞} T_{ℬ} \\ =_{ℬ} T_{𝒞} (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix}) (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix})_{𝒞} T_{ℬ} \\ =_{ℬ} T_{𝒞} (\begin{matrix} λ_{1}^{2} & 0 & 0 \\ 0 & λ_{2}^{2} & 0 \\ 0 & 0 & λ_{3}^{2} \end{matrix})_{𝒞} T_{ℬ} . \end{matrix}

Similarly we have:

A^{n} =_{ℬ} T_{𝒞} (\begin{matrix} λ_{1}^{n} & 0 & 0 \\ 0 & λ_{2}^{n} & 0 \\ 0 & 0 & λ_{3}^{n} \end{matrix})_{𝒞} T_{ℬ}, e^{A} =_{ℬ} T_{𝒞} (\begin{matrix} e^{λ_{1}} & 0 & 0 \\ 0 & e^{λ_{2}} & 0 \\ 0 & 0 & e^{λ_{3}} \end{matrix})_{𝒞} T_{ℬ} .

The critical piece of the calculation above is the cancellation $_{ℬ} T_{𝒞}^{- 1}_{ℬ} T_{𝒞} = I_{3}$ occurring in the calculation for $A^{2}$ . This cancellation shows that matrix multiplication and change of basis can be performed in either order. That in turn means we can first change basis to an eigenbasis (if it exists) and then perform multiplication with the diagonal matrices where it is easy, and only afterwards change back to the standard basis.

Singular vectors The concept of singular vector is designed to make use of aspect (1) of eigenvectors without aspect (2). Because they do not include aspect (2), singular vectors of $A$ are not eigenvectors of $A$ except in special cases. They do not determine fixed lines of the $A$ action.

Unlike eigenvectors and eigenvalues, singular vectors and singular values are (best) defined in the context of a collection of vectors giving the singular value decomposition (SVD). So, a singular vector is one of the vectors in the SVD.

Defining the singular value decomposition

Suppose $A : ℝ^{n} \to ℝ^{m}$ is a matrix providing a linear transformation between different vector spaces.

SVD features

A singular value decomposition of $A : ℝ^{n} \to ℝ^{m}$ is given by: $A = U Σ V^{𝖳},$ where:

$V = (\begin{matrix} 𝐯_{1} & 𝐯_{2} & \dots & 𝐯_{n} \end{matrix})$ is an orthogonal matrix giving a rotation/reflection combo for $ℝ^{n}$ (input)

$U = (\begin{matrix} 𝐮_{1} & 𝐮_{2} & \dots & 𝐮_{m} \end{matrix})$ is an orthogonal matrix giving a rotation/reflection combo for $ℝ^{m}$ (output)

$Σ$ has non-negative entries $σ_{1}, σ_{2}, \dots σ_{ℓ}$ on the main diagonal (possibly zeros!) and zeros elsewhere. (Here $ℓ = \min {n, m}$ .) It is not square if $n \neq m$ . It could look like these, for example:

$(\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 \\ 0 & 0 & 7 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}), (\begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 & 0 & 0 \\ 0 & 0 & 7 & 0 & 0 & 0 \end{matrix}) .$
The orthonormal basis vectors $𝐯_{1}, \dots, 𝐯_{n}$ are called right singular vectors. The orthonormal basis vectors $𝐮_{1}, \dots, 𝐮_{m}$ are called left singular vectors. The values $σ_{i} \geq 0$ are the lengths of the images, namely $σ_{i} = | A 𝐯_{i} |$ .

(Remember: “Right $=$ Input, Left $=$ Output.”)

An alternate way to write the SVD uses scalar projections $𝐯_{i}^{𝖳}$ attached to the vectors $𝐮_{i}$ :

A = σ_{1} 𝐮_{1} 𝐯_{1}^{𝖳} + σ_{2} 𝐮_{2} 𝐯_{2}^{𝖳} + \dots + σ_{ℓ} 𝐮_{ℓ} 𝐯_{ℓ}^{𝖳} .

Each term $σ_{i} 𝐮_{i} 𝐯_{i}^{𝖳}$ takes an input vector, dots it with $𝐯_{i}$ , then applies this value to $𝐮_{i}$ scaled by $σ_{i}$ . With this formula, we can compute $A 𝐱$ for any input $𝐱$ using dot products:

A 𝐱 = σ_{1} 𝐮_{1} (𝐯_{1} \cdot 𝐱) + σ_{2} 𝐮_{2} (𝐯_{2} \cdot 𝐱) + \dots + σ_{ℓ} 𝐮_{ℓ} (𝐯_{ℓ} \cdot 𝐱) .

By plugging in $𝐱 = 𝐯_{i}$ , we verify that $A 𝐯_{1} = σ_{1} 𝐮_{1}$ and $A 𝐯_{2} = σ_{2} 𝐮_{2}$ , etc. up to $A 𝐯_{ℓ} = σ_{ℓ} 𝐮_{ℓ}$ . (For $ℓ < i \leq n$ , we have $A 𝐯_{i} = 𝟎$ .)

Singular vectors vs. eigenvectors

The rule $A 𝐯_{i} = σ_{i} 𝐮_{i}$ looks analogous to the rule for eigenvectors.

But it’s actually very different! Notice that $𝐯_{i}$ and $𝐮_{i}$ are not even vectors in the same space!

The rule $A 𝐯 = σ 𝐮$ does not by itself determine any special property of $𝐯$ or $𝐮$ . (Take any unit vector $𝐯$ , apply $A$ , then define $𝛔 = | A 𝐯 |$ and $𝐮 = σ^{- 1} A 𝐯$ .)

The significance of the rule $A 𝐯_{i} = σ_{i} 𝐮_{i}$ manifests when it holds simultaneously on all vector pairs in orthonormal bases $𝐯_{1}, \dots, 𝐯_{n}$ and $𝐮_{1}, \dots, 𝐮_{m}$ .

Still, the SVD says that after rotating / reflecting the basis of $ℝ^{n}$ and the basis of $ℝ^{m}$ by suitable orthogonal matrices (not necessarily the same on each), the action of $A$ can be represented as simple scalings by $σ_{i}$ .

center center

We may think of $𝐯_{1}, \dots, 𝐯_{n}$ as a special basis for the input space $ℝ^{n}$ vis-à-vis the matrix $A$ , and $𝐮_{1}, \dots, 𝐮_{m}$ as a corresponding basis for the output space $ℝ^{m}$ . Then $A$ acts by pairing the vector $𝐯_{i}$ to $𝐮_{i}$ and stretching it by $σ_{i}$ .

center Partial matrices $𝐮_{i} 𝐯_{i}^{𝖳}$ for $i = 1, \dots, 5$ .

center Partial sum matrices $σ_{1} 𝐮_{1} 𝐯_{1}^{𝖳} + \dots + σ_{i} 𝐮_{i} 𝐯_{i}^{𝖳}$ for $i = 1, \dots, 5$ .

Computing the singular value decomposition

In practice, on a computer, the SVD is calculated using numerical procedures that are akin to those used to find approximations to eigenvectors. We do not address those kinds of procedures in this course.

Still, it is possible to specify and calculate SVD elements by hand for a matrix $A$ by making use of the fact that $A^{𝖳} A$ is always a symmetric matrix and the Spectral Theorem can be applied to it. This method also shows that the SVD always exists, since the symmetric $A^{𝖳} A$ always has real eigenvalues and enough eigenvectors by the Spectral Theorem, even when $A$ does not. Let us see how this works.

First observe that $(A^{𝖳} A)^{𝖳} = A^{𝖳} (A^{𝖳})^{𝖳} = A^{𝖳} A$ , so $A^{𝖳} A$ is always symmetric. Also notice that $A^{𝖳} A : ℝ^{n} \to ℝ^{n}$ , so the input and output spaces of this matrix are the same.

SVD elements

$𝐯_{1}, \dots, 𝐯_{n} =$ orthonormal eigenbasis of $A^{𝖳} A$ . As guaranteed by the Spectral Theorem, so $V$ here is $Q$ in the notation of that theorem.

$σ_{i} = | A 𝐯_{i} |$ defined for $1 \leq i \leq ℓ$ . If $(A^{𝖳} A) 𝐯_{i} = λ_{i} 𝐯_{i}$ , then $σ_{i} = \sqrt{λ_{i}}$ .

The equation $σ_{i} 𝐮_{i} = A 𝐯_{i}$ for $1 \leq i \leq ℓ$ defines unit vectors $𝐮_{i}$ for those $i$ with $σ_{i} \neq 0$ .

Gram-Schmidt provides additional $𝐮_{i}$ vectors when $σ_{i} = 0$ .

Example

Computing SVD by hand

Problem: Find an SVD for the matrix $A = (\begin{matrix} 1 & - 1 \\ - 2 & 2 \\ 2 & - 2 \end{matrix})$ .

Solution: First calculate that $A^{𝖳} A = (\begin{matrix} 9 & - 9 \\ - 9 & 9 \end{matrix})$ and this matrix has $λ = 18$ and $λ = 0$ eigenvalues. Therefore we set $σ_{1} = 3 \sqrt{2}$ and $σ_{2} = 0$ . By solving $(\begin{matrix} - 9 & - 9 \\ - 9 & - 9 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix})$ we obtain an eigenvector $(\begin{matrix} 1 \\ - 1 \end{matrix})$ which we renormalize to the unit vector $(\begin{matrix} 1 / \sqrt{2} \\ - 1 / \sqrt{2} \end{matrix})$ . By solving $(\begin{matrix} 9 & - 9 \\ - 9 & 9 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix})$ we obtain an eigenvector $(\begin{matrix} 1 \\ 1 \end{matrix})$ which we renormalize to the unit vector $(\begin{matrix} 1 / \sqrt{2} \\ 1 / \sqrt{2} \end{matrix})$ . We now have $V$ and $Σ$ :
$Σ = (\begin{matrix} 3 \sqrt{2} & 0 \\ 0 & 0 \\ 0 & 0 \end{matrix}), V = (\begin{matrix} 1 / \sqrt{2} & 1 / \sqrt{2} \\ - 1 / \sqrt{2} & 1 / \sqrt{2} \end{matrix}) .$
Set
$𝐮_{1} = σ_{1}^{- 1} A 𝐯_{1} = \frac{1}{3 \sqrt{2}} (\begin{matrix} 2 / \sqrt{2} \\ - 4 / \sqrt{2} \\ 4 / \sqrt{2} \end{matrix}) = (\begin{matrix} 1 / 3 \\ - 2 / 3 \\ 2 / 3 \end{matrix})$
and we notice that $| 𝐮_{1} | = 1$ . Now we need $𝐮_{2}, 𝐮_{3}$ unit vectors orthogonal to each other and to $𝐮_{1}$ . We first calculate a span of the kernel of the matrix $𝐮_{1}^{𝖳}$ , and then perform Gram-Schmidt to orthogonalize it. For the kernel:
$(\begin{matrix} 1 / 3 & - 2 / 3 & 2 / 3 \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}) = 0,$
and therefore $x_{1} - 2 x_{2} + 2 x_{3} = 0$ so we can set $x_{2}, x_{3}$ free and solve for $x_{1}$ , thereby obtaining generators for the kernel:
$\ker 𝐮_{1}^{𝖳} = {(\begin{matrix} 2 x_{2} - 2 x_{3} \\ x_{2} \\ x_{3} \end{matrix})} = ⟨ 𝐰_{2} = (\begin{matrix} 2 \\ 1 \\ 0 \end{matrix}), 𝐰_{3} = (\begin{matrix} - 2 \\ 0 \\ 1 \end{matrix}) ⟩ .$
Now we already have $𝐰_{2} ⟂ 𝐮_{1}$ by our method of defining $𝐰_{2}$ . So it just remains to do the last step of Gram-Schmidt. We calculate that
$𝐰_{3} - \frac{𝐰_{3} \cdot 𝐮_{1}}{𝐮_{1} \cdot 𝐮_{1}} 𝐮_{1} - \frac{𝐰_{3} \cdot 𝐰_{2}}{𝐰_{2} \cdot 𝐰_{2}} 𝐰_{2} = (\begin{matrix} - 2 \\ 0 \\ 1 \end{matrix}) - \frac{0}{1} (\begin{matrix} 1 / 3 \\ - 2 / 3 \\ 2 / 3 \end{matrix}) - \frac{- 4}{5} (\begin{matrix} 2 \\ 1 \\ 0 \end{matrix}) = (\begin{matrix} - 2 / 5 \\ 4 / 5 \\ 1 \end{matrix}) .$
Then
$𝐮_{2} = \frac{1}{| 𝐰_{2} |} 𝐰_{2} = (\begin{matrix} 2 / \sqrt{5} \\ 1 / \sqrt{5} \\ 0 \end{matrix}), 𝐮_{3} = \frac{5}{\sqrt{45}} (\begin{matrix} - 2 / 5 \\ 4 / 5 \\ 1 \end{matrix}) = (\begin{matrix} - 2 / \sqrt{45} \\ 4 / \sqrt{45} \\ 5 / \sqrt{45} \end{matrix}) .$
We have finished, and the SVD is given by:
$A = U Σ V^{𝖳} = (\begin{matrix} 1 / 3 & 2 / \sqrt{5} & - 2 / \sqrt{45} \\ - 2 / 3 & 1 / \sqrt{5} & 4 / \sqrt{45} \\ 2 / 3 & 0 & 5 / \sqrt{45} \end{matrix}) (\begin{matrix} 3 \sqrt{2} & 0 \\ 0 & 0 \\ 0 & 0 \end{matrix}) (\begin{matrix} 1 / \sqrt{2} & - 1 / \sqrt{2} \\ 1 / \sqrt{2} & 1 / \sqrt{2} \end{matrix}) .$

Exercise 13-01

Computing SVD by hand

Find an SVD for the matrix $A = (\begin{matrix} 2 & - 1 \\ 2 & 2 \end{matrix})$ .

Exercise 13-02

Computing SVD by hand

Find an SVD for the matrix $A = (\begin{matrix} - 3 & 1 \\ 6 & - 2 \\ 6 & - 2 \end{matrix})$ .

Explaining the SVD elements

First, why is $V = (\begin{matrix} 𝐯_{1} & \dots & 𝐯_{n} \end{matrix})$ an orthogonal matrix? Answer: Since $𝐯_{1}, \dots, 𝐯_{n}$ is an orthonormal basis, the matrix $V$ is orthogonal. Second, why is $U = (\begin{matrix} 𝐮_{1} & \dots & 𝐮_{m} \end{matrix})$ an orthogonal matrix? Answer: Consider the calculation:
$\begin{matrix} ⟨ A 𝐯_{i}, A 𝐯_{j} ⟩ & = {(A 𝐯_{i})}^{𝖳} A 𝐯_{j} \\ = 𝐯_{i}^{𝖳} A^{𝖳} A 𝐯_{j} \\ = 𝐯_{i}^{𝖳} (A^{𝖳} A 𝐯_{j}) \\ = 𝐯_{i}^{𝖳} (λ_{j} 𝐯_{j}) \\ = λ_{j} (𝐯_{i}^{𝖳} 𝐯_{j}) \\ = {\begin{matrix} λ_{j} & i = j \\ 0 & i \neq j . \end{matrix} \end{matrix}$
Notes: The first line is the dot product using ‘inner product’ notation. The second line is the transpose rule. The fourth line is the definition of $𝐯_{j}$ as eigenvector of $A^{𝖳} A$ . The last line is the property that $𝐯_{1}, \dots, 𝐯_{n}$ are orthonormal vectors.

From this calculation it follows that $A 𝐯_{1}, \dots, A 𝐯_{n}$ are orthogonal vectors (at least when they aren’t zero). Since $𝐮_{1}, \dots, 𝐮_{m}$ are the same vectors renormalized as unit vectors, we know $𝐮_{1}, \dots 𝐮_{m}$ are orthonormal (at least when $σ_{i} \neq 0$ ). In the case of $σ_{i} = | A 𝐯_{i} | = 0$ , the alternate definition of $𝐮_{i}$ using Gram-Schmidt implies that all $𝐮_{1}, \dots, 𝐮_{m}$ are orthonormal.

Third, why is $σ_{i} = \sqrt{λ_{i}}$ ? Answer: According to the calculation above, $| A 𝐯_{i} |^{2} = λ_{i}$ . But $| A 𝐯_{i} | = σ_{i}$ by definition.

Fourth, why is $A = U Σ V^{𝖳}$ , with these definitions of $U, Σ, V$ ? Answer: To verify the equality $A = U Σ V^{𝖳}$ , it is sufficient to verify that the matrices on both sides have the same action on a basis of $ℝ^{n}$ . (If that is true, then by linearity the matrices have the same action on any vector, and therefore they are the same matrices.)

Therefore, it is sufficient to verify the equality of actions on the basis $𝐯_{1}, \dots 𝐯_{n}$ . Since this basis is orthonormal, we see that $V^{𝖳} 𝐯_{i}$ is the vector with $1$ in the $i^{th}$ row and zeros elsewhere. Then $Σ (V^{𝖳} 𝐯_{i})$ is the $i^{th}$ column of $Σ$ , which has $σ_{i}$ in the $i^{th}$ row and zeros elsewhere. Finally $U (Σ V^{𝖳} 𝐯_{i})$ is the $i^{th}$ row of $U$ (which is $𝐮_{i}$ ) scaled by $σ_{i}$ . Therefore $(U Σ V^{𝖳}) 𝐯_{i} = σ_{i} 𝐮_{i}$ , and this is equal to $A 𝐯_{i}$ . All these statements are valid for $i = 1, \dots, ℓ$ . For $ℓ < i$ , both sides send $𝐯_{i}$ to $𝟎$ .

Backwards: Going from SVD to eigenvectors-eigenvalues of $A^{𝖳} A$

Addendum. If we already have an SVD given by $A = U Σ V^{𝖳}$ , with $σ_{1}, \dots, σ_{ℓ}$ on the diagonal of $Σ$ , then $𝐯_{1}, \dots 𝐯_{n}$ must be the eigenvectors of $A^{𝖳} A$ , and $σ_{1}^{2}, \dots, σ_{ℓ}^{2}, 0, \dots, 0$ are the corresponding eigenvalues.

Proof: Observe that
$\begin{matrix} A^{𝖳} A & = {(V^{𝖳})}^{𝖳} Σ^{𝖳} U^{𝖳} U Σ V^{𝖳} \\ = V Σ^{𝖳} I_{n} Σ V^{𝖳} \\ = V Σ^{𝖳} Σ V^{𝖳} \\ = V Σ^{'} V^{𝖳}, \end{matrix}$
where $Σ^{'}$ is the square diagonal matrix with diagonal entries given by $σ_{1}^{2}, \dots, σ_{ℓ}^{2}$ . Applying the matrix $A Σ^{'} V^{𝖳}$ to $𝐯_{1}, \dots, 𝐯_{ℓ}$ returns $σ_{1}^{2} 𝐯_{1}, \dots, σ_{ℓ} 𝐯_{ℓ}$ , which verifies the claim. (Again both matrices send $𝐯_{i}$ to $0$ for $ℓ < i \leq n$ .)

Problems due 18 Apr 2024 by 12:00pm

Problem 13-01

Computing SVD

Compute an SVD of the matrix $A = (\begin{matrix} 7 & 1 \\ 5 & 5 \\ 0 & 0 \end{matrix})$ .

Problem 13-02

SVD of $A^{𝖳}$

(a) Given an SVD $A = U Σ V^{𝖳}$ , what is a formula for an SVD of $A^{𝖳}$ ? What are the singular values of $A^{𝖳}$ ?

(b) Compute an SVD of the matrix $A = (\begin{matrix} 3 & 2 & 2 \\ 2 & 3 & - 2 \end{matrix})$ by first computing an SVD of $A^{𝖳}$ and then applying your result in (a).

Problem 13-03

Optimizing $| A 𝐱 |$ using singular values

(a) Verify that $| A 𝐱 |^{2} = Q (𝐱)$ where $Q (𝐱)$ is the quadratic form of $A^{𝖳} A$ .

(b) Suppose $A = (\begin{matrix} 4 & 6 \\ 0 & 4 \end{matrix})$ . Starting with the observation in (a), find a unit vector $𝐱$ that maximizes the value of $| A 𝐱 |$ (and hence of $| A 𝐱 |^{2})$ . Describe your answer in terms of singular values and singular vectors of $A$ .

Problem 13-04

Optimizing $| A 𝐱 |$ to find singular vectors

Show that the second largest singular value of $A$ is equal to the maximum of $| A 𝐱 |$ where $𝐱$ varies over all unit vectors which are orthogonal to $𝐯$ , where $𝐯$ is a right singular vector corresponding to the largest singular value of $A$ .

(Notation clarification: if the singular values are listed in order of size $σ_{1} \geq σ_{2} \geq σ_{3} \geq \dots \geq σ_{ℓ}$ , then the largest singular value is $σ_{1}$ , the second largest is $σ_{2}$ , and the right singular vector in the problem is $𝐯_{1}$ satisfying $A 𝐯_{1} = σ_{1} 𝐮_{1}$ .)

Problem 13-05

Computing SVD

Compute an SVD of the matrix $A = (\begin{matrix} - 18 & 13 & - 4 & 4 \\ 2 & 19 & - 4 & 12 \\ - 14 & 11 & - 12 & 8 \\ - 2 & 21 & 4 & 8 \end{matrix})$ . You may use a calculator or computer, but you must show the steps of computation as in the Examples in this packet.

xTensiv
Home

Table of Contents

Applied Linear Algebra - Packet 13

Packet 13

Matrices IV: Spectrum

Eigenvectors, eigenvalues: Part II

Example

Example

Example

Example

Question 13-01

Question 13-02

Eigenvectors vs. singular vectors

Defining the singular value decomposition

Computing the singular value decomposition

Example

Exercise 13-01

Exercise 13-02

Problems due 18 Apr 2024 by 12:00pm

Problem 13-01

Problem 13-02

Problem 13-03

Problem 13-04

Problem 13-05

xTensivHome

Table of Contents

Applied Linear Algebra - Packet 13

Packet 13

Matrices IV: Spectrum

Eigenvectors, eigenvalues: Part II

Example

Example

Example

Example

Question 13-01

Question 13-02

Eigenvectors vs. singular vectors

Defining the singular value decomposition

Computing the singular value decomposition

Example

Exercise 13-01

Exercise 13-02

Problems due 18 Apr 2024 by 12:00pm

Problem 13-01

Problem 13-02

Problem 13-03

Problem 13-04

Problem 13-05

xTensiv
Home