Packet 13

Matrices IV: Spectrum

Eigenvectors, eigenvalues: Part II

Example

Finding eigenvectors and eigenvalues

Problem: Find the eigenvectors and their eigenvalues for the matrix A=(401210201). Solution: First find the eigenvalues:

detAλ=|4λ0121λ0201λ|=(4λ)|1λ001λ|0|2021λ|+1|21λ20|=(4λ)(1λ)20+2(1λ)=λ3+6λ211λ+6.

The roots are λ=1,2,3. (The rational roots theorem leads to the guess λ=2, and then you can divide the cubic polynomial by λ2 to obtain a quadratic one, which is easily factored.)

Now we seek eigenvectors by solving Aλ𝐱=𝟎 for each λ root. When λ=1:

Aλ𝐱=(301200200)(x1x2x3)=(000).

To solve for 𝐱, row reduce Aλ to obtain a matrix in RREF:

Aλ=(301200200)(301002/3002/3)(300002/3000)(100001000).

So we know x1=x3=0 and x2 is free. Plugging these in we have 𝐱=(0x20), and we choose x2=1 obtaining (010) for an eigenvector.

Next for λ=2 we have:

Aλ𝐱=(201210201)(x1x2x3)=(000).

To solve for 𝐱, row reduce Aλ to obtain a matrix in RREF:

Aλ=(201210201)(201011000)(101/2011000).

So we know x1=12x3 and x2=+x3. Plugging these in we have 𝐱=(12x3x3x3), and we choose x3=2 obtaining (122).

Finally for λ=3 we have:

Aλ𝐱=(101220202)(x1x2x3)=(000).

To solve for 𝐱, row reduce Aλ to obtain a matrix in RREF:

Aλ=(101220202)(101022000)(101011000).

So we know x1=x3 and x2=+x3. Plugging these in we have 𝐱=(x3x3x3), and we choose x3=1 obtaining (111).

Example

Changing to eigenbasis yields diagonal matrix

Notice that A in the previous example has three eigenvectors, and that they are independent. Therefore they constitute a basis of 3; let’s call this basis 𝒞. Putting these basis vectors as the columns of a matrix, we create the change of basis transfer matrix from 𝒞 to the standard basis :

T𝒞=(𝐜1𝐜2𝐜3)=(011121021).

So T𝒞[𝐱]𝒞=x1𝐜1+x2𝐜2+x3𝐜3=x1(010)+x2(122)+x3(111).

Now consider that these are eigenvectors of A with eigenvalues 1,2,3:

A𝐜1=𝐜1,A𝐜2=2𝐜2,A𝐜3=3𝐜3.

If we write A in the basis 𝒞 using the transfer matrix T𝒞, we have:

[A]𝒞=𝒞TAT𝒞=(011121021)1(401210201)(011121021)=(100020003).

In other words, the matrix of A is a diagonal matrix when written in the basis of its own eigenvectors.

The example above illustrates a general fact: if a matrix A has enough independent eigenvectors {𝐜1,,𝐜n} to form a basis of the vector space n on which A acts, then if you write A in the basis of these eigenvectors, it will have a diagonal form.

In other words, the eigenbasis of A is a natural basis for A. It is a basis in which the action of A is extremely simple: it is given by scaling the basis vectors by the respective eigenvalue numbers.

Sometimes a matrix has not enough independent eigenvectors. Here is an illustration:

Example

Matrix with too few eigenvectors

The matrix A=(1101) has detAλ=(1λ)2. Therefore it has a “double” eigenvalue λ=+1. To find the eigenvectors:

A+1𝐱=(0100)(x1x2)=(00).

The matrix is already in RREF. So the equation says that x1 is free and x2=0. Therefore we have the eigenvector (10).

However, no more eigenvectors are available (other than scalar multiples of this one, which would not be independent of this one). Therefore it is impossible to find a basis of eigenvectors, and it is thus impossible to convert A to diagonal form by a change of basis.

Sometimes a real matrix has complex eigenvalues and eigenvectors. Here is an illustration:

Example

Matrix with complex eigenvalues and eigenvectors

The matrix A=(0110) has detAλ=λ2+1 so its eigenvalues are λ=±i. To find the eigenvectors:

Ai𝐱=(i11i)(x1x2)=(00).

Row reduce the matrix Ai:

Ai=(i11i)(i100)(1i00).

This is now in RREF, so we see that x1=ix2 and x2 is free, so we get the eigenvector (i1).

Following the same procedure for λ=i, we find the eigenvector (i1).

It is important to be aware of these two situations. For the sake of exams, you should memorize one example of each type. (You can use the two above.)

Question 13-01

Double eigenvalues does not imply insufficient eigenvectors

What are the eigenvalues and eigenvectors of A=(2002)? Are there enough to form a basis of 2?

Question 13-02

Enough eigenvectors?

Does the matrix A=(210021002) have enough eigenvectors to form a basis? If not, how many does it have?

Spectral Theorem: symmetric matrices have good eigen theory

A key point of the Spectral Theorem is that symmetric matrices never encounter the phenomena illustrated in the two examples above. Eigenvalues are always real numbers, and there are always enough eigenvectors to form a basis.

Eigenvectors vs. singular vectors

It is helpful to think about the concept of an eigenvector as consisting of two aspects:

  • (1) Matrix acting by scaling
  • (2) Fixed lines of matrix action

Aspect (1) allows easy computation and geometric understanding of the matrix action. It also allows us to focus on vectors that are more important (those having a larger scale factor). If we have a basis of such vectors, the matrix will have a simple diagonal form when written in this basis.

Aspect (2) goes much further, even though it essentially includes aspect (1). A fixed line (eigenline) is the span of any eigenvector. Such lines are mapped to themselves by the matrix action. This aspect describes eigenvector spans as analogues of fixed points of a function. The concept of fixed points and fixed lines (‘fixity’ in general) only applies when the function or matrix keeps vectors in the same space, i.e. has the same codomain as domain.

Eigenvectors There are various contexts where a matrix does act on vectors in a single space. Aspect (2) is very useful in these contexts. For example, a differential operator, which produces a system of first-order linear differential equations, involves a matrix A:VV for a vector space of functions V. For another example, the moment of inertia matrix describing angular moment of 3D bodies in physics. Or the many operators of quantum mechanics.

One of the biggest advantages of aspect (2) is that it is compatible with iteration.

If A:VV is a linear transformation (i.e. a matrix) having the same codomain as domain, we can define iterates A2,A3 etc. as matrix powers. These powers allow one to apply functions to matrices: if f(x)=n=0anxn is a power series, we can define f(A) by the power series f(A)=n=0anAn. (Bracketing any problems with convergence.) An extremely important series is ex=1+x+x22!+x33!+, since the function eAt solves the linear ODE system X(t)=AX(t).

It is hard to compute eAt, or more generally n=0anAn, by direct multiplication and limits of partial sums. On the other hand, it is easy to compute eAt when A is a diagonal matrix, just as it is easy to compute A2𝐯 when 𝐯 is an eigenvector of A. (Namely, the latter is λ2𝐯.) For example, in the 2×2 case we have e(λ100λ2)=(eλ100eλ2).

Even when A is not diagonal, if an eigenbasis matrix T𝒞=(𝐜1𝐜2𝐜n) can be found, then the powers of A are still easy to calculate, because (e.g. in 3D):

A2=T𝒞(λ1000λ2000λ3)𝒞TT𝒞(λ1000λ2000λ3)𝒞T=T𝒞(λ1000λ2000λ3)T𝒞1T𝒞(λ1000λ2000λ3)𝒞T=T𝒞(λ1000λ2000λ3)(λ1000λ2000λ3)𝒞T=T𝒞(λ12000λ22000λ32)𝒞T.

Similarly we have:

An=T𝒞(λ1n000λ2n000λ3n)𝒞T,eA=T𝒞(eλ1000eλ2000eλ3)𝒞T.

The critical piece of the calculation above is the cancellation T𝒞1T𝒞=I3 occurring in the calculation for A2. This cancellation shows that matrix multiplication and change of basis can be performed in either order. That in turn means we can first change basis to an eigenbasis (if it exists) and then perform multiplication with the diagonal matrices where it is easy, and only afterwards change back to the standard basis.

Singular vectors The concept of singular vector is designed to make use of aspect (1) of eigenvectors without aspect (2). Because they do not include aspect (2), singular vectors of A are not eigenvectors of A except in special cases. They do not determine fixed lines of the A action.

Unlike eigenvectors and eigenvalues, singular vectors and singular values are (best) defined in the context of a collection of vectors giving the singular value decomposition (SVD). So, a singular vector is one of the vectors in the SVD.

Defining the singular value decomposition

Suppose A:nm is a matrix providing a linear transformation between different vector spaces.

SVD features

A singular value decomposition of A:nm is given by: A=UΣV𝖳, where:

  • V=(𝐯1𝐯2𝐯n) is an orthogonal matrix giving a rotation/reflection combo for n (input)
  • U=(𝐮1𝐮2𝐮m) is an orthogonal matrix giving a rotation/reflection combo for m (output)
  • Σ has non-negative entries σ1,σ2,σ on the main diagonal (possibly zeros!) and zeros elsewhere. (Here =min{n,m}.) It is not square if nm. It could look like these, for example:
(10000300007000000000),(100000030000007000).

The orthonormal basis vectors 𝐯1,,𝐯n are called right singular vectors. The orthonormal basis vectors 𝐮1,,𝐮m are called left singular vectors. The values σi0 are the lengths of the images, namely σi=|A𝐯i|.

(Remember: “Right = Input, Left = Output.”)

An alternate way to write the SVD uses scalar projections 𝐯i𝖳 attached to the vectors 𝐮i:

A=σ1𝐮1𝐯1𝖳+σ2𝐮2𝐯2𝖳++σ𝐮𝐯𝖳.

Each term σi𝐮i𝐯i𝖳 takes an input vector, dots it with 𝐯i, then applies this value to 𝐮i scaled by σi. With this formula, we can compute A𝐱 for any input 𝐱 using dot products:

A𝐱=σ1𝐮1(𝐯1𝐱)+σ2𝐮2(𝐯2𝐱)++σ𝐮(𝐯𝐱).

By plugging in 𝐱=𝐯i, we verify that A𝐯1=σ1𝐮1 and A𝐯2=σ2𝐮2, etc. up to A𝐯=σ𝐮. (For <in, we have A𝐯i=𝟎.)

Singular vectors vs. eigenvectors

The rule A𝐯i=σi𝐮i looks analogous to the rule for eigenvectors.

But it’s actually very different! Notice that 𝐯i and 𝐮i are not even vectors in the same space!

The rule A𝐯=σ𝐮 does not by itself determine any special property of 𝐯 or 𝐮. (Take any unit vector 𝐯, apply A, then define 𝛔=|A𝐯| and 𝐮=σ1A𝐯.)

The significance of the rule A𝐯i=σi𝐮i manifests when it holds simultaneously on all vector pairs in orthonormal bases 𝐯1,,𝐯n and 𝐮1,,𝐮m.

Still, the SVD says that after rotating / reflecting the basis of n and the basis of m by suitable orthogonal matrices (not necessarily the same on each), the action of A can be represented as simple scalings by σi.

center center

We may think of 𝐯1,,𝐯n as a special basis for the input space n vis-à-vis the matrix A, and 𝐮1,,𝐮m as a corresponding basis for the output space m. Then A acts by pairing the vector 𝐯i to 𝐮i and stretching it by σi.

center Partial matrices 𝐮i𝐯i𝖳 for i=1,,5.

center Partial sum matrices σ1𝐮1𝐯1𝖳++σi𝐮i𝐯i𝖳 for i=1,,5.

Computing the singular value decomposition

In practice, on a computer, the SVD is calculated using numerical procedures that are akin to those used to find approximations to eigenvectors. We do not address those kinds of procedures in this course.

Still, it is possible to specify and calculate SVD elements by hand for a matrix A by making use of the fact that A𝖳A is always a symmetric matrix and the Spectral Theorem can be applied to it. This method also shows that the SVD always exists, since the symmetric A𝖳A always has real eigenvalues and enough eigenvectors by the Spectral Theorem, even when A does not. Let us see how this works.

First observe that (A𝖳A)𝖳=A𝖳(A𝖳)𝖳=A𝖳A, so A𝖳A is always symmetric. Also notice that A𝖳A:nn, so the input and output spaces of this matrix are the same.

SVD elements

  • 𝐯1,,𝐯n= orthonormal eigenbasis of A𝖳A. As guaranteed by the Spectral Theorem, so V here is Q in the notation of that theorem.
  • σi=|A𝐯i| defined for 1i. If (A𝖳A)𝐯i=λi𝐯i, then σi=λi.
  • The equation σi𝐮i=A𝐯i for 1i defines unit vectors 𝐮i for those i with σi0.
  • Gram-Schmidt provides additional 𝐮i vectors when σi=0.

Example

Computing SVD by hand

Problem: Find an SVD for the matrix A=(112222).

Solution: First calculate that A𝖳A=(9999) and this matrix has λ=18 and λ=0 eigenvalues. Therefore we set σ1=32 and σ2=0. By solving (9999)(x1x2)=(00) we obtain an eigenvector (11) which we renormalize to the unit vector (1/21/2). By solving (9999)(x1x2)=(00) we obtain an eigenvector (11) which we renormalize to the unit vector (1/21/2). We now have V and Σ:

Σ=(3200000),V=(1/21/21/21/2).

Set

𝐮1=σ11A𝐯1=132(2/24/24/2)=(1/32/32/3)

and we notice that |𝐮1|=1. Now we need 𝐮2,𝐮3 unit vectors orthogonal to each other and to 𝐮1. We first calculate a span of the kernel of the matrix 𝐮1𝖳, and then perform Gram-Schmidt to orthogonalize it. For the kernel:

(1/32/32/3)(x1x2x3)=0,

and therefore x12x2+2x3=0 so we can set x2,x3 free and solve for x1, thereby obtaining generators for the kernel:

ker𝐮1𝖳={(2x22x3x2x3)}=𝐰2=(210),𝐰3=(201).

Now we already have 𝐰2𝐮1 by our method of defining 𝐰2. So it just remains to do the last step of Gram-Schmidt. We calculate that

𝐰3𝐰3𝐮1𝐮1𝐮1𝐮1𝐰3𝐰2𝐰2𝐰2𝐰2=(201)01(1/32/32/3)45(210)=(2/54/51).

Then

𝐮2=1|𝐰2|𝐰2=(2/51/50),𝐮3=545(2/54/51)=(2/454/455/45).

We have finished, and the SVD is given by:

A=UΣV𝖳=(1/32/52/452/31/54/452/305/45)(3200000)(1/21/21/21/2).
Exercise 13-01

Computing SVD by hand

Find an SVD for the matrix A=(2122).

Exercise 13-02

Computing SVD by hand

Find an SVD for the matrix A=(316262).

Explaining the SVD elements

First, why is V=(𝐯1𝐯n) an orthogonal matrix? Answer: Since 𝐯1,,𝐯n is an orthonormal basis, the matrix V is orthogonal. Second, why is U=(𝐮1𝐮m) an orthogonal matrix? Answer: Consider the calculation:

A𝐯i,A𝐯j=(A𝐯i)𝖳A𝐯j=𝐯i𝖳A𝖳A𝐯j=𝐯i𝖳(A𝖳A𝐯j)=𝐯i𝖳(λj𝐯j)=λj(𝐯i𝖳𝐯j)={λji=j0ij.

Notes: The first line is the dot product using ‘inner product’ notation. The second line is the transpose rule. The fourth line is the definition of 𝐯j as eigenvector of A𝖳A. The last line is the property that 𝐯1,,𝐯n are orthonormal vectors.

From this calculation it follows that A𝐯1,,A𝐯n are orthogonal vectors (at least when they aren’t zero). Since 𝐮1,,𝐮m are the same vectors renormalized as unit vectors, we know 𝐮1,𝐮m are orthonormal (at least when σi0). In the case of σi=|A𝐯i|=0, the alternate definition of 𝐮i using Gram-Schmidt implies that all 𝐮1,,𝐮m are orthonormal.

Third, why is σi=λi? Answer: According to the calculation above, |A𝐯i|2=λi. But |A𝐯i|=σi by definition.

Fourth, why is A=UΣV𝖳, with these definitions of U,Σ,V? Answer: To verify the equality A=UΣV𝖳, it is sufficient to verify that the matrices on both sides have the same action on a basis of n. (If that is true, then by linearity the matrices have the same action on any vector, and therefore they are the same matrices.)

Therefore, it is sufficient to verify the equality of actions on the basis 𝐯1,𝐯n. Since this basis is orthonormal, we see that V𝖳𝐯i is the vector with 1 in the ith row and zeros elsewhere. Then Σ(V𝖳𝐯i) is the ith column of Σ, which has σi in the ith row and zeros elsewhere. Finally U(ΣV𝖳𝐯i) is the ith row of U (which is 𝐮i) scaled by σi. Therefore (UΣV𝖳)𝐯i=σi𝐮i, and this is equal to A𝐯i. All these statements are valid for i=1,,. For <i, both sides send 𝐯i to 𝟎.

Backwards: Going from SVD to eigenvectors-eigenvalues of A𝖳A

Addendum. If we already have an SVD given by A=UΣV𝖳, with σ1,,σ on the diagonal of Σ, then 𝐯1,𝐯n must be the eigenvectors of A𝖳A, and σ12,,σ2,0,,0 are the corresponding eigenvalues.

Proof: Observe that

A𝖳A=(V𝖳)𝖳Σ𝖳U𝖳UΣV𝖳=VΣ𝖳InΣV𝖳=VΣ𝖳ΣV𝖳=VΣV𝖳,

where Σ is the square diagonal matrix with diagonal entries given by σ12,,σ2. Applying the matrix AΣV𝖳 to 𝐯1,,𝐯 returns σ12𝐯1,,σ𝐯, which verifies the claim. (Again both matrices send 𝐯i to 0 for <in.)

Problems due 18 Apr 2024 by 12:00pm

Problem 13-01

Computing SVD

Compute an SVD of the matrix A=(715500).

Problem 13-02

SVD of A𝖳

  • (a) Given an SVD A=UΣV𝖳, what is a formula for an SVD of A𝖳? What are the singular values of A𝖳?
  • (b) Compute an SVD of the matrix A=(322232) by first computing an SVD of A𝖳 and then applying your result in (a).
Problem 13-03

Optimizing |A𝐱| using singular values

  • (a) Verify that |A𝐱|2=Q(𝐱) where Q(𝐱) is the quadratic form of A𝖳A.
  • (b) Suppose A=(4604). Starting with the observation in (a), find a unit vector 𝐱 that maximizes the value of |A𝐱| (and hence of |A𝐱|2). Describe your answer in terms of singular values and singular vectors of A.
Problem 13-04

Optimizing |A𝐱| to find singular vectors

Show that the second largest singular value of A is equal to the maximum of |A𝐱| where 𝐱 varies over all unit vectors which are orthogonal to 𝐯, where 𝐯 is a right singular vector corresponding to the largest singular value of A.

(Notation clarification: if the singular values are listed in order of size σ1σ2σ3σ, then the largest singular value is σ1, the second largest is σ2, and the right singular vector in the problem is 𝐯1 satisfying A𝐯1=σ1𝐮1.)

Problem 13-05

Computing SVD

Compute an SVD of the matrix A=(181344219412141112822148). You may use a calculator or computer, but you must show the steps of computation as in the Examples in this packet.