Practice problems

Penrose quasi-inverse: smallest LLS solution

Given an LLS problem with A and 𝐛, show that the Penrose quasi-inverse solution A+𝐛 is the LLS solution of smallest length. Show that this solution lies in the cokernal of A. (Hint: use the SVD of A.)

Problem 13-04

Optimizing |A𝐱| to find singular vectors

Show that the second largest singular value of A is equal to the maximum of |A𝐱| where 𝐱 varies over all unit vectors which are orthogonal to 𝐯, where 𝐯 is a right singular vector corresponding to the largest singular value of A.

(Notation clarification: if the singular values are listed in order of size σ1σ2σ3σ, then the largest singular value is σ1, the second largest is σ2, and the right singular vector in the problem is 𝐯1 satisfying A𝐯1=σ1𝐮1.)

Problem 14-05

Penrose quasi-inverse

For this problem, use the definition of A+ based on the SVD of A.

  • (a) Verify that AA+A=A and that A+AA+=A+ for the example A=(316262) studied in the section on quasi-inverse. Think about how and why this happens in terms of the SVD.
  • (b) Show that for any A matrix, AA+𝐯 is the projection of 𝐯 onto the image of A.
  • (c) Show that for any A matrix, AA+A=A and A+AA+=A+.

It may be helpful (though not required) for this problem to use the presentation of SVD in the form A=σ1𝐮1𝐯1𝖳+σ2𝐮2𝐯2𝖳+ as in Packet 13. In order to use this, figure out what the corresponding presentation of the SVD of A+ should be.

Problem 14-06

LLS uniqueness

  • (a) Show that: A𝐱=𝟎 if and only if A𝖳A𝐱=𝟎. (Hint: use the fact that 𝐱𝖳A𝖳A𝐱=|A𝐱|2.)
  • (b) Show that: A𝖳A is invertible if and only if A has independent columns. (Warning: A need not be square. Hint: consider the dimensions of kernels, and use (a) as well as the rank-nullity theorem.)
  • (c) Explain why the LLS problem for A and 𝐛 has a unique solution if and only if the columns of A are independent. (Use (b).)
  • (d) Continuing from (b), show that: rank(A)=rank(A𝖳A). (Hint: apply rank-nullity to both matrices. Notice that A and A𝖳A have the same number of columns.)

You may observe that (a) implies dimKer(A)=dimKer(A𝖳A), while (d) means dimIm(A)=dimIm(A𝖳A). In other words, A and A𝖳A always have the same size kernels and the same size images. This makes sense in terms of the SVD if you think about kernels, cokernels, and images generated by the orthonormal basis vectors 𝐮i’s and 𝐯i’s, writing A=σ1𝐮1𝐯1𝖳+σ2𝐮2𝐯2𝖳+ as in the previous problem (keeping careful track of which σi’s are zero or nonzero!).

Finding a projection to the space perpendicular to given vectors

Let:

𝐯1=(110000),𝐯2=(111010),𝐯3=(000101),𝐯4=(000111).
  • Find the orthogonal complement of the span 𝐯1,𝐯2,𝐯3,𝐯4, written as a span.
  • Find the projection of 𝐮=(1,1,1,1,1,1) to that complement using normal equations.
  • Find a matrix that performs the projection to this orthogonal complement using an augmented matrix.
Problem 14-03

Fitting a plane to four points

Consider the four data points in 3D space:

(x,y,z)=(1,0,0),(0,1,1),(1,0,3),(0,1,4).

Find the parameters a,b,c for which the plane defined by z=a+bx+cy best fits these data points.

You may notice that the data points correspond to heights z=0,1,3,4 at the four corners of a square. Normally a plane is defined by 3 points (non-colinear) through which it passes. No plane passes through these four given points. But there is a plane that minimizes the sum of squares of vertical errors.

LLS uniqueness

Suppose we are given data (x,y)=(x1,y1),(x2,y2),,(xn,yn). The linear least squares problem has a unique solution if and only if the design matrix column vectors are independent. (You may use this fact from the previous problem.)

  • (a) Suppose we are finding the line of best fit for the given data. (Model: y=b+mx.) Show that as long as we can find i and j with xixj, then there is a unique best fit line for this data.
  • (b) Suppose we are finding the parabola of best fit for the given data. (Model: y=a+bx+cx2.) Suppose that x1, x2, and x3 are all distinct values. Show that there is a unique best fit parabola for this data.

LLS summation curve model

Consider the data points:

(x,y)=(1,1.8),(2,2.6),(3,3.3),(4,3.8),(5,4.1).

Find the parameters of best fit for the model y=ax+bx2. Clearly identify your design matrix, observation vector, and parameter vector. Compute the error vector and the total error.