Packet 04

Tutorial: Theory of Matrices in 2D and 3D

Matrix multiplication

Suppose we have a matrix A and a vector u that A acts upon. We have written “Au” for the output vector given by letting A act on u.

Matrices act by sending vectors to other vectors. It is useful to consider another matrix that acts upon the output vector. For example, if 𝐯:=A𝐮, and if B is a matrix that acts upon v, we can write B(A𝐮)=B𝐯. (The colon in 𝐯:=A𝐮 means “𝐯 is defined as A𝐮.”)

Suppose we define 𝐰:=B𝐯, so that 𝐰=B(A𝐮). Then we say that w is the output of 𝐮 under a composition of actions of the matrices A and B. It turns out that if we start with two matrices A and B, the composition action (letting the matrices act consecutively) can itself be represented by a single matrix. (The composition of linear transformations is a linear transformation.) Let us find this matrix in the case when A and B are 2×2 matrices.

Suppose the entries of A and B are given by

A=(a11a12a21a22),B=(b11b12b21b22).

Now compute the action of A followed by the action of B on the vector 𝐮=(u1u2):

B(A𝐮)=(b11b12b21b22)((a11a12a21a22)(u1u2))=(b11b12b21b22)(a11u1+a12u2a21u1+a22u2)=(b11(a11u1+a12u2)+b12(a21u1+a22u2)b21(a11u1+a12u2)+b22(a21u1+a22u2))=((b11a11+b12a21)u1+(b11a12+b12a22)u2(b21a11+b22a21)u1+(b21a12+b22a22)u2)=(b11a11+b12a21b11a12+b12a22b21a11+b22a21b21a12+b22a22)(u1u2).

Therefore, we define the product matrix BA by this formula:

BA=(b11b12b21b22)(a11a12a21a22)=(b11a11+b12a21b11a12+b12a22b21a11+b22a21b21a12+b22a22).

With this definition and the above reasoning, we have that B(A𝐮)=(BA)𝐮. (Because of this equality, we can omit the parentheses and write simply BA𝐮, since the result is the same whether you act first by A and then by B or instead you first multiply the matrices and then act by BA.)

Just as for the formula for a matrix acting on a vector, we have multiple ways to interpret the formula for the matrix BA.

  • Each entry of the product matrix BA is given by taking the dot product of a row of B with a column of A.
  • Each column of the product matrix BA is given by letting B act upon the corresponding column vector of A. In other words, BA=(B𝐚1B𝐚2). (Recall that the action of B upon 𝐚i has two interpretations of its own!)
  • Each row of the product matrix BA is given by letting the entries in the corresponding row vector of B provide the coefficients for a linear combination of the rows of A. The first of these interpretations is best for doing calculations, and the second is best for understanding abstract theory.

Bottom-Right Method for computing AB: center E.g.: center

Max Born, Nobel Prize lecture: center

Question 04-01

Some matrix products

Compute the following matrix products by hand:

  • (a) (2315)(436123)
  • (b) (135021004)(211013002)

In both (a) and (b) there are three matrices: two are given and their product is the third. In (a), what size vectors do these three matrices act upon? (How many components?) In (b), what do you notice about the nature of the two matrices and their resulting product? Can you guess a rule that generalizes this fact?

Question 04-02

Double rotation

Show that (cosθsinθsinθcosθ)2=(cos(2θ)sin(2θ)sin(2θ)cos(2θ)), in other words two rotations by θ is equivalent to one rotation by 2θ.

Exercise 04-01

Nilpotent matrix

Define the matrix Aλ:

Aλ=(0λ00000λ00000λ00000λ00000).

For λ=2, calculate the powers A2, A22, A23, A24, A25.

(What special property does Aλ have? A matrix with that property is said to be nilpotent.)

Identity matrix Notice that the diagonal matrix with 1’s on the diagonal acts upon another vector or matrix by doing nothing:

(100010001)(a11a12a13a21a22a23a31a32a33)=(a11a12a13a21a22a23a31a32a33).

This matrix is called the identity matrix, since it preserves the identity of whatever it acts upon. It is frequently written “I”, regardless of the size of the matrix. So I𝐮=𝐮 for any 𝐮.

Sometimes I is written more specifically as In, where n is the number of rows or columns. (Identity matrices always have the same number of rows and columns.)

Exercise 04-02

Reflections square to identity

Show that [refl𝐯]2=I2 for any 𝐯=(v1v2) with 𝐯𝐯=1.

This means that reflecting twice across the same line gives back the original vector. (In 2D the hyperplane of reflection is just a line.)

Matrices acting on matrices

It is useful to think of matrices as acting upon other matrices by multiplying on the left. The effect is the same as the effect of acting on a vector, but it performs this (same) action on each column vector separately. The action is therefore the same across any given row.

Diagonal matrices When two diagonal matrices are multiplied, the result is a diagonal matrix. The diagonal entries of the product are simply the products of the diagonal entries:

(a1000a2000a3)(b1000b2000b3)=(a1b1000a2b2000a3b3).

A consequence is that diagonal matrices commute with each other: AB=BA when A,B are both diagonal. (As seen with 3D rotations, frequently this is not true for matrix multiplication.)

Row-scale matrices The diagonal matrices

A1(λ)=(λ00010001),A2(λ)=(1000λ0001),A3(λ)=(10001000λ)

have the effect of scaling rows 1,2,3 (respectively) by the number λ, without changing other rows, when they are multiplied on a vector or a matrix on the left. These matrices are called row-scale matrices.

Permutation matrices When two permutation matrices are multiplied, the result is a permutation matrix. For example:

(010100001)(100001010)=(001100010).
Question 04-03

Permutation matrices

Explain why the product of two permutation matrices is always a permutation matrix. (Recall the definition: a matrix is a permutation matrix when each row and each column contains exactly one 1 and the other entries are 0.)

Row-swap matrices A permutation matrix acts upon another matrix (multiplying on the left) by permuting the rows of the matrix, in the same way that it would permute the rows of a vector:

(0110)(a11a12a21a22)=(a21a22a11a12).

The most important permutation type is one which simply swaps two rows and leaves the others unchanged. This type of matrix is called a row-swap matrix.

The row-swap action is performed by a permutation matrix which has 1’s on the diagonal except for two rows. In these two rows, the ones from a diagonal matrix are swapped with each other.

Example

Row-swap example

The 4×4 matrix which swaps rows 2 and 4 (of a vector or another matrix) works like this:

(1000000100100100)(a11a12a13a14a21a22a23a24a31a32a33a34a41a42a43a44)=(a11a12a13a14a41a42a43a44a31a32a33a34a21a22a23a24).

To obtain this row-swap matrix, start with the diagonal matrix with 1’s along the diagonal, and then swap rows 2 and 4.

Row-add matrices A matrix that has 1’s on the diagonal and the other entries 0’s, except for a single additional entry λ, is called an elementary shear matrix, or a row-add matrix:

(1λ0010001),(100010λ01),etc.

If the λ entry is in the ith row and jth column, this matrix acts by taking the jth row (of a matrix or vector that it’s acting upon) and adding it into the ith row with a λ coefficient:

(1λ0010001)(a11a12a13a21a22a23a31a32a33)=(a11+λa21a12+λa22a13+λa23a21a22a23a31a32a33).

Remember that when a matrix is multiplied on the left on a vector or a matrix, the top row of the matrix controls the top row of the output.

Matrix algebra

Matrix addition and scaling Matrices can be added, subtracted, and multiplied by scalers, just like vectors. These operations are done componentwise:

(a11a12a21a22)+(b11b12b21b22)=(a11+b11a12+b12a21+b21a22+b22),λ(a11a12a21a22)=(λa11λa12λa21λa22).

In terms of the actions performed by these matrices, addition is like addition of functions: the action of a sum of matrices is to give the sum of the outputs under the actions of the separate matrices. For functions, this would be written (f+g)(x)=f(x)+g(x). For matrices, we write (A+B)𝐮=A𝐮+B𝐮, and similarly for subtraction, scalar multiples, and other variations on basic algebra operations.

Matrix distributivity It is important to notice that matrix multiplication is compatible with matrix addition in the usual sense of distributivity:

A(B+C)=AB+AC.

One way to prove this formula is to show that it is true after both sides are applied to an arbitrary vector 𝐮. (To complete the proof, let 𝐮 take the specific value of each standard basis vector 𝐞i in turn, and you find that column i of the matrices on each side must agree.) To show the equality after multiplying both sides on an arbitrary 𝐮, we need to check distributivity of matrix actions on vectors in general:

Question 04-04

Linearity / Distributivity of matrix actions

Show that A(𝐮+𝐯)=A𝐮+A𝐯 whenever a matrix A acts on vectors 𝐮 and 𝐯.

Hint: first try this with a 3×3 matrix and 3D vector with arbitrary entries. Then to show it in general, use the “summation notation” for the matrix action on a vector:

(A𝐮)i=j=1naijuj.

Matrix division For real numbers, division and multiplication are inverse processes, meaning that division “undoes” the action of multiplication, and vice versa.

Any number can divide another number provided it is not zero. For matrices, the situation is more complex, because “being zero” is more complex. Many matrices have non-zero entries, yet they “act as zero” upon certain other vectors. When this happens, the action cannot be “undone”, because you cannot divide by the zero action performed on those vectors.

Example

Projection is not invertible

Consider the projection matrix (1000) which sends a vector (x,y) to the vector (x,0). This matrix acts by zero upon any vector (0,y), sending (0,y) to (0,0). This action cannot be undone! (What specific vector would the inverse action send (0,0) to?)

So, some matrices can be divisors. These are called invertible matrices. Others cannot, and these are called non-invertible matrices. A matrix A is invertible when the inverse matrix A1 exists.

When it exists, an inverse matrix A1 acts by undoing multiplication by A. Concretely, this means:

A1A𝐮=𝐮,AA1𝐮=𝐮for any vector 𝐮.

Notice in the second formula that the action of A can be undone before A even acts. (Ordinary division also works like this!) Another way to look at it is that A acts by undoing the effect of A1. (Ordinary division: multiplication by 5 undoes the effect of multiplying by 1/5.)

Recall the “identity matrix” I which satisfies I𝐮=𝐮 for any 𝐮. So we can write the above formula using matrices:

A1A=I,AA1=I.

Notice that we have to write both formulas because matrix multiplication is not commutative! (Ordinary division: 5(1/5)=1 and (1/5)5=1 are equivalent because ab=ba for any numbers.)

Inverse of 2×2 matrices There is a general formula for the inverse of any 2×2 matrix that should be memorized:

A=(abcd),A1=1adbc(dbca).
Exercise 04-03

Checking formula for 2×2 inverse

Check the formula for A1 for a 2×2 matrix A by computing by hand AA1 and A1A. (You should obtain (1001) in both cases.)

Which 2×2 matrices A have an inverse?

(Hint: you just checked a formula that works whenever it makes sense, so there is an inverse whenever it makes sense, given by the formula. Suppose a,b,c,d are such that the formula does not make sense. Can you find a vector 𝐮𝟎 which is sent to 𝟎 by A?)

Exercise 04-04

Inverse of a reflection

Show (in 2D) that the matrix of a reflection is the inverse of itself. Show this first abstractly (using the definition of inverse), and then concretely (using the formulas for reflection and inverse matrices).

Exercise 04-05

Inverse of a rotation

Show (in 2D) that the inverse of the matrix of rotation by θ is the matrix of rotation by θ.

The inverse of A is useful for finding a vector 𝐮 which is sent to a given vector 𝐯 by the action of a given matrix A:

A𝐮=𝐯if and only if𝐮=A1𝐯.

If we know A and 𝐯, and we want 𝐮, this method will give us 𝐮.

Example

Inverse matrix to find preimage

Problem: Define

A=(1223),𝐯=(14).

Find a vector 𝐮 with the property that A𝐮=𝐯, meaning that A sends 𝐮 to the given 𝐯.

Solution: First we compute A1 using the formula. We have adbc=34=1, so A1=(3221). Now compute that A1𝐯=(116), and this is our answer for 𝐮. Check this answer by computing A(116)=(14), as expected.

Exercise 04-06

Inverse matrix to find preimage

Repeat the previous example for this data:

A=(1324),𝐯=(42).

Determinant of 2×2 matrices The determinant of a matrix A, written detA, is a single number that is associated to the matrix.

For a 2×2 matrix, the determinant is given by the formula in the divisor of the formula for A1:

detA=adbcwhenA=(abcd).

We will learn many things about determinants. The important thing for now is that A1 exists if and only if detA0.

Optional: interpretation of 2×2 determinant Just for fun, here is a visual proof (by S. Golomb) that detA gives the area of the parallelogram spanned by the column vectors of A, namely (ac) and (bd): center Here is another visual for parallelograms from obtuse vectors: center

Eigenvectors, eigenvalues

We saw that standard basis vectors 𝐞i are eigenvectors for diagonal matrices, and the diagonal entries are the eigenvalues. If A has diagonal entries a1,a2,a3, then A𝐞i=ai𝐞i.

Now consider the problem of finding eigenvectors and eigenvalues for a non-diagonal matrix A. This means finding a vector 𝐯𝟎 and a scalar λ such that A𝐯=λ𝐯.

Suppose we start with a hypothetical λ. Write λI2 for the matrix (λ00λ). So we wish to solve A𝐯=(λI2)𝐯, which is equivalent to (AλI2)𝐯=0. Now write Aλ:=AλI2. So we wish to find 𝐯 such that Aλ𝐯=𝟎.

Notice that if Aλ is invertible, then there is no solution. (The only solution would be 𝐯=0 and that is not allowed for an eigenvector.) The reason is that if Aλ1 exists, we can let it act upon both sides of the equation Aλ𝐯=𝟎 to obtain 𝐯=Aλ1𝟎=0. Therefore, we only have a chance of solving this if Aλ is non-invertible. Recall from above that Aλ is non-invertible precisely when detAλ=0.

It turns out we can solve the equation if detAλ=0. Let us do this by hand for 2×2 matrices.

Derivation that if detM=0 then M𝐯=𝟎 can be solved with 𝐯0

Let M=(abcd). Assume detM=0, which means adbc=0 or ad=bc or c/a=d/b. Write x for the common ratio c/a=d/b. Thus M=(abxaxb), and now we wish to solve the equation:

(abxaxb)(v1v2)=(00).

This means av1+bv2=0 and xav1+xbv2=0. Dividing the latter by x we arrive at the former, so we can take any (v1,v2) that satisfy v1/v2=b/a. Let us choose (v1,v2)=(b,a). You can now check directly that it works:

(abxaxb)(ba)=(00).

If detAλ=0, there must be a vector 𝐯𝟎 such that A𝐯=λ𝐯 for some scalar λ

Proof: Define A=(abcd), and so

Aλ=(abcd)(λ00λ)=(aλbcdλ).

Suppose detAλ=0. By the formula for det, this means (aλ)(dλ)bc=0. That is a quadratic equation in λ and could be rewritten as λ2+(ad)λ+(adbc)=0. By the quadratic formula, the solutions are λ=12(a+d±(ad)2+4bc). Write these two solutions as λ1 and λ2.

So we have found λ1 and λ2 with detAλ1=0 and detAλ2=0. In either case we can solve for a 𝐯1𝟎 such that Aλ1𝐯1=0 and a 𝐯2 such that Aλ1𝐯2=0 using the method in the box above.

Example

Finding eigenvectors and eigenvalues

Problem: Define A=(1652). Find the two eigenvalues and some (any) two corresponding eigenvectors of A.

Solution: First write Aλ=(1λ652λ). Then detAλ=0 means (1λ)(2λ)30=0. Solve this quadratic equation to obtain λ1=7 and λ2=4. We have found the two eigenvalues.

Next, write out the matrices

Aλ1=(6655),Aλ2=(5656).

Our goal is to find 𝐯1 and 𝐯2 such that Aλi𝐯i=0. We can guess two solutions: let 𝐯1=(11) and let 𝐯2=(65). To check that these are indeed eigenvectors of the original A corresponding to λ1=7 and λ2=4, compute:

(1652)(11)=(77)=7(11),(1652)(65)=(2420)=4(65).

You could also use the procedure in the gray box “Derivation.” For Aλ1 we have x=5/6, and for Aλ2 we have x=1. Then we would get 𝐯1=(66) and 𝐯2=(65). Notice that the first eigenvector is a scalar multiple of ours above: (66)=6(11).

This is true in general: if A𝐯=λ𝐯, then A(x𝐯)=λ(x𝐯). In other words, you can scale an eigenvector and the result is still an eigenvector with the same eigenvalue. This is because Ax=xA when x is a scalar, and of course xλ=λx when both are scalars. In yet other words, the line 𝐯 spanned by an eigenvector 𝐯, called an eigenline, is preserved by the matrix: every vector on this line is mapped back to something on this line.

Summary

  • Matrices can be multiplied by each other. Express A=(𝐚1𝐚2𝐚3) as a row of column vectors; then BA is the matrix which is the row of column vectors (B𝐚1B𝐚2B𝐚3). In other words, BA is computed by letting B act upon the columns of A.
  • Matrices acting on vectors satisfy A(𝐮+𝐯)=A𝐮+A𝐯, and acting on matrices they satisfy A(B+C)=AB+AC.
  • The identity matrix I3=(100010001) has the property that I3A=A and AI3=A for any other matrix A.
  • Row-scale matrices act on a matrix (or vector) by scaling a chosen row. Row-swap matrices act by swapping two rows. Row-add matrices act by adding one row into another.
  • An “invertible” matrix A is one having an inverse A1, which has the property that A1A=AA1=In. (Here A is an n×n matrix.) For a 2×2 matrix, its inverse is given by a formula:

A=\begin{pmatrix} a&b\c&d \end{pmatrix},\quad A^{-1}=\frac{1}{ad-bc}\begin{pmatrix} d&-b\-c&a \end{pmatrix}.

- The inverse matrix $A^{-1}$ can be *used to solve for $\mathbf{u}$* in the equation $A\mathbf{u}=\mathbf{v}$, where $A$ and $\mathbf{v}$ are both given and $\mathbf{u}$ is sought. - The *determinant* $\det A$ for the $2\times 2$ matrix $A$ (as above) is the number $ad-bc$. The matrix $A$ is *invertible if and only if $\det A\neq 0$*. - Given the matrix $A$, an *eigenvector* $\mathbf{v}$ is a vector ($\neq 0$) such that *$A\mathbf{v}=\lambda\mathbf{v}$* for some scalar $\lambda$ called the *eigenvalue* corresponding to $\mathbf{v}$. - We can *find eigenvalues and eigenvectors* by: setting $\det(A-\lambda I_2)=0$ and solving a quadratic equation to find $\lambda_1$, $\lambda_2$, then plugging these in for $\lambda$ and writing $A_{\lambda_1}=A-\lambda_1 I_2$ and $A_{\lambda_1}=A-\lambda_1 I_2$, and then manually solving for vectors $\mathbf{v}_1$ and $\mathbf{v}_2$ satisfying $A_{\lambda_1}\mathbf{v}_1=0$ and $A_{\lambda_2}\mathbf{v}_2=0$. This is possible using the fact that one row of $A_{\lambda_i}$ will be a multiple $x$ times the other row. ## Problems due 14 Feb 2024 by 12:00pm ##### Problem 04-01 > [!question] Basic multiplications; differing sizes > > Define the following matrices: > $$ > A=\begin{pmatrix}1&2\\3&6\\2&1\end{pmatrix},\quad B=\begin{pmatrix}1&-1\\-1&0\end{pmatrix},\quad C=\begin{pmatrix}1&0&1\\1&3&3\end{pmatrix}. > $$ > Compute the matrix products: $AB$, $AC$, $(BC)A$, $B(CA)$. ##### Problem 04-02 > [!question] Row reduction using elementary matrices > > The row-scale, row-add, and row-swap matrices can be applied in a sequence to *manipulate the entries* of another matrix or vector. A very useful goal is to convert a matrix into an *upper-triangular* matrix, which is a matrix that has all entries *below* the main diagonal equal to zero. > > For example, the matrix $\begin{pmatrix}2&-4\\3&-1\end{pmatrix}$ can be manipulated into an upper-triangular matrix by adding $-3/2$ times row $1$ into row $2$, because $(-3/2)\cdot 2=-3$ and $-3+3=0$. This action is performed by the row-add matrix with $-3/2$ in the $2^{\text{nd}}$ row of the $1^{\text{st}}$ column: > $$ > \begin{pmatrix}1&0\\-3/2&1\end{pmatrix}\begin{pmatrix}2&-4\\3&-1\end{pmatrix}=\begin{pmatrix}2&-4\\0&5\end{pmatrix}. > $$ > Now, using a sequence of matrix actions chosen from the types *row-scale* and *row-add*, convert the matrix > $$ > A = \begin{pmatrix}3&2&-2\\15&12&-8\\9&2&-6\end{pmatrix} > $$ > into an upper-triangular matrix. (Hint: first create two zeros in the first column, then create the zero needed in the second column.) > > Now multiply together the manipulation matrices used to perform your actions (multiplied right-to-left in order of the actions so that the actions are composed). > > Finally, show that this product matrix converts the original matrix to an upper-triangular matrix by multiplying on the left. ##### Problem 04-03 > [!question] Matrix inversion to find preimages > > In each case below, show that $\det A\neq 0$, compute the inverse $A^{-1}$, and use it to find the vector $\mathbf{u}$ satisfying the equation: > - (a) $\begin{pmatrix}1&2\\3&5\end{pmatrix}\mathbf{u}=\begin{pmatrix}5\\-2\end{pmatrix}$ > - (b) $\begin{pmatrix}2&5\\-3&-7\end{pmatrix}\mathbf{u}=\begin{pmatrix}1\\8\end{pmatrix}$ ##### Problem 04-04 > [!question] Calculate eigenvalues and eigenvectors for $2\times 2$ matrix > > Imitate the Example to find the eigenvalues and eigenvectors for $A$: > $$ > A=\begin{pmatrix}7&4\\-3&-1\end{pmatrix}. > $$ ##### Problem 04-05 > [!question] Actions on the right: rows v. columns > > In this problem, we consider the action of a matrix upon another matrix by multiplication *on the right*. That is, the action of $B$ upon $A$ by sending $A$ to $AB$. (The effect is very similar to acting on the left, except that the roles of ‘rows’ and ‘columns’ are reversed!) > - (a) Describe the action of a $3\times 3$ matrix $B$ on the right upon a *row vector* $(v_1\;v_2\;v_3)$. (To obtain the formula, take the matrix product treating the row vector as a $1\times 3$ matrix. Now interpret the formula analogously to the first point in the Summary.) > - (b) Describe the action of $B$ multiplying on the right on a *matrix* $A$ in terms of the *row vectors* of $A$. > - (c) What does the matrix $B_3(\lambda)=\begin{pmatrix}1&0&0\\0&1&0\\0&0&\lambda\end{pmatrix}$ do to another $3\times 3$ matrix $A$ when it acts on the right? > - (d) What is the $3\times 3$ matrix that *swaps columns $i$ and $j$* by acting on the right? > - (e) What is the $3\times 3$ matrix that *adds column $j$ into column $i$* by acting on the right? ParseError: Can't use function '$' in math mode at position 22: …inverse matrix $̲A^{-1}$ can be …