Packet 11

Matrices II: Theory of Linearity

Linearity and the meaning of matrix multiplication

Matrix times vector Matrix multiplication preserves linear combinations:

A (λ_{1} 𝐮 + λ_{2} 𝐯) = λ_{1} (A 𝐮) + λ_{2} (A 𝐯) .

This also means, of course, the preservation of scaling and (thus) of zero: $A (λ 𝐱) = λ (A 𝐱)$ , $A 𝟎 = 𝟎$ .

This fact about matrix actions is not merely a convenient property, but it is part of the very essence and definition of matrices. This fact alone completely determines the formulas for matrix multiplication and matrix actions on vectors, as we see next.

Matrix action on vectors determined by linearity

Suppose that $A$ acts upon vectors and preserves linear combinations. Since it acts on vectors, there is a well-defined image of every standard basis vector $𝐞_{1}, \dots, 𝐞_{n}$ . Write $𝐚_{i} = A 𝐞_{i}$ for these images. Note that nothing about matrix formulas is yet involved.

Now consider any vector $𝐱$ . We are able to write $𝐱$ using components in the standard basis:
$𝐱 = x_{1} 𝐞_{1} + \dots + x_{n} 𝐞_{n},$
and therefore linearity implies:
$\begin{matrix} A 𝐱 & = A (x_{1} 𝐞_{1} + \dots + x_{n} 𝐞_{n}) \\ = x_{1} (A 𝐞_{1}) + \dots + x_{n} (A 𝐞_{n}) \\ = x_{1} 𝐚_{1} + \dots + x_{n} 𝐚_{n} . \end{matrix}$
Now recall the formula for matrix $A$ acting on vector $𝐱$ : the output is the linear combination of the columns of $A$ using $x_{i}$ as the coefficients. So, if we write $𝐚_{1}, \dots, 𝐚_{n}$ into the columns of $A$ , then $A 𝐱$ according to our previous formulation ( $x_{i}$ coefficients on combos of the columns of $A$ ) is precisely the matrix given by linearity, when the images $𝐚_{i}$ of basis vectors $𝐱_{i}$ are recorded in the matrix columns.

In Question 04-04 we proved that matrix multiplication, as given by the formulas in Packet 04 specifically in summation notation, does satisfy linearity:

Derivation of linearity of matrix multiplication

$\begin{matrix} (A (λ_{1} 𝐮 + λ_{2} 𝐯))_{i} & = \sum_{j = 1}^{n} a_{i j} (λ_{1} u_{j} + λ_{2} v_{j}) \\ = \sum_{j = 1}^{n} a_{i j} λ_{1} u_{j} + a_{i j} λ_{2} v_{j} \\ = \sum_{j = 1}^{n} λ_{1} (a_{i j} u_{j}) + λ_{2} (a_{i j} v_{j}) \\ = λ_{1} \sum_{j = 1}^{n} a_{i j} u_{j} + λ_{2} \sum_{j = 1}^{n} a_{i j} v_{j} \\ = λ_{1} (A 𝐮)_{i} + λ_{2} (A 𝐯)_{i} \\ = (λ_{1} (A 𝐮) + λ_{2} (A 𝐯))_{i} \end{matrix}$
This sequence shows that the $i^{th}$ row of the vector $A (λ_{1} 𝐮 + λ_{2} 𝐯)$ agrees with the $i^{th}$ row of the vector $λ_{1} (A 𝐮) + λ_{2} (A 𝐯)$ . Since this is true for every row $i$ , the vectors must be completely the same.

Question 11-01

Finding a matrix using linearity

What is the 2x2 matrix that sends $2 𝐞_{1} - 2 𝐞_{2}$ to $(\begin{matrix} - 4 \\ 2 \end{matrix})$ and $- 3 𝐞_{2}$ to $(\begin{matrix} 3 \\ - 6 \end{matrix})$ ?

(Hint: first determine where $𝐞_{1}$ and $𝐞_{2}$ are sent using linearity. Then: the matrix is given by using these images as column vectors $𝐚_{1}$ and $𝐚_{2}$ .)

Matrix times matrix We already have a definition of matrix multiplication as the “composition of matrix action,” namely that $(A B) 𝐱 = A (B 𝐱)$ for all vectors $𝐱$ .

Since $A$ acts linearly and $B$ acts linearly, the above definition implies that $A B$ acts linearly:

\begin{matrix} (A B) (λ_{1} 𝐮 + λ_{2} 𝐯) & = A (B (λ_{1} 𝐮 + λ_{2} 𝐯)) \\ = A (λ_{1} (B 𝐮) + λ_{2} (B 𝐯)) \\ = λ_{1} (A (B 𝐮)) + λ_{2} (A (B 𝐯)) \\ = λ_{1} ((A B) 𝐮) + λ_{2} ((A B) 𝐯) . \end{matrix}

Therefore, by our previous reasoning, we can represent the action of $A B$ using a matrix where the $i^{th}$ column of the matrix of $A B$ is the image of $𝐞_{i}$ under this composition action:

{(A B)}_{i} = (A B) 𝐞_{i} = A (B 𝐞_{i}) = A 𝐛_{i} .

In other words, the columns of $A B$ must be the images of the columns of $B$ under the action of $A$ .

Linearity is fundamental

In summary: the idea of matrix product, and the formula for matrix product, come from the prescriptive hypothesis of linearity of matrix action. We can do many things in linear algebra by thinking and working in terms of linearity, instead of in terms of matrix formulas for actions and multiplications.

Question 11-02

Linearity and quadratic forms

Consider the function $f : ℝ^{2} \to ℝ$ given by $f (\begin{matrix} x \\ y \end{matrix}) = x y$ . Is this function linear? If not, is it linear in each variable separately?

Question 11-03

Dual vectors

Suppose $𝐯 \in ℝ^{n}$ is some vector. Define a function $f_{𝐯} : ℝ^{n} \to ℝ$ by the formula $f_{𝐯} (𝐱) = 𝐯 \cdot 𝐱$ . Is this function linear? If so, find a $1 \times n$ matrix that represents the action of $f_{𝐯}$ . This matrix is often called the dual vector of $𝐯$ .

Exercise 11-01

Projection and inclusion

Suppose $p : ℝ^{4} \to ℝ^{2}$ and $i : ℝ^{2} \to ℝ^{4}$ are two linear mappings given by the formulas:
$p : (\begin{matrix} u_{1} \\ u_{2} \\ u_{3} \\ u_{4} \end{matrix}) \mapsto (\begin{matrix} u_{3} \\ u_{4} \end{matrix}), i : (\begin{matrix} v_{1} \\ v_{2} \end{matrix}) \mapsto (\begin{matrix} v_{1} \\ v_{2} \\ 0 \\ 0 \end{matrix}) .$

(a) Check that $p$ and $i$ (given by these formulas) are linear mappings.

(b) Compute the matrices that correspond to $p$ and $i$ .

(c) Compute the matrix of the composition $i \circ p$ by evaluating the effect of this composite function upon the standard basis vectors $𝐞_{1}, \dots, 𝐞_{4}$ .

(d) Compute the same matrix by multiplying the matrix of $i$ and the matrix of $p$ .

(e) Repeat (c) and (d) for $p \circ i$ . (This time you only need $𝐞_{1}, 𝐞_{2}$ .)

Change of basis

The previous section encourages us to think of linearity as fundamental, and the matrix formulas as secondary, derivable from linearity.

To be specific, the array of numbers that we have taken for granted as the “matrix columns” can be understood as “really” just the images $𝐚_{i}$ of the standard basis vectors $𝐞_{i}$ . When a new vector is given to us in the form $x_{1} 𝐞_{1} + \dots + x_{n} 𝐞_{n}$ , then the action of $A$ on this vector is calculated using linearity as $x_{1} 𝐚_{1} + \dots + x_{n} 𝐚_{n}$ . By writing these $𝐚_{i}$ into the columns of a matrix, and putting the coefficients $(x_{1}, \dots, x_{n})$ into the columns of a vector, we arrive at the usual matrix-on-vector multiplication formula.

Most importantly: we do not need to know what the vectors $𝐞_{i}$ are to make sense of the rows and columns of the matrix $A$ , provided we insist that the columns $𝐚_{i}$ are supposed to give the images of $𝐞_{i}$ .

This fact allows us to generalize the idea of a matrix to the idea of a matrix in a basis. The coefficient columns of a matrix $A_{ℬ^{'}}$ in a basis $ℬ^{'}$ are simply the vector images of the members of the basis $ℬ^{'}$ . Images of other vectors are calculated by writing them in terms of the basis $ℬ^{'}$ and then applying matrix-on-vector multiplication by $A_{ℬ^{'}}$ .

Suppose we are given a specific basis $ℬ^{'} = {𝐟_{1}, \dots, 𝐟_{n}}$ for the space $ℝ^{n}$ that may not be the standard basis. Recall what this means: (a) that $ℬ^{'}$ is independent, and (b) that the span of $ℬ^{'}$ is all of $ℝ^{n}$ . (Note: we use the prime notation because we would write $ℬ = {𝐞_{1}, \dots, 𝐞_{n}}$ , with no prime, for the standard basis.)

Using (a) and (b), every vector $𝐱$ can be written with unique coefficients as a linear combination:

𝐱 = ν_{1} 𝐟_{1} + \dots + ν_{n} 𝐟_{n} .

Now suppose we have some image vectors $𝐠_{1}, \dots, 𝐠_{n}$ , meaning that $A_{ℬ^{'}} : 𝐟_{i} \mapsto 𝐠_{i}$ for all $i$ , and let us define the matrix $A_{ℬ}^{'}$ using this knowledge. By linearity we calculate:

\begin{matrix} A_{ℬ^{'}} (𝐱) & = ν_{1} A_{ℬ^{'}} (𝐟_{1}) + \dots + ν_{n} A_{ℬ^{'}} (𝐟_{n}) \\ = ν_{1} 𝐠_{1} + \dots + ν_{n} 𝐠_{n} . \end{matrix}

This is just like the formulation of the action of $A$ on $𝐱$ using the basis vectors $𝐞_{i}$ and their images $𝐚_{i}$ under $A$ . So, we define the matrix $A_{ℬ^{'}}$ as having column vectors $𝐠_{i}$ . This matrix is given in the basis $ℬ^{'}$ in the sense that when a given vector $𝐱$ is written with coefficients $ν_{1}, \dots, ν_{n}$ , the action of $A_{ℬ^{'}}$ on $𝐱$ is computed using the usual matrix-on-vector multiplication formula, but with the coefficients in $A_{ℬ^{'}}$ made of the column vectors $𝐠_{1}, \dots, 𝐠_{n}$ and the coefficients $(\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix})$ for $𝐱$ .

It can be helpful when writing the array $(\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix})$ of coefficients of $𝐱$ to use the notation ${[𝐱]}_{ℬ^{'}}$ . This notation refers to the array of numbers $ν_{1}, \dots ν_{n}$ that are used to write $𝐱$ as a linear combination $ν_{1} 𝐟_{1} + \dots + ν_{n} 𝐟_{n}$ of the vectors $𝐟_{i}$ that constitute the basis $ℬ^{'}$ . The brackets suggest the specific array in some non-standard basis, and of course $ℬ^{'}$ specifies that basis.

Note: some writers also use brackets for matrices in a specified basis, for example writing ${[A]}_{ℬ^{'}}$ . Going even further, some authors distinguish the matrix itself (which presupposes a basis) from the underlying linear transformation that determines it (using the presupposed basis to interpret its coefficients). In our notation, we only put brackets on vectors. The symbols $A$ and $A_{ℬ^{'}}$ stand for distinct concrete matrices, but the visual relation between them serves as a reminder that $A_{ℬ^{'}}$ performs the same action as $A$ when vector components are rewritten using the basis $ℬ^{'}$ .

Example

Matrix written in $ℬ^{'}$ acting on vector written in $ℬ^{'}$

Problem: The set of vectors
$ℬ^{'} = {𝐟_{1} = (\begin{matrix} 1 \\ - 1 \end{matrix}), 𝐟_{2} = (\begin{matrix} 1 \\ 1 \end{matrix})}$
is a basis of $ℝ^{2}$ . (These vectors are independent and there are two of them, so they span all of $ℝ^{2}$ .)

Consider the matrix $A = (\begin{matrix} - 2 & 2 \\ 0 & 1 \end{matrix})$ and the vector $𝐱 = (\begin{matrix} 2 \\ 4 \end{matrix})$ . Compute the array ${[𝐱]}_{ℬ^{'}}$ that represents $𝐱$ in the new basis $ℬ^{'}$ , and then compute $A_{ℬ^{'}}$ . Calculate the image $A 𝐱$ and compare this to the image $A_{ℬ^{'}} {[𝐱]}_{ℬ^{'}}$ . They should agree!

Solution: First we find ${[𝐱]}_{ℬ^{'}}$ . We must solve the system
$ν_{1} (\begin{matrix} 1 \\ - 1 \end{matrix}) + ν_{2} (\begin{matrix} 1 \\ 1 \end{matrix}) = (\begin{matrix} 2 \\ 4 \end{matrix}) so: \begin{matrix} ν_{1} + ν_{2} & = 2, \\ - ν_{1} + ν_{2} & = 4. \end{matrix}$
The solution is $ν_{2} = 3, ν_{1} = - 1$ . Therefore ${[𝐱]}_{ℬ^{'}} = (\begin{matrix} - 1 \\ 3 \end{matrix})$ .

Next we seek $A_{ℬ^{'}}$ . The columns of $A_{ℬ^{'}}$ are given by the images under $A$ of the items $𝐟_{1}$ and $𝐟_{2}$ . So we compute:
$A 𝐟_{1} = (\begin{matrix} - 2 & 2 \\ 0 & 1 \end{matrix}) (\begin{matrix} 1 \\ - 1 \end{matrix}) = (\begin{matrix} - 4 \\ - 1 \end{matrix}), A 𝐟_{2} = (\begin{matrix} - 2 & 2 \\ 0 & 1 \end{matrix}) (\begin{matrix} 1 \\ 1 \end{matrix}) = (\begin{matrix} 0 \\ 1 \end{matrix}) .$
Therefore we have $A_{ℬ^{'}} = (\begin{matrix} - 4 & 0 \\ - 1 & 1 \end{matrix})$ . This matrix sends $𝐟_{1} = {(\begin{matrix} 1 \\ 0 \end{matrix})}_{ℬ^{'}}$ to the vector $(\begin{matrix} - 4 \\ - 1 \end{matrix})$ and it sends $𝐟_{2} = {(\begin{matrix} 0 \\ 1 \end{matrix})}_{ℬ^{'}}$ to the vector $(\begin{matrix} 0 \\ 1 \end{matrix})$ .

Finally, observe on the one hand that
$A_{ℬ^{'}} {[𝐱]}_{ℬ^{'}} = (\begin{matrix} - 4 & 0 \\ - 1 & 1 \end{matrix}) (\begin{matrix} - 1 \\ 3 \end{matrix}) = (\begin{matrix} 4 \\ 4 \end{matrix}),$
while on the other hand
$A 𝐱 = (\begin{matrix} - 2 & 2 \\ 0 & 1 \end{matrix}) (\begin{matrix} 2 \\ 4 \end{matrix}) = (\begin{matrix} - 4 + 8 \\ 0 + 4 \end{matrix}) = (\begin{matrix} 4 \\ 4 \end{matrix}) .$
Therefore $A_{ℬ^{'}} {[𝐱]}_{ℬ^{'}} = A 𝐱$ as the problem statement had anticipated.

In this example, the output $A_{ℬ^{'}} {[𝐱]}_{ℬ^{'}} = (\begin{matrix} 4 \\ 4 \end{matrix})$ was written as a vector in the standard basis. This means it equals $4 𝐞_{1} + 4 𝐞_{2}$ . However, if we are given the input $𝐱$ in terms of the new basis $ℬ^{'}$ as $- 𝐟_{1} + 3 𝐟_{2}$ , then we may wish to represent the output also in terms of this basis; in other words we may want $[A_{ℬ^{'}} [𝐱]_{ℬ^{'}}]_{ℬ^{'}}$ . In a sense, the matrix $A_{ℬ^{'}}$ has been adjusted to use the basis $ℬ^{'}$ on the input side only, but we may want a matrix that also uses the basis $ℬ^{'}$ on the output side as well.

Such a matrix would be denoted by $_{ℬ^{'}} A_{ℬ^{'}}$ . This matrix operates on vectors ${[𝐱]}_{ℬ^{'}}$ that are given in terms of $ℬ^{'}$ and it produces vectors ${[𝐲]}_{ℬ^{'}}$ that are also given in terms of $ℬ^{'}$ . To study $_{ℬ^{'}} A_{ℬ^{'}}$ more effectively we introduce a new concept, the matrix of a change of basis.

Matrix of change of basis

Consider the problem of finding $ν_{1}, \dots, ν_{n}$ , the coefficients such that $𝐱 = ν_{1} 𝐟_{1} + \dots + ν_{n} 𝐟_{n}$ for some given $𝐱$ . Let us write $_{ℬ} T_{ℬ^{'}} = (\begin{matrix} 𝐟_{1} & 𝐟_{2} & \dots & 𝐟_{n} \end{matrix})$ for the matrix with column vectors $𝐟_{1}, \dots, 𝐟_{n}$ , and called it the “transfer matrix” from basis $ℬ^{'}$ , or $_{ℬ} T_{ℬ^{'}}$ for short.

Notice specifically that $_{ℬ} T_{ℬ^{'}}$ accepts the input vector $(\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix})$ and (using matrix-on-vector multiplication) returns the output vector $𝐱 =_{ℬ} T_{ℬ^{'}} (\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix})$ given in the standard basis. Therefore, in order to find $(\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix})$ , we simply need to invert $_{ℬ} T_{ℬ^{'}}$ and multiply its inverse by $𝐱$ :

(\begin{matrix} ν_{1} \\ ⋮ \\ ν_{n} \end{matrix}) =_{ℬ} T_{ℬ^{'}}^{- 1} (\begin{matrix} x_{1} \\ ⋮ \\ x_{n} \end{matrix}) .

Now, because $_{ℬ} T_{ℬ^{'}}^{- 1}$ changes a vector from basis $ℬ$ into a vector in basis $ℬ^{'}$ , it is also fair to write:

_{ℬ^{'}} T_{ℬ} =_{ℬ} T_{ℬ^{'}}^{- 1} .

Using transfer or change of basis matrices, we can compute $A_{ℬ^{'}}$ just by taking a matrix product:

A_{ℬ^{'}} = A \cdot_{ℬ} T_{ℬ^{'}} .

The operation $_{ℬ} T_{ℬ^{'}}$ converts vectors from basis $ℬ^{'}$ into vectors in basis $ℬ$ , and then we multiply those vectors by $A$ . The result is the same as first computing the matrix of $A$ in the basis $ℬ^{'}$ , obtaining $A_{ℬ^{'}}$ , and acting by this matrix.

Finally, if we wish to put the outputs of $A_{ℬ^{'}}$ back into the basis $ℬ^{'}$ , we just apply another transfer matrix:

_{ℬ^{'}} A_{ℬ^{'}} =_{ℬ^{'}} T_{ℬ} \cdot A_{ℬ^{'}} =_{ℬ^{'}} T_{ℬ} \cdot A \cdot_{ℬ} T_{ℬ^{'}} .

Example

Finding $_{ℬ^{'}} A_{ℬ^{'}}$ using ‘change of basis’ transfer matrices

Problem: Find the transfer matrix of the example in the previous section and use it to calculate $_{ℬ^{'}} A_{ℬ^{'}}$ .

Solution: We know that
$_{ℬ} T_{ℬ^{'}} = (\begin{matrix} 𝐟_{1} & 𝐟_{2} \end{matrix}) = (\begin{matrix} 1 & 1 \\ - 1 & 1 \end{matrix}) .$
Notice that the matrix $A_{ℬ^{'}}$ found in the example is quickly computed as the matrix product:
$A_{ℬ^{'}} = A \cdot_{ℬ} T_{ℬ^{'}} = (\begin{matrix} - 2 & 2 \\ 0 & 1 \end{matrix}) (\begin{matrix} 1 & 1 \\ - 1 & 1 \end{matrix}) = (\begin{matrix} - 4 & 0 \\ - 1 & 1 \end{matrix}) .$
To find the matrix $_{ℬ^{'}} A_{ℬ^{'}}$ , we compute the inverse using the usual 2x2 formula for inverses:
$_{ℬ^{'}} T_{ℬ} =_{ℬ} T_{ℬ^{'}}^{- 1} = {(\begin{matrix} 1 & 1 \\ - 1 & 1 \end{matrix})}^{- 1} = (\begin{matrix} 1 / 2 & - 1 / 2 \\ 1 / 2 & 1 / 2 \end{matrix}) .$
Then we can calculate $_{ℬ^{'}} A_{ℬ^{'}}$ by taking the product:
$_{ℬ^{'}} A_{ℬ^{'}} =_{ℬ^{'}} T_{ℬ} \cdot A_{ℬ^{'}} = (\begin{matrix} 1 / 2 & - 1 / 2 \\ 1 / 2 & 1 / 2 \end{matrix}) (\begin{matrix} - 4 & 0 \\ - 1 & 1 \end{matrix}) = (\begin{matrix} \frac{- 3}{2} & \frac{- 1}{2} \\ \frac{- 5}{2} & \frac{1}{2} \end{matrix}) .$

Example

Directly computing change of basis matrix using row reduction

Consider the following two bases:
$ℬ = {𝐛_{1} = (\begin{matrix} 7 \\ 5 \end{matrix}), 𝐛_{2} = (\begin{matrix} - 3 \\ - 1 \end{matrix})}, 𝒞 = {𝐜_{1} = (\begin{matrix} 1 \\ - 5 \end{matrix}), 𝐜_{2} = (\begin{matrix} - 2 \\ 2 \end{matrix})} .$
Problem: Find the transfer matrices $_{ℬ} T_{𝒞}$ and $_{𝒞} T_{ℬ}$ representing change of basis.

Solution: We only need to find one, since the other will be determined as the inverse of the first. Let us start with $_{ℬ} T_{𝒞}$ .

Let $B = (\begin{matrix} 𝐛_{1} & 𝐛_{2} \end{matrix})$ and $C = (\begin{matrix} 𝐜_{1} & 𝐜_{2} \end{matrix})$ . Now suppose we have found a matrix $X$ such that $B X = C$ . Then $X$ is in fact the transfer matrix $_{ℬ} T_{𝒞}$ . Why? Observe that by definition, $_{ℬ} T_{𝒞} (\begin{matrix} 1 \\ 0 \end{matrix}) = {[𝐜_{1}]}_{ℬ} = (\begin{matrix} x \\ y \end{matrix})$ where $x 𝐛_{1} + y 𝐛_{2} = 𝐜_{1}$ , and similarly $_{ℬ} T_{𝒞} (\begin{matrix} 0 \\ 1 \end{matrix}) = {[𝐜_{2}]}_{ℬ} = (\begin{matrix} z \\ w \end{matrix})$ where $z 𝐛_{1} + w 𝐛_{2} = 𝐜_{2}$ . But these two equations precisely combine to the matrix equation $B X = C$ , where $X = (\begin{matrix} x & z \\ y & w \end{matrix})$ .

So, we solve the equation $B X = C$ using row reduction on the augmented matrix $(\begin{matrix} B | C \end{matrix})$ :
$\begin{matrix} (\begin{matrix} B & C \end{matrix}) & = (\begin{matrix} 7 & - 3 & 1 & - 2 \\ 5 & - 1 & - 5 & 2 \end{matrix}) \\ \sim (\begin{matrix} 7 & - 3 & 1 & - 2 \\ 0 & 8 / 7 & - 40 / 7 & 24 / 7 \end{matrix}) \\ \sim (\begin{matrix} 1 & - 3 / 7 & 1 / 7 & - 2 / 7 \\ 0 & 1 & - 5 & 3 \end{matrix}) \\ \sim (\begin{matrix} 1 & 0 & - 2 & 1 \\ 0 & 1 & - 5 & 3 \end{matrix}) . \end{matrix}$
It follows that $_{ℬ} T_{𝒞} = X = (\begin{matrix} - 2 & 1 \\ - 5 & 3 \end{matrix})$ . By computing the inverse $_{𝒞} T_{ℬ} =_{ℬ} T_{𝒞}^{- 1}$ using the $2 \times 2$ formula, we have $_{𝒞} T_{ℬ} = (\begin{matrix} - 3 & 1 \\ - 5 & 2 \end{matrix})$ .

Exercise 11-02

Change of basis using row reduction

Find the change of basis transfer matrices between the following two bases:
$ℬ = {𝐛_{1} = (\begin{matrix} - 6 \\ - 1 \end{matrix}), 𝐛_{2} = (\begin{matrix} 2 \\ 0 \end{matrix})}, 𝒞 = {𝐜_{1} = (\begin{matrix} 2 \\ - 1 \end{matrix}), 𝐜_{2} = (\begin{matrix} 6 \\ - 2 \end{matrix})} .$

Exercise 11-03

Changing basis without knowing vectors

Assume that $ℬ = {𝐛_{1}, 𝐛_{2}, 𝐛_{3}}$ and $𝒞 = {𝐜_{1}, 𝐜_{2}, 𝐜_{3}}$ are bases of $ℝ^{3}$ , and that
$\begin{matrix} 𝐛_{1} & = 4 𝐜_{1} - 𝐜_{2} \\ 𝐛_{2} & = - 𝐜_{1} + 𝐜_{2} + 𝐜_{3} \\ 𝐛_{3} & = 𝐜_{2} - 2 𝐜_{3} . \end{matrix}$
Find the change of basis transfer matrix $_{𝒞} T_{ℬ}$ , and also compute ${[𝐱]}_{𝒞}$ for the vector $𝐱 = 3 𝐛_{1} + 4 𝐛_{2} + 𝐛_{3}$ .

Image, kernel, transpose, rank

Definitions Suppose that $A : ℝ^{n} \to ℝ^{m}$ is a matrix giving a linear transformation. (Notice that we don’t assume $n = m$ .) In this section we introduce three important subspaces that are derived from $A$ . These three subspaces are defined and may be referred to and used every time we have a matrix $A$ representing a linear transformation.

The image of $A$ written $Im (A)$ , also called the column space or range, is the span of the columns of $A$ , which is equivalent to the set of all possible outputs $𝐲 = A 𝐱$ for any $𝐱 \in ℝ^{n}$ .
The kernel of $A$ written $Ker (A)$ , also called the null space and written $Nul (A)$ , is the set of inputs that $A$ sends to $𝟎$ , i.e. the set of $𝐱 \in ℝ^{n}$ such that $A 𝐱 = 𝟎$ .
The co-kernel of $A$ written $CoKer (A)$ , also called the row space of $A$ , is the span of the row vectors of $A$ , which is equivalent to the image of the transpose $A^{𝖳}$ . It is also equivalent to the full orthogonal complement of the kernel $Ker (A)$ .

For the last one, we need to define the transpose of any matrix $A$ , written as $A^{𝖳}$ , as the matrix which reflects $A$ across the main diagonal, swapping the roles of rows and columns.

Be aware that if $𝐯$ is a normal column vector, then its transpose $𝐯^{𝖳}$ is a row vector. Another way to view this: every normal column vector $𝐯 \in ℝ^{n}$ is actually a $n \times 1$ matrix, while its transpose row vector $𝐯^{𝖳}$ is a $1 \times n$ matrix.

Exercise 11-04

Transpose has ‘reversing property’

Show that ${(A B)}^{𝖳} = B^{𝖳} A^{𝖳}$ .

Hint: You should use the summation notation for matrix products: ${(A B)}_{i j} = \sum_{k = 1}^{n} a_{i ℓ} b_{ℓ j} .$

More about co-kernels The meaning of image and kernel is clear from these definitions, but the meaning of co-kernel is less so.

If $A$ is an $m \times n$ matrix with entries $(a_{i j})$ , then the row vectors of $A$ are just its rows $(a_{i 1}, \dots, a_{i n})$ for each $i = 1, \dots, m$ . These are $1 \times n$ matrices. We will use the notation $𝐚^{1}, \dots, 𝐚^{m}$ for the row vectors.

By taking transposes, we can convert these row vectors into ordinary vectors ${𝐚^{i}}^{𝖳}$ . Notice that the $i^{th}$ row vector of $A$ equals the $i^{th}$ column vector of the transpose $A^{𝖳}$ .

Now consider the connection between row vectors and dot products. A row vector $𝐚^{i}$ , being an $1 \times n$ matrix, acts upon a $n \times 1$ column vector and returns a $1 \times 1$ vector, which is to say a scalar. This action is given by the formula:

𝐚^{i} 𝐱 = (\begin{matrix} a_{i 1} & \dots & a_{i n} \end{matrix}) (\begin{matrix} x_{1} \\ ⋮ \\ x_{n} \end{matrix}) = a_{i 1} x_{1} + \dots + a_{i n} x_{n} .

This formula is the same as taking the dot product ${𝐚^{i}}^{𝖳} \cdot 𝐱$ .

By generalizing this to every row of the matrix $A$ , and recalling that the multiplication $A 𝐱$ gives a vector each row of which is the corresponding row of $A$ dotted with $𝐱$ , we obtain the general fact that $A 𝐱 = 𝟎$ if and only if $𝐱$ is perpendicular to each of the row vectors transposed into ordinary vectors ${𝐚^{i}}^{𝖳}$ :

A 𝐱 = 𝟎 ⟺ {𝐚^{i}}^{𝖳} \cdot 𝐱 = 0 for all i = 1, \dots, n .

Co-kernel is the orthogonal complement of the kernel

The co-kernel of $A$ is precisely the subspace of (transposes of) vectors in $ℝ^{n}$ which are perpendicular to the kernel of $A$ .

Derivation

We know that every $𝐱 \in Ker (A)$ is perpendicular to every (transposed) row vector of $A$ . This implies that $𝐱$ is perpendicular to every vector in the span of the (transposed) row vectors of $A$ . In other words, every $𝐱 \in Ker (A)$ is perpendicular to everything in the co-kernel of $A$ .

It remains only to show that if a vector is perpendicular to every $𝐱 \in Ker (A)$ , then the vector must be in $CoKer (A)$ , which is to say it must be a linear combination of (tranposed) row vectors of $A$ .

Suppose that $𝐯$ is not a linear combination of the (transposed) row vectors of $A$ . Then (imitating the Gram-Schmidt process) define $𝐯 = 𝐯_{⟂} + 𝐯_{∥}$ where $𝐯_{∥} = {proj}_{co-kernel of A} (𝐯)$ . Then $𝐯_{⟂}$ is perpendicular to all rows of $A$ and thus it belongs to $Ker (A)$ . Furthermore, we know that $𝐯_{⟂} \cdot 𝐯_{∥} = 0$ . Thus, $𝐯_{⟂} \cdot 𝐯 = 𝐯_{⟂} \cdot 𝐯_{⟂} \neq 0$ . (We know $𝐯_{⟂} \neq 0$ because $𝐯$ is not in $CoKer (A)$ .) In conclusion, our vector $𝐯$ is not perpendicular to the specific vector $𝐯_{⟂} \in Ker (A)$ .

Reversing the logic (i.e. taking the contrapositive), we have shown that if a vector is perpendicular to every $𝐱 \in Ker (A)$ , then it must lie in $CoKer (A)$ .

Co-kernel is image of transpose

Observe that $Im (A^{𝖳})$ is the same as (the transpose of) $CoKer (A)$ :

The image of $A^{𝖳}$ is another name for the column space of $A^{𝖳}$ , and columns of $A^{𝖳}$ are the same data as rows of $A$ , just transposed.

Rank The rank of a matrix $A$ is defined to be the dimension of the image of $A$ . This number is the same as the number of independent columns of $A$ . There are two fundamental theorems about the rank of a matrix which relate this number to the dimensions of the other natural subspaces derived from $A$ .

Rank-Rank Theorem

The rank of a matrix $A$ equals the rank of its transpose $A^{𝖳}$ :
$rank A = \dim Im (A) = \dim Im (A^{𝖳}) = rank A^{𝖳}$

Observe that $Im (A^{𝖳})$ is the (transpose of the) row space of $A$ . So the theorem says that the row space and the column space of $A$ always have the same number of independent vectors.

You can remember the reason for this theorem with the mnemonic:

Pivots give the independent columns and the independent rows.

Rank-Nullity Theorem

For any matrix $A : ℝ^{n} \to ℝ^{m}$ , we have:
$\dim Im (A) + \dim Ker (A) = n .$

To remember this theorem, observe that ‘nullity’ refers to the dimension of the null space. To remember the reason for this theorem, we have another mnemonic:

Each of the $n$ columns is either a pivot or a free variable.

The mnemonics for these theorems point the way towards their proofs.

For matrices in RREF, both theorems are obvious from the mnemonics. (Check this!)
For matrices not in RREF, the key facts are:
- (i) that row reduction is performed by left-multiplying $A$ by an invertible matrix $Q$ representing a composite sequence of invertible row operations (row-adds, row-scales, row-swaps), and
- (ii) that the dimension of the image of $Q A$ is equal to the dimension of the image of $A$ whenever $Q$ is an invertible matrix.
- (iii) that the row space of a matrix $A$ is equal to the row space of the matrix $Q A$ when $Q$ is a row reduction matrix, as in (ii).

Invertible matrices preserve subspaces

The second fact (ii) is actually much more general: multiplying by an invertible matrix preserves the dimensions of all subspaces.

This fact is not hard to prove, because invertible matrices send independent / dependent vectors to independent / dependent vectors, and thus bases to bases, and thus dimensions to dimensions.

In order to make use of these theorems, you should simply compute the RREF of a matrix and count the numbers of pivots and free variables.

Problems due 3 Apr 2024 by 12:00pm

Problem 11-01

Linearity

Suppose that $𝐮_{1}, \dots, 𝐮_{k} \in ℝ^{n}$ are vectors and that $A$ represents some linear transformation acting on $ℝ^{n}$ .

(a) Suppose we know that $𝐮_{1}, \dots, 𝐮_{k}$ are independent. Do we automatically know that $A 𝐮_{1}, \dots, A 𝐮_{k}$ are independent? (Justify your answer. If false, a counterexample suffices; if true, an argument is required.)

(b) Suppose we know that $𝐮_{1}, \dots, 𝐮_{k}$ are dependent. Do we automatically know that $A 𝐮_{1}, \dots, A 𝐮_{k}$ are dependent? (Justify your answer. If false, a counterexample suffices; if true, an argument is required.)

(c) If $\vec{ℓ} (t)$ parametrizes a line passing through the origin, do we automatically know that $A \vec{ℓ}$ is also a line passing through the origin? What if we know that $A$ is invertible?

(d) If $\vec{ℓ} (t)$ parametrizes a line not passing through the origin, do we automatically know that $A \vec{ℓ}$ is also a line not passing through the origin? What if we know that $A$ is invertible?

Problem 11-02

Changing bases

(a) Suppose that $ℬ = {𝐛_{1}, 𝐛_{2}, 𝐛_{3}}$ and $𝒞 = {𝐜_{1}, 𝐜_{2}, 𝐜_{3}}$ are bases of $V$ . Suppose that $A = (\begin{matrix} {[𝐜_{1}]}_{ℬ} & {[𝐜_{2}]}_{ℬ} & {[𝐜_{3}]}_{ℬ} \end{matrix})$ is the matrix with column vectors equal to $𝐜_{1}, 𝐜_{2}, 𝐜_{3}$ when written in the basis $ℬ$ . Which of the following is satisfied by $A$ ? (i) $A {[𝐱]}_{ℬ} = {[𝐱]}_{𝒞}$ or (ii) $A {[𝐱]}_{𝒞} = {[𝐱]}_{ℬ}$ . (You must justify your answer.)

(b) Find the change of coordinates transfer matrices $_{𝒞} T_{ℬ}$ and $_{ℬ} T_{𝒞}$ between the following two bases of $ℝ^{2}$ : $ℬ = {𝐛_{1} = (\begin{matrix} - 1 \\ 8 \end{matrix}), 𝐛_{2} = (\begin{matrix} 1 \\ - 5 \end{matrix})}, 𝒞 = {𝐜_{1} = (\begin{matrix} 1 \\ 4 \end{matrix}), 𝐜_{2} = (\begin{matrix} 1 \\ 1 \end{matrix})} .$

(c) Assume that $ℬ = {𝐛_{1}, 𝐛_{2}, 𝐛_{3}}$ and $𝒞 = {𝐜_{1}, 𝐜_{2}, 𝐜_{3}}$ are bases of $V$ , and that \begin{align*} ParseError: Expected & or \\ or \cr or \end at end of input: \begin{align*}

\mathbf{b}_1&= 2\mathbf{c}_1-\mathbf{c}_2+\mathbf{c}_3\ \mathbf{b}_2&= 3\mathbf{c}_2+\mathbf{c}_3\ \mathbf{b}_3&= -3\mathbf{c}_1+2\mathbf{c}_3.\ \end{align*}
Find the change of basis transfer matrix $_\mathcal{C}T_\mathcal{B}$, and also compute $[\mathbf{x}]_\mathcal{C}$ for the vector $\mathbf{x}=\mathbf{b}_1-2\mathbf{b}_2+2\mathbf{b}_3$. ParseError: Can't use function '$' in math mode at position 42: …ransfer matrix $̲_\mathcal{C}T_\…

Problem 11-03

Rank-Rank and Rank-Nullity counting practice

Answer the following questions using the Rank-Rank and Rank-Nullity theorems. It may help to work with imaginary matrices in RREF and think about the pivots.

(a) Suppose $A$ is $3 \times 8$ with $rank A = 2$ . What is $\dim Ker (A)$ and $\dim CoKer (A)$ ?

(b) Suppose $A$ is $7 \times 3$ with $rank A = 3$ . What is $\dim Ker (A)$ and $\dim CoKer (A)$ ?

(c) Suppose $A$ is $9 \times 5$ with $\dim Ker (A) = 2$ . What is $rank A$ and $\dim CoKer (A)$ ?

Problem 11-04

Bases and dimensions from row reduction

Row reduce the following matrix $A$ in order to find a basis for the row space $Im (A^{𝖳})$ , and another basis for the column space $Im (A)$ , and a third basis for the kernel $Ker (A)$ . By counting basis elements to determine dimensions, verify the two theorems of the section of the packet about rank.
$A = (\begin{matrix} - 2 & - 5 & 8 & 0 & - 17 \\ 1 & 3 & - 5 & 1 & 5 \\ 3 & 11 & - 19 & 7 & 1 \\ 1 & 7 & - 13 & 5 & - 3 \end{matrix})$
(Hint: for $Im (A)$ , you should determine which columns of $Q A$ have pivots; the same columns of $A$ will then be independent vectors! This happens because row reduction $Q$ preserves independence / dependence relations among the column vectors of $A$ .)

xTensiv
Home

Table of Contents

Applied Linear Algebra - Packet 11

Packet 11

Matrices II: Theory of Linearity

Linearity and the meaning of matrix multiplication

Question 11-01

Question 11-02

Question 11-03

Exercise 11-01

Change of basis

Example

Matrix of change of basis

Example

Example

Exercise 11-02

Exercise 11-03

Image, kernel, transpose, rank

Exercise 11-04

Problems due 3 Apr 2024 by 12:00pm

Problem 11-01

Problem 11-02

Problem 11-03

Problem 11-04

xTensivHome

Table of Contents

Applied Linear Algebra - Packet 11

Packet 11

Matrices II: Theory of Linearity

Linearity and the meaning of matrix multiplication

Question 11-01

Question 11-02

Question 11-03

Exercise 11-01

Change of basis

Example

Matrix of change of basis

Example

Example

Exercise 11-02

Exercise 11-03

Image, kernel, transpose, rank

Exercise 11-04

Problems due 3 Apr 2024 by 12:00pm

Problem 11-01

Problem 11-02

Problem 11-03

Problem 11-04

xTensiv
Home