Basic vector algebra

Vectors in โ„n have n real number components:

๐ฏ=(v1,v2,โ€ฆ,vn).

Such vectors are added componentwise, and scalars multiply every component simultaneously. All the abstract operations and properties of vectors apply to vectors in โ„n:

  • Operations: addition and scalar multiplication,
  • Properties: commutativity, associativity, distributivity, zero vector.

There are n standard basis vectors:

๐ži=(0,โ€ฆ,0,1ithโŸ,0,โ€ฆ,0).

Unique decomposition of vectors (weighted sum) using the standard basis:

๐ฏ=v1๐ž1+v2๐ž2+โ‹ฏ+vn๐žn,๐ฏ=โˆ‘i=1nvi๐ži.

Pairs of vectors in nD also have dot products defined by summing component products:

๐ฎโ‹…๐ฏ=(u1,โ€ฆ,un)โ‹…(v1,โ€ฆ,vn)=u1v1+โ‹ฏ+unvn=โˆ‘i=1nuivi.

The norm of an nD vector is still |๐ฎ|=๐ฎโ‹…๐ฎ.

Dot product still has the meaning of โ€œrelative alignment between vectors,โ€ and can still be used to determine the angle between vectors using the cosine formula, ๐ฎโ‹…๐ฏ=|๐ฎ||๐ฏ|cosโกฮธ. However, this angle is considerably less important in nD.

Spans

Linear combinations of vectors in nD:

๐ฏ=a1๐ฎ1+a2๐ฎ2+โ‹ฏ+an๐ฎn,aiโˆˆโ„.

A span is a collection of all vectors which could be obtained as linear combinations of certain given vectors. For example, the collection of all possible ๐ฏ that can be written as above, for any ai, in terms of the ๐ฎi given at the outset, is called either of:

span{๐ฎ1,โ€ฆ,๐ฎn},โŸจ๐ฎ1,โ€ฆ,๐ฎnโŸฉ.

It is still an important fact that a span passes through the origin ๐ŸŽ=(0,0,โ€ฆ,0). This is because linear combinations do not include a constant term, so the point ๐ŸŽ can always be achieved by setting ai=0 for all i.

Matrices: linear actions on vectors

Matrices represent linear transformations (โ€˜actionsโ€™) on vectors. An action A:โ„3โ†’โ„3, for example, is linear if:

  • A(๐ฎ+๐ฏ)=A(๐ฎ)+A(๐ฏ)
  • A(ฮป๐ฎ)=ฮป(A(๐ฎ))

Recall vector decomposition, ๐ฑ=x1๐ž1+x2๐ž2+x3๐ž3. This gives ๐ฑ in terms of weights in the standard basis. Applying a linear transformation to this weight decomposition, we have:

A(๐ฑ)=x1A(๐ž1)+x2A(๐ž2)+x3A(๐ž3).

Now let us write ๐š1=A(๐ž1) and ๐š2=A(๐ž2) and ๐š3=A(๐ž3). Therefore A(๐ฑ)=x1๐š1+x2๐š2+x3๐š3. Then build a matrix for A by putting these โ€˜columnโ€™ vectors ๐ši into a row of columns:

A=(๐š1๐š2๐š3)=((๐š1)1(๐š2)1(๐š3)1(๐š1)2(๐š2)2(๐š3)2(๐š1)3(๐š2)3(๐š3)3)=(a11a12a13a21a22a23a31a32a33).

(The last matrix defines the individual entries of A with subscript ij where i is the row and j is the column. Mnemonic: โ€œdownwards index goes first because itโ€™s heavier.โ€) In general, the entry aij is the jth row of the ith column vector of A.

This matrix A acts upon the vector ๐ฑ by letting its ๐ži weights xi become the ๐ši weights for the new vector:

A๐ฑ=(๐š1๐š2๐š3)(x1x2x3)=x1๐š1+x1๐š2+x1๐š3=(a11x1+a12x2+a13x3a21x1+a22x2+a23x3a31x1+a32x2+a33x3).

Determinant

The determinant of a 2ร—2 matrix is given by the formula:

A=(abcd),detโกA=|abcd|=adโˆ’bc.

The determinant of a 3ร—3 matrix is given by the formula:

A=(abcdefghi),detโกA=|abcdefghi|=a|efhi|โˆ’b|dfgi|+c|degh|.

Higher determinants

You may notice a pattern here that hold true in general and allows calculation of nร—n determinants: traverse any row or column, using these entries as weights (with alternating signs), writing a weighted sum of the smaller determinants of the minor matrices (which are given by deleting the row and column in which the given entry in our traversal process lives). This process therefore reduces an nร—n determinant to a linear combination of (nโˆ’1)ร—(nโˆ’1) determinants, and the reduction can be iterated until the 2ร—2 case is reached.

Independence

A collection of vectors {๐ฏ1,๐ฏ2,โ€ฆ,๐ฏk} is called independent when the only solution to the equation

x1๐ฏ1+x2๐ฏ2+โ‹ฏ+xk๐ฏk=0

is given by setting xi=0 for every i.

A dependency relation is any equation giving one of the vectors in terms of the others, for example ๐ฏ2=3๐ฏ1โˆ’4๐ฏ3.

A collection of vectors is independent if and only if no vector is in the span of the others, which is if and only if none of them can be written as a linear combination of the others, i.e. there are no dependency relations.

The independence or potential dependency relations among a set of vectors are best studied by putting the vectors into a matrix:

A=(๐ฏ1๐ฏ2โ€ฆ๐ฏk).

The vectors are independent if and only if this matrix has rank k. That means: row reduction converts it to a matrix with k pivots, one per column.

When k=n, meaning the number of vectors is the same as the number of coefficients in a vector, then A is made of independent vectors if and only if it is invertible, which is if and only if it has nonvanishing determinant, i.e. detโกAโ‰ 0.

Matrix algebra: composition as multiplication

Matrices can be multiplied by each other. Express A=(๐š1๐š2๐š3) as a row of column vectors; then BA is the matrix which is the row of column vectors (B๐š1B๐š2B๐š3). In other words, B simply acts upon each column of A in parallel.

This rule means that BA acts on a vector ๐ฑ by composition of the actions of A first and then of B second:

(BA)๐ฑ=x1B๐š1+x2B๐š2+x3B๐š3=B(x1๐š1)+B(x2๐š2)+B(x3๐š3)=B(x1๐š1+x2๐š2+x3๐š3)=B(A(๐ฑ)).

The identity matrix I3=(100010001) acts by doing nothing, and therefore it does nothing in matrix multiplication: I3A=AI3=A. (For higher n we have InA=AIn=A.)

The inverse matrix, written Aโˆ’1, is a matrix which does the opposite of the action of A, undoing the effect of A. So AAโˆ’1๐ฑ=Aโˆ’1A๐ฑ=๐ฑ for every ๐ฑ. In terms of matrix multiplication, AAโˆ’1=Aโˆ’1A=In.

The inverse does not always exist, since some actions cannot be undone! For example the matrix (1000) sends (x1x2) to (x10), deleting the data of x2. The data cannot be recovered, so this matrix has no inverse.

Eigenvectors

The action of a diagonal matrix is to scale each row by the diagonal entry in that row:

(a1000a2000a3)(x1x2x3)=(a1x1a2x2a3x3).

This action is very easy to express algebraically in terms of standard basis vectors:

A๐ž1=a1๐ž1,A๐ž2=a2๐ž2,A๐ž3=a3๐ž3.

In other words, standard basis vectors are sent to scalar multiples of themselves.

Whenever a vector is mapped to a multiple of itself, even when A is not diagonal, the vector is called an eigenvector (โ€œself-vectorโ€), and the scalar is called an eigenvalue (โ€œself-valueโ€):

A๐ฑ=ฮป๐ฑ.

Meaning of eigenvectors

Eigenvectors are โ€œspecialโ€ or โ€œnaturalโ€ vectors associated to the matrix A, since they reveal a critical proportionality between the vector entries in ๐ฑ such that A does not disturb this proportionality. If an eigenvector is interpreted as a direction, then the action of A simply rescales vectors in that direction. As an example, every rotation in 3D can be written as a matrix, and an eigenvector of this matrix would point along the axis of rotation, since rotation does not change the direction of axis vectors. The eigenvalue in this case would be ฮป=1.

The eigenvalue equation above is equivalent to the equation (Aโˆ’ฮปIn)๐ฑ=๐ŸŽ. Frequently the shorthand notation Aฮป=(Aโˆ’ฮปIn) is used, so the last equation becomes Aฮป๐ฑ=๐ŸŽ.

In linear algebra one develops further machinery to analyze equations A๐ฑ=๐› for any matrix A and vector ๐›. Setting A=Aฮป and ๐›=๐ŸŽ, the equation A๐ฑ=๐› becomes Aฮป๐ฑ=๐ŸŽ, so the machinery of linear algebra would be applicable here. Nonetheless, we will not learn that machinery for this ODE course. Our goal is only to see how matrix algebra techniques are applicable to solving linear systems.

Finding eigenvalues and eigenvectors

The characteristic polynomial chA(ฮป) of a matrix A is defined as the determinant detโกAฮป in which ฮป is considered an indefinite variable.

For example: if A=(1โˆ’2โˆ’4โˆ’1) then Aฮป=(1โˆ’ฮปโˆ’2โˆ’4โˆ’1โˆ’ฮป) and (using the determinant formula) we have:

chA(ฮป)=|1โˆ’ฮปโˆ’2โˆ’4โˆ’1โˆ’ฮป|=(1โˆ’ฮป)(โˆ’1โˆ’ฮป)โˆ’8=ฮป2โˆ’9.

Key fact for finding eigenvalues

The eigenvalues of a matrix A are the roots of its characteristic polynomial chA(ฮป).

In the previous example, therefore, ฮป=ยฑ3 are the eigenvalues.

Once the eigenvalues are found by solving chA(ฮป)=0 for ฮป, the solutions can be plugged back in for ฮป in the equation Aฮป๐ฑ=๐ŸŽ, and then we can try to find the eigenvectors ๐ฑ by solving this equation. That is normally done using the linear algebra technique of โ€œrow reduction,โ€ but in this course you will only solve such equations in the 2ร—2 case when it is very easy: one of the rows of Aฮป will be a multiple of the other whenever ฮป is an eigenvalue, so the equation Aฮป๐ฑ=๐ŸŽ reduces to a single relation of proportionality between x1 and x2.

Notice that if ๐ฑ is an eigenvector, then ฮฑ๐ฑ is also an eigenvector for any scalar ฮฑโ‰ 0. So really we are studying eigenlines, namely the spans of eigenvectors. The relation of proportionality between x1 and x2 gives the direction of this line.

For ฮป=+3 in the previous example, solving Aฮป๐ฑ=๐ŸŽ would mean solving (โˆ’2โˆ’2โˆ’4โˆ’4)(x1x2)=(00). Therefore โˆ’x1โˆ’x2=0 and thus x1=โˆ’x2, so an eigenvector is given by (1โˆ’1) and a line of eigenvectors is given by the span ฮฑ(1โˆ’1)=(ฮฑโˆ’ฮฑ) for all ฮฑ. For ฮป=โˆ’3 we find (4โˆ’2โˆ’42)(x1x2)=(00), and solving this we find 2x1=x2 and therefore (12) is an eigenvector, as is everything in the span (ฮฑ2ฮฑ).

Matrix calculus

An important technique for differential equations that is frequently not covered in linear algebra courses is that of differentiating matrices and taking matrix powers such as eA. This is important for systems of differential equations because the general form of a linear homogeneous constant coefficient system:

๐ฑโ€ฒ(t)=A๐ฑ(t),๐ฑ(t)=๐ฑ0

has for its general fundamental solution the powerful formula:

๐ฑ(t)=eAt.

We unpack this formula and some related material now.

A time-varying matrix or vector may be given as an array of functions time:

๐ฑ(t)=(x1(t)x2(t)x3(t)),A(t)=(a11(t)a12(t)a13(t)a21(t)a22(t)a23(t)a31(t)a32(t)a33(t)).

Since matrices are added / subtracted / scaled by adding / subtracting / scaling their coefficients in parallel, it is easy to define derivatives (and integrals, for that matter) by coefficients:

๐ฑโ€ฒ(t)=(x1โ€ฒ(t)x2โ€ฒ(t)x3โ€ฒ(t)),Aโ€ฒ(t)=(a11โ€ฒ(t)a12โ€ฒ(t)a13โ€ฒ(t)a21โ€ฒ(t)a22โ€ฒ(t)a23โ€ฒ(t)a31โ€ฒ(t)a32โ€ฒ(t)a33โ€ฒ(t)).

A โ€˜matrix productโ€™ rule is a little less trivial, but also true:

ddt(A(t)B(t))=Aโ€ฒ(t)B(t)+A(t)Bโ€ฒ(t).

This rule follows from the general formula (using a summation) for each single entry in the matrix product.

Now it is possible to apply any power series function to a matrix. This is because matrix multiplication (for the powers), along with scaling and summing, are enough to evaluate the series. For example, recall that

ex=1+x+x22!+x33!+x44!โ‹ฏ.

It is easy enough to plug a matrix A into this formula, so we use the formula to define eA:

eA=In+A+12!A2+13!A3+14!A4+โ‹ฏ.

(Does such a thing converge? It is possible to show that it does, for any A, but we cannot do so here.)

More generally, if f(x)=โˆ‘i=0โˆžanxn is any power series function, then we can define the evaluation f(A) where the series converges:

f(A)=โˆ‘i=0โˆžanAn.

Now, notice a significant observation about diagonal matrices: multiplication of diagonal matrices is just multiplication of the respective diagonal entries:

(a1000a2000a3)(b1000b2000b3)=(a1b1000a2b2000a3b3).

Since the powers in a power series are defined using matrix multiplication, this tells us that powers are easy for diagonal matrices:

(a1000a2000a3)n=(a1n000a2n000a3n).

This, in turn, implies that power series functions applied to diagonal matrices are simply given by applying the function to the diagonal entries:

f(A)=f((a1000a2000a3))=(f(a1)000f(a2)000f(a3))

so in particular:

eA=expโกA=expโก(a1000a2000a3)=(ea1000ea2000ea3).

(Sometimes โ€œexpโกxโ€ or โ€œexpโกAโ€ is used instead of ex or eA when the notation is more convenient.)

Non-diagonal matrix powers

If A is non-diagonal, then computing eA can be extremely challenging! It is necessary to develop an efficient way to find higher matrix powers. There are some tools for this in linear algebra: basically you change coordinates so that the matrix appears to be diagonal or almost diagonal, and you power it up in the new coordinates, and then change back to the original coordinates.

The matrix exponential has some nice properties that follow from the power series formula:

  • e0In=In
  • eA+B=eAeB (When AB=BA, but not necessarily when ABโ‰ BA!)
  • eAt+As=eAteAs (Always.)
  • (eA)โˆ’1=eโˆ’A
  • erIn=erIn

Finally, consider differentiation. Using the power series definition for eAt, we discover that:

ddt(eAt)=ddt(I+At+A2t22!+A3t33!+โ‹ฏ)=A+A2t+A3t22!+A4t33!+โ‹ฏ=A(I+At+A2t22!+A3t33!+โ‹ฏ)=AeAt.

Therefore:

Matrix exponential solves ODE

The function X(t)=eAt solves the matrix ODE:

Xโ€ฒ(t)=AX(t).

Given the way matrix multiplication works, this formula means (with 3ร—3 matrices) that we have 3 distinct vector solutions to the ODE, namely:

(๐ฑ1โ€ฒ(t)๐ฑ2โ€ฒ(t)๐ฑ3โ€ฒ(t))=(A(๐ฑ1(t))A(๐ฑ2(t))A(๐ฑ3(t))).

Equating the corresponding columns, this means:

๐ฑ1โ€ฒ(t)=A๐ฑ1(t)๐ฑ2โ€ฒ(t)=A๐ฑ2(t)๐ฑ3โ€ฒ(t)=A๐ฑ3(t).

Notice that at each given time t, the vectors ๐ฑ1,๐ฑ2,๐ฑ3 are linearly independent! This is because they are the column vectors of the matrix X=eAt, and this entity is invertible because its inverse is always given by the matrix eโˆ’At.

In conclusion, by using matrix exponentials we have a remarkably efficient way to generate and describe a set of n independent solutions to the ODE given by ๐ฑโ€ฒ(t)=A๐ฑ(t).