Basic vector algebra

Vectors in have real number components:

Such vectors are added componentwise, and scalars multiply every component simultaneously. All the abstract operations and properties of vectors apply to vectors in :

  • Operations: addition and scalar multiplication,
  • Properties: commutativity, associativity, distributivity, zero vector.

There are standard basis vectors:

Unique decomposition of vectors (weighted sum) using the standard basis:

Pairs of vectors in D also have dot products defined by summing component products:

The norm of an D vector is still .

Dot product still has the meaning of “relative alignment between vectors,” and can still be used to determine the angle between vectors using the cosine formula, . However, this angle is considerably less important in D.

Spans

Linear combinations of vectors in D:

A span is a collection of all vectors which could be obtained as linear combinations of certain given vectors. For example, the collection of all possible that can be written as above, for any , in terms of the given at the outset, is called either of:

It is still an important fact that a span passes through the origin . This is because linear combinations do not include a constant term, so the point can always be achieved by setting for all .

Matrices: linear actions on vectors

Matrices represent linear transformations (‘actions’) on vectors. An action , for example, is linear if:

Recall vector decomposition, . This gives in terms of weights in the standard basis. Applying a linear transformation to this weight decomposition, we have:

Now let us write and and . Therefore . Then build a matrix for by putting these ‘column’ vectors into a row of columns:

(The last matrix defines the individual entries of with subscript where is the row and is the column. Mnemonic: “downwards index goes first because it’s heavier.”) In general, the entry is the row of the column vector of .

This matrix acts upon the vector by letting its weights become the weights for the new vector:

Determinant

The determinant of a matrix is given by the formula:

The determinant of a matrix is given by the formula:

Higher determinants

You may notice a pattern here that hold true in general and allows calculation of determinants: traverse any row or column, using these entries as weights (with alternating signs), writing a weighted sum of the smaller determinants of the minor matrices (which are given by deleting the row and column in which the given entry in our traversal process lives). This process therefore reduces an determinant to a linear combination of determinants, and the reduction can be iterated until the case is reached.

Independence

A collection of vectors is called independent when the only solution to the equation

is given by setting for every .

A dependency relation is any equation giving one of the vectors in terms of the others, for example .

A collection of vectors is independent if and only if no vector is in the span of the others, which is if and only if none of them can be written as a linear combination of the others, i.e. there are no dependency relations.

The independence or potential dependency relations among a set of vectors are best studied by putting the vectors into a matrix:

The vectors are independent if and only if this matrix has rank . That means: row reduction converts it to a matrix with pivots, one per column.

When , meaning the number of vectors is the same as the number of coefficients in a vector, then is made of independent vectors if and only if it is invertible, which is if and only if it has nonvanishing determinant, i.e. .

Matrix algebra: composition as multiplication

Matrices can be multiplied by each other. Express as a row of column vectors; then is the matrix which is the row of column vectors . In other words, simply acts upon each column of in parallel.

This rule means that acts on a vector by composition of the actions of first and then of second:

The identity matrix acts by doing nothing, and therefore it does nothing in matrix multiplication: . (For higher we have .)

The inverse matrix, written , is a matrix which does the opposite of the action of , undoing the effect of . So for every . In terms of matrix multiplication, .

The inverse does not always exist, since some actions cannot be undone! For example the matrix sends to , deleting the data of . The data cannot be recovered, so this matrix has no inverse.

Eigenvectors

The action of a diagonal matrix is to scale each row by the diagonal entry in that row:

This action is very easy to express algebraically in terms of standard basis vectors:

In other words, standard basis vectors are sent to scalar multiples of themselves.

Whenever a vector is mapped to a multiple of itself, even when is not diagonal, the vector is called an eigenvector (“self-vector”), and the scalar is called an eigenvalue (“self-value”):

Meaning of eigenvectors

Eigenvectors are “special” or “natural” vectors associated to the matrix , since they reveal a critical proportionality between the vector entries in such that does not disturb this proportionality. If an eigenvector is interpreted as a direction, then the action of simply rescales vectors in that direction. As an example, every rotation in D can be written as a matrix, and an eigenvector of this matrix would point along the axis of rotation, since rotation does not change the direction of axis vectors. The eigenvalue in this case would be .

The eigenvalue equation above is equivalent to the equation . Frequently the shorthand notation is used, so the last equation becomes .

In linear algebra one develops further machinery to analyze equations for any matrix and vector . Setting and , the equation becomes , so the machinery of linear algebra would be applicable here. Nonetheless, we will not learn that machinery for this ODE course. Our goal is only to see how matrix algebra techniques are applicable to solving linear systems.

Finding eigenvalues and eigenvectors

The characteristic polynomial of a matrix is defined as the determinant in which is considered an indefinite variable.

For example: if then and (using the determinant formula) we have:

Key fact for finding eigenvalues

The eigenvalues of a matrix are the roots of its characteristic polynomial .

In the previous example, therefore, are the eigenvalues.

Once the eigenvalues are found by solving for , the solutions can be plugged back in for in the equation , and then we can try to find the eigenvectors by solving this equation. That is normally done using the linear algebra technique of “row reduction,” but in this course you will only solve such equations in the case when it is very easy: one of the rows of will be a multiple of the other whenever is an eigenvalue, so the equation reduces to a single relation of proportionality between and .

Notice that if is an eigenvector, then is also an eigenvector for any scalar . So really we are studying eigenlines, namely the spans of eigenvectors. The relation of proportionality between and gives the direction of this line.

For in the previous example, solving would mean solving . Therefore and thus , so an eigenvector is given by and a line of eigenvectors is given by the span for all . For we find , and solving this we find and therefore is an eigenvector, as is everything in the span .

Matrix calculus

An important technique for differential equations that is frequently not covered in linear algebra courses is that of differentiating matrices and taking matrix powers such as . This is important for systems of differential equations because the general form of a linear homogeneous constant coefficient system:

has for its general fundamental solution the powerful formula:

We unpack this formula and some related material now.

A time-varying matrix or vector may be given as an array of functions time:

Since matrices are added / subtracted / scaled by adding / subtracting / scaling their coefficients in parallel, it is easy to define derivatives (and integrals, for that matter) by coefficients:

A ‘matrix product’ rule is a little less trivial, but also true:

This rule follows from the general formula (using a summation) for each single entry in the matrix product.

Now it is possible to apply any power series function to a matrix. This is because matrix multiplication (for the powers), along with scaling and summing, are enough to evaluate the series. For example, recall that

It is easy enough to plug a matrix into this formula, so we use the formula to define :

(Does such a thing converge? It is possible to show that it does, for any , but we cannot do so here.)

More generally, if is any power series function, then we can define the evaluation where the series converges:

Now, notice a significant observation about diagonal matrices: multiplication of diagonal matrices is just multiplication of the respective diagonal entries:

Since the powers in a power series are defined using matrix multiplication, this tells us that powers are easy for diagonal matrices:

This, in turn, implies that power series functions applied to diagonal matrices are simply given by applying the function to the diagonal entries:

so in particular:

(Sometimes “” or “” is used instead of or when the notation is more convenient.)

Non-diagonal matrix powers

If is non-diagonal, then computing can be extremely challenging! It is necessary to develop an efficient way to find higher matrix powers. There are some tools for this in linear algebra: basically you change coordinates so that the matrix appears to be diagonal or almost diagonal, and you power it up in the new coordinates, and then change back to the original coordinates.

The matrix exponential has some nice properties that follow from the power series formula:

  • (When , but not necessarily when !)
  • (Always.)

Finally, consider differentiation. Using the power series definition for , we discover that:

Therefore:

Matrix exponential solves ODE

The function solves the matrix ODE:

Given the way matrix multiplication works, this formula means (with matrices) that we have distinct vector solutions to the ODE, namely:

Equating the corresponding columns, this means:

Notice that at each given time , the vectors are linearly independent! This is because they are the column vectors of the matrix , and this entity is invertible because its inverse is always given by the matrix .

In conclusion, by using matrix exponentials we have a remarkably efficient way to generate and describe a set of independent solutions to the ODE given by .