Packet 05

Vectors I: Generators

Basics of nD vectors

Vectors in โ„n have n real number components:

๐ฏ=(v1,v2,โ€ฆ,vn).

Such vectors are added componentwise, and scalars multiply every component simultaneously. All the abstract operations and properties of vectors apply to vectors in โ„n:

  • Operations: addition and scalar multiplication,
  • Properties: commutativity, associativity, distributivity, zero vector.

There are n standard basis vectors:

๐ži=(0,โ€ฆ,0,1ithโŸ,0,โ€ฆ,0).

Decomposition works as in 3D:

๐ฏ=v1๐ž1+v2๐ž2+โ‹ฏ+vn๐žn,๐ฏ=โˆ‘i=1nvi๐ži.

Pairs of vectors in nD also have dot products defined by summing component products:

๐ฎโ‹…๐ฏ=(u1,โ€ฆ,un)โ‹…(v1,โ€ฆ,vn)=u1v1+โ‹ฏ+unvn=โˆ‘i=1nuivi.

The norm of an nD vector is still |๐ฎ|=๐ฎโ‹…๐ฎ.

Dot product still has the meaning of โ€œrelative alignment between vectors,โ€ and can still be used to determine the angle between vectors using the cosine formula, ๐ฎโ‹…๐ฏ=|๐ฎ||๐ฏ|cosโกฮธ. However, this angle is considerably less important in nD.

Projection in nD is very important. It is computed with the same formula:

๐ฎ||=proj๐ฏ(๐ฎ)=(๐ฎโ‹…๐ž๐ฏ)๐ž๐ฏ=(๐ฎโ‹…๐ฏ๐ฏโ‹…๐ฏ)๐ฏ.

We also have ๐ฎโŸ‚=๐ฎโˆ’๐ฎโˆฅ. Notice that ๐ฎโŸ‚ now lives in the hyperplane perpendicular to ๐ฏ. Given various ๐ฎ and a fixed ๐ฏ, the various ๐ฎโˆฅ are all parallel to ๐ฏ, but the various ๐ฎโŸ‚ are not all parallel to each other.

The Cauchy-Schwartz and Triangle inequalities become more important in nD:

|๐ฎโ‹…๐ฏ|โ‰ค|๐ฎ||๐ฏ|,|๐ฎ+๐ฏ|โ‰ค|๐ฎ|+|๐ฏ|.

The vector formula for a line through ๐ซ0 in the direction of ๐ฏ, namely ๐ซ(t)=๐ซ0+t๐ฏ, still works in nD. However, the formula for a plane:

(๐ซโˆ’๐ซ0)โ‹…๐ง=0

determines a hyperplane in nD, meaning an (nโˆ’1)D space inside of โ„n. We could write this space in scalar form as

k1(x1โˆ’a1)+k2(x2โˆ’a2)+โ‹ฏ+kn(xnโˆ’an)=0,

where:

๐ง=(k1,โ€ฆ,kn),๐ซ๐ŸŽ=(a1,โ€ฆ,an),๐ซ=(x1,โ€ฆ,xn).

Spans

Linear combinations of vectors in nD work just as in 3D:

๐ฏ=a1๐ฎ1+a2๐ฎ2+โ‹ฏ+an๐ฎn,aiโˆˆโ„.

A span is a collection of all vectors which could be obtained as linear combinations of certain given vectors. For example, the collection of all possible ๐ฏ that can be written as above, for any ai, in terms of the ๐ฎi given at the outset, is called either of:

span{๐ฎ1,โ€ฆ,๐ฎn},โŸจ๐ฎ1,โ€ฆ,๐ฎnโŸฉ.

It is still an important fact that a span passes through the origin ๐ŸŽ=(0,0,โ€ฆ,0). This is because linear combinations do not include a constant term, so the point ๐ŸŽ can always be achieved by setting ai=0 for all i.

Example

Computing a span by hand

Consider the vectors ๐ฎ=(1,1,0,โˆ’1,2) and ๐ฏ=(0,2,3,1,โˆ’1). Problem: Show that the set of vectors perpendicular to both ๐ฎ and ๐ฏ is a span.

Solution: Let ๐ฑ=(x1,โ€ฆ,x5) be an arbitrary vector such that ๐ฑโ‹…๐ฎ=๐ฑโ‹…๐ฏ=0. By writing these dot products using components, we find a system of two equations:

x1+x2โˆ’x4+2x5=0,2x2+3x3+x4โˆ’x5=0.

Solve for x1 and x3 in terms of the others:

x1=โˆ’x2+x4โˆ’2x5,x3=โˆ’23x2โˆ’13x4+13x5.

Now, we can let x2, x4, and x5 take any value, and using these equations we can specify x1 and x3 to guarantee that the system of equations is valid, implying that ๐ฑโ‹…๐ฎ=๐ฑโ‹…๐ฏ=0. Conversely, given values of x2, x4, and x5, these equations fully determine the only possible values of x1 and x3. Therefore, the set of possible ๐ฑ is given by varying x2, x4, and x5 in the vector:

๐ฑ=(โˆ’x2+x4โˆ’2x5x2โˆ’23x2โˆ’13x4+13x5x4x5).

Observe that this vector can be written as the linear combination x2๐ฐ2+x4๐ฐ4+x5๐ฐ5, with

๐ฐ2=(โˆ’11โˆ’2/300),๐ฐ4=(10โˆ’1/310),๐ฐ5=(โˆ’201/301).

So indeed the set of vectors ๐ฑ perpendicular to ๐ฎ and ๐ฏ is the span โŸจ๐ฐ2,๐ฐ4,๐ฐ5โŸฉ.

Convex combinations A convex combination of vectors is a linear combination with a certain constraint on the coefficients:

๐ฏ=c1๐ฎ1+c2๐ฎ2+โ‹ฏ+cn๐ฎn,ciโˆˆ[0,1],โˆ‘i=1nci=1.

Subspaces

A subspace is any collection of vectors that satisfies the rules of a vector space, meaning the operations and properties. Since the properties automatically hold for vectors from the original space, the key point is that a subspace is a collection of vectors including the origin, all of whose linear combinations are still in the subspace.

Symbolically: if WโŠ‚โ„n is any subset of vectors containing ๐ŸŽ, and if a๐ฐ1+b๐ฐ2โˆˆW automatically whenever ๐ฐ1,๐ฐ2โˆˆW, then W is a subspace.

Question 05-01

Subspaces

Suppose a set W satisfies the symbolic hypothesis above:

๐ฐ1,๐ฐ2โˆˆWimpliesa๐ฐ1+b๐ฐ2โˆˆW,anyย a,b.

Show that any linear combination of n vectors from W must lie in W.

Exercise 05-01

Subspaces given by perpendicularity

Let WโŠ‚โ„n be defined as the set of vectors which are perpendicular to the given set {๐ฏ1,โ€ฆ,๐ฏk}. Show that W is a subspace.

To say that W is a subspace is to say that W is a vector space in its own right, even though vectors in W may also live in a bigger space.

Example

Simple subspaces of โ„n

Let WโŠ‚โ„n be the collection of all possible vectors ๐ฐ=(w1,โ€ฆ,wnโˆ’1,0), meaning that the final term is always zero, and the other terms can be anything.

This set is a subspace because it contains (0,0,โ€ฆ,0), and any linear combination a๐ฐ1+b๐ฐ2 will still have a zero in the final component, so it lies in W.

Question 05-02

Iterating from 2 to n

How many distinct subspaces of โ„n can you describe using the general idea of the previous example?

Spans and subspaces

A span is always a subspace: it contains ๐ŸŽ, and any linear combination of some vectors in a span can be written as a linear combination of the original vectors defining the span (by expanding each vector in terms of the original vectors, and then collecting like terms).

The converse is also true: any subspace is the span of some vectors. This is not so easy to prove! Here is the proof:

Why every subspace is a span

We consider โ„kโŠ‚โ„n by using the first k components, setting the last nโˆ’k components to zero. We prove that if we assume every subspace of โ„k is a span, then every subspace of โ„k+1 must also be a span. (The result will follow because it is clearly true for โ„1, and we can use this fact to build up to โ„n one dimension at a time.)

So we assume that every subspace of โ„k is a span. Now suppose WโŠ‚โ„k+1 is any subspace. Consider the last components of vectors ๐ฐโˆˆW. If all of these last components are zero, then actually WโŠ‚โ„k, and by the assumption, it is a span.

Suppose, on the other hand, that at least one vector ๐ฐโ‹†โˆˆW has a last component wk+1โ‹†โ‰ 0. Now we create the subspace Wโ€ฒโŠ‚โ„k defined as

Wโ€ฒ=Wโˆฉโ„k={(w1,w2,โ€ฆ,wk+1)โˆˆW|wk+1=0}.

The collection Wโ€ฒ is a subspace of โ„k, so it is a span by the assumption, and we can find a generating set of vectors:

Wโ€ฒ=โŸจ๐ฐ1,โ€ฆ,๐ฐkโŸฉ.

Now we propose the following key fact:

W=โŸจ๐ฐ1,โ€ฆ,๐ฐk,๐ฐโ‹†โŸฉ.

To prove this, let ๐ฐ=(w1,โ€ฆ,wk+1)โˆˆW be any vector. If wk+1=0, then ๐ฐโˆˆWโ€ฒ, so it is in the proposed span above. If wk+1โ‰ 0, then consider the vector ๐ฎ=๐ฐโˆ’wk+1wk+1โ‹†๐ฐโ‹†. This vector has zero in the last component. So ๐ฎโˆˆWโ€ฒ, which can be expanded as a linear combination of ๐ฐ1,โ€ฆ,๐ฐk. By vector algebra, ๐ฐ=๐ฎ+wk+1wk+1โ‹†๐ฐโ‹†, and we can substitute the expansion of ๐ฎ to see that ๐ฐ is in the proposed span.

The difference between the concept of span and the concept of subspace is a matter of connotation. When working with a โ€˜spanโ€™, we have in mind a collection of vectors that generates the span using linear combinations. When working with a โ€˜subspaceโ€™, we have in mind the abstract rules of vector spaces.

Incidentally, the above proof shows that any subspace of โ„n can be written as the span of n or fewer vectors.

Dimension

The dimension of a subspace WโŠ‚โ„n, written dimโกW, is the smallest possible number of vectors needed to span W.

This definition is very intuitive: a space has n dimensions if n different numbers are needed to locate every item in the space. These numbers are the coefficients of the spanning vectors in a minimal spanning set for the space.

The definition can also be hard to work with in a rigorous way. How could we prove that a given space could not be spanned by an ever smaller number of vectors?

For example, of course โ„n should be n-dimensional. That is why we have been saying โ€œnD.โ€ Of course it is spanned by the n standard basis vectors ๐ž1,โ€ฆ,๐žn. But how do we know it cannot be spanned by a smaller number?

In the next section the concept of independence will be developed to handle this question more generally. However, it is worthwhile practice with the general theory of spans to demonstrate this fact by learning about Steinitz Exchange:

Showing โ„n is n-dimensional: Steinitz Exchange Process

Suppose some set {๐ฏ1,โ€ฆ,๐ฏk} with only k vectors could span โ„n with n>k, so:

โ„n=โŸจ๐ฏ1,โ€ฆ,๐ฏkโŸฉ.

This means that ๐ž1โˆˆโŸจ๐ฏ1,โ€ฆ,๐ฏkโŸฉ. Then it turns out we can โ€œexchangeโ€ ๐ž1 for one of the ๐ฏi:

โ„n=โŸจ๐ž1,๐ฏ2โ€ฒ,โ€ฆ,๐ฏkโ€ฒโŸฉ,

where the notation ๐ฏ2โ€ฒ,โ€ฆ,๐ฏkโ€ฒ means some subset of ๐ฏ1,โ€ฆ,๐ฏk with only kโˆ’1 elements, but we donโ€™t care which is which. The reason we can do this is important: since ๐ž1โˆˆโŸจ๐ฏ1,โ€ฆ,๐ฏkโŸฉ, we can write a linear combination

๐ž1=a1๐ฏ1+โ‹ฏ+ak๐ฏk

with at least one aiโ‹†โ‰ 0. The equation can be solved for ๐ฏiโ‹† using the other vectors in the equation, and therefore every vector in โ„n can be expressed as a linear combination of those other vectors in the equation. But that is the meaning of โ€œโ„n=โŸจ๐ž1,๐ฏ2โ€ฒ,โ€ฆ,๐ฏkโ€ฒโŸฉ.โ€

Now iterate this process: we can exchange ๐ž2 for another vector out of ๐ฏ2โ€ฒ,โ€ฆ,๐ฏkโ€ฒ:

โ„n=โŸจ๐ž1,๐ž2,๐ฏ3โ€ฒโ€ฒ,โ€ฆ,๐ฏkโ€ฒโ€ฒโŸฉ.

The reason we can do this is similar, but with a slight twist. As before, write a linear combination:

๐ž2=a1๐ž1+a2๐ฏ2โ€ฒ+โ‹ฏ+ak๐ฏkโ€ฒ.

This time, however, we know that at least one of a2,โ€ฆ,ak is nonzero, say aiโ‹†, because there is no way to generate ๐ž2 using combinations of ๐ž1 alone. So we solve for ๐ฏiโ‹†โ€ฒ using the other vectors in the equation. This means we can eliminate ๐ฏiโ‹†โ€ฒ from the spanning set, provided we add in ๐ž2.

Now iterate the process k times, replacing some ๐ฏiโ‹† at each successive stage with a new vector ๐žj, observing that the linear combination for ๐žj could not involve nonzeros only on the other ๐ži.

After k iterations, we have โ„n=โŸจ๐ž1,โ€ฆ,๐žkโŸฉ. This is impossible! Since k<n, there is no way to get nonzero entries in components higher than k.

Problems due Tuesday 20 Feb 2024 by 11:59pm

Problem 05-01

Spans of column vectors

Matrices M1,M2 are nร—n matrices, and k<n. Matrix M1 has ones everywhere on and above the main diagonal, down to the kth row, and zeros elsewhere, including everything after the kth row. Matrix M2 has ones on the main diagonal only, down to the kth row, and zeros everywhere else. Show that the span of the column vectors of M1 is the same subspace as the span of the column vectors of M2. What subspace is it?

(Hint: first try to verify the assertion for small values of k and n, for example when n=2 and k=1, and then n=3 and k=1 or k=2. Then see if you can generalize your method to all k<n.)

Problem 05-02

Computing dimensions by hand

  • Show that the span โŸจ(11),(1โˆ’1)โŸฉ has dimension 2 as a subspace by using the โ€œexchangeโ€ technique.
  • Show that the span โŸจ(111),(110),(022)โŸฉ has dimension 3 as a subspace by using the โ€œexchangeโ€ technique.
  • Show that the span โŸจ(1โˆ’23),(402),(3โˆ’24)โŸฉ has dimension 2 as a subspace.
Problem 05-03

Suppose we are given a collection of data points (x1,y1),โ€ฆ,(xn,yn). center Later in the course, we will learn how to compute the linear regression of this data, namely the line of best fit. In this problem, you will learn about the correlation coefficient of the data: a measure of how linear the data is. There is no point in computing a linear regression if the data is not very linear!

First, let ๐ฑ=(x1,x2,โ€ฆ,xn) and ๐ฒ=(y1,y2,โ€ฆ,yn). Then let x be the average of the xi, and y be the average of the yi. Therefore ๐ฑโˆ’x and ๐ฒโˆ’y have zero average. (By subtracting a scalar from a vector, we really mean to subtract the scalar from each component.) Then define the correlation coefficient:

ฯ(๐ฑ,๐ฒ)=(๐ฑโˆ’x)โ‹…(๐ฒโˆ’y)|๐ฑโˆ’x||๐ฒโˆ’y|.

Correlation coefficient

Problem: Compute the correlation coefficient of the data:

(โˆ’3,4),(1,โˆ’1),(0,โˆ’1),(4,โˆ’3),(โˆ’2,1).

The correlation coefficient ฯ(๐ฑ,๐ฒ) is frequently written โ€˜rโ€™. The formula describes r as a dot product of unit vectors, so โˆ’1โ‰คrโ‰ค+1. When r=ยฑ1, the data are very linear. When r=+1, the data lie on a line with positive slope, meaning that if you increase x then you expect y to increase as well. When r=โˆ’1, the data lie on a line with negative slope, so if you increase x then you expect y to decrease.

In a future Packet we will see why r=ฯ(๐ฑ,๐ฒ) measures linearity. For now, consider the following. Suppose the data is linear, so yi=mxi+b for some m and b. Then as vectors, we have ๐ฒ=m๐ฑ+b. If we subtract the average of both sides, we have

๐ฒโˆ’y=m๐ฑ+bโˆ’m๐ฑ+b=m(๐ฑโˆ’x)+bโˆ’b=m(๐ฑโˆ’x).

Therefore ๐ฒโˆ’y is a scalar multiple of ๐ฑโˆ’x, and the dot product of their unit vectors tells us whether they align or anti-align.

This problem (and its sequel) illustrates the use of nD vectors to study data that is presented as 2D vectors.