Scientific Computing Using Python - PHYS:4905 - Fall 2018

Lecture Notes #09 - 9/18/2018 - Prof. Kaaret

These notes borrow from Linear Algebra by Cherney, Denton, Thomas, and Waldron.

Linear Transformations

Now that we have the concept of vector spaces, we can introduce a new way to think about functions - that they are transformations of one vector space to another vector space.  For example, we may have the vector spaces V and W and the function L which would then transform VWV → W.

Since this is a course on linear algebra, we are interested in linear transformations.  The function L is linear if

     L(ru+sv)=rL(u)+sL(v)L(ru+sv) = rL(u) + sL(v)

for all u,vVu,v ∊ V and r,sr,s ∊ ℝ. Sometimes linear functions are called linear maps, linear operators, or linear transformations.  The name "map" comes from the idea of mapping one vector space into another vector space.

Note that we could instead have required

     L(u+v)=L(u)+L(v) L(u+v) = L(u) + L(v)   and   L(ru)=rL(u)L(ru) = rL(u)

These separate statements of additivity and homogeneity are equivalent.


Why Are Linear Operators so Special?

In general, specifying a real function of one variable requires an infinite number of real variables.  This is clearly true if we allow any real function without any requirements like continuity.  Such a function is a map from the set of real numbers to another set  of real numbers, so we need to specify the output real number separately for each input real number.  It also holds true if we require continuity.  In that case, we have a choice of what the derivative is at each point, so again, we need an infinite number of real numbers.

In contrast, linear functions are completely specified by just a few numbers.  Let's say that L is a linear operator and

     L(10)=(53)L \,\begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 5 \\ 3 \end{pmatrix}      and     L(01)=(22)L \, \begin{pmatrix} 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 2 \end{pmatrix}

Can you figure out what L(11)L \, \begin{pmatrix} 1 \\ 1 \end{pmatrix}  is?

We can go back to the requirement of additivity given above,    L(u+v)=L(u)+L(v) L(u+v) = L(u) + L(v).  Then,

L(11)=L[(10)+(01)]=L(10)+L(01)L \, \begin{pmatrix} 1 \\ 1 \end{pmatrix} = L \, \left[ \begin{pmatrix} 1 \\ 0 \end{pmatrix} + \begin{pmatrix} 0 \\ 1 \end{pmatrix} \right] = L \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + L \, \begin{pmatrix} 0 \\ 1 \end{pmatrix}

We wrote down definitions for the two operations on the right hand side of that equation, so we can substitute those in and do a little math,

=L(10)+L(01)=(53)+(22)=(75)= L \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + L \, \begin{pmatrix} 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 5 \\ 3 \end{pmatrix} + \begin{pmatrix} 2 \\ 2 \end{pmatrix} = \begin{pmatrix} 7 \\ 5 \end{pmatrix}

If you think about it for a little while, you may conclude (correctly) that you can do this for any vector. Any arbitrary vector can be written in terms of our 'basis' vectors

     (xy)=x(10)+y(01)\begin{pmatrix} x \\ y \end{pmatrix} = x \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y \, \begin{pmatrix} 0 \\ 1 \end{pmatrix}

We can then use that sum of vectors and do the same calculations that we did before

     L(xy)=L[x(10)+y(01)]=xL(10)+yL(01)=x(53)+y(22)=(5x+2y3x+2y)L \, \begin{pmatrix} x \\ y \end{pmatrix} = L \, \left[ x \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y \, \begin{pmatrix} 0 \\ 1 \end{pmatrix} \right] = x L \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y L \, \begin{pmatrix} 0 \\ 1 \end{pmatrix} = x \begin{pmatrix} 5 \\ 3 \end{pmatrix} + y \begin{pmatrix} 2 \\ 2 \end{pmatrix} = \begin{pmatrix} 5x + 2y \\ 3x + 2y \end{pmatrix}

Therefore, we can figure out what L does to any vector in the vector space 2{ℝ}^{2} that L is defined over.

Exactly how much information, or how many numbers, did we need to specify a linear function on  2?

Look back at the definition above, we need two columns vectors each with two elements.  This is four numbers.  We could also write the four numbers as a matrix

     (5232)\begin{pmatrix} 5 & 2 \\ 3 & 2 \end{pmatrix}

In general, a linear transformation operating on the vector space n{ℝ}^{n} is completely specified by how it acts on n vectors each with exactly one non-zero component called the basis vectors.  The result on each basis vector will be a vector with n components.  Hence, we need n2va{n}^{2} numbers to completely specify the function.  Those n2va{n}^{2} numbers can be written as an n×nn×n matrix.


Basis vectors

We just saw that a linear operator acting on 2{ℝ}^{2} is completely specified by how it acts on the two vectors (10)\begin{pmatrix} 1 \\ 0 \end{pmatrix}  and  (01)\begin{pmatrix} 0 \\ 1 \end{pmatrix}.  Do we need to use those particular two vectors?

The crucial steps in our little 'proof' above were using additivity and homogeneity to break down the operation of L on an arbitrary vector into operations on our basis vectors.

     L(xy)=L[x(10)+y(01)]=xL(10)+yL(01)L \, \begin{pmatrix} x \\ y \end{pmatrix} = L \, \left[ x \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y \, \begin{pmatrix} 0 \\ 1 \end{pmatrix} \right] = x L \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y L \, \begin{pmatrix} 0 \\ 1 \end{pmatrix}

We could do this because any arbitrary vector can be written in terms of the basis vectors

     (xy)=x(10)+y(01)\begin{pmatrix} x \\ y \end{pmatrix} = x \, \begin{pmatrix} 1 \\ 0 \end{pmatrix} + y \, \begin{pmatrix} 0 \\ 1 \end{pmatrix}

Could we write any arbitrary vector in terms of some other set can be written in terms of the basis vectors?

How about  (11)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (1-1)\begin{pmatrix} 1 \\ -1 \end{pmatrix} ?

The question is whether we can find a and b such that

(xy)=a(11)+b(1-1)\begin{pmatrix} x \\ y \end{pmatrix} = a \, \begin{pmatrix} 1 \\ 1 \end{pmatrix} + b \, \begin{pmatrix} 1 \\ -1 \end{pmatrix}
Yes, we can!  You can set this up as a system of linear equations.  The solution is a = (x+y)/2, b = (x-y)/2.  Substituting in, we can check that this solution is valid.

(xy)=x+y2(11)+x-y2(1-1)\begin{pmatrix} x \\ y \end{pmatrix} = \frac{x+y}{2} \begin{pmatrix} 1 \\ 1 \end{pmatrix} + \frac{x-y}{2} \begin{pmatrix} 1 \\ -1 \end{pmatrix}


How about  (11)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (22)\begin{pmatrix} 1 \\ -1 \end{pmatrix} ?


How about  (11)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (1-2)\begin{pmatrix} 1 \\ -1 \end{pmatrix} ?


How about  (10)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (01)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (10)\begin{pmatrix} 1 \\ 1 \end{pmatrix} ?

In general, the number of basis vectors needed is equal to the number of dimensions of the vector space on which the linear operator acts.  Any more creates redundancy in how we 'decompose' vectors.

Also the basis vectors must span the entire space, meaning that it must be possible to write any arbitrary vector as a linear combination of the basis vectors.

Do (11)\begin{pmatrix} 1 \\ 1 \end{pmatrix}  and  (22)\begin{pmatrix} 1 \\ -1 \end{pmatrix} span 2ℝ^2 ?

No, because those two vectors are not linearly independent.  You can get the second vector by multiplying the first vector by a constant.

Linear independence is easy for the two dimensional case, but how do we generalize to three dimension?  Is this set of vectors linearly independent and do they span 3ℝ^3 ?

(100)(110)(210)\begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} \;\;\; \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} \;\;\; \begin{pmatrix} 2 \\ 1 \\ 0 \end{pmatrix}

You cannot get any of these vectors by multiplying another vector by a constant.
However, you can get the third vector by adding the first two.

A set of vectors are linearly dependent if you can get one vector (or more) in the set from a linear combination of other vectors in the set.  Or, in math speak, the vectors v1, v2, ..., vn are linearly dependent if there exist scalars c1, c2, ..., cn not all zero such that

      c1 v1 + c2 v2 + ... + cn vn = 0

Once we do determinants, we see how to efficiently check if a set of vectors are linearly dependent.

A basis is a set of vectors that can be used to uniquely express any other vector in the vector space.  The vectors must be linearly independent.  If the set of vectors is finite, then the number of vectors in the basis is the dimension of the vector space.


Basis Notation

In physics, we often write 3-vectors in a notation like

      v=xi^+yj^+zk^v = x \, \hat{i} + y \, \hat{j} + z \, \hat{k}

The set of unit vectors {i^,j^,k^}\{ \hat{i}, \hat{j}, \hat{k} \} are the basis.  In this notation, we need to write down the basis vectors whenever we write a vector.  The vector written above makes sense, by trying to write a vector without the basis vectors, e.g. v=x+y+zv = x+y+z    would not.

In matrix notation, we define a set of vectors as a basis.  The 'standard' basis for 3ℝ^2 is

e1=(100),e2=(010),e3=(001)e_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} \, , \;\;\; e_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} \, , \;\;\; e_3 = \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}

To use matrix notation for vectors, we need to assign an order to the vectors in our basis.  We denote that by using parentheses instead of curly brackets.  This basis given is then E=(e1,e2)E = (e_1, e_2) and we can write the vector in column vector form or in algebraic form,

     v=(xyz)=(xyz)E=xe1+ye2+ze3=(e1,e2,e3)(xyz)v = \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} x \\ y \\ z \end{pmatrix}_E = x \, e_1 + y e_2 + z e_3 = \begin{pmatrix} e_1, & e_2, & e_3 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix}

The notation with the subscript E gives "the vector v in the basis E".  This is equivalent to writing out the vector in algebraic form with the basis vectors given explicitly and also to the product of the set of basis vectors with the column vector.  To evaluate the latter, do the multiplication in the same fashion as multiplying a matrix times a column vector.

We can use a different basis to describe the vectors.  For example, to form the basis B, we could pick a new set of vectors,

b1=(110),b2=(1-10),b3=(001)e_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} \, , \;\;\; e_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} \, , \;\;\; e_3 = \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}

We can write the vector in this basis,

     v=(xyz)B=xb1+yb2+zb3=(b1,b2,b3)(xyz)v = \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix}_B = x' \, b_1 + y' b_2 + z' b_3 = \begin{pmatrix} b_1, & b_2, & b_3 \end{pmatrix} \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix}

Note that this changes the values of the components. 

     v=(xyz)B=(x+yx-yz)v = \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix}_B = \begin{pmatrix} x' + y' \\ x' - y' \\ z' \end{pmatrix}

Note that only in the standard basis do the components equal the components of the column vector.  Converting from one basis to another is equivalent to solving a system of linear equations,

Many physics problem are much simpler if you change to an appropriate basis.  Indeed, some problems, like finding the energy levels of a quantum mechanical system, are essentially exercises in figuring out an appropriate basis that simplifies the matrix multiplications.



From Linear Operators to Matrices

Linear functions are completely defined by how they act on their basis vectors.  One needs to compute what the linear transformation does to every input basis vector and then write the answers in terms of the output basis vectors.

More formally, if L is a linear operator from the vector space V to the vector space W and we define a basis B=(b1,b2,...)B = (b_1, b_2, ...) for V and a basis B=(β1,β2,...)B' = (β_1, β_2, ...) for W, then L is completely specified by a set of numbers mijsichm_i^j  that give

L(bi)=mi1β1+mi2β2+minβnL(b_i) = m_i^1 β_1 + m_i^2 β_2 + ⋯ m_i^n β_n

Note that this equation is defined for all i from 1 to the dimension of the basis, n, and that the sum goes over the same range.  If both bases are standard bases, then the numbers mijsichm_i^j  give the matrix representation of the operator.  If not, one can compute the matrix representation of the operator.