Scientific Computing Using Python - PHYS:4905
Lecture Notes #19 - Prof. Kaaret
These notes borrow from Linear Algebra
by Cherney, Denton, Thomas, and Waldron.
String Theory
If we tie down both ends of a string, we can excite standing waves
on the string. The ends of the string must be at nodes (zero
amplitude of displacement), since they are tied down. Standing
waves must have wavelength 𝜆=2𝐿/𝑛, where L =
length of string. The different allowed wavelengths are
“harmonics”.
The waves on the string obey the wave equation
where c is the speed of the wave.
The solutions to the wave equation are functions of the form .
To check, we need to insert this into the wave equations.
Taking derivatives, we find
and
Putting these into the wave equation, we find
The a and the sine functions cancel, so we have a solution
if
.
For our string theory problem, the boundary condition at x =
0 is automatically satisfied,
To satisfy the boundary condition at x = L, we
require
since sine has
zeros at 0, π, 2π, 3π, ...
The solutions to our string theory problem are then any linear
combination of functions of the form
where .
A Linear Operator for String Theory
Now we'll translate this to linear functions and vector spaces.
We can define a linear function
that acts on the vector space V of functions y(x,
t) for which all partial derivatives of the form
where k and m are any positive integer exist.
The function L is linear if
for all
and
. Is
that true of W acting on any y that is an element of
V?
Note that vector spaces needs to have closure under addition and
scalar multiplication and that a linear operator defined on the
space also needs to have closure - the resultant of the linear
operator must also be in the vector space. Do you see why the
condition on partial derivatives is needed?
Now our wave equation can be written as W y = 0.
It is called a homogeneous equation because one side of the equation
is zero. The solutions of this equation are a "sub-space" of
the vector space V. This particular sub-space, the set
of all elements y in the vector space V for which W y
= 0, is called the "null space" of W. It is also
called the "nullspace" (nulling out the space ;) and the "kernel"
(which unfortunately has nothing to do with Colonel Sanders).
It is sometimes written as ker(W) or Null(W).
The nullspace is also a vector space. Most of the properties
needed to make the nullspace a vector space transfere automatically
from V. The only one that you might worry about is
closure. (People often have a hard time getting closure.)
Eigenvalues and Eigenvectors
Let's look at just the spatial parts of our functions. We can
introduce a vector space U that is the set of functions f(x)
for which all derivatives exist. (The derivatives are, of
course, with respect to x and are total derivatives since
the functions depend only on x.) We can introduce a
linear operator L on this vector space that is the spatial
part of W,
If we take the spatial part of any of the solutions to our string
theory problem,
then
we find that
or
where
. The equation just above is called the
eigenvalue-eigenvector equation or just the eigenvector
equation. "Eigen" is a German adjective meaning own, as
in "my own house", and characteristic,
as in "he answered with his
characteristic cynicism". The eigenvector of a linear
operator are the vectors whose direction is unchanged when acted
on by the linear operator, i.e. the result of L acting
on fn
is a scalar multiplied by fn. The
scalar is the eigenvalue of that eigenvector, which in our
equation above is λn. The eigenvectors provide
a means to characterize a linear operator. For this
reason, the textbook says that "This is perhaps one of the most
important equations in all of linear algebra!".
Invariant Directions
The eigenvectors are directions where the output of the linear
operator is parallel to the input. These are called 'invariant
directions', since they are not changed by the linear
operator. Finding the eigenvectors, or the invariant
directions, and then changing to a basis defined by the eigenvectors
greatly simplifies calculations with the linear operator.
Let's try an example. Consider a linear operator L acting
on defined by
the transformations
and
.
The matrix for L in the standard basis is then
.
Let's make an inspired guess and look at
The output vector is the same as the input vector. This means
that the vector is along an invariant direction and is an
eigenvector. In this case, the output vector has the same
magnitude as the input vector, so the eigenvalue = 1.
Note that any vector in the same direction as
can be written as a scalar, c, multiplied by
. We
then have
.
So the new vector is also an
eigenvector of L with the same eigenvalue of 1.
Now let's make another inspired guess,
The vector is another
eigenvector of L, but this time the magnitude of the output is twice
the magnitude of the input, so the eigenvalue is 2. Any vector
in the same direction as
gets
stretched by L by a factor of 2.
Diagonalization
We can write any vector in as a linear
combination of
and
, .
Applying L, we find
Transforming to a basis using
and
as basis
vectors, the vector w would be written as
and the matrix representation of L would be
. The
matrix is diagonal and the diagonal entries are the
eigenvalues. As you might imagine, this greatly simplifies
calculations involving L.
The process of transforming to a basis consisting of the
eigenvectors is called diagonalization. To do this, we
can't always rely on inspired guesses, but instead need an
algorithm.
Let's try finding the eigenvectors for the linear transformation L
described in the standard basis by the matrix
.
We want to find invariant directions. That means that L
will transform the vector
to a multiple of that vector. In matrix notation,
Re-writing the right hand side as a matrix expression,
and subtracting the same matrix from both sides, we find
A system of equations where the column vector part is zero is called
a homogeneous system. Homogeneous systems have non-zero
solutions only if the inverse to the matrix part does not
exist. To see this, think about the situation if that inverse
exists, then we can multiply both sides from the left by the inverse
and we would find that x = y = 0. The inverse of the matrix
exists if its determinant is not zero, so we need to have the
determinant of the matrix equal to zero in order for the linear
operator to be diagonalizable, which is a fun word to say.
In equations, we need
We find that L has two eigenvalues which are
and
.
The non-matrix terms in the equation above are a polynomial in λ
that is the negative of the so-called characteristic polynomial.
Now we need to find the corresponding eigenvectors. The
eigenvectors must satisfy the equation,
With the appropriate value of λ. As we did above, we can write
write this as a homogeneous system,
Plugging in
, we
get
The top row gives us or
.
The bottom row gives the same relation. We are free to choose
any length for the vector as long as it points in the invariant
direction, so we can choose x = 1 and get the
eigenvector
Doing the same for
, we
find
Algorithm for diagonalization
Let's review our algorithm for finding eigenvalues and eigenvectors
so that we can put a matrix into diagonal form.
First, we found the characteristic polynomial of the matrix M
representing the linear operator L. In equation form the
characteristic polynomial is
Second, we found the roots of the characteristic polynomial.
Actually, we found the roots of its negative, .
The roots are the same, but writing it this way tends to save on
minus signs.
Third, we plugged each eigenvalue into the equation
to find the associated eigenvector. Note that only the
direction of the eigenvector is important. You are free to
choose the magnitude.
Doesn't this newfound knowledge make you want to go out and
diagonalize some matrices?
Having issues
One issue that might come up is that the roots of the characteristic
polynomial may include complex values. The Fundamental Theorem
of Algebra states that any polynomial can be factored into a product
of first order polynomials, but only if you allow complex numbers
in the first order polynomials and therefore complex roots. In
physics, we usually only consider operators that have real
eigenvalues and we'll discuss conditions that insure this later.
Another issue that might come up is that some of the roots of the
characteristic polynomial may be the same. The number of times
that any given root appears in the collection of eigenvalues is
called its multiplicity. If you have a root with a
multiplicity greater than one, then your equation in step
three for that root will have multiple,
linearly-independent solutions. In fact, the
number of linearly independent vectors satisying the
equation will equal the multiplicity. In that
case, you are free to choose as your eigenvectors any
set of linearly
independent vectors satisying the equation
with the number of vectors in the set equal
to the multiplicity. The
set of eigenvectors with the same eigenvalue λn span
a sub-space that is called the eigenspace of λn.
Change of basis
Another condition that must hold for a linear operator to be
diagonalizable (still fun to say) is that it must be possible to
write the operator as a matrix in a basis consisting of the
eigenvectors of the operator. As we saw above in our 2×2
matrix example, the matrix for the linear operator will be diagonal
in this basis with the diagonal entries equal to the eigenvalues.
Let's call set of basis vectors for the standard basis . In the
standard basis, we have
Our basis vectors for the new basis where the linear operator is
diagonal is
where the
vectors are the eigenvectors.
If our linear operator is diagonalizable, then we can transform from
the basis B to the basis V by writing each as a linear
combination of the
. In
equations,
The are constants and we can think of them as forming a square
matrix P, as shown above. The matrix P is called a change
of basis matrix. Note that the values are
numbers, but the
and
the are
vectors.
Let's look at the first vector in our new basis,
, we can
write this vector in terms of the B basis vectors as
This is the same as the equation on the left with k = 1 or
doing the multiplication in the equation on the right for only the
first column. From this, we can see that the components of the
first column of the matrix P are the same as the components of the
vector
in the
basis B. Therefore, the columns of the change of basis
matrix are the components of the new basis vectors in terms of the
old basis vectors.
The change of basis matrix lets us take a vector u written
in components in the basis V, which we will call
, to and
write it in component in the basis B, which we will call
, by doing a
matrix multiplication, .
To see that this is correct, think about the first
eigenvector. In the basis V, it is . In the
basis B, it is .
Multiplying
by P gives you
.
If we change from the basis V to the basis B, we
must also be able to change from basis B to basis V.
Let's call the matrix that does that Q, so .
Clearly, if we change u from the basis B to the basis V and then
back to the basis B, we should get the same vector that we started
with, so
This means that the matrix Q must be the inverse of the matrix P.
It also means that P must be invertiable if we can actually
do the transformation of basis.
So, to transform a vector u from the basis B
to the basis V, we do the matrix multiplication
To transform the matrix M representing the linear operator L
from the basis B to the basis V, we use the
following equation. Since the new matrix is diagonal, we call
it D.
We conclude that a matrix M is diagonalizable if there
exists an invertible matrix P and a diagonal matrix D
that satisfy the equation above.
A worked example
Let's take the linear transformation L from the
Diagonalization section above. In the standard basis, the
linear transformation is described by the matrix
.
We found that the eigenvectors are
and
and that the eigenvectors are
.
We can construct the change of basis matrix P using the
eigenvectors. It is
.
To change the matrix M into the diagonal matrix D,
we need the inverse of P. Reviewing from lecture #11,
we can either do this directly from the equation for the inverse of
a 2x2 matrix given at the beginning of that lecture or use the
formula that the end which gives the inverse in terms of the adjoint
and determinant of the matrix. Note that the latter
generalizes to larger matrices. We find
Then
This matches our prescription above that the new matrix for the
linear transformation L should be diagonal in this basis
with the diagonal entries equal to the eigenvalues.
Note that changing basis changes the matrix representing a linear
transformation but does not change the linear transformation
itself. The linear transformation maps input vectors to output
vectors with a one-to-one correspondence of input vector to output
vector. This mapping stays the same no matter which basis we
use. Linear transformations are the actual objects of study of
linear algebra, not matrices. Matrices are merely a convenient
way of doing computations.
To test this, let's calculate the action of L on the vector
u in both basis. We choose u to have components
both components equal 1 in the standard basis. Then
Transforming u into the eigenvector basis V, we find
Doing the matrix multiplication in the V basis, we find
The linear transformation describes the same mapping from the input
to the output vector, regardless of the basis that we use.
Isn't that nice?
To solidify your understanding, try finding the eigenvalues and
eigenvectors of the linear operator L written in the
standard basis as
. Then find
the change of basis matrix and calculate the action of L on
the vector u that we specify in the standard basis as
.