Lecture #20

Scientific Computing Using Python - PHYS:4905 - Fall 2018

Lecture Notes #20 - 11/8/2018 - Prof. Kaaret

These notes borrow from Linear Algebra by Cherney, Denton, Thomas, and Waldron.

Standard basis

Our standard notion of length of a vector is the dot product of a vector with itself

\begin{Vmatrix} x \end{Vmatrix} = \sqrt{x•x} = \sqrt{(x_1)^2 + (x_2)^2 + ⋯ + (x_n)^3}

The set of basis vectors for the standard basis

E = (e_1, e_2, ...)

written in the standard basis are

e_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \\ ⋮ \end{pmatrix}, \;\;\; e_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \\ ⋮ \end{pmatrix}, \;\;\; e_3 = \begin{pmatrix} 0 \\ 0 \\ 1 \\ ⋮ \end{pmatrix}, \;\;\; ⋯

Each of the basis vectors has unit length,

\sqrt{e_i • e_i} \, = 1

In addition, each of the basis vectors is orthogonal, or perpendicular, to all of the others,

e_i • e_j = 0

when i ≠ j.

To write this more compactly, we introduce the Kronecker delta, which is a deposit of sediments at the mouth of the Kronecker river and apparently also a sexibeat.

We use a lower case delta to write the Kronecker delta which is defined as

δ_{ij} = \begin{Bmatrix} 1 & i = j \\ 0 & i ≠ j \end{Bmatrix}

The relations above for the basis vectors of the standard basis can then be summarized as

e_i • e_j = δ_{ij}

.

The dot product is a special operator for vectors. We can form an equivalent calculation using standard matrix multiplication by taking the transpose of the first vector,

e_i • e_j = e_i^T e_j

This is called the inner product. Sometimes the inner product is written as

<e_i , e_j>

and physicists doing quantum mechanics like to use a closely related notation of

<e_i | e_j>

.

We can also form the outer product by taking the transpose of the second vector. In this case, the resultant is a square matrix with dimensions equal to the number of components of the vector,

Π_i = e_i e_j^T

If you do the multiplication, you'll see that

Π_i

is a diagonal square matrix with a 1 in the i^th diagonal position and zeros everywhere else.

We can write a diagonal matrix D with diagonal entries

λ_1, …, λ_n

D = λ_i Π_i +⋯ + λ_n Π_n

Orthonormal bases

These properties of the basis can also hold for other bases.

Orthogonal bases have basis vectors that are mutually perpendicular,

u_i • u_j = 0

for i ≠ j.

Orthonormal bases have the additional property that all basis vectors are unit vectors,

u_i • u_j = δ_{ij}

.

If we have an orthonormal basis, then it is easy to find the components of any vector v in that basis.

We can always write v as a linear combination of the basis vectors,

v = \underset{i}{∑} c^i \, u_i

Then if we take the dot product of v with the basis vector

u_j

, we find

v • u_j = \underset{i}{∑} c^i \, u_i • u_j = c^i δ_{ij} = c^j

Thus, the j^th component of the vector v in the basis

\{ u_1, … , u_n \}

v • u_j

and we can write the vector as

v = \underset{i}{∑} (v • u_i) u_i

Or if you prefer in terms of the inner product

v = \underset{i}{∑} <v, u_i> u_i

Inner product in an orthonormal basis

You are familiar with the dot product from using vectors in physics and it makes great sense as a way to measure the lengths of vectors and the angles between vectors. However, in some more general vector spaces, it makes no sense and we must use an appropriate inner product instead. It turns out that by using an orthonormal bases, we can always relate the inner product to the dot product.

For example, consider the space of first order polynominals p defined on the interval [0, 1]. We define the inner product as

<p, p'> = {∫}_{0}^{1} p(x) p'(x) dx

We can use as a basis B, the functions 1 and x. Then a vector in our space

\begin{pmatrix} a \\ b \end{pmatrix}

describes the first order polynomial

p(x) = a + bx

. This basis is neither ortho nor normal.

Instead, let's use a basis consisting of the functions 1 and

2\sqrt{3} (x - \frac{1}{2})

.

The vectors are orthogonal because

{∫}_{0}^{1} 1 • 2\sqrt{3} (x - \frac{1}{2}) dx = 2\sqrt{3} {∫}_{0}^{1} (x - \frac{1}{2}) dx = 2\sqrt{3} \, \left[\frac{x^2}{2} - \frac{x}{2} \right]_{x=0}^{x=1} = 0

The vectors also have unit length

{∫}_{0}^{1} 1•1 \, dx = 1

and

{∫}_{0}^{1} 2\sqrt{3}(x - 0.5)•2\sqrt{3}(x - 0.5) dx = 4×3×{∫}_{0}^{1} (x-0.5)^2 dx = 12 × \frac{1}{12} = 1

An arbitrary vector

v = a + bx

can be written in the orthonormal basis O as

v = \begin{pmatrix} a + \frac{b}{2} \\ \frac{b}{2\sqrt{3}} \end{pmatrix}

To see this, note that the coefficients are equal to the inner product of the vector with the basis vector

<v, u_i>.

We can evaluate the first coefficient as

c^1 = <v, u_i> = {∫}_{0}^{1} (a+bx)•1 \, dx = {∫}_{0}^{1} a \, dx + {∫}_{0}^{1} bx \, dx = a + \frac{b}{2}

We'll leave evaluating the second coefficient as an exercise for the reader.

In an orthonormal basis, the inner product of

v = a + bx

with

v' = a' + b'x

is equal to the dot product of the two vectors, which is

\begin{pmatrix} a + \frac{b}{2} \\ \frac{b}{2\sqrt{3}} \end{pmatrix} • \begin{pmatrix} a' + \frac{b'}{2} \\ \frac{b'}{2\sqrt{3}} \end{pmatrix} = \left( a + \frac{b}{2} \right) \left( a' + \frac{b'}{2} \right) + \frac{bb'}{12} = aa' + \frac{1}{2}(ab' + a'b) + \frac{1}{3} bb'

We can check by doing the inner product on the two vectors using the definition in terms of the integral,

{∫}_{0}^{1} (a+bx)(a'+b'x) dx = aa' + \frac{1}{2}(ab' + a'b) + \frac{1}{3} bb'

Changing between orthonormal bases

The change of basis matrix from an orthonormal basis

T = \{ u_1, …, u_n \}

to another orthonormal basis

R = \{ w_1, …, w_n \}

P = (p_i^j) = (u_j • w_i)

The columns of P are made from an orthonormal set of vectors, since they consist of the vectors in the orthonormal basis R.

It can be shown that P is an orthogonal matrix, meaning that

P^{-1} = P^T

We'll leave the proof to the linear algebra textbook.

If you start with a diagonal matrix D and we change to a new basis using an orthogonal matrix P, then the matrix M in the new basis is

M = PDP^{-1}

Interestingly, the matrix M must be symmetric,

M^T = M

. Let's try to check this.

M^T = (PDP^T)^T = (P^T)^T D^T P^T

since the transpose of product of matrices equals the product of the transposes with the order swapped (from lecture #10),

= PDP^T

since the transpose of the transpose is the original matrix and the transpose of a diagonal matrix is itself,

= PDP^{-1} = M

since P is orthogonal and using our equation for the transformation of matrices from lecture #19.

This has interesting implications.

Diagonalizing Symmetric Matrices

A matrix M is symmetric if M^T = M. Symmetric matrices arise quite often in real world applications. For example, if you create a table of distances between cities, it will be symmetric since the distance from city A to city B is the same as the distance from city B to city A. Or, if you have a collection of objects and a matrix representing the magnitudes of the forces between them, it will be symmetric because the force exerts by object A on object B must be equal and opposite to the force exerted on object B by object A due to Newton's third law.

Symmetric matrices always have real eigenvalues. Let's look at a general 2×2 matrix with real elements. To find the eigenvalues, we find the roots of the characteristic polynomial,

P_λ \begin{pmatrix} a & b \\ b & d \end{pmatrix} = \det \begin{pmatrix} λ - a & -b \\ -b & λ - d \end{pmatrix} = (λ-a) (λ-d) - b^2 = λ^2 - (a+d)λ - b^2 + ad = 0

We find the roots of the quadratic polynomial using the quadratic equation,

λ = \frac{a+d}{2} ± \sqrt{b^2 + \left(\frac{a-d}{2}\right)^2}

The discriminant

b^2 + \left(\frac{a-d}{2}\right)^2

has two terms that are both squares. Thus, it must be positive. Therefore, the square root will always be real and the eigenvalues must be real. (Ya gotta be real man!).

But, it gets even better. Let's suppose that our symmetric matrix M has two distinct eigenvalues λ ≠ μ and eigenvectors x and y, so

Mx = λx, \;\;\;\; My = μy

Let's see if the eigenvectors are orthogonal. The dot product is symmetric in its operands,

x^T y = x•y = y•x = y^T x

Let's calculate

x^T My = x^T μy = μ \, x^T y

we used the eigenvalue-eigenvector equation for y, and then moved μ out of the inner product since it is a scalar.

Now let's flip things around (transpose them),

x^T My = (x^T My)^T = (y^T Mx)^T

We can do the first step because the inner product gives a scalar and the transpose of a scalar is always equal to itself. The second step is our usual rule that the transpose of product of matrices equals the product of the transposes with the order swapped along with the fact that M is symmetric. Now, we can use the eigenvalue-eigenvector equation for x,

= (y^T λx)^T = λ(y^T x)^T = λ x^T y

In the first step, we move the scalar out of the transpose. In the second step, we again have the transpose of a product of matrices.

Then, we can write zero in a funny way,

0 = x^T M y - x^T M y = μ x^T y - λ x^T y = (μ-λ) x^T y = (μ-λ) x•y

We assumed that λ ≠ μ, so we must have x•y = 0, thus our eigenvectors are orthogonal.

In general, the eigenvectors of a symmetric matrix with distinct eigenvalues are orthogonal. Since we can also change the length of an eigenvector to be whatever we want, this means that we can always form an orthonormal basis using the eigenvectors of a symmetric matrix.