Lecture #10

Scientific Computing Using Python - PHYS:4905 - Fall 2018

Lecture #10 - 9/25/2018 - Prof. Kaaret

These notes borrow from Linear Algebra by Cherney, Denton, Thomas, and Waldron.

Matrices

An

r × k matrix is a rectangular array of numbers

M = ( m_j^i )

where i = 1, 2, ..., r and j = 1, 2, ..., k. Each number m is an element of the matrix.

We can write the matrix in the form

M = \begin{pmatrix} m_1^1 & m_2^1 & ⋯ & m_k^1 \\ m_1^2 & m_2^2 & ⋯ & m_k^2 \\ ⋮ & ⋮ & & ⋮ \\ m_1^r & m_2^r & ⋯ & m_k^r \\ \end{pmatrix}

An r × k matrix has r rows and k columns.

An r × 1 matrix is a column vector,

v = \begin{pmatrix} v^1 \\ v^2 \\ ⋮ \\ v^r \end{pmatrix}

A 1 × r matrix is a row vector,

v^T = \begin{pmatrix} v^1 & v^2 & ⋯ & v^r \end{pmatrix}

Some people like to add commas to separate the elements of row vectors.

Operations on Matrices

Scalar multiplication: We can multiply a matrix by a scalar, which means that we multiply each element by the scalar.

rM = ( rm_j^i )

Addition: We can add matrices, which means that we add each elements of each matrix. Note that the two matrices have to the same dimensions for addition to be defined.

M + N = (m_j^i + n_j^i)

Interestingly, this means that the set of r × k matrices form a vector space that we call

{𝕄}_k^r

.

Matrix multiplication: We can multiply matrices,

M N = {∑}_{p=1}^{k} m_p^i n_j^p

Matrix multiplication is not simply multiplying each element of M by the corresponding element of N. Rather, it is (exactly) like the dot product - you multiply several elements in M by elements in N and then sum the products.

The simplest example is multiplying a 1 × r matrix (a row vector) times a r × 1 matrix (column vector). For example,

\begin{pmatrix} 1 & 2& 3 \end{pmatrix} \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 1×4 + 2×5 + 3×6 = 32

This is the same as the dot product. Which is why the function dot works for both the dot product and matrix multiplication in Python. Note that the row and column vectors need to have the same length. This is equivalent to the second dimension of the first matrix equaling the first dimension of the second matrix.

Graphically, you do matrix multiplication by taking each column of the second matrix, flopping it over into a row vector (taking the transpose), multiplying each pair of elements, and then summing those products. The entries of the matrix multiplication MN are made from the dot products of the rows of M with the columns of N.

\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} e & f \\ g & h \end{pmatrix} = \begin{pmatrix} ae + bg & af + bh \\ ce + dg & cf + dh \end{pmatrix}

Doing this for each of the m columns of the second matrix on each of the r rows gives you r × m numbers that form your r × m output matrix. Note that the dot product (multiply and add operation) doesn't make sense unless the number of columns of the first matrix equals the number of rows of the second matrix. So, the two matrices do not need to have the same dimensions, but the second dimension of the first matrix must equal the first dimension of the second matrix.

What do you get if you multiply a 3×3 matrix times a column vector?

What do you get if you multiply a 3×3 matrix times a row vector?

What do you get if you multiply a 3×3 matrix times a 3×3 matrix?

What do you get if you multiply a 3×1 matrix times a 1×2 matrix?

In general, (r × k) times (k × m) is (r × m).

Matrix terminology

A matrix is called square if its two dimensions are equal. Makes sense, since you then have a square pattern of numbers when you write it down.

The transpose

T = M^T

of a

n

r × k matrix is a

k × r matrix

T= ( m_i^j )

where i = 1, 2, ..., r and j = 1, 2, ..., k. The elements are swapped across the diagonal. If the matrix M is not square, then the transpose will have a different shape. For example, the transpose of a column vector is a row vector.

Taking the transpose twice gets you back to the original matrix.

The transpose of product of matrices equals the product of the transposes with the order swapped,

(MN)^T = N^T M^T

If the transpose equals the matrix,

M = M^T

, then the matrix is symmetric. Of course, only square matrices can be symmetric (just from their geometry).

A square matrix that is zero for all elements off the diagonal is called a diagonal matrix.

A diagonal matrix with all diagonal entries equal to 1 is called the identity matrix. The identity matrix is special because IM = MI = M. The identity matrix works just like the multiplicative identity in real numbers (1). The identity matrix is square and there are an infinite number of identity matrices with different numbers of rows/columns.

We can define powers of matrices, e.g.

M^2 = MM

, but only for square matrices. Why?

Similar to the fact that

x^{0} = 1

for all real numbers, we define

M^0 = I

. This allows us to evaluate any polynomial on any square matrix.

Exercise: Let

f(x) = 3x + 2x^3

and

M = \begin{pmatrix} 2 & 1 \\ 2 & 3 \end{pmatrix}

. What is f(M)?

Associativity and non-commutativity

Matrix multiplication is associative. Specifically, (MN)R = M(NR).

Matrix multiplication is, in general, not commutative. Specifically, for two generic square matrices,

MN \neq NM

.

Note that multiplication by some matrices is commutative, e.g. the identity matrix. However, you cannot assume that matrix multiplication is commutative for an arbitrary pair of matrices, so you cannot commute two matrices when do algebra with matrices.

Trace

The trace of a square matrix is the sum of its diagonal entries,

tr(M) = \underoverset{i=1}{n}{∑} \, m_i^i

Taking the transpose does not affect the trace, since the transpose doesn't change any of the diagonal elements, tr(M) = tr(M^T).

Even though matrix multiplication does not commute, the trace of a product of matrices does not depend on the order of multiplication, tr(MN) = tr(NM).

The trace operator is a linear operation that transforms matrices to the real numbers.

Block matrices

Sometimes it is efficient to break a matrix into blocks. For example,

You can then work out matrix multiplication using the smaller matrices in the blocks.
For example, we can find the square of M using there blocks. First, we do the matrix math on the blocks in schematic form,

Then, we can calculate each of these expressions (which are polynomials in A, B, C, and D) using the matrices for each block as defined above,

Then, we substitute these results back into the schematic form of our matrix multiplication and get the answer,

This is exactly what we would have gotten if we did the full matrix multiplication.

This example doesn't actually save much work. Block matrices are much more useful if some of the blocks are zero or an identity matrix.

Exercise: The matrix

\begin{pmatrix} \cos \, θ & \sin \,θ \\ -\sin \, θ & \cos \, θ \end{pmatrix}

rotates a vector in a two-dimensional space through an angle

θ

. We can use a block matrix to do a rotation in three dimensions. The matrix

M(θ) = \begin{pmatrix} \cos \, θ & \sin \, θ & 0 \\ - \sin \, θ & \cos \, θ & 0 \\ 0 & 0 & 1 \end{pmatrix}

will rotate a 3-dimensional vector (x, y, z) around the z axis (or in the xy plane) through an angle

θ

.

Using block matrices, find the matrix product

M(α) M(β)

.
What does this matrix do when applied to a 3-dimensional vector (x, y, z)?
What matrix would rotate a 3-dimensional vector (x, y, z) around the x axis?

Inverse matrix

The inverse of a square matrix M is the matrix

M^{-1}

such that

M^{-1} M = I = M M^{-1}

.

Not all matrices have inverses. If the inverse does exist, then the matrix is called invertible or nonsingular. If a matrix does not have an inverse then it is called singular or non-invertible.

The inverse is the inverse is my friend, or actually, the original matrix,

(M^{-1})^{-1} = M

.

The inverse of the product is the product of the inverses with the order swapped,

(AB)^{-1} = B^{-1} A^{-1}

.

The inverse of the transpose is the transpose of the inverse,

(M^{-1})^T = (M^T)^{-1}

Solving linear equations with the inverse

The process of Gaussian elimination to find the solution of a system of linear equations is equivalent to finding an inverse matrix. Let's look at this in matrix form.

We start with a system of linear equations that we can write in matrix form as MX = V, where M is a matrix and X and V are column vectors. The problem is specified by M and V. The solution is X.

We write the system of equations in augmented matrix form as (M | V). We apply a bunch of elementary row operations to M that reduce it to the identity matrix I. The operations taken together are equivalent to multiplying by M^-1. We apply the same operations to V to get the solution. This is the same as multiplying V by M^-1, so the solution of the system of linear equations is

X = M^-1V.

If we want to solve another system of linear equations that has the same M, but a different V that we'll call W, we have already done most of the work. We just need to find M^-1 W.

When does the inverse exist?

A square matrix M is invertible if and only if the homogenous system of equations Mx = 0 has no non-zero solutions.

If M^-1 exist, then we can multiply both sides of Mx = 0 by the inverse. We get

M^{-1}M x = M^{-1} 0 → I x = 0 → x = 0

.

So the only solution of Mx = 0 must be x = 0.
If M^-1 does not exist, then we cannot multiply both sides of Mx = 0 by the inverse.

This condition must be satisfied for a matrix to be invertible, but is it sufficient to ensure the existence of the inverse?