Scientific Computing Using Python - PHYS:4905 - Fall 2018
Lecture #10 - 9/25/2018 - Prof. Kaaret
These notes borrow from Linear Algebra
by Cherney, Denton, Thomas, and Waldron.
Matrices
r × k matrix
is a rectangular array of numbers
where i = 1,
2, ..., r and j = 1, 2, ..., k. Each
number m is an element of the matrix.
We can write the matrix in the form
An r × k matrix has r rows and k
columns.
An r × 1 matrix is a column vector,
A 1 × r matrix is a row vector,
Some people like to add commas to separate the elements of row
vectors.
Operations on Matrices
Scalar multiplication: We can multiply a matrix by a scalar,
which means that we multiply each element by the scalar.
Addition: We can add matrices, which means that we add each
elements of each matrix. Note that the two matrices have to
the same dimensions for addition to be defined.
Interestingly, this means that the set of r × k
matrices form a vector space that we call
.
Matrix multiplication: We can multiply matrices,
Matrix multiplication is not simply multiplying each element of M by
the corresponding element of N. Rather, it is (exactly) like
the dot product - you multiply several elements in M by elements in
N and then sum the products.
The simplest example is multiplying a 1 × r
matrix (a row vector) times a r × 1 matrix (column
vector). For example,
This is the same as the dot product. Which is why the function dot
works for both the dot product and matrix multiplication in
Python. Note that the row and column vectors need to have the
same length. This is equivalent to the second dimension of the
first matrix equaling the first dimension of the second matrix.
Graphically, you do matrix multiplication by taking each column of
the second matrix, flopping it over into a row vector (taking the
transpose), multiplying each pair of elements, and then summing
those products. The entries of the matrix multiplication MN
are made from the dot products of the rows of M with the columns of
N.
Doing this for each of the m columns of the second matrix on
each of the r rows gives you r × m numbers
that form your r × m output matrix. Note
that the dot product (multiply and add operation) doesn't make sense
unless the number of columns of the first matrix equals the number
of rows of the second matrix. So, the two matrices do not need
to have the same dimensions, but the second dimension of the first
matrix must equal the first dimension of the second matrix.
What do you get if you multiply a 3×3 matrix times a column vector?
What do you get if you multiply a 3×3 matrix times a row vector?
What do you get if you multiply a 3×3 matrix times a 3×3 matrix?
What do you get if you multiply a 3×1 matrix times a 1×2 matrix?
In general, (r × k) times (k
× m) is (r × m).
Matrix terminology
A matrix is called square if its two dimensions are
equal. Makes sense, since you then have a square pattern of
numbers when you write it down.
The transpose of ar × k matrix
is ak × r
matrix where i = 1,
2, ..., r and j = 1, 2, ..., k. The
elements are swapped across the diagonal. If the matrix M is
not square, then the transpose will have a different shape.
For example, the transpose of a column vector is a row vector.
Taking the transpose twice gets you back to the original matrix.
The transpose of product of matrices equals the product of the
transposes with the order swapped,
If the transpose equals the matrix,
, then
the matrix is symmetric. Of course, only square matrices can
be symmetric (just from their geometry).
A square matrix that is zero for all elements off the diagonal is
called a diagonal matrix.
A diagonal matrix with all diagonal entries equal to 1 is called the
identity matrix. The identity matrix is special
because IM = MI = M. The identity matrix works just like the
multiplicative identity in real numbers (1). The identity
matrix is square and there are an infinite number of identity
matrices with different numbers of rows/columns.
We can define powers of matrices, e.g.
, but
only for square matrices. Why?
Similar to the fact that
for all real numbers, we define
.
This allows us to evaluate any polynomial on any square matrix.
Exercise: Let
and . What is
f(M)?
Associativity and non-commutativity
Matrix multiplication is associative. Specifically, (MN)R
= M(NR).
Matrix multiplication is, in general, not commutative.
Specifically, for two generic square matrices,
.
Note that multiplication by some matrices is commutative, e.g. the
identity matrix. However, you cannot assume that matrix
multiplication is commutative for an arbitrary pair of matrices, so
you cannot commute two matrices when do algebra with matrices.
Trace
The trace of a square matrix is the sum of its diagonal
entries,
Taking the transpose does not affect the trace, since the transpose
doesn't change any of the diagonal elements, tr(M) = tr(MT).
Even though matrix multiplication does not commute, the trace of a
product of matrices does not depend on the order of multiplication,
tr(MN) = tr(NM).
The trace operator is a linear operation that transforms matrices to
the real numbers.
Block matrices
Sometimes it is efficient to break a matrix into blocks. For
example,
You can then work out matrix multiplication using the smaller
matrices in the blocks.
For example, we can find the square of M using there
blocks. First, we do the matrix math on the blocks in
schematic form,
Then, we can calculate each of these expressions (which are
polynomials in A, B, C, and D) using
the matrices for each block as defined above,
Then, we substitute these results back into the schematic form of
our matrix multiplication and get the answer,
This is exactly what we would have gotten if we did the full matrix
multiplication.
This example doesn't actually save much work. Block matrices
are much more useful if some of the blocks are zero or an identity
matrix.
Exercise: The matrix
rotates a vector in a two-dimensional space through an angle
. We can
use a block matrix to do a rotation in three dimensions. The
matrix
will rotate a 3-dimensional vector (x, y, z)
around the z axis (or in the xy plane) through an
angle
.
Using block matrices, find the matrix product
.
What does this matrix do when applied to a 3-dimensional vector (x,
y, z)?
What matrix would rotate a 3-dimensional vector (x, y,
z) around the x axis?
Inverse matrix
The inverse of a square matrix M is the matrix such
that
.
Not all matrices have inverses. If the inverse does exist,
then the matrix is called invertible or nonsingular.
If a matrix does not have an inverse then it is called singular
or non-invertible.
The inverse is the inverse is my friend, or actually, the original
matrix,
.
The inverse of the product is the product of the inverses with the
order swapped,
.
The inverse of the transpose is the transpose of the inverse,
.
Solving linear equations with the inverse
The process of Gaussian elimination to find the solution of a system
of linear equations is equivalent to finding an inverse
matrix. Let's look at this in matrix form.
We start with a system of linear equations that we can write in
matrix form as MX = V, where M is a matrix
and X and V are column vectors. The problem is
specified by M and V. The solution is X.
We write the system of equations in augmented matrix form as (M
| V). We apply a bunch of elementary row operations to
M that reduce it to the identity matrix I. The
operations taken together are equivalent to multiplying by M
-1. We apply the same operations to V to get
the solution. This is the same as multiplying V by M
-1, so the solution of the system of linear equations is
X = M -1V.
If we want to solve another system of linear equations that has the
same M, but a different V that we'll call W,
we have already done most of the work. We just need to
find M -1 W.
When does the inverse exist?
A square matrix M is invertible if and only if the homogenous system
of equations Mx = 0 has no non-zero solutions.
If M -1 exist, then we can multiply both sides
of Mx = 0 by the inverse. We get
.
So the only solution of Mx = 0 must be x = 0.
If M -1 does not exist, then we cannot multiply
both sides of Mx = 0 by the inverse.
This condition must be satisfied for a matrix to be invertible, but
is it sufficient to ensure the existence of the inverse?