Scientific Computing Using Python - PHYS:4905
Lecture #05 - Prof. Kaaret
Matrices and Gaussian elimination
These notes borrow from Linear Algebra
by Cherney, Denton, Thomas, and Waldron and the WikiPedia pages on Augmented
matrix and Gaussian
elimination. You need to use the FireFox
browser to view the web page with matrices.
Matrices
An r × k matrix is a rectangular
array of numbers
where i = 1,
2, ..., r and j = 1, 2, ..., k. Each
number m is an element of the matrix.
We can write the matrix in the form
An r × k matrix has r rows and k
columns. A matrix is called square if its two
dimensions are equal. Makes sense, since you then have a
square pattern of numbers when you write it down.
An r × 1 matrix is a column vector,
A 1 × r matrix is a row vector,
Some people like to add commas to separate the elements of row
vectors.
Vectors written in this form are just like vectors written in terms
of x, y, z, components as you did in Physics I. You write a
vector specifying the position of an object in 3D space by writing
the components as a column vector or as the sum of the components
multiplied by unit vectors along each axis,
The set of unit vectors
is called a basis and we'll talk a lot about bases later.
Operations on Matrices
Scalar multiplication: We can multiply a matrix by a scalar,
which means that we multiply each element by the scalar.
Addition: We can add matrices, which means that we add each
elements of each matrix. Note that the two matrices must have
the same dimensions for addition to be defined.
Matrix multiplication: We can multiply matrices,
Matrix multiplication is not simply multiplying each element of M by
the corresponding element of N. Rather, it is (exactly) like
the dot product - you multiply several elements in M by elements in
N and then sum the products.
The simplest example is multiplying a 1 × r
matrix (a row vector) times a r × 1 matrix (column
vector). For example,
This is the same as the dot product - you multiply each pair of
corresponding components and then sum the products. Note that
the row and column vectors need to have the same length. This
is equivalent to the second dimension of the first matrix equaling
the first dimension of the second matrix.
Let's look at multiplying a 2×2 matrix times a 2-component column
vector. To do this graphically, we take the column vector and
rotate it by 90° counterclockwise (which basically makes it into a
row vector). To get the top line of the resultant column
vector, we line up the rotated vector with the top line of the
matrix and take the product of each component of the vector with the
corresponding component in the matrix. In this case a×e
and b×g. Finally, we sum the products. To
get the bottom line of the resultant column vector, we line up the
rotated vector with the bottom line of the matrix and follow the
same procedure. In the end, we find
Now let's try multiplying a 2×2 matrix by another multiplying a 2×2
matrix. You can think of the second 2×2 matrix as a pair of
column vector and use the same procedure as above. Try
it. The answer is
In general, doing this for each of the m columns of the
second matrix on each of the r rows on the first matrix
gives you r × m numbers that form your r × m
output matrix. Note that the dot product (multiply and add
operation) doesn't make sense unless the number of columns of the
first matrix equals the number of rows of the second matrix.
So, to multiply two matrices, they do not need to have the same
dimensions, but the second dimension of the first matrix must equal
the first dimension of the second matrix.
What do you get if you multiply a 3×3 matrix times a column vector?
What do you get if you multiply a 3×3 matrix times a row vector?
What do you get if you multiply a 3×3 matrix times a 3×3 matrix?
What do you get if you multiply a 3×1 matrix times a 1×2 matrix?
In general, multiplying a matrix of the form (r ×
k) times a matrix of the form (k × m) makes a
matrix of the form (r × m). For matrix
multiplication to be define, the number of columns of the first
matrix must equal the number of rows of the second matrix.
A Special Matrix
The number 1 is called the 'identity' in multiplication because any
number multiplied by the identity is the same number. Is there
a similar identity for matrices?
A square matrix that is zero for all elements off the diagonal is
called a diagonal matrix. A diagonal matrix
with all diagonal entries equal to 1 is called the identity
matrix. The identity matrix (I) is special
because IM = MI = M. The identity
matrix works just like the multiplicative identity in real numbers
(1). The identity matrix is square and there are an infinite
number of identity matrices with different sizes.
Try multiplying the 2×2 identity matrix times a 2×2 matrix.
Linear equations
Linear equations are equations in which all of the terms are either
constants or linear in the variables, i.e. they include terms like 2x
but none like x2 or x3.
Linear equations can depend on several variables, but the individual
terms have to be linear in the variables and cannot have products of
the variables, i.e. 3x + 4y is allowed, but 3xy
is not.
A system of linear equations is a set of linear equations all
involving the same set of variables. For example,
Solving linear equations
Now let's try to find the solution of our set of linear equations,
i.e. find values for x, y, z that satisfy
all of the equations. First, let's think about what sorts of
operations we can do on the equations that keep the solutions of the
equations the same. We'll call those elementary
operations. Then by applying the elementary operations, we can
attempt to simplify the equations to the point where we can read off
the solutions.
Will the solutions be changed by swapping the order of two
equations?
No, they won't. So, we can define our first elementary
operation as swapping two equations. This seems a bit trivial
when thinking of the set as equations, but will be useful when we
translate them to matrices.
What other operations can you do that leave the solutions unchanged?
We can multiply any equation by a non-zero number. Since all the
terms get multiplied by the same factor, this leaves the solutions
unchanged.
Also, we can add any two equations. Since we apply the
summation to both sides of the equation, again the solutions will be
unchanged. More generally, we can add a multiple of any
equation to any other equation, replacing the original equation.
Let's try to solve this set of linear equations.
Replace the first equation by the sum of the two equations. We
get
Divide the first equation by 3.
Now we can read off the solution for x as x =
9.
Now, replace the second equation by the second equation minus 2
times the first equation.
Multiply the second equation by -1.
We can read off the solution for y as y = 18.
Let's check that these values are actually a solution of our
original equations.
Augmented matrices
We can write our system of linear equation in matrix form.
We could do all of the manipulations above in matrix form.
Note that we would never touch the column vector with x and
y, all of the operations would affect only the numbers in the
matrix and in the column vector on the right hand side of the
equation. So, we can write this in an even more compact form,
leaving out the column vector with x and y.
This is called an augmented matrix. Note that we've removed
all of the variables, so this form is very amenable to use with
numerical matrix operations.
Our final set of equations, where we could read off the solutions,
in augmented matrix form is
The left hand side of the augmented matrix (the original matrix) is
the Identity Matrix which has ones along the diagonal and zeros
elsewhere. Finding a solution of our linear equations is
equivalent to changing the left hand side of the augmented matrix to
the identity matrix. The only changes allowed are those that
keep the solutions of the system of equations unchanged,
specifically the elementary operations.
Our three elementary operations translated to the augmented matrix
representation are
- Row swap - exchange any two rows.
- Scalar multiplication - multiply any row by a non-zero
constant.
- Row addition - add a multiple of one row to another row.
These are called elementary row operations
or EROs.
Echelon form
We can derive the solutions to a set of equations if we can change
the augmented matrix so it is in Row Echelon Form.
This is an example of a matrix in row echelon form:
All entries to the left of the diagonal are zero. The leftmost
nonzero entry in each row is called the leading coefficient or more
commonly the pivot of that row. Echelon means "level
or rank in an organization". In this context, the first row is
the upper echelon because it has the most non-zero entries, while
the bottom row is the bottom echelon because it has the fewest
non-zero entries.
You can't directly read off the solution to the equation from this
form, but you can derive it by "back substitution". The bottom
row is equivalent to the equation 3z = 3, which has the
solution z = 1. The middle row is equivalent to the
equation y + z = 2. Since we know z =
1, we can substitute z into this equation and find y + 1 = 2
or y = 1. Substituting y and z into
the equation derived from the top row, we find x = 0.
To make the back substitution easier, it is preferable to get the
augmented matrix a form that has and ones along the diagonal (as
much as possible) and the minimum number of non-zero entries off the
diagonal. Some people call this the Reduced Row Echelon
Form and define it with the following rules.
- In every row, the left most non-zero entry is 1 (and is called
the pivot of that row).
- The pivot of any given row is always to the right of the pivot
of the row above it.
- The pivot is the only non-zero entry in its column.
If the system of equations has one unique solution, then it will be
possible to change the left hand part of the augmented matrix to be
the identity matrix. Once you've done that, you'll be able to
read off the solutions from the right hand part of the augmented
matrix (the column vector part). If it is not possible to
change the left hand part into the identity matrix, that means that
the set of linear equations is redundant, so there are many
solutions, or inconsistent, so there are no solutions.
Gaussian elimination
The process of applying elementary row operations
to simplify the augmented matrix to Reduced Row Echelon Form
is called 'Gaussian elimination'. We can write down an
algorithm for Gaussian elimination.
- Check that the first entry of the first row is non-zero.
If not, swap that row with a row that does have a non-zero first
entry. (ERO - Row swap)
- Multiply the first row by a scalar to make the first entry
equal 1. (ERO - scalar multiplication)
- Use that 1 as a pivot to zero out the first entry of all of
the other rows. More explicitly, for each other row,
multiply the top row by the negative of the first entry in the
row and add that to the row. (ERO - row addition) Note
that the first row does not change during this step.
- In the first column, you should now have a one in the first
row and zeros in the other rows.
- Move to the second row.
- Check that the second entry of the the current row is
non-zero. If not, swap that row with a row (other than the
first row) that does have a non-zero second entry. (ERO - Row
swap)
- Multiply the second row by a scalar to make the second entry
equal 1. (ERO - scalar multiplication)
- Use that 1 as a pivot to zero out the second entry of all of
the other rows. More explicitly, for each other row,
multiply the second row by the negative of the second entry in
the row and add that to the row. (ERO - row addition) Note
that the second row does not change during this step, but the
first row probably will.
- In the second column, you should now have a one in the second
row and zeros in the other rows.
- Repeat the process in lines 5 through 9 for the remaining
rows. Note that for the third row, you work on the third
entry, etc...
You should practice Gaussian elimination as we will be using it
repeatedly in this class.
Practice
Write each following systems of linear equations as an augmented
matrix and follow the algorithm above to perform Gaussian
elimination and find the solution.
and
Assignment
The assignment for today's class is HW #5
and involves doing Gaussian elimination and matrix operations with
no programming. The assignment is due at the beginning of
the next class. It will be very useful to do it before
class, since that class will be devoted to doing Gaussian
elimination in Python.