Scientific Computing Using Python - PHYS:4905 - Fall 2018
Lecture #05 - 9/4/2018 - Prof. Kaaret
These notes borrow from Linear Algebra
by Cherney, Denton, Thomas, and Waldron and the WikiPedia pages on Augmented
matrix and Gaussian
elimination.
Linear equations
Linear equations are equations in which all of the terms are either
constants or linear in the variables, i.e. they include terms like 2x
but none like x2 or x3.
Linear equations can depend on several variables, but the individual
terms have to be linear in the variables and cannot have products of
the variables, i.e. 3x + 4y is allowed, but 3xy
is not.
A system of linear equations is a set of linear equations all
involving the same set of variables. For example,
Solving linear equations
Now let's try to find the solution of our set of linear equations,
i.e. find values for x, y, z that satisfy
all of the equations. First, let's think about what sorts of
operations we can do on the equations that keep the solutions of the
equations the same. We'll call those elementary
operations. Then by applying the elementary operations, we can
attempt to simplfy the equations to the point where we can read off
the solutions.
Will the solutions be changed by swapping the order of two
equations?
No, they won't. So, we can define our first elementary
operation as swapping two equations. This seems a bit trivial
when thinking of the set as equations, but will be useful when we
translate them to matrices.
What other operations can you do that leave the solutions unchanged?
We can multiply any equation by a non-zero number. Since all the
terms get multiplied by the same factor, this leaves the solutions
unchanged.
Also, we can add any two equations. Since we apply the
summation to both sides of the equation, again the solutions will be
unchanged. More generally, we can add a multiple of any
equation to any other equation, replacing the original equation.
Let's try to solve this set of linear equations.
Replace the first equation by the sum of the two equations. We
get
Divide the first equation by 3.
Now we can read off the solution for x as x =
9.
Now, replace the second equation by the second equation minus 2
times the first equation.
Multiply the second equation by -1.
We can read off the solution for y as y = 18.
Let's check that these values are actually a solution of our
original equations.
Augmented matrices
We can write our system of linear equation in matrix form.
We could do all of the manipulations above in matrix form.
Note that we would never touch the column vector with x and
y, all of the operations would affect only the numbers in the
matrix and in the column vector on the right hand side of the
equation. So, we can write this in an even more compact form,
leaving out the column vector with x and y.
This is called an augmented matrix. Note that we've removed
all of the variables, so this form is very amenable to use with
numerical matrix operations.
Our final set of equations, where we could read off the solutions,
in augmented matrix form is
The left hand side of the augmented matrix (the original matrix) is
the Identity Matrix which has ones along the diagonal and zeros
elsewhere. Finding a solution of our linear equations is
equivalent to changing the left hand side of the augmented matrix to
the identity matrix.
Our three elementary operations translated to the augmented matrix
representation are
- Row swap - exchange any two rows.
- Scalar multiplication - multiply any row by a non-zero
constant.
- Row addition - add a multiple of one row to another row.
These are called elementary row operations
or EROs.
Echelon form
We can derive the solutions to a set of equations if we can change
the augmented matrix so it is in Row Echelon Form.
This is an example of a matrix in row echelon form:
All entries to the left of the diagonal are zero. The leftmost
nonzero entry in each row is called the leading coefficient or more
commonly the pivot of that row. Echelon means "level
or rank in an organization". In this context, the first row is
the upper echelon because it has the most non-zero entries, while
the bottom row is the bottom echelon because it has the fewest
non-zero entries.
You can't directly read off the solution to the equation from this
form, but you can derive it by "back substitution". The bottom
row is equivalent to the equation 3z = 3, which has the
solution z = 1. The middle row is equivalent to the
equation y + z = 2. Since we know z =
1, we can substitute z into this equation and find y + 1 = 2
or y = 1. Substituting y and z into
the equation derived from the top row, we find x = 0.
To make the back substitution easier, it is preferable to get the
augmented matrix into Reduced Row Echelon Form or
RREF. The RREF of an augmented matrix is the equivalent matrix
(same solutions) that has the minimum number of non-zero entries and
ones along the diagonal. The RREF of an augmented matrix is
defined as:
- In every row, the left most non-zero entry is 1 (and is called
the pivot of that row).
- The pivot of any given row is always to the right of the pivot
of the row above it.
- The pivot is the only non-zero entry in its column.
An augmented matrix with the left hand part equal to the identity
matrix is in RREF. If the RREF of an augmented matrix does not
have the identity matrix as its left hand part, usually this means
that the set of linear equations is redundant, so there are many
solutions, or inconsistent, so there are no solutions.
Gaussian elimination
The process of applying elementary row operations
to simplfy the augmented matrix to Reduced Row Echelon Form
is called 'Gaussian elimination'. We can write down an
algorithm for Gaussian elimination.
- Check that the first entry of the first row is non-zero.
If not, swap that row with a row that does have a non-zero first
entry. (ERO - Row swap)
- Multiply the first row by a scalar to make the first entry
equal 1. (ERO - scalar multiplication)
- Use that 1 as a pivot to zero out the first entry of all of
the other rows. More explicitly, for each other row,
multiply the top row by the negative of the first entry in the
row and add that to the row. (ERO - row addition) Note
that the first row does not change during this step.
- In the first column, you should now have a one in the first
row and zeros in the other rows.
- Move to the second row.
- Check that the second entry of the the current row is
non-zero. If not, swap that row with a row (other than the
first row) that does have a non-zero second entry. (ERO - Row
swap)
- Multiply the second row by a scalar to make the second entry
equal 1. (ERO - scalar multiplication)
- Use that 1 as a pivot to zero out the second entry of all of
the other rows. More explicitly, for each other row,
multiply the second row by the negative of the second entry in
the row and add that to the row. (ERO - row addition) Note
that the second row does not change during this step, but the
first row probably will.
- In the second column, you should now have a one in the second
row and zeros in the other rows.
- Repeat the process in lines 5 through 9 for the remaining
rows. Note that for the third row, you work on the third
entry, etc...
You should practice Gaussian elimination as we will be using it
repeatedly in this class.
Assignment
The assignment for today's class involves doing Gaussian
elimination and matrix operations with no programming. The
assignment is not due until September 11, but it will be very
useful to do it before class on Thursday, September 6, since that
class will be devoted to doing Gaussian elimination in Python.
HW #5 is due at the beginning of class on Tuesday,
September 11.
https://homepage.physics.uiowa.edu/~pkaaret/2018f_p4905/hw05.html