Scientific Computing Using Python - PHYS:4905 - Fall 2018

Lecture #05 - 9/4/2018 - Prof. Kaaret

These notes borrow from Linear Algebra by Cherney, Denton, Thomas, and Waldron and the WikiPedia pages on Augmented matrix and Gaussian elimination.

Linear equations

Linear equations are equations in which all of the terms are either constants or linear in the variables, i.e. they include terms like 2x but none like x2 or x3.  Linear equations can depend on several variables, but the individual terms have to be linear in the variables and cannot have products of the variables, i.e. 3x + 4y is allowed, but 3xy is not.

A system of linear equations is a set of linear equations all involving the same set of variables.  For example,

x+2y+3z=03x+4y+7z=26x+5y+9z=11\begin{matrix} x + 2y + 3z = 0 \\ 3x + 4y + 7z = 2 \\ 6x + 5y +9z = 11 \\ \end{matrix}

Solving linear equations

Now let's try to find the solution of our set of linear equations, i.e. find values for x, y, z that satisfy all of the equations.  First, let's think about what sorts of operations we can do on the equations that keep the solutions of the equations the same.  We'll call those elementary operations.  Then by applying the elementary operations, we can attempt to simplfy the equations to the point where we can read off the solutions.

Will the solutions be changed by swapping the order of two equations?

No, they won't.  So, we can define our first elementary operation as swapping two equations.  This seems a bit trivial when thinking of the set as equations, but will be useful when we translate them to matrices.

What other operations can you do that leave the solutions unchanged?

We can multiply any equation by a non-zero number. Since all the terms get multiplied by the same factor, this leaves the solutions unchanged.

Also, we can add any two equations.  Since we apply the summation to both sides of the equation, again the solutions will be unchanged.  More generally, we can add a multiple of any equation to any other equation, replacing the original equation.

Let's try to solve this set of linear equations.
x+y=272x-y=0\begin{matrix} x + y = 27 \\ 2x - y = 0\\ \end{matrix}
Replace the first equation by the sum of the two equations.  We get
3x+0=272x-y=0\begin{matrix} 3x + 0 = 27 \\ 2x - y = 0 \\ \end{matrix}
Divide the first equation by 3.
x+0=92x-y=0\begin{matrix} x + 0 = 9 \\ 2x - y = 0 \\ \end{matrix}
Now we can read off the solution for x as x = 9. 
Now, replace the second equation by the second equation minus 2 times the first equation.
x+0=90-y=-18\begin{matrix} x + 0 = 9 \\ 0 - y = -18 \\ \end{matrix}
Multiply the second equation by -1.
x+0=90+y=18\begin{matrix} x + 0 = 9 \\ 0 + y = 18 \\ \end{matrix}
We can read off the solution for y as y = 18.
Let's check that these values are actually a solution of our original equations.

x+y=9+18=272x-y=2×9-18=0\begin{matrix} x + y = 9 + 18 = 27 \\ 2x - y = 2×9 - 18 = 0 \\ \end{matrix}

Augmented matrices

We can write our system of linear equation in matrix form.
x+y=272x-y=0(112-1)(xy)=(270)\begin{matrix} x + y = 27 \\ 2x - y = 0\\ \end{matrix} \;\;\; ⟺ \;\;\; \begin{pmatrix} 1 & 1 \\ 2 & -1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 27 \\ 0 \end{pmatrix}
We could do all of the manipulations above in matrix form.  Note that we would never touch the column vector with x and y, all of the operations would affect only the numbers in the matrix and in the column vector on the right hand side of the equation.  So, we can write this in an even more compact form, leaving out the column vector with x and y.

(11|272-1|0)\left(\begin{array}{cc|l} 1 & 1 & | \, 27 \\ 2 & -1 & | \, 0 \\ \end{array}\right)

This is called an augmented matrix.  Note that we've removed all of the variables, so this form is very amenable to use with numerical matrix operations.

Our final set of equations, where we could read off the solutions, in augmented matrix form is

(10|901|18)\left(\begin{array}{cc|l} 1 & 0 & | \, 9 \\ 0 & 1 & | \, 18 \\ \end{array}\right)

The left hand side of the augmented matrix (the original matrix) is the Identity Matrix which has ones along the diagonal and zeros elsewhere.  Finding a solution of our linear equations is equivalent to changing the left hand side of the augmented matrix to the identity matrix.

Our three elementary operations translated to the augmented matrix representation are

These are called elementary row operations or EROs.

Echelon form

We can derive the solutions to a set of equations if we can change the augmented matrix so it is in Row Echelon Form.  This is an example of a matrix in row echelon form:

(231|6011|2003|3)\left(\begin{array}{ccc|l} 2 & 3 & 1 | \, 6 \\ 0 & 1 & 1 | \, 2 \\ 0 & 0 & 3 | \, 3 \\ \end{array}\right)

All entries to the left of the diagonal are zero.  The leftmost nonzero entry in each row is called the leading coefficient or more commonly the pivot of that row.  Echelon means "level or rank in an organization".  In this context, the first row is the upper echelon because it has the most non-zero entries, while the bottom row is the bottom echelon because it has the fewest non-zero entries.

You can't directly read off the solution to the equation from this form, but you can derive it by "back substitution".  The bottom row is equivalent to the equation 3z = 3, which has the solution z = 1.  The middle row is equivalent to the equation y + z = 2.  Since we know z = 1, we can substitute z into this equation and find y + 1 = 2 or y = 1.  Substituting y and z into the equation derived from the top row, we find x = 0.

To make the back substitution easier, it is preferable to get the augmented matrix into Reduced Row Echelon Form or RREF.  The RREF of an augmented matrix is the equivalent matrix (same solutions) that has the minimum number of non-zero entries and ones along the diagonal.  The RREF of an augmented matrix is defined as:
An augmented matrix with the left hand part equal to the identity matrix is in RREF.  If the RREF of an augmented matrix does not have the identity matrix as its left hand part, usually this means that the set of linear equations is redundant, so there are many solutions, or inconsistent, so there are no solutions.

Gaussian elimination

The process of applying elementary row operations to simplfy the augmented matrix to Reduced Row Echelon Form is called 'Gaussian elimination'.  We can write down an algorithm for Gaussian elimination.
  1. Check that the first entry of the first row is non-zero.  If not, swap that row with a row that does have a non-zero first entry. (ERO - Row swap)
  2. Multiply the first row by a scalar to make the first entry equal 1. (ERO - scalar multiplication)
  3. Use that 1 as a pivot to zero out the first entry of all of the other rows.  More explicitly, for each other row, multiply the top row by the negative of the first entry in the row and add that to the row. (ERO - row addition)  Note that the first row does not change during this step.
  4. In the first column, you should now have a one in the first row and zeros in the other rows.
  5. Move to the second row.
  6. Check that the second entry of the the current row is non-zero.  If not, swap that row with a row (other than the first row) that does have a non-zero second entry. (ERO - Row swap)
  7. Multiply the second row by a scalar to make the second entry equal 1. (ERO - scalar multiplication)
  8. Use that 1 as a pivot to zero out the second entry of all of the other rows.  More explicitly, for each other row, multiply the second row by the negative of the second entry in the row and add that to the row. (ERO - row addition)  Note that the second row does not change during this step, but the first row probably will.
  9. In the second column, you should now have a one in the second row and zeros in the other rows.
  10. Repeat the process in lines 5 through 9 for the remaining rows.  Note that for the third row, you work on the third entry, etc...
You should practice Gaussian elimination as we will be using it repeatedly in this class.

Assignment

The assignment for today's class involves doing Gaussian elimination and matrix operations with no programming.  The assignment is not due until September 11, but it will be very useful to do it before class on Thursday, September 6, since that class will be devoted to doing Gaussian elimination in Python.

HW #5 is due at the beginning of class on Tuesday, September 11.
https://homepage.physics.uiowa.edu/~pkaaret/2018f_p4905/hw05.html