Scientific Computing Using Python - PHYS:4905 - Fall 2018
     Lecture #10 - 9/25/2018 - Prof. Kaaret
    These notes borrow from Linear Algebra
    by Cherney, Denton, Thomas, and Waldron.
    
    Matrices
    
    r × k matrix
    is a rectangular array of numbers
     where i = 1,
    2, ..., r and j = 1, 2, ..., k.  Each
    number m is an element of the matrix.
    
    We can write the matrix in the form
    
    
    
    An r × k matrix has r rows and k
    columns.
    
    An  r × 1 matrix is a column vector,
     
    
     
    
    A 1 × r  matrix is a row vector, 
    
                      
    
    
    Some people like to add commas to separate the elements of row
    vectors.
    
    
    Operations on Matrices
    Scalar multiplication: We can multiply a matrix by a scalar,
    which means that we multiply each element by the scalar.
    
     
    Addition: We can add matrices, which means that we add each
    elements of each matrix.  Note that the two matrices have to
    the same dimensions for addition to be defined.
    
    
    Interestingly, this means that the set of r × k
    matrices form a vector space that we call
    .
    
    Matrix multiplication: We can multiply matrices,
    
    
    Matrix multiplication is not simply multiplying each element of M by
    the corresponding element of N.  Rather, it is (exactly) like
    the dot product - you multiply several elements in M by elements in
    N and then sum the products.  
    
    The simplest example is multiplying a  1 × r 
    matrix (a row vector) times a r × 1  matrix (column
    vector). For example,
    
    
    
    This is the same as the dot product. Which is why the function dot
    works for both the dot product and matrix multiplication in
    Python.  Note that the row and column vectors need to have the
    same length.  This is equivalent to the second dimension of the
    first matrix equaling the first dimension of the second matrix.
    
    Graphically, you do matrix multiplication by taking each column of
    the second matrix, flopping it over into a row vector (taking the
    transpose), multiplying each pair of elements, and then summing
    those products.  The entries of the matrix multiplication MN
    are made from the dot products of the rows of M with the columns of
    N.
    
    
    Doing this for each of the m columns of the second matrix on
    each of the r rows gives you r × m numbers
    that form your r × m  output matrix.  Note
    that the dot product (multiply and add operation) doesn't make sense
    unless the number of columns of the first matrix equals the number
    of rows of the second matrix.  So, the two matrices do not need
    to have the same dimensions, but the second dimension of the first
    matrix must equal the first dimension of the second matrix.
    
    What do you get if you multiply a 3×3 matrix times a column vector?
    
    What do you get if you multiply a 3×3 matrix times a row vector?
    
    What do you get if you multiply a 3×3 matrix times a 3×3 matrix?
    
    What do you get if you multiply a 3×1 matrix times a 1×2 matrix?
    
    In general, (r × k) times (k
    × m) is (r × m).
    
    Matrix terminology 
    A matrix is called square if its two dimensions are
    equal.  Makes sense, since you then have a square pattern of
    numbers when you write it down.
    
    The transpose  of ar × k matrix
    is ak × r
    matrix  where i = 1,
    2, ..., r and j = 1, 2, ..., k.  The
    elements are swapped across the diagonal.  If the matrix M is
    not square, then the transpose will have a different shape. 
    For example, the transpose of a column vector is a row vector. 
    
    
    Taking the transpose twice gets you back to the original matrix.
    
    The transpose of product of matrices equals the product of the
    transposes with the order swapped, 
    
    
    If the transpose equals the matrix,
    , then
    the matrix is symmetric.  Of course, only square matrices can
    be symmetric (just from their geometry).
    
    A square matrix that is zero for all elements off the diagonal is
    called a diagonal matrix.
    
    A diagonal matrix with all diagonal entries equal to 1 is called the
    identity matrix.  The identity matrix is special
    because IM = MI = M.  The identity matrix works just like the
    multiplicative identity in real numbers (1).  The identity
    matrix is square and there are an infinite number of identity
    matrices with different numbers of rows/columns.
    
    We can define powers of matrices, e.g.
    , but
    only for square matrices.  Why?
    
    Similar to the fact that
    for all real numbers, we define
    . 
    This allows us to evaluate any polynomial on any square matrix.
    
    Exercise: Let 
    and  .  What is
    f(M)?
    
    Associativity and non-commutativity
    Matrix multiplication is associative.  Specifically, (MN)R
    = M(NR). 
    
    Matrix multiplication is, in general, not commutative. 
    Specifically, for two generic square matrices,
    . 
    
    
    Note that multiplication by some matrices is commutative, e.g. the
    identity matrix.  However, you cannot assume that matrix
    multiplication is commutative for an arbitrary pair of matrices, so
    you cannot commute two matrices when do algebra with matrices.
    
    
    Trace
    The trace of a square matrix is the sum of its diagonal
    entries,
    
    
    Taking the transpose does not affect the trace, since the transpose
    doesn't change any of the diagonal elements, tr(M) = tr(MT).
    
    
    Even though matrix multiplication does not commute, the trace of a
    product of matrices does not depend on the order of multiplication,
    tr(MN) = tr(NM).
    
    The trace operator is a linear operation that transforms matrices to
    the real numbers.
    
    Block matrices
    Sometimes it is efficient to break a matrix into blocks.  For
    example,
    
    
    
    You can then work out matrix multiplication using the smaller
    matrices in the blocks.  
    For example, we can find the square of M using there
    blocks.  First, we do the matrix math on the blocks in
    schematic form,
    
    
    Then, we can calculate each of these expressions (which are
    polynomials in A, B, C, and D) using
    the matrices for each block as defined above,
    
    
    
    Then, we substitute these results back into the schematic form of
    our matrix multiplication and get the answer,
    
    
    
    This is exactly what we would have gotten if we did the full matrix
    multiplication.
    
    This example doesn't actually save much work.  Block matrices
    are much more useful if some of the blocks are zero or an identity
    matrix.
    
    Exercise: The matrix
     
    rotates a vector in a two-dimensional space through an angle
    .  We can
    use a block matrix to do a rotation in three dimensions.  The
    matrix
    
    
    
    will rotate a 3-dimensional vector (x, y, z)
    around the z axis (or in the xy plane) through an
    angle
    . 
    
    Using block matrices, find the matrix product
    . 
    
    What does this matrix do when applied to a 3-dimensional vector (x,
    y, z)?
    What matrix would rotate a 3-dimensional vector (x, y,
    z) around the x axis?
    
    
    Inverse matrix
    The inverse of a square matrix M is the matrix  such
    that 
    .
    
    Not all matrices have inverses.  If the inverse does exist,
    then the matrix is called invertible or nonsingular. 
    If a matrix does not have an inverse then it is called singular
    or non-invertible.
    
    The inverse is the inverse is my friend, or actually, the original
    matrix,
    .
    
    The inverse of the product is the product of the inverses with the
    order swapped,
    .
    
    The inverse of the transpose is the transpose of the inverse, 
    .
    
    
    Solving linear equations with the inverse
    The process of Gaussian elimination to find the solution of a system
    of linear equations is equivalent to finding an inverse
    matrix.  Let's look at this in matrix form.
    
    We start with a system of linear equations that we can write in
    matrix form as MX = V, where M is a matrix
    and X and V are column vectors.  The problem is
    specified by M and V.  The solution is X.
    
    We write the system of equations in augmented matrix form as (M
    | V).  We apply a bunch of elementary row operations to
    M that reduce it to the identity matrix I.  The
    operations taken together are equivalent to multiplying by M
      -1.  We apply the same operations to V to get
    the solution.  This is the same as multiplying V by M
      -1, so the solution of the system of linear equations is
    X = M -1V.
    
    
    If we want to solve another system of linear equations that has the
    same M, but a different V that we'll call W,
    we have already done most of the work.  We just need to
    find  M -1 W.
    
    
    When does the inverse exist?
    A square matrix M is invertible if and only if the homogenous system
    of equations Mx = 0 has no non-zero solutions.
    
    If M -1 exist, then we can multiply both sides
    of  Mx = 0 by the inverse.  We get 
    
        .
    
    So the only solution of Mx = 0 must be x = 0.
    If M -1 does not exist, then we cannot multiply
    both sides of  Mx = 0 by the inverse.
    
    This condition must be satisfied for a matrix to be invertible, but
    is it sufficient to ensure the existence of the inverse?