Sunday, November 04, 2012

Back-substitution and inverting matricies

Matrix triangulation and back-substitution algorithms can be used in combination with gaussian elimination to solve systems of equations or to find the inverse of a matrix. I previously covered gaussian elimination, continuing on we can now solve the systems of equations using back substitution.

The matrix we had to solve was:
    1     2     1     4    13 
    0    -4     2    -5     2 
    0     0    -5  -7.5   -35 
    0     0     0    -9   -18 
First we normalise the upper-triangle matrix, by simply dividing each row such that the leading coefficient is one:
    1     2     1     4    13 
    0     1  -0.5   1.2  -0.5 
    0     0     1   1.5     7 
    0     0     0     1     2 
(this simplifies the back-substitution, but we can skip/combine this step with the back-substitution)

For back-substitution we work our way backwards from the bottom of the matrix to the top, progressively eliminating each variable. As with gaussian elimination we select a pivot row, and subtract that from the rows above it. First, we start with the last row, and subtract 1.5 times that row from the row above.
    1     2     1     4    13 
    0     1  -0.5   1.2  -0.5 
    0     0     1     0     4 <-- subtract pivot row * 1.5
    0     0     0     1     2 <-- pivot 
Similarly, we continue on to the second row, subtracting 1.2 times, and the top row, subtracting four times.
    1     2     1     0     5 <-- subtract pivot row * 4
    0     1  -0.5     0    -3 <-- subtract pivot row * 1.2
    0     0     1     0     4 
    0     0     0     1     2 <-- pivot
Again, we repeat the process for the third column:
    1     2     0     0     1 
    0     1     0     0    -1 
    0     0     1     0     4 <-- pivot
    0     0     0     1     2 
And finally, the second column:
    1     0     0     0     3 
    0     1     0     0    -1 <-- pivot
    0     0     1     0     4 
    0     0     0     1     2 
Now we have our solution to the system of equations from our original gaussian elimination problem.
a = 3, b = -1, c = 4 and d = 2.
In words/pseudo-code, the process is:
  • Pivot through all the rows, starting from the bottom to the top
  • For each row above the pivot, calculate how many times we need to subtract the pivot row from this row.
  • For each element in the row, subtract the corresponding element from the pivot row, multiplied by the value above.
In code:
for (int p=n-1;p>0;p--) { //pivot backwards through all the rows
        for (int r=p-1;r>=0;r--) { //for each row above the pivot
            float multiple = mat[r][p] / mat[p][p]; //how many multiples of the pivot row do we need (to subtract)?
            for (int c=p-1;c<m;c++) {
                mat[r][c] = mat[r][c] - mat[p][c]*multiple; //subtract the pivot row element (multiple times)
            }
        }
    }
(complete code here)

This process can be applied to find the inverse of a general matrix. Beginning with any matrix we want to invert, we augment it with the identity matrix. For example:
    2     4    -2     1     0     0 
    4     9    -3     0     1     0 
   -2    -3     7     0     0     1 
Now we can apply gaussian elimination to generate:
    2     4    -2     1     0     0 
    0     1     1    -2     1     0 
    0     0     4     3    -1     1 
The normalise the upper triangle to get:
    1     2    -1   0.5     0     0 
    0     1     1    -2     1     0 
    0     0     1  0.75 -0.25  0.25 
And finally, back-substitution to get our solved inverse:
    1     0     0   6.8  -2.8  0.75 
    0     1     0  -2.8   1.2 -0.25 
    0     0     1  0.75 -0.25  0.25 
In this entire discussion I have left out ill-conditioned and singular matrices, but I'll leave modifying the code for that as an exercise for the reader.

Gaussian Elimination

Gaussian Elimination is an elementary transformation that converts a matrix into a triangle, or row-reduced echelon form (RREF). It forms the basis of a number of operations in linear algebra to solve systems of equations, invert matrices, and minimize systems of equations among other things (I'll cover these in later posts). The Gaussian Elimination algorithm itself is straight-forward (you probably learnt it in high school). Given a system of equations, e.g.
  a + 2b +  c + 4d = 13
 2a +      4c + 3d = 28
 4a + 2b + 2c +  d = 20
-3a +  b + 3c + 2d = 6
We can form an augmented matrix to represent it, and use Gaussian elimination to solve it. The goal is to produce a triangle-matrix representation, so that we can solve the equations by back-substitution. In other words, we want to (eventually) have one row represent each variable, and for all other rows, that variable should be zero. (i.e. solved). Gaussian elimination takes us part of the way there by giving us a set of equations with a starting point which we can then later solve.

Representing the above equations as a matrix, we have:
    1     2     1     4    13 
    2     0     4     3    28 
    4     2     2     1    20 
   -3     1     3     2     6 
The first step is to select a pivot row, which we can use to eliminate/reduce the other rows. When we eliminate the other rows, we want the that variables value to be 0. In this example, we pick the first row, and then subtract that twice from the row below, to ensure that the row below will have zero a's.
    1     2     1     4    13 <-- pivot
    0    -4     2    -5     2 <-- subtract pivot row * 2
    4     2     2     1    20 
   -3     1     3     2     6 
Likewise, four times the third tow, and negative three times the final row.
    1     2     1     4    13 <-- pivot
    0    -4     2    -5     2 
    0    -6    -2   -15   -32 <-- subtract pivot row * 4
    0     7     6    14    45 <-- subtract pivot row * -3
Great. Our first variable (a) has been eliminated. We now repeat this step, starting from the second row, with the variable 'b'. We don't want to use the first row, as we want to preserve that row's representation of the 'a' variable.
    1     2     1     4    13 
    0    -4     2    -5     2 <-- pivot
    0     0    -5  -7.5   -35 <-- subtract pivot row * 1.5
    0     0   9.5   5.2    48 <-- subtract pivot row * -1.75
Now, we repeat the process again, starting from the third row.
    1     2     1     4    13 
    0    -4     2    -5     2 
    0     0    -5  -7.5   -35 <-- pivot
    0     0     0    -9   -18 <-- subtract pivot row * -1.9
Done. In pseudo-code/words, the algorithm is:
  • For each row (except the last), select a pivot. (In my example, I just take the first available row each time)
  • For each row that is below the pivot, calculate the number of times we need to subtract the row (i.e. divide)
  • For each element in this row, subtract the corresponding element in the pivot row, multiplied by the value we calculated above.
The code to achieve this is:
//input a m (col) by n (row) matrix ('mat')
    //p is the pivot - which row we will use to eliminate
    for (int p=0;p<n-1;p++) { //pivot through all the rows
        for (int r=p+1; r < n; r++) { //for each row that isn't the pivot
            float multiple = mat[r][p] / mat[p][p]; //how many multiples of the pivot row do we need (to eliminate this row)?
            for (int c = 0; c<m; c++) { //for each element in this row
                mat[r][c] = mat[r][c] - mat[p][c]*multiple; //subtract the pivot row element (multiple times)
            }
             
        }
    }
(full code here) Next time, we continue on to solve the equations - available here! (2,4,-1,3)