Basic matrix operations

Definitions, notation and terminology

A matrix is a two-dimensional arrangement of numbers, in the form of rows and columns. Each element in a matrix is a number identified by its position in a particular row and a particular column. For example, element \(a_{23}\) is the element in row \(2\) and column \(3\). The generation notation for an element in a matrix is \(a_{ij}\) where \(i\) is the row of the element, and \(j\) is its column.

A matrix is denoted by square brackets \([\,]\) around the arrangement of numbers. For example, the following is a matrix of 2 rows and 3 columns \[\begin{bmatrix} 2 & 0 & 3 \\ 1 & 2 & -1\end{bmatrix}\]

A row vector is a one-row matrix. A column vector is a one-column matrix. For example, matrix \(A\) that follows is a row vector, while matrix \(B\) is a column vector. \[A=\begin{bmatrix} 2 & 0 & 3 \end{bmatrix};\qquad B=\begin{bmatrix} 2 \\ 0 \end{bmatrix}\]

The size of the matrix is how many rows and how many columns it contains. A \(m\times n\) matrix contains \(m\) rows and \(n\) columns. In the example above, \(A\) is a \(1\times 3\) matrix and \(B\) is a \(2\times 1\) matrix.

A matrix that has an equal number of rows and columns is called a square matrix. A square matrix of size \(1\times 1\) is generally considered equivalent to a scalar, so \(\begin{bmatrix} 5\end{bmatrix}=5\). The elements \(a_{11},a_{22},a_{33},\dots,a_{nn}\) of a square matrix are on the main diagonal (red elements below) of the matrix. \[\begin{bmatrix} \color{red}{1} & 0 & 3 \\ 2 & \color{red}{-1} & 1 \\ 0 & 4 & \color{red}{5} \end{bmatrix}\]

We may denote the size of the matrix by subscript notation. For example, \(A_{2,3}\) denotes matrix \(A\) that has 2 rows and 3 columns. For a square matrix, we only need one number that is both the number of rows and the number of columns. Therefore, the notation \(A_3\) may be used to represent a \(3\times 3\) matrix. Caution needs to be used here. The meaning of the subscript must be interpreted in its context.

If all elements above the main diagonal are zero (like matrix \(A\) below), the matrix is called a lower triangular matrix. If all elements below the main diagonal are zero (like matrix \(B\) below), the matrix is called an upper triangular matrix. If all elements other than those on the main diagonal are zero (like matrix \(C\) below), the matrix is called a diagonal matrix \[A=\begin{bmatrix} 2 & 0 & 0 \\ -1 & 3 & 0 \\ 0 & 1 & 2 \end{bmatrix};\qquad B=\begin{bmatrix} 3 & 1 & 3 \\ 0 & 0 & 2 \\ 0 & 0 & -1 \end{bmatrix};\qquad C=\begin{bmatrix} 4 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & -3 \end{bmatrix}\]

A matrix whose all elements are zero is called a zero matrix, and is denoted by \(0\). Note that there are infinitely many zero matrices with different sizes. A diagonal matrix whose main diagonal has only \(1\)'s is called an identity matrix, and is denoted by \(I\). Note that all an identity matrix must be square, and that there is an infinite number of identity matrices of different sizes.

Matrix addition and subtraction

Two matrices \(A\) and \(B\) can be added (or subtracted) if and only if they are of the same size. We do this by adding (or subtracting) their corresponding elements. The result is a matrix of the same size.

\[A=\begin{bmatrix} 2 & 3 \\ -1 & 1 \end{bmatrix};\qquad B=\begin{bmatrix} 1 & 0 \\ -2 & 4 \end{bmatrix}\] \[A+B=\begin{bmatrix} 2+1 & 3+0 \\ -1-2 & 1+4 \end{bmatrix}= \begin{bmatrix} 3 & 3 \\ -3 & 5\end{bmatrix}\]

Matrix addition is both commutative and associative. The zero matrix (with the appropriate size) acts like the zero in addition of numbers. Consider matrices \(A_{m,n},B_{m,n} and \(C_{m,n}\). \begin{align} A+B &= B+A \\ A+B+C&=(A+B)+C=A+(B+C) \\ A + 0_{m,n} &= 0_{m,n} + A = A \end{align}

Scalar multiplication

Multiplying a matrix by a scalar (a number) is equivalent to multiplying all elements of the matrix by that scalar. The result is a matrix of the same size as the original.

\[2\times\begin{bmatrix} 2 & 5 \\ -1 & 3 \end{bmatrix}= \begin{bmatrix} 2\times2 & 2\times5 \\ 2\times-1 & 2\times3 \end{bmatrix}= \begin{bmatrix} 4 & 10 \\ -2 & 6 \end{bmatrix}\]

Matrix multiplication

When multiplying a row matrix \(A_{1,n}\) by a column matrix \(B_{n,1}\), we multiply elements \(a_{1i}\) and \(b_{j,1}\) and add the products. The result is a single number. For example, \[\begin{bmatrix} \color{red}{2} & \color{blue}{3} & \color{green}{-1} \end{bmatrix}\times \begin{bmatrix} \color{red}{1} \\ \color{blue}{-1} \\ \color{green}{2} \end{bmatrix}= \color{red}{(2)(1)}+\color{blue}{(3)(-1)}+\color{green}{(-1)(2)}=2-3-2=-3\]

Note that the number of columns in the first matrix must match the number of rows in the second; otherwise we will end up with an extra number that cannot be paired with another. In other words, we can only multiply a \(1\times n\) row matrix by a \(n\times 1\) column matrix.

When we multiply larger matrices, the same rule applies: the number of columns in the first matrix must match the number of rows in the second. To get the element in row \(i\) and column \(j\) of the result, we multiply row \(i\) from the first matrix by column \(j\) from the second. \[\begin{bmatrix} 1 & 2 \\ -3 & 1 \end{bmatrix}\times \begin{bmatrix} 2 & 1 & 0 \\ 0 & 3 & 1 \end{bmatrix}= \begin{bmatrix} (1)(2)+(2)(0) & (1)(1)+(2)(3) & (1)(0)+(2)(1) \\ (-3)(2)+(1)(0) & (-3)(1)+(1)(3) & (-3)(0)+(1)(1) \end{bmatrix}= \begin{bmatrix} 2 & 7 & 2 \\ -6 & 0 & 1 \end{bmatrix}\]

We can only multiply a \(\color{blue}{m}\times\color{red}{n}\) matrix by a \(\color{red}{n}\times\color{blue}{k}\) matrix, and the result will be a \(\color{blue}{m}\times\color{blue}{k}\) matrix.

Matrix multiplication is generally not commutative, but it is associative. The zero matrix (with appropriate size) acts like the zero in multiplication of numbers. Consider matrices \(A_{m,n},B_{n,k}\) and \(C_{k,r}\) \begin{align} A\times B &\color{red}{{}\neq{}} B\times A \\ A\times B\times C&=(A\times B)\times C=A\times(B\times C) \\ A \times 0_{n,p} &= 0_{m,p} \\ 0_{p,m}\times A &= 0_{p,n} \end{align}

In the expression \(A\times B\) we say that \(B\) is pre-multiplied by \(A\), while in \(B\times A\) that \(B\) is post-multiplied by \(A\). An identity matrix (with appropriate size) acts like \(1\) in multiplication of numbers. For a matrix \(A_{m,n}\), \[A\times I_n=I_m\times A=A\]

A matrix can be multiplied by itself if and only if it is a square matrix. A matrix is then said to be raised to a power. For example, given a matrix \(A_{n,n}\), \begin{align} A^2 &= A\times A \\ A^t &= \overset{t\text{-times}}{\overbrace{A\times A\times \cdots\times A}} \end{align}

Matrix transpose

The transpose, denoted by \(A^T\) of matrix \(A\) is the matrix obtained by interchanging the rows and columns of \(A\). Thus, row \(1\) of \(A\) becomes columns \(1\) of \(A^T\), row \(2\) becomes columns \(2\), and so on. The result of transposing a matrix \(A_{m,n}\) is a matrix \(A^T_{n,m}\).

\[A=\begin{bmatrix} 2 & 3 & -1 \\ 1 & 0 & 4 \end{bmatrix};\qquad A^T=\begin{bmatrix} 2 & 1 \\ 3 & 0 \\ -1 & 4 \end{bmatrix}\]

The transpose of the transpose gives the original matrix. \[(A^T)^T=A\]