Eigendecomposition as Change of Basis

Mar 28 2016

This is a series of notes taken during my review of linear algebra, using Axler's excellent textbook Linear Algera Done Right, which will be heavily referenced.

$$ \newcommand{\X}{$X$} \newcommand{\Y}{$Y$} \newcommand{\la}{\langle} \newcommand{\ra}{\rangle} \newcommand{\bv}{\mathbf{v}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bw}{\mathbf{w}} \newcommand{\be}{\mathbf{e}} \newcommand{\bs}{\mathbf{s}} \newcommand{\bff}{\mathbf{f}} $$

As before, we work with a finite-dimensional vector space \(V\), and reuse some of the notations from an earlier post on change of basis.

Recall that an operator \(T \in \mathcal{L}(V)\) is diagonalizable if and only if there exists a basis \(\be = (e_1,...,e_n)\) consisting entirely of eigenvectors of \(T\). That is, \(T\) has a diagonal matrix

$$[T]_\be = \begin{bmatrix} \lambda_1 & &\\ & \ddots &\\ & & \lambda_n \end{bmatrix} $$

with respect to basis \(\be = e_1,...,e_n\) if and only if \(Te_j=\lambda_j e_j\) for each \(j\). Equivalently, the existence of such a basis \(\be = e_1,...,e_n\), together with any arbitrary basis \(\bff\) of \(V\) (which always exists), allows the matrix of the identity operator \(I\) with respect to these two bases to ``diagonalize'' matrix \([T]_{\bff}\):

$$[T]_\be= [I]_\bff^\be [T]_\bff [I]_\be^\bff$$

In words: a linear operator has a particularly simple matrix description in its eigenbasis. Most commonly, the arbitrary basis \(\bff\) is chosen to be the standard basis \(\bs\) of \(V\) (recall \(\be\) is already used to denote the eigenbasis, i.e., the basis of its eigenvectors). Then let \(\Lambda = [T]_\be\), \(A=[T]_\bs\), and \(S=[I]_\be^\bs\) expresses the eigenbasis in terms of the standard basis; then we have the famous factorizations:

$$\Lambda = S^{-1} A S $$
$$A = S \Lambda S^{-1} $$

In the special case of a normal operator, \(Q=[I]_\be^\bs\) becomes orthonormal, then \(Q^{-1}=Q^*\), so the factorizations become even nicer:

$$\Lambda = Q^* A Q $$
$$A = Q \Lambda Q^* $$

Let's look at the decomposition \([T]_\bff= [I]_\be^\bff [T]_\be [I]_\bff^\be\) and see what it's really saying. As before, \(\bff\) is an arbitrary basis of \(V\), but let's choose it to be \(\bs\) since \(\bs\) is the default basis for matrices, and let \([x]_\bs\) be the coordinates of a vector \(x\) in this basis (which are trivially obtained: e.g. if \(x=(1,2,3)\), then \([x]_\bs=[1 \quad 2 \quad 3]^\top\)). \(T\) acts on \(x\), and we seek the resulting vector \(Tx\)--naturally corresponding to \([Tx]_\bs\), the coordinates of \(Tx\) in \(\bs\). The answer is in the matrix multiplication \([T]_\bs [x]_\bs = [I]_\be^\bs [T]_\be [I]_\bs^\be [x]_\bs\), which decomposes into a series of three multiplications. First, \(S^{-1} =[I]_\bs^\be\) translates the coordinates of \(x\) from \(\bs\) to \(\be\):

$$[I]_\bs^\be [x]_\bs = S^{-1}[x]_\bs = [x]_\be = \begin{bmatrix} c_1 & ... & c_n \end{bmatrix}^\top $$

Next, \(\Lambda = [T]_\be\) describes the stretching of individual eigenvectors, corresponding to the rescaling of coordinates by eigenvalues:

$$ [T]_\be [I]_\bs^\be [x]_\bs = [T]_\be [x]_\be = \Lambda \begin{bmatrix} c_1 & ... & c_n \end{bmatrix}^\top = \begin{bmatrix} \lambda_1 c_1 & ... & \lambda_n c_n \end{bmatrix}^\top $$

Last step, \(S=[I]_\be^\bs\) translates the resulting ``eigen-coordinates'' above back to the original basis \(\bs\):

$$ [Tx]_\bs = [T]_\bs [x]_\bs = [I]_\be^\bs [T]_\be [I]_\bs^\be [x]_\bs = [I]_\be^\bs [T]_\be [x]_\be = S \begin{bmatrix} \lambda_1 c_1 & ... & \lambda_n c_n \end{bmatrix}^\top = c_1 \lambda_1 [e_1]_\bs + ..., + c_n \lambda_n [e_n]_\bs $$

Remember, column \(j\) of \(S= [I]_\be^\bs\) is, by definition \([e_j]_\bs\), the \(j\)th eigenvector of \(T\) expressed in the original basis \(\bs\).