Canonical Forms

Throughout this chapter we shall be concerned with an dimensional vector space . We shall not assume that it is a complex vector space: the scalar coefficients may lie in an arbitrary field. The treatment we shall give is taken from N. Jacobson’s Notes on Algebra.


 

1. The space of linear transformations

The set of all linear transformations defined on is an dimensional vector space. For if is a basis in we may define the linear transformations in by . (We have: so that .) The are linearly independent, for for all implies whence for all , . Moreover every linear transformation is a linear combination of the : if the matrix of is , so that , then , so that .

2. The minimal polynomial of a transformation

If is an arbitrary linear transformation, the linear transformations are linearly dependent, from the result of the preceding section, so that there exists a polynomial of degree for which .

Definition 1. The polynomial of minimal degree and leading coefficient for which is the minimal polynomial of .

We observe that it is uniquely determined by the conditions we placed on it: if is such that , we may divide by and obtain , where the degree of is less than . From the relation we see that , whence, by the properties of , must be identically zero. Consequently divides any for which : this proves that it is uniquely determined by our conditions. Moreover if , then , so that must be a divisor of the minimal polynomial of . Interchanging the roles of and we obtain the result that the minimal polynomials of two similar transformations (i.e., of two transformations and related by an equation of the form ) are equal.

3. The order of a vector

Since for an arbitrary vector , the vectors , being in number more than , are linearly dependent, there exists a polynomial for which . We verify easily, as in the previous section, that the requirements that have minimal degree and have leading coefficient one uniquely determine : we call this polynomial the order of (relative to ). Since, moreover, is a factor of every polynomial for which , the least common multiple (L.C.M.) of all exists and is a factor of . If is a basis in and then for every in we have or , so that is divisible by . It follows that is exactly equal to , or equivalently, to the L.C.M. of all .

4. Quotient spaces

For an arbitrary linear subspace in we write if is in . The set of all vectors in , with the same meaning of addition and scalar multiplication as before, but with the notion of equality of two vectors redefined to mean their congruence , is a vector space: we denote it by . We prove that if the dimension of is then that of is . For let be an arbitrary linear subspace whose intersection with consists of alone and which together with spans . If is a basis in , then the are a basis in . If , i.e, if lies in , then, since , we must have , and the linear independence of the implies that for all . Hence the are linearly independent. To prove that they span we observe that every in may be written in the form , with in and in . In other words, since is in , every is congruent, , to some in , as was to be proved.

If is a linear transformation which leaves invariant, i.e., for which in implies in , then implies , so that induces a linear transformation in . The space is the quotient space of modulo (or by ).

5. Cyclic spaces

Definition 2. For any vector , and given linear transformation , we denote by the linear subspace spanned by all vectors : we call such linear subspaces cyclic.

If the order of is , then are linearly independent, but together with they form a linearly dependent set, so that the dimension of is . The matrix of in the space with respect to the basis is

We note that is the minimal polynomial of in and therefore depends only on the subspace and not on the particular generator .

6. Vectors of maximal order

The fundamental lemma on which the work of this chapter is based is the following.

Lemma 1. Given any linear transformation , there exists a cyclic subspace whose order is the minimal polynomial, , of .

Proof. For let be a basis in and let be the order of . If is the factorization of into powers of irreducible polynomials and if we write , for , then is the minimal polynomial of . We write then the order of is . Hence among the we may find vectors whose orders are, respectively, . We shall prove that has the order .

For any polynomial , denote by the linear subspace of vectors for which . It is easy to verify that is invariant under . We assert that if and are relatively prime then and have only in common. For if then must be a divisor of both and , whence , and . Hence, in our case, the intersection of any two of the subspaces consists of alone, and the are invariant under . It follows (since is in ) that implies , so that must be divisible by , , and therefore by . This completes the proof of our lemma.

Since the order of any vector is a polynomial of degree , we have proved that the degree of is : if it is equal to then itself is cyclic.

7. The rational canonical form

On the basis of the lemma of the preceding section we shall now prove that is the direct sum of cyclic subspaces: i.e., that we can find vectors of orders respectively (where is a polynomial of degree ) so that the vectors , , , are a basis in .

Let be the minimal polynomial of and let be a vector whose order is . Let be the minimal polynomial of in the quotient space , and let be a vector of this space whose order () is . This means that , i.e., that is in , so that for a suitable polynomial .

Since (and, a fortiori, ) must be a factor of : say . We have then so that must be divisible by :

It follows (upon division by ) that

We define . Since the order of is the same as that of , i.e., it is . We have moreover

The intersection of the linear subspaces and consists of zero alone, for in implies that is a divisor of , and therefore that .

The above constitutes the first step of an induction proof: we go on to indicate the second step.

Denote by the linear subspace spanned together by and and consider the quotient space . Let be a vector whose order () is the minimal polynomial, , of in this space. Then (where the auxiliary polynomial is not related to the one used above). Since and therefore , must divide : Hence so that whence, finally,

Similarly, since , , so that Consequently so that

We write ; then so that the order of is still ; and moreover

It follows that the linear subspace has only the vector in common with , for implies that is divisible by and therefore that .

Proceeding in this way by induction we obtain finally the desired result. The polynomials (which incidentally have the property that each divides all preceding ones) are the invariant factors of .

Using the basis , , , the matrix of has the so-called rational canonical form: where each has the form described in (V.5) above.

It is easily verified, using this form, that the characteristic polynomial, , is the product of all the and is therefore divisible by . As a corollary, therefore, we obtain the Hamilton-Cayley theorem:

Theorem 1. For every , , where is the characteristic polynomial of .

8. Elementary transformations

In the present section we consider matrices (not linear transformations) depending on a parameter .

Definition 3. Let be a matrix whose elements are polynomials in , and let be the highest common factor (H.C.F.) of all rowed minors of . is the -th determinantal factor of .

Two matrix polynomials, and , are called equivalent if there exist non-singular matrix polynomials and for which . Since the rows (or columns) of (or ) are linear combinations of the rows (or columns) of , it follows that the determinantal factors of are linear combinations of those of and therefore that is divisible by . Interchanging the roles of and we see that if is equivalent to then their determinantal factors are equal.

The matric polynomial is obtained from by elementary transformations if it is obtained by

  1. the interchange of any two rows (or columns),
  2. the multiplication of any row (or column) by a constant different from , or
  3. the addition of the -fold of any row (or column) to any other row (or column), when is an arbitrary polynomial.

It is easily verified that the elementary transformations can be effected by suitable multiplication by non-singular matric polynomials, so that if is obtained from by elementary transformations then and are equivalent. In particular therefore and have the same determinantal factors.

9. Similarity of linear transformations

To apply the results of the preceding section to linear transformations, let , , and be linear transformations connected by the equation . Then in any coordinate system we have the matrix equation so that the determinantal factors of and are equal. In other words the determinantal factors are invariant under a change of basis. Hence if for any linear transformation we want to compute the determinantal factors of , we may assume that the matrix of is in rational canonical form. The generic block in the matrix of will then have the form

Adding to the first column times the second, times the third, etc., we obtain and by elementary transformations on the rows this becomes

After these transformations on each block, the matrix of will have the form

It follows that for , and for , . Hence for , which shows, incidentally, that the invariant factors are invariant under change of basis and that we may therefore speak of the invariant factors of a linear transformation. As an easy consequence of our results we obtain the fact that a necessary and sufficient condition that two linear transformations and be similar, i.e., that there exist a non-singular linear transformation for which , is that and have the same invariant factors. The necessity of this condition we have just proved; its sufficiency follows from the fact that if and have the same invariant factors then their matrices, in two suitably chosen coordinate systems, will have the same rational canonical form.

10. Elementary divisors

Let the minimal polynomial of be , and write . Then the H.C.F. of all is , so that we may find polynomials for which

Let be the range of , i.e., the linear subspace of all vectors of the form . If is in then and if is any vector then where is in . If then since and are relatively prime we may find polynomials and for which so that

To sum up: is the direct sum of , i.e., the linear subspaces span and any two of them contain only the vector in their intersection. Moreover the minimal polynomial of in the invariant linear subspace is exactly .

If we apply the above considerations to each cyclic subspace of that is exhibited by the rational canonical form we obtain an even finer decomposition of into cyclic subspaces whose orders are the prime power factors of . These orders are called the elementary divisors of . The elementary divisors are determined by the invariant factors: we show that the converse is true. If is the direct sum of cyclic subspaces of orders which are powers of irreducible polynomials, and if the notation is so chosen that , then it is easy to verify that the -th invariant factor, , of is

From the unique factorization theorem it follows that the are the elementary divisors of , and therefore that the elementary divisors determine the invariant factors. It follows easily from our preceding results that a necessary and sufficient condition that two linear transformations be similar is that they have the same elementary divisors.

11. The classical canonical form

In preceding section we have decomposed into a direct sum of cyclic subspaces and the decomposed each subspace into a direct sum of smaller cyclic subspaces whose orders are powers of irreducible polynomials. In the present section we shall indicate one more decomposition, of these last prime power spaces, which will not in general be into cyclic spaces.

Consider a cyclic space of order , where is an irreducible polynomial. For , and , we write

Since the degrees of the polynomials are all different and , the are linearly independent and therefore form a basis for . In this basis the matrix of has the form where

In case the ground field is algebraically closed, so that the only irreducible polynomials are linear (i.e., ), this matrix reduces to The matrix of in the entire space will then consist of a number of blocks of this type on the main diagonal: this form of the matrix of is called the classical canonical (or Jordan) form.

As an application of these ideas we mention that (in the case of an algebraically closed field) a necessary and sufficient condition that a matrix be similar to a diagonal matrix is that the roots of the minimal equation (or the elementary divisors) be simple, i.e., have multiplicity one. For if these roots are simple then the blocks of the classical canonical form will consist of only one element each, so that no super diagonal ’s appear, and the matrix is diagonal. This proves the sufficiency of our conditions. To prove necessity we observe that for a diagonal matrix whose diagonal elements are the product of factors extended over a set of subscripts in such a way as to contain every eigenvalue exactly once is precisely the minimal polynomial, which therefore has simple roots. As a particular application of this result we see that every idempotent linear transformation, which satisfies the equation , may be realized by a diagonal matrix whose diagonal elements consist of only ’s and ’s.