Dual Spaces and Tensor Products


 

1. Transformations of rank one

Before beginning the proper subject matter of the present chapter we digress to a discussion of interest in itself whose results we shall need later. It follows easily from the spectral theory of normal transformations (or, equivalently, from the possibility of representing a normal transformation by a diagonal matrix) that every normal transformation is a sum of normal transformations of rank one; similarly every Hermitian (or non-negative) transformation is a sum of Hermitian (or non-negative) transformations of rank one. It becomes, therefore, of interest to investigate transformations of rank one.

Theorem 1. A necessary and sufficient condition that a linear transformation has rank is that in every coordinate system the matrix has the form .

Proof. If has rank then the set of all vectors of the form is one dimensional, so that there exists a vector with the property that is for every a constant multiple (depending on ) of : . (It is easily verified that is a linear function of , but we shall not need this fact.) Now if is a coordinate system the matrix is characterized by , whence so that

Conversely if , we may find a linear function for which , and we may define . The linear transformation defined by is clearly of rank , and we have so that .

If has rank and is Hermitian and if the matrix of in some coordinate system is , then we must have . If, for some , and , then for all , whence . Since we assumed that the rank of is , this is impossible and we can find an for which . Using this the relation implies that with some constant independent of . Since the diagonal elements of a Hermitian matrix, i.e., the , are real, we can even conclude that is real, so that in this case has the form with a real .

If has rank and is non-negative, then the discussion of the preceding paragraph applies and the fact that the diagonal elements of a non-negative matrix are non-negative implies that is non-negative. In this case we may write and the relation shows that has the form .

It is easy to see that the conditions given in the last two paragraphs are not only necessary but also sufficient. If then clearly is Hermitian and has rank . If, moreover, and then

so that is non-negative.

2. The Hadamard product of non-negative matrices

As a consequence of the preceding section it is very easy to prove a remarkable theorem on non-negative matrices, due to I. Schur.

Theorem 2. If and are non-negative linear transformations whose matrices in some coordinate system are and respectively then the linear transformation whose matrix in this coordinate system is defined by is also non-negative.

Proof. Since we may write both and as a sum of non-negative transformations of rank , may be written as a sum of transformations the matrices of which are obtained from the matrices of two non-negative transformations of rank in the same way as the matrix of was obtained from the matrices of and . Since a sum of non-negative transformations is non-negative, it is therefore sufficient to prove the theorem in the case where and both have rank . In this case , , and therefore , where , whence it follows that is non-negative (and has rank ).

3. The dual space of a vector space

Definition 1. Let be an arbitrary vector space; we denote by the set of all linear functions defined on . If in , , , and are defined by , , and respectively, then becomes a vector space: we call the dual space of .

In the present chapter we shall discuss the theory of dual spaces. We call attention to the fact that all our definitions and theorems will be phrased without reference to any basis or coordinate system and that, although we shall make liberal use of bases, we use them only when that is unavoidable: namely in considerations of dimensionality, where bases enter by definition. Through out this chapter we shall mean by a basis a linear basis, i.e., a maximal set of linearly independent elements: in case is an inner product space we shall in each case specify whether or not we need an orthogonal basis.

If is of dimension so is . For let be a basis in . For each we may define a linear function by the requirement that . Then , i.e., implies so that the are linearly independent. Moreover if is arbitrary in and in then and , so that . In other words is a basis in , so that has dimension .

Since and are both dimensional vector spaces it is possible, in many ways, to set up a one to one correspondence between them that preserves , sum, and scalar product: in other words and are isomorphic. These isomorphisms, however, are perfectly arbitrary and yield no information about the structure of vector spaces. If, however, we consider not but its dual space, which we may denote by , (i.e., is the set of all linear functions defined on ), then it is possible to set up a ‘natural’ isomorphism between and .

Given any vector in , we make correspond to it an element in by defining, for every in , . (It is easy to verify that is indeed a linear function of .) The correspondence is linear: i.e., if , , and , then . For, by definition we have for each in ,

We now show that the correspondence is one to one. If and correspond to the same , then we have for every in , , or . If we introduce, as above, a basis in and corresponding linear functions in , defined by , then we see that for all , where , implies in particular that , so that . Hence and can correspond to the same only if , as was to be proved. Finally we remark that every in corresponds in this correspondence to some in . The simplest proof of this fact is that since and therefore are dimensional vector spaces the dual space of is also dimensional. Hence if we can exhibit linearly independent elements of which do correspond to elements of the desired result will follow. Let be a basis in : then is a set of linearly independent elements, and therefore a basis, in . For , i.e., for all , implies that

whence, as above, and therefore .

Thus the correspondence is an isomorphism, the so-called ‘natural isomorphism’, between and .

4. The dual space of an inner product space

The considerations of the preceding section apply, of course, to inner product spaces. In the case of inner product spaces, however, It is not necessary to go to : we shall establish a natural correspondence between and .

Let be an dimensional inner product space and its dual space. The theorem on the representation of linear functions (cf. (I.15)) shows that every in has the form . This relation establishes a correspondence, which we already know to be one to one, between in and in . If in this correspondence corresponds to , , then we have so that corresponds to . Thus the correspondence is not an isomorphism but a conjugate isomorphism between and .

This correspondence can also be used to define an inner product in . At first glance it might seem plausible to define to be , where , but due to the fact that the correspondence is a conjugate isomorphism we have the relation

so that this definition does not satisfy the requirements of the definition of an inner product in (I.3). If, however, we define then it is readily verified that is an inner product in so that is an inner product space.

5. Reflexivity of inner product spaces

If we apply the results of the preceding section not to the inner product space but to its dual we obtain a conjugate isomorphism between and . Thereby we have induced a one to one correspondence between itself and : it is readily verified (since the operation of conjugation is involutory) that this correspondence is an isomorphism. We now show that this isomorphism is the same as the natural isomorphism between and described in (IV.3). Let be an arbitrary vector in ; to it there corresponds an element in ; to this element, in turn, there corresponds the element in . We must show that . Let be an arbitrary element of ; we have as was to be proved.

6. Direct sum of vector spaces

Definition 2. If and are arbitrary vector spaces we define the direct sum, to be the set of all pairs with in and in .

If in we define then becomes a vector space. If, moreover, and are inner product spaces we may define in , and becomes thereby an inner product space. In fact although for vector spaces this definition yields something new, for inner product spaces it can be subsumed under the discussion of the projection theorem (I.13). In other words and can be thought of as two orthogonal linear subspaces in , and in an arbitrary inner product space a linear subspace and its orthogonal complement are a decomposition of the space into a direct sum.

If and are linear transformations in and respectively we may define a linear transformation , the direct sum of and , in by . It is easy to discuss the matricial representation of , its relation to addition, multiplication, scalar multiplication, , , inverse, dual, etc. We omit this discussion here, and merely state without proof two propositions that will be useful to us later.

  1. If and have dimensions and respectively the dimension of is . If and are bases in and respectively then the totality of all vectors of either of the two forms or is a basis in . If the matrix of the direct sum transformation is computed in this basis it will have the form

where and are the matrices of and in the bases and respectively, and where the zeros represent rectangular blocks each element of which is zero.

  1. The most general linear function on is of the form , where and are linear functions in and . In other words the dual space of a direct sum is the direct sum of the dual spaces.

7. Tensor product of vector spaces

The main purpose of this chapter is to define for vector spaces (and inner product spaces) the notion of a tensor product. In other words if and are given vector spaces we shall define for every vector in and in a product , which is to be an element of a suitable vector space, in such a way that depends linearly on either variable if the other one is fixed and so that (in case and are inner product spaces) we have

In order to clarify the definition we shall give, we proceed heuristically on the basis of the proposition (ii) in the preceding section. If we denote the (as yet undefined) tensor product of and by , me may expect that . Since it is technically easier to do so, instead of defining itself we shall instead define ; we shall then write, by definition, . Also we may expect that if and are linear functions in and respectively then it is their product, , that should in some sense be the general element of . This product is a function , defined for in and in , with the property that for each fixed value of one variable it is a linear function of the other: in other words is a bilinear function of and . This discussion is meant to motivate the formal work that we begin in the next paragraph.

Let and be vector spaces of dimensions and respectively; we denote by the set of all bilinear functions defined for in and in . Let be the dual space of (i.e., is the set of all linear functions defined for in ): we call the tensor product of and . To every pair of vectors with in and in we make correspond the element in defined by . (It is easy to verify that is a linear function of .) We write and call the tensor product of and . We shall consistently use the notation for vectors of , for vectors of , and for vectors of the vector space .

8. Dimension of a tensor product

We observe that the dimension of is . For, exactly as in (IV.3) above, we may choose bases and in and respectively, and then we may find bilinear functions subject to the requirement that . It is then easy to show that the are linearly independent and that every bilinear function is a linear combination of them.

We shall also need the fact that the elements of are a basis in . According to the preceding paragraph we need only prove that they are linearly independent. If for all then we should have, in particular, for all and , as was to be proved.

9. The dual of a tensor product

If and and if then . For we have, for every bilinear function: , Similarly we can show that so that depends linearly on each of its factors when the other is held fixed. It follows from the preceding paragraph that every element in is a sum of tensor products (not necessarily uniquely). It is also easy to prove, using the bilinear character of , that every linear function of (i.e., every element in the dual space of ) is a bilinear function of and and consequently a sum of products of the form , where and are linear functions defined on and respectively. Hence for general vector spaces our definition of tensor product fulfills the conditions (heuristically derived above) of our program. Before investigating the relation of tensor product spaces to linear transformations, we examine the situation in inner product spaces.

10. Tensor product of inner product spaces

If and are inner product spaces the construction of the preceding sections applies unaltered: the only new problem is to introduce into the tensor product an inner product related to the given inner products in the factor spaces in a suitable way. It is technically easier to define inner product not in but in and then apply the general theory of duals of inner product spaces to find a tensor product in .

If is any element of , can be written as a sum of products of the form , or, since and are inner product spaces, can be written as a sum of expressions of the form . Hence if and are any two elements of we may write

We write, by definition,

(The conjugate nature of the relation between vectors and linear functions again necessitates putting before .) Before we can even start to prove that this definition fulfills the conditions of the definition of an inner product, we must prove that it defines independently of the representations as sums. To do this we observe that so that whence is independent of the particular representation of . Since, moreover, in any given representations of and , , it follows that is also independent of the representation of .

It is easy to verify that the expression is linear in , conjugate linear in , and Hermitian symmetric. It remains to prove that it is positive definite: i.e., that for all , and that if and only if . This surprisingly, is not trivial: it requires Schur’s theorem, proved in (IV.2).

We have

Let be arbitrary complex numbers. Then so that the matrix whose general element is is non-negative. Similarly we may show that the matrix whose general element is is non-negative; it follows from Schur’s theorem that the matrix whose general element is the product is also non-negative. Hence for every choice of the complex numbers : choosing for all proves that .

In order to prove that implies we proceed as follows. For the expression , which now has all other properties of an inner product, we may prove the Schwartz inequality as in (I.4):

It follows that the vanishing of implies the vanishing of for all . Let and be arbitrary vectors and take, in particular, . The vanishing of implies that hence the vanishing of for all implies that , for every pair , of vectors, or, in other words, that .

This concludes the introduction of an inner product in . Applying the results of (IV.4) we obtain an inner product in the dual space of , so that becomes an inner product space.

11. The inner product in a tensor product

It is now easy to prove that the inner product defined in has the property that

We write and we define where and are the particular bilinear functions defined by

For an arbitrary we have and (This is similar to the proof in (IV.5) if the equality of the two natural correspondences between an inner product space and its second dual.) Hence we have, finally, as was to be proved.

The last proved fact justifies the terminology of tensor product and describes completely the structure of and its relation to and . It follows also that if and are orthogonal bases in and respectively then so that the form an orthonormal set in . Since we have already seen that they form a maximal linearly independent set it follows that they are a complete orthonormal set, or an orthogonal basis, in .

12. Tensor product of transformations

We are now in a position to examine the relation of linear transformations to the theory of tensor products. If and are linear transformations defined in and respectively we define a linear transformation in by , and then a linear transformation in by . In brief:

If we apply to a particular of the form (i.e., ) we obtain

Since we have already remarked that every is a sum of tensor products the relation completely characterizes . The linear transformation in the space is called the tensor product of the linear transformations and , .

13. Kronecker products of matrices

Let and be linear transformations and and orthogonal bases in and respectively. We find the matrix of the linear transformation in the orthogonal basis of . Naturally the matrix depends on the way in which these vectors are ordered in a linear order: we suppose first that the order is the lexicographical one, i.e.,

We have

so that the matrix of has the form or, in a condensed notation whose meaning is clear,

If we had adopted, instead, the converse lexicographic ordering, i.e., we should have found the matrix of to be

The first of these two matrices is known as the Kronecker product, , of and (in this order!); the second one is . Since a permutation of the elements of an orthogonal basis is a trivial kind of change of basis (i.e. it is effected by a unitary transformation ) we obtain that

14. Properties of tensor product transformations

We now proceed to describe some of the elementary properties of tensor product transformations.

14.1. If and then . For we have

14.2. If and then . For As immediate consequences of this result we obtain the formulas