Definition 1. A vector space, , is a set of elements, , , , etc., called vectors or points, which satisfies the following axioms.
1.1. To every pair, and , of vectors in there corresponds a vector , called the sum of and , .
1.2. Addition is commutative: .
1.3. Addition is associative: .
1.4. There exists in a unique vector such that for all in , .
1.5. To every pair , where is a complex number and is in , there corresponds a vector in , called the product of and , , such that
We shall refer to complex numbers as scalars or constants; later in these notes we shall discuss vector spaces in which the scalars may lie in an arbitrary field (not necessarily the field of complex numbers). To differentiate between the various cases we shall refer to the vector spaces just defined as complex vector spaces; if (1.5) is satisfied for real numbers, , we shall refer to real vector spaces. Even though throughout the first three chapters we shall be concerned with complex vector spaces only, it will be very helpful to keep in mind the best known example of real vector spaces, namely, two-dimensional real Euclidean space. This space, , is the set of all pairs, , of real numbers. Addition, zero, and scalar multiplication are defined by the formulas
For any vector we write, by definition, ; instead of we write . We observe that (so that the elements of form under addition an Abelian group with playing the role of the inverse of ) and that for any scalar, , and an arbitrary ,
(We comment on the fact that the symbol is used in two different senses, once as a scalar and once as a vector. Later we shall even assign a third meaning to the symbol, but the relation between the various interpretations of it is such that no confusion should arise from this practice.)
2. Linear dependence and dimension
Definition 2. The vectors in are linearly dependent if there exist scalars, , not all zero, such that
If no such scalars, , exist, the vectors () are linearly independent.
We shall assume throughout these notes, that there exists a positive integer such that every set of more than vectors is linearly dependent. If, moreover, a set of linearly independent vectors actually exists, we shell say that the linear dimension of is .
3. Inner products and inner product spaces
Two fundamental notions in Euclidean geometry, that occur already in the study of the space , are the notions of angle and length. In the following, we shall abstract the essential properties of these two notions and carry them over to abstract vector spaces. It turns out more convenient for our purposes to study the analog of not the angle between two vectors but of the cosine of this angle. Suppose then that and are two vectors in ; we denote the angle between (or ) and the positive real axis by (or ). Then the cosine of the angle between and becomes (where denotes the distance of from the origin, ). It is the expression , which we shall denote by , that is of interest to us. It is easy to verify that , considered as a numerical valued function of the pair of vectors and , is symmetric in and , depends linearly on both and , and has the property that , which is the square of the distance between and the origin, vanishes if and only if and is otherwise positive.
The properties of are in one respect too special for our purposes. For consider the example of one dimensional complex Euclidean space: i.e., the set of all complex numbers . In this space, as well as in , the notions of angle and length are defined, but if the expression , (which in case is equal to the cosine of the angle between and ), were linear in and then we should have (where ). This would imply that the angle between and is zero. It is easily verified that in fact the cosine of the angle between and (where and are two complex numbers of distance one from the origin) is the real part of , where denotes the complex conjugate of . In other words in this case the expression , which is neither symmetric nor linear in , takes the place of the expression which might have been suggested by analogy with the situation in . Using the results of the last two paragraphs as a heuristic indication of what we should require in the general case we proceed to the formal definition.
Definition 3. An inner product, , in a vector space, , is a complex numerically valued function of the ordered pair of vectors , , such that
3.1. ,
3.2. ,
3.3. ; implies .
An inner product space, , is a vector space in which an inner product is defined.
If is an inner product space with the inner product then we have
In analogy with the situation in Euclidean space we define in an inner product space the length, , of a vector , by the formula We note that
4. Schwartz Inequality
We shall now prove for an inner product space the following inequality, known as Schwartz’ Inequality or Cauchy’s Inequality.
Theorem 1.
Proof. Since (where we use and to denote the real and imaginary parts of , respectively) it follows that
Since, furthermore, for any complex number , we can find a complex number of absolute value one such that is real and non-negative, we may find such a number when . Then
Applying the preceding inequality with in place of we obtain
Let be a positive number for which and apply the last inequality to the vectors and in place of and . We obtain and this is the desired result.■
5. Triangle Inequality in inner product spaces
A geometric interpretation and application of the Schwartz inequality is the following. For any two vectors ind in the inner product space we may define the distance between and to be . It is clear that this distance is non-negative and symmetric in and , and that it vanishes if and only if . It is precisely the Schwartz inequality which implies that this distance has the remaining property usually required of a distance function, namely that the triangle inequality, is valid. For we have so that , and replacing by and by , we obtain the triangle inequality.
6. -dimensional complex Euclidean space
As an example of an inner product space, we consider -dimensional complex Euclidean space, , which is the set of all ordered sets of complex numbers. If and are two points of this space and and are complex numbers, we define to be if in addition we define and it is easy to verify that is an inner product space. Applying the Schwartz inequality to a pair of vectors and in this space we get the arithmetic relation
Since in an arbitrary inner product space the expression plays the role of the cosine of the angle between and , the Schwartz inequality can also be considered as a generalization of the fact that for real angles , .
7. Linear subspaces
Definition 4. A linear subspace1 (also called a vector subspace or simply a subspace) in a vector space is a non-empty set, , of vectors in , which contains along with every pair of vectors and the vector , for arbitrary complex numbers and .
It follows immediately from the definitions that a linear subspace in a vector space (or an inner product space) is a vector space (or an inner product space).
8. Orthonormal sets
Definition 5. For vectors and in an inner product space we say that is orthogonal to , in symbols , if . Two linear subspaces, and , are orthogonal if every vector in each is orthogonal to every vector in the other. A set, , of vectors is an orthonormal set if (where is the Kronecker symbol which is equal to when and is equal to otherwise).
If is a finite orthonormal set, then implies that so that an orthonormal set is linearly independent. Hence if is the linear dimension of the space, no orthonormal set can have more than elements; the maximal number of elements is the orthogonal dimension of the space.
If an orthonormal set is not contained in any larger orthonormal set it is complete.
It follows from the definition of the terms that the orthogonal dimension is not greater than the linear dimension.
9. Intersection and linear sum of linear subspaces
The intersection of any collection of linear subspaces is again a linear subspace. If and are linear subspaces we shall denote their intersection by . If is an arbitrary set of vectors there exist linear subspaces containing , (since the whole space is such), and we may form the intersection of all linear subspaces containing . This linear subspace is the linear subspace spanned by ; if and are linear subspaces we shall denote the linear subspace spanned by the vectors of and by . is the linear sum of and .
10. Bessel’s Inequality
Theorem 2. If is an orthonormal set, is any vector, and , then
Proof. Let be the vector Then, since , we have and this is the desired result.■
Corollary 1. is orthogonal to each and therefore to the linear subspace spanned by ; a necessary and sufficient condition that belong to this linear subspace is that .
Proof. We have This proves the first part of the corollary. The sufficiency of the condition of the second part is obvious. If on the other hand, belongs to the linear subspace spanned by then , being a linear combination of and the must also belong to this linear subspace Since, however, is orthogonal to every vector in this linear subspace it must be orthogonal to itself, i.e., , and the condition is necessary.■
11. Characterization of complete orthonormal sets
We shall now prove that the following five conditions on an orthonormal set are equivalent to each other.
11.1. is complete.
11.2. for all implies .
11.3. The linear subspace spanned by is the whole space, .
11.4. For every in , , where .
11.5. For every pair , in where and . (11.5 is Parseval’s identity.)
We shall prove this by establishing the following implication relations.
(11.1) (11.2): If there exists a vector with for all , then the set of vectors formed by and is an orthonormal set containing .
(11.2) (11.3): If does not span the whole space then, by the corollary to Bessel’s inequality, there exists a vector which is not of the form . It follows from this corollary that the vector is different from zero and is orthogonal to each .
(11.3) (11.4): This implication is a direct consequence of the corollary to Bessel’s inequality.
(11.4) (11.5): Assuming (11.4) we have and , whence
(11.5) (11.1): If were contained in a larger orthonormal set, say if is orthogonal to each , then taking in (11.5) we obtain .
We observe that (11.5) with is the natural generalization of the Pythagorean theorem.
12. Erhardt Schmidt orthogonalization process
Although we have proved several relations among properties of complete orthonormal sets, we have never yet established that such sets exist. The purpose of the construction that follows is to prove the existence of complete orthonormal sets.
Let be an inner product space of linear dimension , and let , be a set of linearly independent vectors. We define ; then the set consisting of the single vector is an orthonormal set. We proceed by induction. Suppose that the vectors have been defined so that they form an orthonormal set and so that is a linear combination of for . We write , and we determine the coefficients so that should be orthogonal to each , . Since , we may choose . Since is a linear combination of and and is therefore a linear combination of in such a way that the coefficient of does not vanish (in fact it is equal to ), the linear independence of the implies that . Hence we may define , and the set will again satisfy our induction hypothesis. The vectors so constructed are an orthonormal set which must be complete, for if it were contained in a larger orthonormal set then there would exist a set of more than linearly independent vectors. This construction proves, therefore, that complete orthonormal sets do indeed exist and that the orthogonal dimension of is equal to its linear dimension.
For a complete orthonormal set, we shall also use the terms ‘coordinate system,’ ‘frame of reference,’ and ‘orthogonal basis.’ By ‘basis’ or ‘linear basis’ we mean a set of linearly independent vectors, where is the dimension of the space. (Since we have proved the equality of the orthogonal and linear dimensions, we shall in the future not distinguish between them.) If is a coordinate system then, by (11.4), every can be written in the form where . The numbers are the coordinates of with respect to .
13. The projection theorem
Theorem 3. Let be a linear subspace in . Every vector in can be written in the form , where is in and is orthogonal to every vector in , in one and only one way.
Proof. , being a linear subspace in an inner product space, is itself an inner product space, and we may find an orthonormal set in which is complete in . For any , write , where . Then clearly is in and by the corollary to Bessel’s inequality, is orthogonal to every vector in . This establishes the existence of the representation; to prove uniqueness suppose that and , where and () have the properties of and , respectively. Then we should have : i.e., a vector in is equal to a vector which is orthogonal to . It follows that this vector is orthogonal to itself and is therefore zero, whence and , as was to be proved.■
The vector is the projection of on ; it is easy to verify that it is a generalization of the well known notion of perpendicular projection in Euclidean space. We shall investigate properties of projections more thoroughly in the next chapter.
14. Calculus of linear subspaces
We make some comments on the calculus of linear subspaces. If is an arbitrary linear subspace, we denote by the set of all vectors orthogonal to . It is clear that is a linear subspace; we call the orthocomplement of .
The projection theorem proves that , in the sense established in (9), is . If and are any two linear subspaces, is a linear subspace. If is a complete orthonormal set in and is a complete orthonormal set in then the set of vectors consisting of all and all is a complete orthonormal set in . If is any linear subspace and is a vector such that , (i.e., if lies in ) then is in . For by the projection theorem, we may write , with in and in . Since an element, , of is orthogonal to , and is therefore in , must also be in . Since , therefore, is both in and orthogonal to , must be zero. It follows that .
15. Representation of linear functions
Of interest in analysis in general and in our study in particular is the study of linear functions on vector spaces.
Definition 6. A linear function is a (complex) numerically valued function of the vector such that for any two vectors and and any two complex numbers and , .
(A geometric interpretation of linear functions is the fact that the set of all vectors for which , where is a linear function not identically zero, is a hyperplane in . Because two linear functions and related by for all , where is any nonzero constant, correspond to the same hyperplane, it turns out a little more convenient to study the linear functions themselves and not the hyperplanes, that is, classes of linear functions defined by them. It is easy to verify, but we shall not make use of this fact in this form, that the hyperplanes of a vector space of dimension are the linear subspaces of dimension . It is also easy to verify that the set of all linear functions defined on a vector space is itself a vector space; one interpretation of the theorem to be proved below is that the set of all linear functions on an inner product space is itself an inner product space.)
Theorem 4. If is a linear function in an inner product space , there exists a unique vector in such that for all , .
Proof. Since the theorem is obvious if is identically zero, we may assume that this is not the case. Let be the set of all vectors for which , and let be the orthocomplement of . Then contains a vector for which ; by multiplication by a suitable constant we may assume .
Consider now any vector in . For any complex number , is also in and we have
If we choose , then , so that is in ; since we have already seen that it must be in , it follows that , or . (This shows that is one dimensional and therefore that the dimension of is .) We write , and if is an arbitrary vector in we use the projection theorem to write in the form , with in and in . Then and since is in , so that . This completes the proof of the existence of .
To prove uniqueness, suppose that for every , . Then we should have for all : i.e., the vector is orthogonal to every vector, and therefore in particular to itself, so that , as was to be proved.■