Convergence in Distribution of a Sequence of Random Variables

In this section we define the notion of convergence in distribution of sequence of random variables to a random variable , which is the notion of convergence most used in applications of probability theory. The notion of convergence in distribution of a sequence of random variables can be defined in a large number of equivalent ways, each of which is important for certain purposes. Instead of choosing any one of them as the definition, we prefer to introduce all the equivalent concepts simultaneously.

Theorem 3A. Definitions and Theorems Concerning Convergence in Distribution. For , let be a random variable with distribution function and characteristic function . Similarly, let be a random variable with distribution function and characteristic function . We define the sequence as converging in distribution to the random variable , denoted by 

and read “the law of converges to the law of ” if any one (and consequently all) of the following equivalent statements holds: 

(i) For every bounded continuous function of a real variable there is convergence of the expectation to ; that is, as tends to

(ii) At every real number there is convergence of the characteristic functions; that is, as tends to

(iii) At every two points and , where , at which the distribution function of the limit random variable is continuous, there is convergence of the probability functions over the interval to ; that is, as tends to

 

(iv) At every real number that is a point of continuity of the distribution function there is convergence of the distribution functions; that is, as tends to , if is a continuity point of

(v) For every continuous function , as tends to

at every real number at which the distribution function is continuous. 

Let us indicate briefly the significance of the most important of these statements. The practical meaning of convergence in distribution is expressed by (iii); the reader should compare the statement of the central limit theorem in section 5 of Chapter 8 to see that (iii) constitutes an exact mathematical formulation of the assertion that the probability law of “approximates” that of . From the point of view of establishing in practice that a sequence of random variables converges in distribution, one uses (ii), which constitutes a criterion for convergence in distribution in terms of characteristic functions. Finally, (v) represents a theoretical fact of the greatest usefulness in applications, for it asserts that if converges in distribution to then a sequence of random variables , obtained as functions of the , converges in distribution to if the function is continuous.

We defer the proof of the equivalence of these statements to section 5.

The Continuity Theorem of Probability Theory. The inversion formulas of section 3 of Chapter 9 prove that there is a one-to-one correspondence between distribution and characteristic functions; given a distribution function and its characteristic function

there is no other distribution function of which is the characteristic function. The results stated in theorem 3A show that the one-to-one correspondence between distribution and characteristic functions, regarded as a transformation between functions, is continuous in the sense that a sequence of distribution functions converges to a distribution function at all points of continuity of if and only if the sequence of characteristic functions

converges at each real number to the characteristic function of . Consequently, theorem 3A is often referred to as the continuity theorem of probability theory .

Theorem 3A has the following extremely important extension, of which the reader should be aware. Suppose that the sequence of characteristic functions , defined by (3.6), has the property of converging at all real to a function , which is continuous at . It may be shown that there is then a distribution function , of which is the characteristic function . In view of this fact, the continuity theorem of probability theory is sometimes formulated in the following way:

Consider a sequence of distribution functions , with characteristic functions , defined by (3.6). In order that a distribution function exist such that at all points , which are continuity points of , it is necessary and sufficient that a function , continuous at , exist such that

Expansions for the Characteristic Function. In the use of characteristic functions to prove theorems concerning convergence in distribution, a major role is played by expansions for the characteristic function, and for the logarithm of the characteristic function, of a random variable such as those given in lemmas 3A and 3B. Throughout this chapter we employ this convention regarding the use of the symbol . The symbol is used to describe any real or complex valued quantity satisfying the inequality . It is to be especially noted that the symbol does not denote the same number each time it occurs, but only that the number represented by it has modulus less than 1.

Lemma 3A. Let be a random variable whose mean exists and is equal to 0 and whose variance is finite. Then (i) for any

(ii) for any such that exists and satisfies for some number such that . Further, if the third absolute moment is finite, then for such that

 

Proof

Equation (3.7) follows immediately by integrating with respect to the distribution function of the easily verified expansion

 

To show (3.8), we write [by (3.7)] that , in which

Now , so that if is such that . For any complex number of modulus

since . The proof of (3.8) is completed.

Finally, (3.9) follows immediately from (3.8), since

Lemma 3B. In the same way that (3.7) and (3.8) are obtained, one may obtain expansions for the characteristic function of a random variable whose mean exists: for such that

Example 3A. Asymptotic normality of binomial random variables. In section 2 of Chapter 6 it is stated that a binomial random variable is approximately normally distributed. This assertion may be given a precise formulation in terms of the notion of convergence in distribution. Let be the number of successes in independent repeated Bernoulli trials, with probability of success at each trial, and let

Let be any random variable that is normally distributed with mean 0 and variance 1. We now show that the sequence converges in distribution to . To prove this assertion, we first write the characteristic function of in the form

Therefore,

where we define

Now is the characteristic function of a random variable with mean, mean square, and absolute third moment given by

By (3.9), we have the expansion for , valid for , such that :

in which is some number such that .

In view of (3.16) and (3.19), we see that for fixed and for so large that ,

which tends to as tends to infinity. By statement (ii) of theorem 3A, it follows that the sequence converges in distribution to .

Characteristic functions may be used to prove theorems concerning convergence in probability to a constant. In particular, the reader may easily verify the following lemma.

Lemma 3C. A sequence of random variables converges in probability to 0 if and only if it converges in distribution to 0, which is the case if and only if, for every real number

Theorem 3B. The law of large numbers for a sequence of independent, identically distributed random variables with common finite mean . As tends to , the sample mean converges in probability to the mean , in which is a random variable obeying the common probability law of

 

Proof

Define and

 

To prove that the sample mean converges in probability to the mean , it suffices to show that converges in distribution to 0. Now, for a given value of and for so large that

which tends to 0 as tends to , since, for each fixed , and tends to 1 as tends to . The proof is complete.

Exercises

3.1. Prove lemma 3C.

3.2. Let be independent random variables, each assuming each of the values +1 and -1 with probability . Let . Find the characteristic function of and show that, as tends to , for each tends to the characteristic function of a random variable uniformly distributed over the interval -1 to 1. Consequently, evaluate , approximately.

3.3. Let be independent random variables, identically distributed as the random variable . For , let

Assuming that is (i) binomial distributed with parameters and , (ii) Poisson distributed with parameter , (iii) distributed with degrees of freedom, for each real number , show that . Consequently, evaluate approximately.

3.4. For any integer and let denote the minimum number of trials required to obtain successes in a sequence of independent repeated Bernoulli trials, in which the probability of success at each trial is . Let be a random variable distributed with degrees of freedom. Show that, at each . State in words the meaning of this result.

3.5. Let be binomial distributed with parameters and , in which is a fixed constant. Let be Poisson distributed with parameter . For each , show that . State in words the meaning of this result.

3.6. Let be a random variable Poisson distributed with parameter . By use of characteristic functions, show that as tends to

in which is normally distributed with mean 0 and variance 1.

3.7. Show that implies that .