Conditional Probability of an Event Given a Random Variable. Conditional Distributions

In this section we introduce a notion that is basic to the theory of random processes, the notion of the conditional probability of a random event , given a random variable . This notion forms the basis of the mathematical treatment of jointly distributed random variables that are not independent and, consequently, are dependent.

Given two events, and , on the same probability space, the conditional probability of the event , given the event , has been defined:

Now suppose we are given an event and a random variable , both defined on the same probability space. We wish to define, for any real number , the conditional probability of the event , given the event that the observed value of is equal to , denoted in symbols by . Now if , we may define this conditional probability by (11.1). However, for any random variable for all (except, at most, a countable number of) values of . Consequently, the conditional probability of the event , given that , must be regarded as being undefined insofar as (11.1) is concerned.

The meaning that one intuitively assigns to is that it represents the probability that has occurred, knowing that was observed as equal to . Therefore, it seems natural to define

if the conditioning events have positive probability for every . However, we have to be very careful how we define the limit in (11.2). As stated, (11.2) is essentially false, in the sense that the limit does not exist in general. However, we can define a limiting operation, similar to (11.2) in spirit, although different in detail, that in advanced probability theory is shown always to exist.

Given a real number , define as that interval, of length , starting at a multiple of , that contains ; in symbols,

Then we define the conditional probability of the event , given that the random variable has an observed value equal to , by

It may be proved that the conditional probability , defined by (11.4), has the following properties.

First, the convergence set of points on the real line at which the limit in (11.4) exists has probability one, according to the probability function of the random variable ; that is, . For practical purposes this suffices, since we expect that all observed values of lie in the set , and we wish to define only at points that could actually arise as observed values of .

Second, from a knowledge of one may obtain by the following formulas:

in which the last two equations hold if is respectively continuous or discrete. More generally, for every Borel set of real numbers, the probability of the intersection of the event and the event is in that the observed value of is in is given by

Indeed, in advanced studies of probability theory the conditional probability is defined not constructively by (11.4) but descriptively, as the unique (almost everywhere) function of satisfying (11.6) for every Borel set of real numbers. This characterization of is used to prove (11.15).

Example 11A . A young man and a young lady plan to meet between 5:00 and 6:00 P.M., each agreeing not to wait more than ten minutes for the other. Assume that they arrive independently at random times between 5:00 and 6:00 P.M. Find the conditional probability that the young man and the young lady will meet, given that the young man arrives at 5:30 P.M.

 

Solution

Let be the man’s arrival time (in minutes after 5:00 p.M.) and let be the lady’s arrival time (in minutes after 5:00 P.M.). If the man arrives at a time , there will be a meeting if and only if the lady’s arrival time satisfies or . Let denote the event that the man and lady meet. Then, for any between 0 and 60 in which we have used (11.9) and (11.11). Next, using the fact that is uniformly distributed between 0 and 60, we obtain (as graphed in Fig. 11A) Consequently, , so that the conditional probability that the young man and the young lady will meet, given that the young man arrives at 5:30 P.M., is . Further, by applying (11.5), we determine that .

 

Figure 2.4.1

Fig. 11A . The conditional probability , graphed as a function of .

 

In (11.7) we performed certain manipulations that arise frequently when one is dealing with conditional probabilities. We now justify these manipulations.

Consider two jointly distributed random variables and . Let be a Borel function of two variables. Let be a fixed real number. Let be the event that the random variable has an observed value less than or equal to . Next, let be a fixed real number, and let be the event that the random variable , which is a function only of , has an observed value less than or equal to . It appears formally reasonable that

In words, a statement involving the random variable , conditioned by the hypothesis that the value of is a given number , has the same conditional probability given , as the corresponding statement obtained by replacing the random variable by its observed value. The proof of (11.9) is omitted, since it is beyond the scope of this book.

It may help to comprehend (11.9) if we state it in terms of the events and . Equation (11.9) asserts that the functions of ,

have the same value at .

Another important formula is the following. If the random variables and are independent, then

since it holds that

We thus obtain the basic fact that if the random variables and are independent

We next define the notion of the the conditional distribution function of one random variable given another random variable , denoted For any real numbers and , it is defined by

The conditional distribution function has the basic property that for any real numbers and the joint distribution function may be expressed in terms of by

To prove (11.15), let and be two jointly distributed random variables. For two given real numbers and define . Then (11.15) may be written

If in (11.6) (11.16) is obtained.

Now suppose that the random variables and are jointly continuous. We may then define the conditional probability density function of the random variable , given the random variable , denoted by . It is defined for any real numbers and by

We now prove the basic formula: if , then

To prove (11.18), we differentiate (11.15) with respect to (first replacing by ). Then

Now differentiating (11.19) with respect to , we obtain

from which (11.18) follows immediately.

Example 11B . Let and be jointly normally distributed random variables whose probability density function is given by (9.31). Then the conditional probability density of , given , is equal to

In words, the conditional probability law of the random variable , given , is the normal probability law with parameters and . To prove (11.21), one need only verify that it is equal to the quotient . Similarly, one may establish the following result.

Example 11C . Let and be jointly distributed random variables. Let

Then, for

In the foregoing examples we have considered the problem of obtaining , knowing . We next consider the converse problem of obtaining the individual probability law of from a knowledge of the conditional probability law of , given , and of the individual probability law of .

Example 11D . Consider the decay of particles in a cloud chamber (or, similarly the breakdown of equipment or the occurrence of accidents). Assume that the time of any particular particle to decay is a random variable obeying an exponential probability law with parameter . However, it is not assumed that the value of is the same for all particles. Rather, it is assumed that there are particles of different types (or equipment of different types or individuals of different accident proneness). More specifically, it is assumed that for a particle randomly selected from the cloud chamber the parameter is a particular value of a random variable obeying a gamma probability law with a probability density function,

in which the parameters and are positive constants characterizing the experimental conditions under which the particles are observed.

The assumption that the time of a particle to decay obeys an exponential law is now expressed as an assumption on the conditional probability law of given :

We find the individual probability law of the time (of a particle selected at random to decay) as follows; for

The reader interested in further study of the foregoing model, as well as a number of other interesting topics, should consult J. Neyman, “The Problem of Inductive Inference”, Communications on Pure and Applied Mathematics , Vol. 8 (1955), pp. 13–46.

The foregoing notions may be extended to several random variables. In particular, let us consider random variables and a random variable , all of which are jointly distributed. By suitably adapting the foregoing considerations, we may define a function

called the conditional distribution function of the random variables , given the random variable , which may be shown to satisfy, for all real numbers and ,

Theoretical Exercises

11.1 . Let be a random variable, and let be a fixed number. Define the random variable by and the event by . Evaluate and in terms of the distribution function of . Explain the difference in meaning between these concepts.

11.2 . If and are independent Poisson random variables, show that the conditional distribution of , given , is binomial.

11.3 . Given jointly distributed random variables, and , prove that, for any and almost all if and only if and are independent.

11.4 . Prove that for any jointly distributed random variables and

For contrast evaluate

Exercises

In exercises 11.1 to 11.3 let and be independent random variables. Let . Let . Find (i) , (ii) , (iii) , (iv) .

11.1 . If and are each uniformly distributed over the interval 0 to 2.

 

Answer

(i) 1; (ii), (iii), (iv) .

 

11.2 . If and are each normally distributed with parameters and .

11.3 . If and are each exponentially distributed with parameter .

 

Answer

(i) 0.865; (ii) 0.632; (iii) 0.368; (iv) 0.5.

 

In exercises 11.4 to 11.6 let and be independent random variables. Let and . Let . Find (i) , (ii) , (iii) , (iv) , (v) .

11.4 . If and are each uniformly distributed over the interval 0 to 2.

11.5 . If and are each normally distributed with parameters and .

 

Answer

(i) 0.276; (ii) 0.5; (iii) 0.2; (iv) 0.5, (v) .

 

11.6 . If and are each exponentially distributed with parameter .

11.7 . Let and be jointly normally distributed random variables (representing the observed amplitudes of a noise voltage recorded a known time interval apart). Assume that their joint probability density function is given by (9.31) with (i) , (ii) , . Find .

 

Answer

(i) 0.28; (ii) 0.61.

 

11.8 . Let and be jointly normally distributed random variables, representing the daily sales (in thousands of units) of a certain product in a certain store on two successive days. Assume that the joint probability density function of and is given by (9.31), with , . Find so that (i) , (ii) 0.05, (iii) . Suppose the store desires to have on hand on a given day enough units of the product so that with probability 0.95 it can supply all demands for the product on the day. How large should its inventory be on a given morning if (iv) yesterday’s sales were 2000 units, (v) yesterday’s sales are not known.