Describing a Random Variable

Although, by definition, a random variable is a function on a probability space, in probability theory we are rarely concerned with the functional form of , for we are not interested in computing the value that the function assumes at any individual member of the sample description space on which is defined. Indeed, we do not usually wish to know the space on which is defined. Rather, we are interested in the probability that an observed value of the random variable will lie in a given set . We are interested in a random variable as a mechanism that gives rise to a numerical valued random phenomenon, and the questions we shall ask about a random variable are precisely the same as those asked about numerical valued random phenomena. Similarly, the techniques we use to describe random variables are precisely the same as those used to describe numerical valued random phenomena.

To begin with, we define the probability function of a random variable , denoted by , as a set function defined for every Borel set of real numbers, whose value is the probability that is in . We sometimes write the intuitively meaningful expression is in for the mathematically correct expression . Similarly, we adopt the following expressions for any real numbers , and :

One obtains the probability function of the random variable from the probability function , which exists on the sample description space on which is defined as a function, by means of the following basic formula: for any Borel set of real numbers

Equation (2.2) represents the definition of ; it is clear that it embodies the intuitive meaning of given above, since the function will have an observed value lying in the set if and only if the observed value of the underlying random phenomenon is such that is in .

Example 2A . The probability function of the number of white balls in a sample . To illustrate the use of (2.2), let us compute the probability function of the random variable defined by (1.1) . Assuming equally likely descriptions on , one determines for any set of real numbers that the value of depends on the intersection of with the set :

1
if

We may represent the probability function of a random variable as a distribution of a unit mass over the real line in such a way that the amount of mass over any set of real numbers is equal to the value of the probability function of at . We have seen in Chapter 4 that a distribution of probability mass may be specified in various ways by means of probability mass functions, probability density functions, and distribution functions. We now introduce these notions in connection with random variables. However, the reader should bear constantly in mind that, as mathematical functions defined on the real line, these notions have the same mathematical properties, whether they arise from random variables or from numerical valued random phenomena.

The probability law of a random variable is defined as a probability function over the real line that coincides with the probability function of the random variable . By definition, probability theory is concerned with the statements that can be made about a random variable, knowing only its probability law . Consequently, a proposition stated about a probability function is, from the point of view of probability theory, a proposition stated about all random variables , whose probability functions coincide with .

Two random variables and are said to be identically distributed if their probability functions are equal; that is, for all Borel sets .

The distribution function of a random variable , denoted by , is defined for any real number by

The distribution function of a random variable possesses all the properties stated in section 3 of Chapter 4 for the distribution function of a numerical valued random phenomenon. The distribution function of uniquely determines the probability function of .

The distribution function may be used to classify random variables into types. A random variable is said to be discrete or continuous, depending on whether its distribution function is discrete or continuous

The probability mass function of a random variable , denoted by , is a function whose value at any real number represents the probability that the observed value of the random variable will be equal to ; in symbols,

A real number for which is positive is called a probability mass point of the random variable . From the distribution function one may obtain the probability mass function by

A random variable is discrete if the sum of the probability mass function over the points at which it is positive (there are at most a countably infinite number) is equal to 1; in symbols, is discrete if

In other words, a random variable is discrete when one distributes a unit mass over the infinite line in accordance with the probability function if one does so by attaching a positive mass to each of a finite or a countably infinite number of points.

If a random variable is discrete, it suffices to know its probability mass function in order to know its probability function , for we have the following formula expressing in terms of . If is discrete, then for any Borel set of real numbers 

Thus, for a discrete random variable , to evaluate the probability that the random variable will have an observed value lying in , one has only to list the probability mass points of which lie in . One then adds the probability masses attached to these probability mass points to obtain .

The distribution function of a discrete random variable is given in terms of its probability mass function by

The distribution function of a discrete random variable is what might be called a piecewise constant or “step” function, as diagrammed in Fig. 3A of Chapter 4. It consists of a series of horizontal lines over the intervals between probability mass points; at a probability mass point , the graph of jumps upward by an amount .

Example 2B . A random variable has a binomial distribution with parameters and if it is a discrete random variable whose probability mass function is given by, for any real number ,

Thus for a random variable , which has a binomial distribution with parameters and ,

Example 2C . Identically distributed random variables . Some insight into the notion of identically distributed random variables may be gained by considering the following simple example of two random variables that are distinct as functions and yet are identically distributed. Suppose one is tossing a fair die; consider the random variables and , defined as follows:

Value of , if outcome of die is

It is clear that both and are discrete random variables, whose probability mass functions agree for all ; indeed, , , for , or 2. Consequently, the probability functions and agree for all sets .

If a random variable is continuous, there exists a nonnegative function , called the probability density function of the random variable , which has the following property: for any Borel set of real numbers

In words, for a continuous random variable , once the probability density function is known, the value of the probability function at any Borel set may be obtained by integrating the probability density function over the set .

The distribution function of a continuous random variable is given in terms of its probability density function by

In turn, the probability density function of a continuous random variable can be obtained from its distribution function by differentiation:

at all points at which the derivative on the right-hand side of (2.12) exists.

Example 2D . A random variable is said to be normally distributed if it is continuous and if constants and exist, where and , such that the probability density function is given by, for any real number ,

Then for any real numbers and

For a random variable , which is normally distributed with parameters and ,

We conclude this section by making explicit mention of our conventions concerning the use of the letters , and , and the subscripts . We shall always use to denote a probability mass function and then add as a subscript the random variable (which could be denoted by , , etc.) of which it is the probability mass function. Thus, denotes the probability mass function of the random variable , whereas denotes the value of at the point . Similarly, we write , to denote the probability density function, respectively, of . Similarly, we write , to denote the distribution function, respectively, of .

Exercises

In exercises 2.1 to 2.8 describe the probability law of the random variable given.

2.1 . The number of aces in a hand of 13 cards drawn without replacement from a bridge deck.

 

Answer

for otherwise.

 

2.2 . The sum of numbers on 2 balls drawn with replacement (without replacement) from an urn containing 6 balls, numbered 1 to 6.

2.3 . The maximum of the numbers on 2 balls drawn with replacement (without replacement) from an urn containing 6 balls, numbered 1 to 6.

 

Answer

Without replacement for ; otherwise; with replacement for otherwise.

 

2.4 . The number of white balls drawn in a sample of size 2 drawn with replacement (without replacement) from an urn containing 6 balls, of which 4 are white.

2.5 . The second digit in the decimal expansion of a number chosen on the unit interval in accordance with a uniform probability law.

 

Answer

for otherwise.

 

2.6 . The number of times a fair coin is tossed until heads appears (i) for the first time, (ii) for the second time, (iii) the third time.

2.7 . The number of cards drawn without replacement from a deck of 52 cards until (i) a spade appears, (ii) an ace appears.

 

Answer

(i) for ; = 0 otherwise.

 

(ii) for ; = 0 otherwise.

2.8 . The number of balls in the first urn if 10 distinguishable balls are distributed in 4 urns in such a manner that each ball is equally likely to be placed in any urn.

In exercises 2.9 to 2.16 find for the random variable described.

2.9 . is normally distributed with parameters and .

 

Answer

.

 

2.10 . is Poisson distributed with parameter .

2.11 . obeys a binomial probability law with parameters and .

 

Answer

.

 

2.12 . obeys an exponential probability law with parameter .

2.13 . obeys a geometric probability law with parameter .

 

Answer

.

 

2.14 . obeys a hypergeometric probability law with parameters , .

2.15 . is uniformly distributed over the interval to .

 

Answer

.

 

2.16 . is Cauchy distributed with parameters and .