Random Samples, Randomly Chosen Points (Geometrical Probability), and Random Division of an Interval

The concepts now assembled enable us to explain some of the major meanings assigned to the word “random” in the mathematical theory of probability.

One meaning arises in connection with the notion of a random sample of a random variable . Let us consider a random variable , of which it is possible to make repeated measurements, denoted by For example, may be the lifetimes of each of electric light bulbs, or they may be the numbers on balls drawn (with or without replacement) from an urn containing balls numbered 1 to 100, and so on. The set of measurements is spoken of as a sample of size of the random variable , by which is meant that each of the measurements , for , is a random variable whose distribution function is equal, as a function of , to the distribution function of the random variable . If, further, the random variables are independent, then we say that is a random sample (or an independent sample) of size of the random variable . Thus the adjective “random,” when used to describe a sample of a random variable, indicates that the members of the sample are independent identically distributed random variables.

Example 7A . Suppose that the life in hours of electronic tubes of a certain type is known to be approximately normally distributed with parameters and . What is the probability that a random sample of four tubes will contain no tube with a lifetime of less than 180 hours?

 

Solution

Let , and denote the respective lifetimes of the four tubes in the sample. The assumption that the tubes constitute a random sample of a random variable normally distributed with parameters and is to be interpreted as assuming that the random variables , and are independent, with individual probability density functions, for ,

 

The probability that each tube in the sample has a lifetime greater than, or equal to, 180 hours, is given by

since .

A second meaning of the word “random” arises when it is used to describe a sample drawn from a finite population. A sample, each of whose components is drawn from a finite population, is said to be a random sample if at each draw all candidates available for selection have an equal probability of being selected. The word “random” was used in this sense throughout Chapter 2 .

Example 7B . As in example 7A, consider electronic tubes of a certain type whose lifetimes are normally distributed with parameters and . Let a random sample of four tubes be put into a box. Choose a tube at random from the box. What is the probability that the tube selected will have a lifetime greater than 180 hours?

 

Solution

For let be the event that the box contains tubes with a lifetime greater than 180 hours. Since the tube lifetimes are independent random variables with probability density functions given by (7.1), it follows that Let be the event that the tube selected from the box has a lifetime greater than 180 hours. The assumption that the tube is selected at random is to be interpreted as assuming that

 

The probability of the event is then given by

where we have let . Then

so that the probability that a tube selected at random from a random sample will have a lifetime greater than 180 hours is the same as the probability that any tube of the type under consideration will have a lifetime greater than 180 hours. A similar result was obtained in example 4D of Chapter 3. A theorem generalizing and unifying these results is given in theoretical exercise 4.1 of Chapter 4.

The word random has a third meaning, which is frequently encountered. The phrase “a point randomly chosen from the interval to ” is used for brevity to describe a random variable obeying a uniform probability law over the interval to , whereas the phrase “ points chosen randomly from the interval to ” is used for brevity to describe independent random variables obeying uniform probability laws over the interval to . Problems involving randomly chosen points have long been discussed by probabilists under the heading of “geometrical probabilities”. In modern terminology problems involving geometrical probabilities may be formulated as problems involving independent random variables, each obeying a uniform probability law.

Example 7C . Two points are selected randomly on a line of length so as to be on opposite sides of the mid-point of the line. Find the probability that the distance between them is less than .

 

Solution

Introduce a coordinate system on the line so that its left-hand endpoint is 0 and its right-hand endpoint is . Let be the coordinate of the point selected randomly in the interval 0 to , and let be the coordinate of the point selected randomly in the interval to . We assume that the random variables and are independent and that each obeys a uniform probability law over its interval. The joint probability density function of and is then The event that the distance between the two points selected is less than is then the event . The probability of this event is the probability attached to the hatched area in Fig. 7A. However, this probability can be represented as the ratio of the area of the cross-hatched triangle and the area of the shaded rectangle; thus

 

Figure 2.4.1

Fig. 7A. .

 

Example 7D . Consider again the random variables and defined in example 7C. Find the probability that the three line segments (from 0 to , from to , and from to ) could be made to form the three sides of a triangle.

 

Solution

In order that the three-line segments mentioned can form a triangle, it is necessary and sufficient that the following inequalities be fulfilled (why?): The probability of these inequalities being fulfilled is the probability of the hatched area in Fig. 7B, which is clearly . It might be noted that if each of the two points, with coordinates and , are chosen randomly on the interval 0 to , then the probability is only that the three line segments determined by the two points could be made to form the three sides of a triangle.

 

Figure 2.4.1

Fig. 7B. .

 

Problems involving geometrical probability have played a major role in the development of the modern conception of probability. In the nineteenth century the Laplacean definition of probability was widely accepted. It was thought that probability problems could be given unique solutions by means of finding the proper framework of “equally likely” descriptions. To contradict this point of view, examples were constructed that admitted of several equally plausible, but incompatible, solutions. We now discuss an example similar to one given by Joseph Bertrand in his treatise Calcul des probabilités , Paris, 1889, p. 4, and later called by Poincaré, “Bertrand’s paradox”. It was pointed out to the author by one of his students that this example should serve as a warning to all persons who adopt practical

policies on the basis of theoretical solutions, without first establishing that the assumptions underlying the solutions are in accord with the experimentally observed facts.

Example 7E . Bertrand’s paradox . Let a chord be chosen randomly in a circle of radius . What is the probability that the length of the chord will be less than the radius ?

 

Solution

It is not clear what is meant by a randomly chosen chord. In order to give meaning to this phrase, we shall reformulate the problem as one involving randomly chosen points. We shall state two methods for randomly choosing points to determine a chord. In this manner we obtain two distinct answers for the probability that the length of a randomly chosen chord will be less than the radius .

 

One method is as follows: let and be points chosen randomly in the interval 0 to and the interval 0 to , respectively. Draw a chord by letting be the angle made by the chord with a fixed reference line and by letting be the (perpendicular) distance of the mid-point of the chord from the center of the circle (see Fig. 7C). A second method of randomly choosing a chord is as follows: let and be points chosen randomly in the interval 0 to and the interval 0 to , respectively. Draw a chord by letting and be the angles indicated in Fig. 7D. The reader may be able to think of other methods of choosing points to determine a chord. Six different solutions of Bertrand’s paradox are given in Czuber, Wahrscheinlichkeitsrechnung , B. G. Teubner, Leipzig, 1908, pp. 106–109.

Figure 2.4.1

Fig. 7C. 

Figure 2.4.1

Fig. 7D. .

The length of the chord may be expressed in terms of the random variables , and : Consequently , or . In both cases the required probability is equal to the ratio of the areas of the hatched regions in Figs. 7C and 7D to the areas of the corresponding shaded regions. The first solution yields the answer whereas the second solution yields the answer for the probability that the length of the chord chosen will be less than the radius of the circle.

It should be noted that random experiments could be performed in such a way that either (7.10) or (7.11) would be the correct probability in the sense of the frequency definition of probability. If a disk of diameter were cut out of cardboard and thrown at random on a table ruled with parallel lines a distance apart, then one and only one of these lines would cross the disk. All distances from the center would be equally likely, and (7.10) would represent the probability that the chord drawn by the line across the disk would have a length less than . On the other hand, if the disk were held by a pivot through a point on its edge, which point lay upon a certain straight line, and spun randomly about this point, then (7.11) would represent the probability that the chord drawn by the line across the disk would have a length less than .

The following example has many important extensions and practical applications.

Example 7F . The probability of an uncrowded road . Along a straight road, miles long, are distinguishable persons, distributed at random. Show that the probability that no two persons will be less than a distance miles apart is equal to, for such that ,

 

Solution

For let denote the position of the th person. We assume that are independent random variables, each uniformly distributed over the interval 0 to . Their joint probability density function is then given by

 

Next, for each permutation, or ordered -tuple chosen without replacement, of the integers 1 to , define

Thus is a zone of points in -dimensional Euclidean space. There are ! such zones that are mutually exclusive. The union of all zones does not include all the points in -dimensional space, since an -tuple that contains two equal components does not lie in any zone. However, we are able to ignore the set of points not included in any of the zones, since this set has probability zero under a continuous probability law. Now the event that no two persons are less than a distance apart may be represented as the set of -tuples for which the distance between any two components is greater than . To find the probability of , we must first find the probability of the intersection of and each zone . We may represent this intersection as follows:

Consequently,

in which we have made the change of variables , and have set . The last written integral is readily evaluated and is seen to be equal to, for ,

The probability of is equal to the product of ! and the probability of the intersection of and any zone . The proof of (7.12) is now complete.

In a similar manner one may solve the following problem.

Example 7G . Packing cylinders randomly on a rod . Consider a horizontal rod of length on which equal cylinders, each of length , are distributed at random. The probability that no two cylinders will be less than apart is equal to, for such that ,

The foregoing considerations, together with (6.2) of Chapter 2, establish an extremely useful result.

The Random Division of an Interval or a Circle . Suppose that a straight line of length is divided into sub-intervals by points chosen at random on the line or that a circle of circumference is divided into sub-intervals by points chosen at random on the circle. Then the probability that exactly of the sub-intervals will exceed in length is given by

It may clarify the meaning of (7.19) to express it in terms of random variables. Let be the coordinates of the points chosen randomly on the line (a similar discussion may be given for the circle.) Then are independent random variables, each uniformly distributed on the interval 0 to . Define new random variables is equal to the minimum of ; is equal to the second smallest number among ; and, so on, up to , which is equal to the maximum of . The random variables thus constitute a reordering of the random variables , according to increasing magnitude. For this reason, the random variables are called the order statistics corresponding to . The random variable , for , is usually spoken of as the th smallest value among .

The lengths of the successive subintervals into which the randomly chosen points divide the line may now be expressed:

The probability is the probability that exactly of the events will occur. To prove (7.19), one needs only to verify that for any integer the probability that specified subintervals will exceed in length is equal to

References to the large variety of problems to which (7.19) is applicable may be found in two papers: J. O. Irwin, “A Unified Derivation of Some Well-known Frequency Distributions of Interest in Biometry and Statistics,” Journal of the Royal Statistical Society A , Vol. 118 (1955), pp. 389398, and L. Takacs, “On a general probability theorem and its application in the theory of stochastic processes”, Proceedings of the Cambridge Philosophical Society, Vol. 54 (1958), pp. 219–224.

Theoretical Exercises

7.1 . Buffon’s Needle Problem . A smooth table is ruled with equidistant parallel lines at distance apart. A needle of length is dropped on the table. Show that the probability that it will cross one of the lines is . For an account of some experiments made in connection with the Buffon Needle Problem see J. V. Uspensky, Introduction to Mathematical Probability , McGraw-Hill, New York, 1937, pp. 112–113.

7.2 . A straight line of unit length is divided into subintervals by points chosen at random. For , show that the probability that none of specified subintervals will be less than in length is equal to Hence, using (6.3) of Chapter 2, conclude that the probability that at least 1 of the subintervals will exceed in length is equal to the series continuing as long as the terms are positive.

Exercises

7.1 . A young man and a young lady plan to meet between 5 and 6 P.M., each agreeing not to wait more than 10 minutes for the other. Find the probability that they will meet if they arrive independently at random times between 5 and 6 P.M.

 

Answer

.

 

7.2 . Consider light bulbs produced by a machine for which it is known that the life in hours of a light bulb produced by the machine is a random variable with probability density function

Consider a box containing 100 such bulbs, selected randomly from the output of the machine.

(i) What is the probability that a bulb selected randomly from the box will have a lifetime greater than 1020 hours?

(ii) What is the probability that a sample of 5 bulbs selected randomly from the box will contain (a) at least 1 bulb, (b) 4 or more bulbs with a lifetime greater than 1020 hours?

(iii) Find approximately the probability that the box will contain between 30 and 40 bulbs, inclusive, with a lifetime greater than 1020 hours.

7.3 . Six soldiers take up random positions on a road 2 miles long. What is the probability that the distance between any two soldiers will be more than (i) , (ii) , (iii) of a mile?

 

Answer

(i) 0; (ii) ; (iii) .

 

7.4 . Another version of Bertrand’s paradox . Let a chord be drawn at random in a given circle. What is the probability that the length of the chord will be greater than the side of the equilateral triangle inscribed in that circle?

7.5 . A point is chosen randomly on each of 2 adjacent sides of a square. Find the probability that the area of the triangle formed by the sides of the square and the line joining the 2 points will be (i) less than of the area of the square, (ii) greater than of the area of the square.

 

Answer

(i) ; (ii) 0.

 

7.6 . Three points are chosen randomly on the circumference of a circle. What is the probability that there will be a semicircle in which all will lie?

7.7 . A line is divided into 3 subintervals by choosing 2 points randomly on the line. Find the probability that the 3-line segments thus formed could be made to form the sides of a triangle.

 

Answer

.

 

7.8 . Find the probability that the roots of the equation will be real if (i) and are randomly chosen between 0 and 1, (ii) is randomly chosen between 0 and 1, and is randomly chosen between -1 and 1.

7.9 . In the interval to , points are chosen randomly. Find (i) the probability that the point lying farthest to the right will be to the right of the number 0.6, (ii) the probability that the point lying farthest to the left will be to the left of the number 0.6, (iii) the probability that the point lying next farthest to the left will be to the right of the number 0.6.

 

Answer

(i) ; (ii) ; (iii) .

 

7.10 . A straight line of unit length is divided into 10 subintervals by 9 points chosen at random. For any (i) number , (ii) number find the probability that none of the subintervals will exceed in length.