Markov Dependent Bernoulli Trials

Of interest in many problems of applied probability theory is the evolution in time of the state of a random phenomenon. For example, suppose one has two urns (I and II), each of which contains one white and one black ball. One ball is drawn simultaneously from each urn and placed in the other urn. One is often concerned with questions such as, what is the probability that after 100 repetitions of this procedure urn I will contain two white balls? The theory of Markov ¹chains is applicable to questions of this type.

The theory of Markov chains relates to every field of physical and social science (see the forthcoming book by A. T. Bharucha-Reid, Introduction to the Theory of Markov Processes and their Applications ; for applications of the theory of Markov chains to the description of social or psychological phenomena, see the book by Kemeny and Snell cited in the next paragraph). There is an immense literature concerning the theory of Markov chains. In this section and the next we can provide only a brief introduction.

Excellent elementary accounts of this theory are to be found in the works of W. Feller, An Introduction to Probability Theory and Its Applications , second edition, Wiley, New York, 1957, and J. G. Kemeny and J. L. Snell, Finite Markov Chains , Van Nostrand, Princeton, New Jersey, 1959. The reader is referred to these books for proof of the assertions made in section 6 .

The natural generalization of the notion of independent Bernoulli trials is the notion of Markov dependent Bernoulli trials. Given trials of an experiment, which has only two possible outcomes (denoted by or , for “success” or “failure”), we recall that they are said to be independent Bernoulli trials if for any integer to and events , , depending, respectively, on the first, second,…, st trials, We define the trials as Markov dependent Bernoulli trials if, instead of (5.1) , it holds that In words, (5.2) says that at the th trial the conditional probability of any event , depending on the next trial, will not depend on what has happened in past trials but only on what is happening at the present time. One sometimes says that the trials have no memory.

Suppose that the quantities \begin{align} P(s, s)= & \text { probability of success on the }(k+1) \text {st trial, } \\ & \text { given that there was success on the } k \text {th trial, } \\ P(f, s)= & \text { probability of success at the }(k+1) \text {st trial, } \\ & \text { given that there was failure at the } k \text {th trial, } \tag{5.3} \\ P(f, f)= & \text { probability of failure at the }(k+1) \text {st trial, } \\ & \text { given that there was failure at the } k \text {th trial, } \\ P(s, f)= & \text { probability of failure at the }(k+1) \text {st trial, } \\ & \text { given that there was success at the } k \text {th trial, } \end{align} are independent of . We then say that the trials are Markov dependent repeated Bernoulli trials .

Example 5A . Let the weather be observed on consecutive days. Let describe a day on which rain falls, and let describe a day on which no rain falls. Suppose one assumes that weather observations constitute a series of Markov dependent Bernoulli trials (or, in the terminology of section 6, a Markov chain with two states). Then is the probability of rain tomorrow, given rain today; is the probability of no rain tomorrow, given rain today; is the probability of no rain tomorrow, given no rain today; and is the probability of rain tomorrow, given no rain today. It is now natural to ask for such probabilities as that of rain the day after tomorrow, given no rain today; we denote this probability by and obtain a formula for it.

In the case of independent repeated Bernoulli trials the probability function on the sample description space of the trials is completely specified once we have the probability of success at each trial. In the case of Markov dependent repeated Bernoulli trials it suffices to specify the quantities \begin{align} p_{1}(s) & =\text { probability of success at the first trial, } \\ p_{1}(f) & =\text { probability of failure at the first trial, } \tag{5.4} \end{align} as well as the conditional probabilities in (5.3) . The probability of any event can be computed in terms of these quantities. For example, for let \begin{align} p_{k}(s) & =\text { probability of success at the } k \text { th trial, } \\ p_{k}(f) & =\text { probability of failure at the } k \text { th trial. } \tag{5.5} \end{align}

The quantities satisfy the following equations for : \begin{align} p_{k}(s) & =p_{k-1}(s) P(s, s)+p_{k-1}(f) P(f, s) \tag{5.6} \\ & =p_{k-1}(s) P(s, s)+\left[1-p_{k-1}(s)\right][1-P(f, f)] \\ & =p_{k-1}(s)[P(s, s)+P(f, f)-1]+[1-P(f, f)] \end{align}

To justify (5.6) , we reason as follows: if is the event that there is success on the th trial, then

From (5.7) and the fact that (5.8) one obtains (5.6) .

Equation (5.6) constitutes a recursive relationship for , known as a difference equation. Throughout this section we make the assumption that

By using theoretical exercise 4.5, it follows from (5.6) and (5.9) that for \begin{align} p_{k}(s)=\left[p_{1}(s)\right. & \left.-\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}\right][P(s, s)+P(f, f)-1]^{k-1} \tag{5.10} \\ & +\left[\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}\right]. \end{align}

By interchanging the role of and in (5.10) , we obtain, similarly, for \begin{align} p_{k}(f)=\left[p_{1}(f)\right. & \left.-\frac{1-P(s, s)}{2-P(s, s)-P(f, f)}\right][P(s, s)+P(f, f)-1]^{k-1} \tag{5.11} \\ & +\left[\frac{1-P(s, s)}{2-P(s, s)-P(f, f)}\right]. \end{align}

It is readily verifiable that the expressions in (5.10) and (5.11) sum to one, as they ought.

In many problems involving Markov dependent repeated Bernoulli trials we do not know the probability of success at the first trial. We can only compute the quantities

\begin{align} P_{k}(s, s)= & \text { conditional probability of success at the } \\ & (k+1) \text { st trial, given success at the first trial, } \\ P_{k}(s, f)= & \text { conditional probability of failure at the } \\ & (k+1) \text { st trial, given success at the first trial, } \\ P_{k}(f, f)= & \text { conditional probability of failure at the } \tag{5.12} \\ & (k+1) \text { st trial, given failure at the first trial, } \\ P_{k}(f, s)= & \text { conditional probability of success at the } \\ & (k+1) \text { st trial, given failure at the first trial. } \end{align}

Since it suffices to obtain formulas for and .

In the same way that we obtained (5.6) we obtain \begin{align} P_{k}(s, s) & =P_{k-1}(s, s) P(s, s)+P_{k-1}(s, f) P(f, s) \tag{5.14} \\ & =P_{k-1}(s, s) P(s, s)+\left[1-P_{k-1}(s, s)\right][1-P(f, f)] \\ & =P_{k-1}(s, s)[P(s, s)+P(f, f)-1]+[1-P(f, f)] \end{align}

By using theoretical exericse 4.5, it follows from (5.14) and (5.9) that for \begin{align} P_{k}(s, s) & =\left[P_{1}(s, s)-\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}\right] \tag{5.15} \\ & \times[P(s, s)+P(f, f)-1]^{k-1}+\left[\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}\right] \end{align} which can be simplified to \begin{align} P_{k}(s, s)=\frac{1-P(s, s)}{2-P(s, s)-P(f, f)} & {[P(s, s)+P(f, f)-1]^{k} } \tag{5.16} \\ + & {\left[\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}\right] . } \end{align}

By interchanging the role of and , we obtain, similarly, \begin{align} P_{k}(f, f)=\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}[P(s, s)+P(f, f)-1]^{k} \tag{5.17}\\ \quad+\frac{1-P(s, s)}{2-P(s, s)-P(f, f)} \end{align} By using (5.13) , we obtain \begin{align} P_{k}(s, f)=-\frac{1-P(s, s)}{2-P(s, s)-P(f, f)}&[P(s, s)+P(f, f)-1]^{k}\\ & +\frac{1-P(s, s)}{2-P(s, s)-P(f, f)} \tag{5.18} \end{align} \begin{align} P_{k}(f, s)=-\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}&[P(s, s)+P(f, f)-1]^{k} \\ & +\frac{1-P(f, f)}{2-P(s, s)-P(f, f)}.\tag{5.19} \end{align} Equations (5.16) to (5.19) represent the basic conclusions in the theory of Markov dependent Bernoulli trials (in the case that (5.9) holds).

Example 5B . Consider a communications system which transmits the digits 0 and 1. Each digit transmitted must pass through several stages, at each of which there is a probability that the digit that enters will be unchanged when it leaves. Suppose that the system consists of three stages. What is the probability that a digit entering the system as 0 will be (i) transmitted by the third stage as 0, (ii) transmitted by each stage as 0 (that is, never changed from 0)? Evaluate these probabilities for .

Solution

In observing the passage of the digit through the communications system, we are observing a 4 -tuple , whose first component is 1 or 0, depending on whether the digit entering the system is 1 or 0. For the component is equal to 1 or 0, depending on whether the digit leaving the th stage is 1 or 0. We now use the foregoing formulas, identifying with 1, say, and 0 with . Ou basic assumption is that

The probability that a digit entering the system as 0 will be transmitted by the third stage as 0 is given by \begin{align} P_{3}(0,0)= & \frac{1-P(0,0)}{2-P(0,0)-P(1,1)}[P(0,0)+P(1,1)-1]^{3} \tag{5.21} \\ & \quad+\frac{1-P(1,1)}{2-P(0,0)-P(1,1)} \\ = & \frac{1-p}{2-2 p}(2 p-1)^{3}+\frac{1-p}{2-2 p} \\ = & \frac{1}{2}\left[1+(2 p-1)^{3}\right] . \end{align} If , then . The probability that a digit entering the system as 0 will be transmitted by each stage as 0 is given by the product

Example 5C . Suppose that the digit transmitted through the communications system described in example 5B (with ) is chosen by a chance mechanism; digit 0 is chosen with probability and digit 1, with probability . What is the conditional probability that a digit transmitted by the third stage as 0 in fact entered the system as 0?

Solution

The conditional probability that a digit transmitted by the third stage as 0 entered the system as 0 is given by

To justify (5.22), note that is the probability that the first digit is 0 and the fourth digit is 0, whereas is the probability that the fourth digit is 0. Now, under the assumption that it follows from (5.10) that \begin{align} p_{4}(0)= & {\left[p_{1}(0)-\frac{1-P(1,1)}{2-P(0,0)-P(1,1)}\right][P(0,0)+P(1,1)-1]^{3} } \tag{5.23} \\ & +\frac{1-P(1,1)}{2-P(0,0)-P(1,1)} \\ = & \left(\frac{1}{3}-\frac{1}{2}\right)\left(-\frac{1}{3}\right)^{3}+\frac{1}{2}=\frac{41}{81} \end{align}

From (5.21) and (5.23) it follows that the conditional probability that a digit leaving the system as 0 entered the system as 0 is given by

Example 5D . Let us use the considerations of examples 5B and 5C to solve the following problem, first proposed by the celebrated cosmologist A. S. Eddington (see W. Feller, “The Problem of liars and Markov chains”, American Mathematical Monthly , Vol. 58 (1951), pp. 606–608). “If , each speak the truth once in 3 times (independently), and affirms that denies that declares that is a liar, what is the probability that was telling the truth”?

Solution

We consider a sample description space of 4-tuples in which equals 0 or 1, depending on whether is truthful or a liar, equals 0 or 1, depending on whether the statement made by implies that is truthful or a liar, equals 0 or 1, depending on whether the statement made by implies that is truthful or a liar, and equals 0 or 1, depending on whether the statement made by implies that is truthful or a liar. The sample description space thus defined constitutes a series of Markov dependent repeated Bernoulli trials with

The persons , and can be regarded as forming a communications system. We are seeking the conditional probability that the digit entering the system was 0 (which is equivalent to being truthful), given that the digit transmitted by the third stage was 0 (if affirms that denies that declares that is a liar, then is asserting that is truthful). In view of example , the required probability is .

Statistical Equilibrium. For large values of the values of and are approximately given by \begin{align} p_{k}(s) & =\frac{1-P(f, f)}{2-P(s, s)-P(f, f)} \tag{5.24} \\ p_{k}(f) & =\frac{1-P(s, s)}{2-P(s, s)-P(f, f)} \end{align}

To justify (5.24) , use (5.10) and (5.11) and the fact that (5.9) implies that

Equation (5.24) has the following significant interpretation: After a large number of Markov dependent repeated Bernoulli trials has been performed, one is in a state of statistical equilibrium, in the sense that the probability of success on the kth trial is the same for all large values of and indeed is functionally independent of the initial conditions, represented .

From (5.24) one sees that the trials are asymptotically fair in the sense that approximately for large values of if and only if

Example 5E . If the communications system described in example 5B consists of a very large number of stages, then half the digits transmitted by the system will be 0’s, irrespective of the proportion of 0’s among the digits entering the system.

Exercises

5.1 . Consider a series of Markov dependent Bernoulli trials such that . Find .

Answer

5.2 . Consider a series of Markov dependent Bernoulli trials such that . Find .

5.3 . Consider a series of Markov dependent Bernoulli trials such that . Find the conditional probability of a success at the first trial, given that there was a success at the fourth trial.

Answer

5.4 . Consider a series of Markov dependent Bernoulli trials such that . Find .

5.5 . If , and each speak the truth once in 3 times (independently), and affirms that denies that denies that is a liar, what is the probability that was telling the truth?

Answer

0.35.

5.6 . Suppose the probability is equal to that the weather (rain or no rain) on any arbitrary day is the same as on the preceding day. Let be the probability of rain on the first day of the year. Find the probability of rain on the th day. Evaluate the limit of as tends to infinity.

5.7 . Consider a game played as follows: a group of persons is arranged in a line. The first person starts a rumor by telling his neighbor that the last person in line is a nonconformist. Each person in line then repeats this rumor to his neighbor; however, with probability , he reverses the sense of the rumor as it is told to him. What is the probability that the last person in line will be told he is a nonconformist if (i) , (ii) , (iii) is very large?

Answer

(i) ; (ii) ; (iii) if .

5.8 . Suppose you are confronted with 2 coins, and . You are to make tosses, using the coin you prefer at each toss. You will be paid 1 dollar for each time the coin falls heads. Coin has probability of falling heads, and coin has probability of falling heads. Unfortunately, you are not told which of the coins is coin . Consequently, you decide to toss the coins according to the following system. For the first toss you choose a coin at random. For all succeeding tosses you select the coin used on the preceding toss if it fell heads, and otherwise switch coins. What is the probability that coin will be the coin tossed on the th toss if (i) , (ii) , (iii) , (iv) is very large? What is the probability that the coin tossed on the th toss will fall heads if (i) , (ii) , (iii) , (iv) is very large? Hint: On each trial let denote the use of coin and denote the use of coin .

5.9 . A certain young lady is being wooed by a certain young man. The young lady tries not to be late for their dates too often. If she is late on 1 date, she is sure to be on time on the next date. If she is on time, then there is a chance of her being late on the next date. In the long run, how often is she late?

Answer

5.10 . Suppose that people in a certain group may be classified into 2 categories (say, city dwellers and country dwellers; Republicans and Democrats; Easterners and Westerners; skilled and unskilled workers, and so on). Let us consider a group of engineers, some of whom are Easterners and some of whom are Westerners. Suppose that each person has a certain probability of changing his status: The probability that an Easterner will become a Westerner is 0.04, whereas the probability that a Westerner will become an Easterner is 0.01. In the long run, what proportion of the group will be (i) Easterners, (ii) Westerners, (iii) will move from East to West in a given year, (iv) will move from West to East in a given year? Comment on your answers.

The theory of Markov chains derives its name from the celebrated Russian probabilist, A. A. Markov (1856–1922). ↩︎

Exercises

Quick Links

Social Media