ENGR 4323/5323 Digital and Analog Communication Ch 8 Fundamentals of Probability Theory Engineering and Physics University of Central Oklahoma Dr. Mohamed Bingabr Chapter Outline • Concept of Probability • Random Variables • Statistical Averages (MEANS) • Correlation • Linear Mean Square Estimation • Sum of Random Variables • Central Limit Theorem 2 Deterministic and Random Signals Deterministic Signals: Signals that can be determined by mathematical equation or graph. It is possible to predict the future values with 100% certainty. Random Process Signals: Unpredictable message signals and noise waveform. These type of signal are information-bearing signals and they play key roles in communications. 3 Concept of Probability Experiment: In probability theory an experiment is a process whose outcome cannot be fully predicted. (Throwing a die) Sample space: A set that contain all possible outcomes of an experiment. {1, 2, 3, 4, 5, 6} Sample point (element): an outcome of an experiment. {3} Event: A subset of the sample space that share some common characteristics. {2, 4, 6} even number Complement of event A (Ac): Event containing all points not in A. {1, 3, 5} 4 Concept of Probability Null event (ø): Event that has no sample point. Union of events A and B (A U B): The event that contains all points A and B. Intersection (joint) of events A and B (A ∩ B, AB): The event that contain all points common to event A and B. Mutually Exclusive: Events A and B are mutually exclusive if A occur then B can not occur. Relative frequency and Probability: If event A is of interest and an experiment is conducted N times then the relative frequency of A occurrence (probability) is 𝑁(𝐴) 𝑓 𝐴 = lim = 𝑃(𝐴) 𝑁→∞ 𝑁 5 Concept of Probability S A B 6 Concept of Probability Joint Probability: 𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 + 𝑃 𝐵 − 𝑃(𝐴 ∩ 𝐵) S A B If A and B are mutually exclusive A ∩ B = ø then 𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 + 𝑃(𝐵) Conditional Probability: the probability of one event is influenced by the outcome of another event. 𝑃(𝐴 ∩ 𝐵) 𝑃 𝐴/𝐵 = 𝑃(𝐵) Independent Events: The occurrence of one event is not influenced by the occurrence of the other event. 𝑃 𝐴∩𝐵 =𝑃 𝐴 𝑃 𝐵 𝑃 𝐴/𝐵 = 𝑃(𝐴) 7 Bernoulli Trials Bernoulli trial is an experiment where there are two possible outcomes, success or failure. If the probability of success is p then the probability of failure is (1-p). 𝑝(𝑘 success in a specific order in n trials) = 𝑝𝑘 (1 − 𝑝)𝑛−𝑘 𝑛 𝑛! Number of way to arrange k success in n trials = = ! 𝑘 (𝑛−𝑘)! 𝑘 𝑛 𝑘 𝑝(𝑘 success in n trials) = 𝑝 (1 − 𝑝)𝑛−𝑘 𝑘 𝑛! 𝑘 (1 − 𝑝)𝑛−𝑘 = ! 𝑝 𝑘 (𝑛 − 𝑘)! 8 Example 1 A binary symmetric channel (BSC) has an error probability Pe = 0.001 (i.e., the probability of receiving 0 when 1 is transmitted, or vice versa). Note that the channel behavior is symmetrical with respect to 0 and 1. A sequence of 8 binary digits is transmitted over this channel. Determine the probability of receiving exactly 2 digits in error. 9 Example 2 In binary communication, one of the techniques used to increase the reliability of a channel is to repeat a message several times. For example, we can send each message (0 or 1) three times. Hence, the transmitted digits are 000 (for message 0) or 111 (for message 1). Because of channel noise, we may receive any one of the eight possible combinations of three binary digits. The decision as to which message is transmitted is made by the majority rule. If Pe is the error probability of one digit, and P(ϵ) is the probability of making a wrong decision in this scheme. Find P(ϵ) in term of Pe. If Pe = 0.01 then what is P(ϵ) ? 10 Multiplication Rule for Conditional Probability 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 𝑃 𝐵/𝐴 𝑃 𝐴1 𝐴2 … 𝐴𝑛 = 𝑃 𝐴1 . 𝑃 𝐴2 /𝐴1 . 𝑃 𝐴3 /𝐴1 𝐴2 … 𝑃 𝐴𝑛 /𝐴1 𝐴2 … 𝐴𝑛−1 Example Suppose a box of diodes consist of Ng good diodes and Nb bad diodes. If five diodes are randomly selected, one at a time, without replacement, determine the probability of obtaining the sequence of diodes in the order of good, bad, good, good, bad. 11 The Total Probability Theorem Let n disjoint events A1, …, An from a partition of the sample spaces S such that 𝑛 𝐴𝑖 = 𝑆 and 𝐴𝑖 ∩ 𝐴𝑗 = ∅, if 𝑖 ≠ 𝑗 𝑖=1 Then the probability of an event B can be written as 𝑛 𝑃(𝐵) = 𝑃(𝐵/𝐴𝑖 )𝑃(𝐴𝑖 ) 𝑖=1 This theorem simplifies the analysis of the more complex events of interest, B, by identifying all different causes Ai. 12 Example The decoding of a data packet may be in error because of N distinct error patterns E1, E2, …, En it encounters. These error patterns are mutually exclusive, each with probability P(Ei) = pi. When the error pattern Ei occurs, the data packet would be incorrectly decoded with probability qi. Find the probability that the data packet is incorrectly decoded. 13 Baye’s Theorem Baye’s theorem determines the likelihood of a particular cause of an event among many disjoint possible causes. Theorem Let n disjoint events A1, …, An form a partition of the sample space S. Let B be an event with P(B) >0. Then for j=1, …, n, 𝑃 𝐵/𝐴𝑗 𝑃(𝐴𝑗 ) 𝑃(𝐴𝑗 /𝐵) = = 𝑃(𝐵) 𝑃 𝐵/𝐴𝑗 𝑃(𝐴𝑗 ) 𝑛 𝑖=1 𝑃(𝐵/𝐴𝑖 )𝑃(𝐴𝑖 ) 14 Example A communication system always encounter one of three possible interference waveforms: F1, F2, or F3. The probability of each interference is 0.8, 0.16, and 0.04, respectively. The communication system fails with probability 0.01, 0.1, and 0.4 when it encounters F1, F2, and F3, respectively. Given that the system has failed, find the probability that the failure is a result of F1, F2, or F3, respectively. 15 Random Variable A discrete random variable has numerical values that resulted from mapping sample points (outcomes of experiment) to these numbers. The outcomes of tossing a coin are {H, T} we can assign 1 for head and -1 for tail. The random variable X = {1, -1} 𝑃𝑥 𝑥𝑖 = 1 𝑖 16 Random Variable For two independent random variables X and Y (tossing two coins): 𝑃𝑥𝑦 𝑥𝑖 , 𝑦𝑗 = 𝑃𝑥 𝑥𝑖 𝑃𝑦 𝑦𝑗 𝑃𝑥𝑦 𝑥𝑖 , 𝑦𝑗 = 1 𝑖 𝑗 17 Conditional Probabilities If x and y are two RVs, then the conditional probability of x = xi given y = yj is denoted by Px|y(xi|yj) 𝑃x|y 𝑥𝑖 𝑦𝑗 = 𝑖 𝑃y|x 𝑦𝑗 𝑥𝑖 = 1 𝑗 𝑃xy 𝑥𝑖 , 𝑦𝑗 = 1 𝑖 𝑗 𝑃xy (𝑥𝑖 , 𝑦𝑗 ) = 𝑖 𝑃x|y 𝑥𝑖 𝑦𝑗 𝑃y 𝑦𝑗 = 𝑃y (𝑦𝑗 ) 𝑖 𝑃xy (𝑥𝑖 , 𝑦𝑗 ) = 𝑗 𝑃y|x 𝑦𝑗 𝑥𝑖 𝑃x 𝑥𝑖 = 𝑃x (𝑥𝑖 ) 𝑗 18 Conditional Probabilities If x and y are two RVs, then the conditional probability of x = xi given y = yj is denoted by Px|y(xi|yj) 𝑃y 𝑦𝑗 = 𝑃y|x 𝑦𝑗 𝑥𝑖 𝑃x 𝑥𝑖 𝑖 𝑃x 𝑥𝑖 = 𝑃x|y 𝑥𝑖 𝑦𝑗 𝑃y 𝑦𝑗 𝑗 19 Example A binary symmetric channel (BSC) error probability is Pe. The probability of transmission 1 is Q, and that of transmitting 0 is 1Q. Determine the probability of receiving 1 and 0 at the receiver. 20 Example Over a certain binary communication channel, the symbol 0 is transmitted with probability 0.4 and 1 is transmitted with probability 0.6. It is given that P(ϵ|0) = 10-6 and P(ϵ|1) = 10-4, where P(ϵ|xi) is the probability of detecting the error given that xi is transmitted. Determine P(ϵ), the error probability of the channel. 21 Cumulative Distribution Function (CDF) A CDF, Fx(x), of an RV x is the probability that x takes a value less than or equal to x. Property of CDF 1) Fx(x) 0 2) Fx() = 1 3) Fx (-)=0 4) Fx(x) is a nondecreasing function. 22 Example In an experiment, a trial consists of four successive tosses of a coin. If we define an RV x as the number of heads appearing in a trial, determine Px(x) and Fx(x). 23 Continuous Random Variable The random variable has continuous value. px(x) is the probability density function (pdf) that describes the relative frequency of occurrence of different values of x. Properties of the probability density function: ∞ 𝑝x 𝑥 ≥ 0 𝑝x 𝑥 𝑑𝑥 = 1 −∞ Cumulative distribution function: 𝐹x 𝑥 = 𝑥 𝑝x 𝑢 𝑑𝑢 = 1 −∞ 𝑑𝐹x 𝑥 𝑝x 𝑥 = 𝑑𝑥 𝑥2 𝑃 𝑥1 < 𝑥 ≤ 𝑥2 = 𝑝x 𝑥 𝑑𝑥 = 𝐹x (𝑥2 ) − 𝐹x (𝑥1 ) 𝑥1 24 Continuous Random Variable 25 The Gaussian (Normal) Random Variable Standard Gaussian RV (µ = 0, σ = 1) 1 −𝑥 2 /2 𝑝x 𝑥 = 𝑒 2𝜋 𝐹x 𝑥 = Q 𝑦 = 𝑥 1 2𝜋 1 2𝜋 𝑒 −𝑥 2 /2 𝑑𝑥 −∞ ∞ 2 /2 −𝑥 𝑒 𝑑𝑥 𝑦 𝑄 𝑥 = 1 − 𝐹x (𝑥) 𝐹x 𝑥 = 𝑃 x ≤ 𝑥 = 1 − 𝑄(𝑥) 𝑃 x > 𝑥 = 𝑄(𝑥) 26 27 28 The Gaussian (Normal) Random Variable General Gaussian RV (µ , σ) 𝑝x 𝑥 = 𝐹x 𝑥 = 1 𝜎 2𝜋 𝑒 −(𝑥−𝑚)2 /2𝜎 2 1 𝑥 𝜎 2𝜋 −∞ 2 /2𝜎 2 −(𝑥−𝑚) 𝑒 𝑑𝑥 𝑥−𝑚 𝐹x 𝑥 = 𝑃 x ≤ 𝑥 = 1 − 𝑄 𝜎 𝑥−𝑚 𝑃 x>𝑥 =𝑄 𝜎 29 Example Over a certain binary channel, message m = 0 and 1 are transmitted with equal probability by using a positive and negative pulse, respectively. The received pulse corresponding to 1 is p(t), shown in the figure, and the received pulse corresponding to 0 is –p(t). Let the peak amplitude of p(t) be Ap at t = Tp. The channel noise n(t) has a normal distribution with zero mean and 𝜎𝑛 standard deviation. Because of the channel noise, the received pulse will be 𝑟 𝑡 = ±𝑝 𝑡 + 𝑛(𝑡) What is the probability of error Pe. 30 Example (cont.) 𝑃𝑒 = 𝑃(𝜖, 𝑚𝑖 ) 𝑖 𝑃𝑒 = 𝑃(𝑚𝑖 )𝑃(𝜖|𝑚𝑖 ) 𝑖 𝑃𝑒 = 𝑃 0 𝑃 𝜖 0 + 𝑃(1)𝑃(𝜖|1) 𝑃 𝜖 0 = 𝑃 𝑛 > 𝐴𝑃 𝑃 𝜖 1 = 𝑃 𝑛 < −𝐴𝑃 𝐴𝑝 𝑃𝑒 = 𝑄 𝜎𝑛 𝐴𝑝 =𝑄 𝜎𝑛 𝐴𝑝 =𝑄 𝜎𝑛 31 Joint Distribution For two RVs x and y, the CDF Fxy(x,y) 𝐹xy 𝑥, 𝑦 = 𝑃(x ≤ 𝑥 and y ≤ 𝑦) 𝑝xy 𝜕2 𝑥, 𝑦 = 𝐹xy (𝑥, 𝑦) 𝜕𝑥𝜕𝑦 𝑥2 𝑦2 𝑃 𝑥1 < x ≤ 𝑥2 , 𝑦1 < y ≤ 𝑦2 = 𝑝xy 𝑥, 𝑦 𝑑𝑥𝑑𝑦 𝑥1 𝑦1 ∞ 𝑝x (𝑥) = 𝑝xy 𝑥, 𝑦 𝑑𝑦 −∞ ∞ 𝑝y (𝑦) = 𝑝xy 𝑥, 𝑦 𝑑𝑥 −∞ 32 Conditional Densities For two RVs x and y, the CDF Fxy(x,y) 𝐹xy 𝑥, 𝑦 = 𝑃(x ≤ 𝑥 𝑎𝑛𝑑 y ≤ 𝑦) 𝑝x|y 𝑝xy 𝑥, 𝑦 𝑥|𝑦 = 𝑝y 𝑦 𝑝y|x 𝑝xy 𝑥, 𝑦 𝑦|𝑥 = 𝑝x 𝑥 Bayes’ rule 𝑝x|y 𝑥|𝑦 𝑝y 𝑦 = 𝑝y|x 𝑦|𝑥 𝑝x 𝑥 Independent Random Variables 𝑝x|y 𝑥|𝑦 = 𝑝x 𝑥 𝑝x|y 𝑥|𝑦 = 𝑝x 𝑥 𝑝xy 𝑥, 𝑦 = 𝑝x 𝑥 𝑝y 𝑦 33 Rayleigh Density Example Derive the Rayleigh probability density function (pdf). 𝑟 −𝑟 2/2𝜎2 𝑒 𝑝𝑟 𝑟 = 𝜎 2 0 𝑟≥0 𝑟<0 34 Statistical Averages (MEANS) The average value or expected value of RV x 𝑛 x = 𝐸[𝑥] = 𝑥𝑖 𝑃x (𝑥𝑖 ) 𝑖=1 ∞ x = 𝐸[𝑥] = 𝑥𝑝x 𝑥 𝑑𝑥 −∞ Mean of a function g(x) of a random variable x 𝑛 𝑔(𝑥) = 𝑔(𝑥𝑖 )𝑃x (𝑥𝑖 ) 𝑖=1 ∞ 𝑔(𝑥) = The random variable x can be the alphabetic letters and the function could be the PCM 𝑔(𝑥)𝑝x 𝑥 𝑑𝑥 −∞ 35 Example Example: The output voltage of sinusoid generator is A cos(ωt). This output is sampled randomly. The sampled output is an RV x, which can take on any value in the range (-A, A). Determine the mean value x and the mean square value x 2 of the sample output. 36 Statistical Averages (MEANS) Mean of the Sum x+y=x+y Mean of the product ∞ 𝑔1 (𝑥)𝑔2 (𝑦) = ∞ 𝑔1 (𝑥)𝑔2 (𝑦)𝑝xy 𝑥, 𝑦 𝑑𝑥𝑑𝑦 −∞ −∞ If RVs x and y are independent, then xy = x y ∞ 𝑔1 (𝑥)𝑔2 (𝑥) = ∞ 𝑔1 (𝑥)𝑝x 𝑥 𝑑𝑥 −∞ 𝑔2 (𝑦)𝑝y 𝑦 𝑑𝑦 −∞ 37 Moments The nth moment of an RV x ∞ 𝑥 𝑛 𝑝x 𝑥 𝑑𝑥 x𝑛 = −∞ The nth central moment of an RV x ∞ (𝑥 − x)𝑛 𝑝x 𝑥 𝑑𝑥 (x − x)𝑛 = −∞ The variance 𝜎x2 and standard deviation 𝜎x 𝜎x2 = (x − x)2 = x 2 − x 2 38 Example Find the mean, variance, and the Mean Square of the Uniform Quantization Error in PCM. 39 Example Find the variance and the Mean Square Error Caused by Channel Noise in PCM. 40 Variance of a Sum of Independent RVs z=x+y 𝜎z2 = 𝜎x2 + 𝜎y2 Example Find the total mean square error in PCM m Quantization 𝑞 =𝑚−𝑚 m Channel m 𝜖 =m−m 41 Chebyshev’s Inequality 𝑃 |x| ≤ 𝑘𝜎x 1 ≥1− 2 𝑘 𝑃 |x − x| ≤ 𝑘𝜎x 1 ≥1− 2 𝑘 The standard deviation σ of an RV x is a measure of the width of its PDF. The standard deviation in communication is also used to estimate the bandwidth of a signal spectrum. 42 Correlation The covariance 𝜎xy is a measure of the nature of dependence between the RVs x and y. 𝜎xy = (x − x)(y − y) 𝜎xy = xy − xy Correlation coefficient is a normalized covariance. 𝜎xy 𝜌xy = 𝜎x 𝜎y −1 ≤ 𝜌xy ≤ 1 Independent variable are uncorrelated, the converse is not necessarily true. 43 Linear Mean Square Estimation When two random variables x and y are related (dependent), then it is possible to estimate the value of y from a knowledge of the value of x. Minimum square error is one possible criterion for the estimation of y. 𝜖2 = y − y 2 y = 𝑎x 𝜕𝜖 2 = 2𝑎x 2 − 2xy = 0 𝜕𝑎 𝑅xy 𝑎= 2= 𝑅xx x xy The optimum estimation is to choose a to make 𝜖 2 = 0. 𝜖 2 = y − 𝑎x 2 = y − 𝑎x y − 𝑎. 𝜖x = y − 𝑎x y 𝜖 2 = 𝑅yy − 𝑎𝑅xy 44 Using n Random Variable for Estimation Using n random variables x1, x2,…,xn to estimate a random variable x0. x0 = 𝑎1 x1 + 𝑎2 x2 + ⋯ + 𝑎𝑛 x𝑛 𝜖 2 = x0 − 𝑎1 x1 + 𝑎2 x2 + ⋯ + 𝑎𝑛 x𝑛 2 𝜕𝜖 2 = −2 x0 − 𝑎1 x1 + 𝑎2 x2 + ⋯ + 𝑎𝑛 x𝑛 x𝑖 = 0 𝜕𝑎𝑖 𝑅0𝑖 = 𝑎1 𝑅𝑖1 + 𝑎2 𝑅𝑖2 + ⋯ + 𝑎𝑛 𝑅𝑖𝑛 𝑅11 𝑅12 𝑎1 𝑎2 𝑅 𝑅22 = 21 ⋮ ⋯ ⋯ 𝑎𝑛 𝑅𝑛1 𝑅𝑛2 … 𝑅1𝑛 … 𝑅2𝑛 ⋯ ⋯ ⋯ 𝑅𝑛𝑛 −1 where 𝑅𝑖𝑗 = x𝑖 x𝑗 𝑅01 𝑅02 ⋮ 𝑅0𝑛 𝜖 2 = 𝑅00 − 𝑎1 𝑅01 + 𝑎2 𝑅02 + ⋯ + 𝑎𝑛 𝑅0𝑛 45 Example In differential pulse code modulation (DPCM), instead of transmitting sample values directly, we estimate (predict) the value of each sample from the knowledge of previous n samples. The estimation error 𝜖 k, the difference between the actual value and the estimated value of the kth sample, is quantized and transmitted. Because the estimation error 𝜖k is smaller than the sample value mk, for the same number of quantization levels, the SNR is increased. The SNR improvement is equal to 𝑚2 /𝜖 2 , where 𝑚2 and 𝜖 2 are the mean square values of the speech signal and the estimation error 𝜖, respectively. Find the optimum linear second-order predictor and the corresponding SNR improvement. 46 Sum of Random Variables z=x+y How does the pdf of z relate to the pdfs of x and y? ∞ 𝐹z 𝑧 = 𝑃 z ≤ 𝑧 = 𝑃 x ≤ ∞, y ≤ 𝑧 − 𝑥 = 𝑑𝑥 −∞ 𝑑𝐹𝑧 (𝑧) 𝑝z 𝑧 = = 𝑑𝑧 𝑧−𝑥 𝑝xy 𝑥, 𝑦 𝑑𝑦 −∞ ∞ 𝑝xy 𝑥, 𝑧 − 𝑥 𝑑𝑥 −∞ If x and y are independent random variables ∞ 𝑝z 𝑧 = 𝑝x 𝑥 𝑝y 𝑧 − 𝑥 𝑑𝑥 −∞ The PDF of z is the convolution of the PDFs of x and y. 47 Sum of Gaussian Random Variables The sum of jointly distributed Gaussian random variables is also a Gaussian random variable regardless of their relationship such as independence. y = x1 + x2 y is a Gaussian RV with y = x1 + x2 𝜎y2 = 𝜎 2x1 + 𝜎x22 If x1 and x2 are jointly Gaussian but not necessarily independent then 𝜎y2 = 𝜎 2x1 + 𝜎x22 + 2𝜎x1 x2 48 Sum of Gaussian Random Variables The fact that the sum of jointly distributed Gaussian random variables is also a Gaussian random variable, has important practical application. For example, if xk is a sequence of jointly Gaussian signal samples passing through a discrete time filter with impulse response {hi}, then the filter output y is also Gaussian ∞ 𝑦= ℎ𝑖 x𝑘−𝑖 𝑖=0 49 The Central Limit Theorem The sum of a large number of independent RVs tends to be a Gaussian random variable, independently of the probability densities of the variable added. 50 The Central Limit Theorem (for the Sample Mean) Let x1, x2, …, xn be independent random variables from a given distribution with mean µ and variance σ2 with 0< σ2<. Then the sample mean x1 + ⋯ + xn x𝑛 = 𝑛 is a Gaussian random variable with mean equals µ and variance equals σ2/n. 𝑥 x𝑛 − 𝜇 1 −𝑣 2/2 lim 𝑃 ≤𝑥 = 𝑒 𝑑𝑣 𝑛→∞ 𝜎/ 𝑛 −∞ 2𝜋 x𝑛 − 𝜇 lim 𝑃 > 𝑥 = 𝑄(𝑥) 𝑛→∞ 𝜎/ 𝑛 Also 𝑛𝑖=1 x𝑖 is a Gaussian random variable with mean equals nµ and variance equals nσ2. 51 Example Consider the communication system that transmits a data packet of 1024 bits. Each bit can be in error with probability of 10-2. Find the (approximate) probability that more than 30 of the 1024 bits are in error. 52