CENG 272 Statistical Computations

Lecture Notes On
CENG 272
Statistical Computations
Prepared by: Dr. Emre Sermutlu
Based on the book: Probability and Statistics for Engineers
and Scientists, Ninth Edition, Walpole, Myers, Myers, Ye, Pearson
Education
Last Update: March 12, 2015
.
Week 1– Introduction
A population is a collection of individual items of a particular type.
A sample is a subset of the population, selected by a definite procedure.
In a biased sample, the probability for each member of population to be selected is
not equal.
Sample Mean:
x1 + x 2 + · · · xn
x=
n
Sample Median: If the observations in the sample are ordered as x1 , x2 , . . . , xn the
median is:


x(n+1)/2
if n is odd

x˜ =
1

 (xn/2 + xn/2+1 ) if n is even
2
An alternative to mean and median is the trimmed mean. For example, we can eliminate largest and smallest %10 of the data and find the mean of the remaining elements.
This is called %10 trimmed mean.
Variance:
n
X
(xi − x)2
2
s =
n−1
i=1
Standard Deviation:
√
s=
s2
In statistics, any process that generates a set of data is called an experiment.
The set of all possible outcomes of a statistical experiment is called the sample space
and is denoted by S.
For example, if we toss a coin twice, the sample space is: S = {HH, HT, T H, T T }.
An event is a subset of a sample space. The complement of an event A is the set of
all elements of S that are not in A, denoted by A0 .
Two events A and B are mutually exclusive or disjoint if A ∩ B = ∅.
Exercise 1-1: An experiment consists of tossing a die and then flipping a coin. Describe
the sample space.
Exercise 1-2: An experiment consists of tossing a die and then flipping a coin once if
the number is even, twice if it is odd. Describe the sample space.
Exercise 1-3: A student is registered to 2 courses. He can get one of 5 different letter
grades for each course, (A,B,C,D,F) Describe the sample space of his grade distributions.
Find the event he passes all, he fails all and the complements of these two.
1
Multiplication Rule: If an operation can be performed in n ways, and if for each of
these ways a second operation can be performed in m ways, then the two operations can
be performed together in nm ways.
Permutation: A permutation is an arrangement of a set of objects. The number of
permutations of n objects is n!. Permutation of n objects taken r at a time is:
P (n, r) =
n!
(n − r)!
The number of distinct permutations of n things of which n1 are of one kind, n2 are
of a second kind etc. is:
n!
n1 !n2 ! · · · nk !
Combination: The number of combinations of n distinct objects taken r at a time is:
n
n!
=
r
r!(n − r)!
Exercise 1-4: How many 12 digit numbers contain exactly four 9’s?
Exercise 1-5: A football team plays 20 matches in a season. The matches result in win,
loss or tie. In how many different ways can the team end the season with:
a) No loss?
b) 10 wins, 4 losses, 6 ties?
Exercise 1-6: 6 people, A, B, C, D, E, F will sit around a circular table.
a) In how many ways can they do that?
b) A wants to sit together with B. In how many ways can they do that?
c) C does not want to sit together with D. In how many ways can they do that?
2
Probability
Probability of an event A denotes the weight of A in S. Therefore:
P (S) = 1,
P (∅) = 0,
0 6 P (A) 6 1
Furthermore, if A and B are mutually exclusive events
P (A ∪ B) = P (A) + P (B)
In general, for two events A and B we have:
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
and for three events A, B and C we have:
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C)
For complementary events,
P (A) + P (A0 ) = 1
Exercise 1-7: There are 5 black and 4 red balls in a bag. We randomly choose three
balls without replacement. Find the probabilities that we get
a) 3 black
b) At least 2 black
c) At least 1 black
Exercise 1-8: We toss a pair of dice. What is the probability that:
a) The sum is 7?
b) The maximum number is 4?
c) We have a double number?
Exercise 1-9: A fair coin is tossed 5 times. Find the probability of getting
a) No heads
b) Exactly one head
c) 3 or more heads.
Exercise 1-10: We choose a number from {1, 2, . . . , n} randomly. We repeat this n
times. What is the probability that we choose 1 at least once?
Exercise 1-11: In a game of chance, your probability of winning is 0.7. You play this
game five times. What is the probability that
a) You win 3 or more games?
b) You lose all of them?
Solution:
5
5
5
3
2
4
a)
0.7 0.3 +
0.7 0.3 +
0.75 = 0.83692
3
4
5
b) 0.35 = 0.00243
3
Exercise 1-12: There are 17 balls in a box. 5 are blue, 8 are red and 4 are green. We
randomly choose 5 balls. What is the probability that we choose equal number of blue
and red balls?
Solution: Possible choices are: 1 blue 1 red and 2 blue 2 red.
5 8 4
5 8 4
2 2 1
1 1 3
+
= 0.0259 + 0.1810 = 0.2069
17
17
5
5
Exercise 1-13: A file server has 4 harddisks. Each disk has a 5% probability of failure
within one year. If one (or more) disk fails, the whole system fails.
a) What is the probability that the system will fail within one year?
b) We decide to improve the system reliability by adding an extra disk. Now we have
5 disks, and the system works if 4 or 5 disks are working, fails otherwise. What is the
probability that the new system will fail within one year?
Solution: a) 1 − 0.954 = 0.1855
b) 1 − 0.955 − 5 · 0.954 · 0.05 = 0.0226
Exercise 1-14: In a computer game, there are three results: Win, Draw, Lose. The
probabilities are:
0.4, 0.5, 0.1. You get 2 points for Win, 1 for Draw and 0 for Lose.
What is the probability that you get 16 points after playing this game for 10 rounds?
Solution: To get 16 points, we may get 8W +0D +2L or 7W +2D +1L or 6W +4D +0L.
We can find the probabilities using multinomial distribution:
10!
10!
10!
0.48 0.50 0.12 +
0.47 0.52 0.11 +
0.46 0.54 0.10
8! 0! 2!
7! 2! 1!
6! 4! 0!
= 0.0003 + 0.0147 + 0.0538
= 0.0688
4
Week 2– Conditional Probability
The conditional probability of B, given A, denoted by P (B A) is defined as:
P (A ∩ B)
, provided P (A) > 0
P (B A) =
P (A)
Two events A and B are independent if and only if
P (B A) = P (B) or P (AB) = P (A)
Otherwise, A and B are dependent.
Theorem: Two events A and B are independent if and only if
P (A ∩ B) = P (A)P (B)
Exercise 2-1: In a classroom of 50 students, 28 are girls and 22 are boys. 16 of the girls
are from Ankara, and 10 of the boys are from Ankara. We randomly choose a student.
a) Given that the student is a girl, what is the probability that she’s from Ankara?
b) Given that the student is from Ankara, what is the probability that the student is a
girl?
c) Are these events independent?
Exercise 2-2: In a city, cars are colored black, white or red. 10% of all cars are black,
60% are white, the rest are red. In the past one year, 4% of all cars had an accident. 15%
of all cars that had an accident are black, 45% are white, the rest are red.
a) Given that a car is red, what is the probability that it had an accident?
b) Are these events independent?
Solution: Using the values 0.04 × 0.15 = 0.006,
0.016, we can fill the table as follows:
0.04 × 0.45 = 0.018 and 0.04 × 0.40 =
Black White Red
Accident
0.006 0.018 0.016
NO Accident 0.094 0.582 0.284
P (Acc. ∩ R)
0.016
a) P (Acc.R) =
=
= 0.053
P (R)
0.016 + 0.284
b) These events are dependent.
P (Acc.) = %4, P (Acc.R) = %5.3
⇒
P (Acc.) 6= P (Acc.R)
5
Exercise 2-3: In a country, people are unemployed with 0.20 probability. 60% of the
population is young. The probability that a person is unemployed given that he/she is
young is 0.25.
What is the probability that an old person is unemployed? Are these events independent?
Answer: 0.125, NO, they are dependent
Exercise 2-4: A driver uses road 1 with probability 0.3 and road 2 with probability 0.7.
On road 1, the probability that he sees a police car is 0.5, on road 2 it is 0.2. Given that
he saw a police car, what is the probability he took road 1?
Exercise 2-5: There are 20 balls in a box. There is a 50% probability that all are white,
30% probability that 18 are white and 2 are black, 20% probability that 15 are white and
5 are black. We randomly choose two balls and see that both are white. What is the
probability that all balls are white?
Exercise 2-6: The probability that a married man watches Muhte¸sem Y¨
uzyıl is 0.5. The
probability that a married woman watches it is 0.6. The probability that a man watches
it, given that his wife does is 0.7.
Given that a married man watches Muhte¸sem Y¨
uzyıl, what is the probability that his
wife watches it? Are these events independent?
Exercise 2-7: There are two roads, A and B that I can take in the mornings. I prefer
A 80% of the time. If I choose A, I arrive work early with probability 0.1, on time with
probability 0.8 and late with probability 0.1. For road B, these probabilities are 0.5, 0.3
and 0.2.
a) What is the probability that I arrive work early?
b) Given that I arrived early, what is the probability that I have taken B?
Solution:
a) 0.8 · 0.1 + 0.2 · 0.5 = 0.18
b)
0.1
= 0.5556
0.18
6
Exercise 2-8: We have a shipment of 20 components. There is a 60% probability that it is
good (no defectives), 30% probability that it is medium (1 defective) and 10% probability
that it is bad (2 defectives). We randomly choose two components, test them, and see
that neither is defective.
a) What is the probability that the shipment is good?
b) What is the probability that the shipment is medium?
c) What is the probability that the shipment is bad?
Solution: The probability that there are no defectives among the two components we
test is:
18
19
P (N D) = 0.60 × 1 + 0.30 ×
2
20
2
+ 0.10 ×
2
20
2
= 0.60 + 0.27 + 0.0805 = 0.9505
P (G ∩ N D)
0.60
a) P (GN D) =
=
= 0.631
P (N D)
0.9505
P (M ∩ N D)
0.27
b) P (M N D) =
=
= 0.284
P (N D)
0.9505
P (B ∩ N D)
0.0805
c) P (B N D) =
=
= 0.085
P (N D)
0.9505
Exercise 2-9: 65% of the customers of a coffee shop are women, the rest are men. A
woman orders cappuccino with 50% probability, she orders espresso with 30% probability
and orders something else with 20% probability. For a man these numbers are 25%, 50%
and 25%.
Given that a customer ordered espresso, what is the probability that the customer is
a man?
Solution: Let E denote that the customer orders espresso, M denote the customer is a
man and W denote the customer is a woman.
P (M | E) =
P (M ∩ E)
0.35 × 0.5
=
= 0.47
P (E)
0.35 × 0.5 + 0.65 × 0.3
Exercise 2-10: Three assistants, Bu˜gra, Alphan and O˜guz are grading homeworks.
Bu˜gra grades 30% of the homeworks, Alphan grades 45% and O˜guz grades the rest.
Bu˜gra makes 2 mistakes per 100 homeworks, Alphan makes 3 and O˜guz makes 5.
I have a homework that was graded wrongly, but I don’t know who graded it. What
is the probability it was O˜guz?
Solution: Let O denote O˜guz has graded the homework and M denote a mistake was
made.
P (O | M ) =
P (O ∩ M )
0.25 × 0.05
=
= 0.39
P (M )
0.25 × 0.05 + 0.45 × 0.03 + 0.30 × 0.02
7
Exercise 2-11: I feel sad 5%, happy 35% and normal 60% of the time. On any given
day, if I feel normal, I do not go to canteen. If I feel sad, I go to canteen with probability
70%. If I feel happy, I go to the canteen with probability 30%.
At the canteen, I order either black coffee or coffee with milk. On happy days the
probabilities are 50% - 50%, on sad days 10% - 90%.
Today I was at the canteen, drinking coffee with milk. What is the probability I am
feeling happy?
Solution: The event Sad+Canteen+Coffee with Milk has probability 0.05×0.70×0.90 =
0.0315. The alternative event Happy+Canteen+Coffee with Milk has probability 0.35 ×
0.30 × 0.50 = 0.0525. We know one of these happened, so using conditional probability
formulas, I am happy with probability:
0.0525
= 0.625 = 62.5%
0.0525 + 0.0315
8
Week 3– Probability Distributions
A Random Variable is a function that associates a real number with each element
in the sample space. Random variables can be discrete or continuous.
Exercise 3-1: There are 10 balls in a box. 4 of them are black, 6 are white. We draw
two balls without replacement. The number of black balls is a random variable. (discrete)
Exercise 3-2: We take cell phones of 4 students and redistribute them randomly. The
number of students getting the correct phone is a random variable. (discrete)
Exercise 3-3: The time between passing of two trucks along the way is a random variable.
(continuous)
Discrete Probability Distributions
f (x) is a probability function, or probability distribution of the discrete random variable X if:
• f (x) > 0
X
•
f (x) = 1
x
• P (X = x) = f (x)
Exercise 3-4: Among a shipment of 20 laptops, 3 are defective. We purchase 2. Find
the probability distribution for the number of defectives.
Let the discrete random variable X have the probability distribution f (x). The cumulative distribution function F (x) is defined as
X
F (x) = P (X 6 x) =
f (t)
t6x
Exercise 3-5: A box contains 2 black and 5 white balls. We randomly select 3. If x is
the number of black balls we choose, find the probability distribution of X. Then, find
the cumulative distribution function.
Answer: 2/7, 4/7, 1/7, , 0, 2/7, 6/7, 1
Continuous Probability Distributions
f (x) is a probability density function of the continuous random variable X if:
• f (x) > 0
Z ∞
•
f (x) dx = 1
−∞
Z
• P (a < X < b) =
b
f (x) dx
a
9
Exercise 3-6: Determine c such that f (x) = c(x2 + 4) for x = 0, 1, 2, 3 is a probability
distribution.
Answer: 1/30
Exercise 3-7: Let the error in an experiment be given by
 2
 x −1 < x < 2
3
f (x) =

0
elsewhere
a) Verify that f (x) is a density function.
b) Find P (0 < X 6 1)
Exercise 3-8: Consider the density function
( √
k x 0<x<1
f (x) =
0
elsewhere
a) Find k.
b) Find P (0.3 < X < 0.6)
Answer: 3/2, 0.3
The Cumulative Distribution Function F (x) of a continuous random variable X
with density function f (x) is:
Z x
f (t) dt
F (x) = P (X 6 x) =
−∞
Therefore P (a < X < b) = F (b) − F (a).
Exercise 3-9: The time to failure in hours of an electronic equipment is
(
0
x<0
f (x) =
exp(−x/2000)/2000 x > 0
a) Find F (x)
b) Find the equipment lasts at least 1000 hours
c) Find the probability that it fails before 2000 hours.
Answer: 0.6065, 0.6321
10
Exercise 3-10: The probability distribution for a continuous random variable X is:
(
k(1 − x)4 0 6 x 6 1
f (x) =
0
elsewhere
a) Find k
b) Find P (0.8 < X)
Z
Z 1
4
k(1 − x) dx = k
Solution: a)
0
u4 (−du) =
1
0
Z
k
=1
5
⇒
k=5
∞
b) P (0.8 < x) =
5(1 − x)4 dx = 0.25 = 0.00032
0.8
Exercise 3-11: The waiting time, in hours, for a police radar is a continuous random
variable with probability density function:
(
0
x<0
f (x) =
8 exp(−8x) x > 0
Find the probability of waiting less than 12 minutes.
Answer: 0.7981
Exercise 3-12: The particle size (in micrometers) distribution in a chemical mixture is
given by
( −4
3x
x>1
f (x) =
0
elsewhere
Find the probability that the particle size is greater than 4 micrometers.
11
Exercise 3-13: A continuous random variable X has the probability distribution

0
x<2

k(1 + x) 2 6 x 6 5
f (x) =

0
5<x
a) Find k.
b) Find P (4 6 X 6 8)
Solution: a)
Z
∞
Z
5
f (x) dx =
k(1 + x) dx
−∞
2
k(1 + x)2
=
2
=
⇒
k=
5
2
27k
=1
2
2
27
Z
8
f (x) dx
P (4 6 x 6 8) =
b)
4
Z
5
=
4
=
2
(1 + x) dx
27
11
27
= 0.4074
Exercise 3-14: Emre hoca announces exam results x hours after the exam ends. The
probability distribution function of x is:

 ke−x/24 18 < x
f (x) =

0
elsewhere
a) Find k.
b) Find the probability that an exam result is announced within 36 hours of end of the
exam.
Z ∞
e3/4
−x/24
Solution: a) k
e
dx = 1 ⇒ k =
= 0.0882
24
18
Z 36
b) P = k
e−x/24 dx = 0.5276
18
12
Exercise 3-15: The concentration of a pollutant is a continuous random variable with
probability density function:
 c

x>1
x4
f (x) =

0 elsewhere
a) Find c.
b) Find P (3 < x < 4)
Solution:
∞
c
c
c =
=1 ⇒ c=3
dx
=
x4
−3x3 1
3
1
4
Z 4
3
1
−1 1
P (3 < x < 4) =
+
= 0.0214
dx = 3 = −
4
x 3
64 27
3 x
Z
∞
13
Exercise 3-16: The time to failure in years of an electronic equipment is

t<0

 0
f (x) =
−t/3

 e
t>0
3
The company will replace any product that had a lifetime less than 1 year. What proportion of the products will they replace?
Solution:
Z
0
1
1
e−t/3
dx = −e−t/3 = 1 − e−1/3 = 0.283
3
0
They will replace %28.3 of the products.
14
Week 4– Joint Probability Distributions
Discrete case:
f (x, y) is a probability mass function, or joint probability distribution of the discrete
random variables X and Y if:
• f (x, y) > 0
XX
•
f (x, y) = 1
x
y
• P (X = x, Y = y) = f (x, y)
Continuous case:
f (x, y) is a joint density function of the continuous random variables X and Y if:
• f (x, y) > 0
Z ∞Z ∞
•
f (x, y) dx dy = 1
−∞
−∞
Z Z
• P ((X, Y ) ∈ A) =
f (x, y) dxdy for any region A in the xy−plane
A
Exercise 4-1: A box contains 3 blue, 2 red and 3 green pens. We randomly choose 2
pens. If X is the number of blue and Y is the number of red pens, find
a) the joint probability function f (x, y).
b) P (X, Y ) ∈ A where A is the region {(x, y) | x + y 6 1}
Exercise 4-2: Let

 2 (2x + 3y) 0 6 x 6 1, 0 6 y 6 1
5
f (x, y) =

0
elsewhere
a) Verify that it is a probability density function
1
1
1
b) Calculate the probability that 0 < x < and < y <
2
4
2
Answer: 13/160
15
Marginal Distributions
Given a joint probability distribution f (x, y), we can find the probability distribution
of x only or y only as follows:
X
X
g(x) =
f (x, y) and h(y) =
f (x, y)
y
Z
x
∞
Z
f (x, y) dy
g(x) =
and
∞
h(y) =
−∞
f (x, y) dx
−∞

2
 x(1 + 3y )
4
Exercise 4-3: Let f (x, y) =

0
0 < x < 2, 0 < y < 1
elsewhere
Find marginal distributions g(x) and h(y).
2
Answer: x/2, (1 + 3y )/2
(
10xy 2
0<x<y<1
0
elsewhere
Exercise 4-4: Let f (x, y) =
Find marginal distributions g(x) and h(y).
3
4
Answer: 10x(1 − x )/3, 5y
Statistical Independence
The random variables X and Y are said to be statistically independent if and only if
f (x, y) = g(x)h(y)
Exercise 4-5: Let X and Y have the distribution given in table:
x
f
1
2
1 0.2 0.3
y
2 0.4 0.1
a) Find the marginal distributions of X and Y .
b) Are they statistically independent?
Exercise 4-6: Given a joint density function

2
 3y
0 6 x, 1 6 y 6 3
26ex
f (x, y) =

0
elsewhere
are X and Y statistically independent?
16
Exercise 4-7: Given a joint density function

 4 + 6x + 3y
128
f (x, y) =

0
06x+y 64
elsewhere
are X and Y statistically independent?
Exercise 4-8: Age and income distribution in a country is given by the following table
in percentages:
Age
20-34 35-49 50-64 658
7
4
3
Less than $20 000
$20 000-$40 000
13
10
8
6
Income
5
6
8
7
$40 000-$60 000
Greater than$60 000
2
2
5
6
a) Find the marginal distributions for age and income.
b) Are they independent?
Exercise 4-9: A coffee factory investigates the relation between wind speed and quality
of coffee produced that day. They obtain the following table for probabilities:
Wind
Quality
Low
Calm (No wind)
0.03
Light Wind
0.05
Strong Wind
0.02
Average
0.225
0.375
0.15
High
0.045
0.075
0.03
a) Find the marginal distributions for quality and wind speed.
b) Are they independent?
c) Find the probability that we obtain high quality coffee, given that there is strong wind.
d) Find the probability that there is strong wind, given that we obtain high quality coffee.
Solution: a) Quality g(x): Low: 0.1, Average: 0.75, High:0.15
Wind h(y): Calm: 0.3, Light: 0.5, Strong: 0.2
b) Multiplication of these numbers give exactly the above table. In other words f (x, y) =
g(x) · h(y). Therefore, wind speed and quality are independent.
c)
0.03
= 0.15
0.03 + 0.02 + 0.15
d)
0.03
= 0.20
0.045 + 0.075 + 0.03
17
Exercise 4-10: Given a joint density function
 2
2
 x + 2y
−5 < x < 5, −5 < y < 5
2500
f (x, y) =

0
elsewhere
what is the probability that 2 < x and 3 < y?
Solution:
Z
3
5
Z
2
5
5
x3
2
+ 2y x dy
3
3
2
5
1
117y
=
+ 2y 3 2500
3
3
x2 + 2y 2
1
dx dy =
2500
2500
=
Z
5
274
2500
= 0.1096
Exercise 4-11: Given a joint density function
 2
2
 x + 2y
−5 < x < 5, −5 < y < 5
2500
f (x, y) =

0
elsewhere
are X and Y statistically independent?
Solution:
5
Z
g(x) =
−5
Z
5
h(y) =
−5
x2 + 2y 2
x2
1
dy =
+ ,
2500
250 15
−5 < x < 5
x2 + 2y 2
y2
1
dx =
+ ,
2500
125 30
−5 < y < 5
g(x) · h(y) 6= f (x, y)
⇒
18
They are dependent
Week 5– Mathematical Expectation
Expected Value
Let X be a random variable with probability distribution f (x). The mean, or expected value of X is:
X
µ = E(X) =
xf (x)
x
if X is discrete and
∞
Z
µ = E(X) =
xf (x) dx
−∞
if X is continuous.
Exercise 5-1: A lot containing 7 components contains 4 good and 3 defective ones. We
take a sample of 3. Find the expected value of number of good components.
Answer: 12/7
Exercise 5-2: Let X be the random variable that denotes the life in hours of a certain
electronic device. The probability density function is
 20000

x > 100
x3
f (x) =

0
elsewhere
Find the expected life of this type of device.
Answer: 200
Let X be a random variable with probability distribution f (x). The expected value
of g(X) is:
X
E(g(X)) =
g(x)f (x)
x
if X is discrete and
Z
∞
g(x)f (x) dx
E(g(X)) =
−∞
if X is continuous.
Exercise 5-3: The number of sales per month have the probability distribution:
x
f (x)
4
5
6 7 8 9
1
12
1
12
1
4
1
4
1
6
1
6
If the salesman is paid a bonus of 2X − 1, find the expected amount of bonus.
Answer: 12.67
Theorem: If a and b are constants, E(aX + b) = aE(X) + b.
Exercise 5-4: Solve the previous problem with a second method.
19
Variance
Let X be a random variable with probability distribution f (x) and mean µ. The
variance of X is
X
σ 2 = E[(X − µ)2 ] =
(x − µ)2 f (x), if X is discrete
x
σ 2 = E[(X − µ)2 ] =
Z
∞
(x − µ)2 f (x) dx,
if X is continuous
−∞
The square root of variance, σ is called the standard deviation of X.
Theorem: The variance of a random variable X is σ 2 = E(X 2 ) − µ2 .
Exercise 5-5: Let the random variable X represent the number of typographical errors
on a page.The probability distribution is given as:
x
0
1
2
3
f (x) 0.51 0.38 0.10 0.01
Calculate σ 2 .
Answer: 0.4979
Exercise 5-6: The weekly demand for a product is a continuous random variable X
having the probability density
(
2(x − 1) 1 < x < 2
f (x) =
0
elsewhere
Find the mean and variance of X.
Answer: 5/3, 1/18
Exercise 5-7: A random variable X has density function

x
0<x<1

2−x 16x<2
f (x) =

0
elsewhere
Find the mean and the variance of X.
Exercise 5-8: A random variable X has density function

x
0<x<1

2−x 16x<2
f (x) =

0
elsewhere
Find the expected value of Y = 3X 2 − 4X.
7
1
Answer: 3 · − 4 · 1 = −
6
2
20
Chebyshev’s Theorem
Theorem: The probability that any random variable X will take a value within k stan1
dard deviations of the mean is at least 1 − 2 . That is:
k
P (µ − kσ < X < µ + kσ) > 1 −
1
k2
Exercise 5-9: A random variable X has a mean µ = 10 and a variance σ 2 = 4. Using
Chebyshev’s theorem, find P (5 < X < 15)
Answer: p > 21/25
Exercise 5-10: Compute P (µ − 2σ < X < µ + 2σ) where X has the density function
(
6x(1 − x) 0 < x < 1
f (x) =
0
elsewhere
and compare with the result given in Chebyshev’s theorem.
Exercise 5-11: Find the mean and variance of a random variable X whose probability
distribution is:
0
5
10
20
x
f (x) 0.17 0.33 0.41 0.09
Solution: µ = E(X) = 0 × 0.17 + 5 × 0.33 + 10 × 0.41 + 20 × 0.09 = 7.55
σ 2 = E(X 2 ) − µ2 = 0 × 0.17 + 25 × 0.33 + 100 × 0.41 + 400 × 0.09 − 7.552 = 28.2475
21
Exercise 5-12: The length of time cars have to wait at a traffic light in seconds has the
density function:

 1 e−x/5
0<x
5
f (x) =

0
elsewhere
a) Find E(X)
b) Find E(X 2 )
Solution: Using integration by parts, we can show that, for any nonzero a:
Z
xeax eax
− 2
xeax dx =
a
a
Z
2 ax
2xeax 2eax
xe
−
x2 eax dx =
+ 3
a
a2
a
Z ∞ −x/5
xe
dx = 5
a) E(X) =
5
0
Z ∞ 2 −x/5
xe
2
dx = 50
b) E(X ) =
5
0
Exercise 5-13: The probability density function of a random variable is:

 3 1 + 7x − x2 0 < x < 4
116
f (x) =

0
elsewhere
Find σ 2 Z(the variance).
4
3 µ=
x + 7x2 − x3 dx = 2.4138
0 116
Z 4
3 2
2
2
2
3
4
σ = E(x ) − µ =
x + 7x − x dx − µ2 = 1.0150
0 116
22
Week 6– Binomial Distributions
Bernoulli Process: In a Bernoulli process, we make trials. The result of each trial is
success or failure. (There are two options) The probability of success (p) remains constant
from trial to trial.
Exercise 6-1: We select a card from a standard deck. We replace the card and shuffle
after each trial. What is the probability that we get 3 hearts after 6 trials?
(If we do this without replacement, it is no longer Bernoulli)
Binomial Distribution: In a Bernoulli trial, if the probability of success is p and the
probability of failure is q = 1 − p, the number of successes after n trials is given by:
n x n−x
b(x; n, p) =
p q
x
Note that
n
X
b(x; n, p) = 1
x=0
The mean of the binomial distribution is µ = np and the variance is σ 2 = npq.
Exercise 6-2: We conjecture 30% of the wells in an area are impure. We randomly
select 10 wells and test. The results show 6 have impurity. What can we say about the
conjecture? (If it were correct, we would see this with 4.7% chance)
Exercise 6-3: Suppose airplane engines operate independently and fail with probability
p = 0.4. Assuming that a plane makes a safe flight if at least one-half of its engines run,
determine whether a 4-engine plane or 2-engine plane has a better chance.
Exercise 6-4: The probability that a patient recovers after a heart operation is 0.9. Find
the probability that,
a) Out of the next 10 patients, 5 or more recover.
b) Out of the next 8 patients, 4 or more recover.
Exercise 6-5: Tests show that only 30% of the cars have correct tire pressure. We test
7 cars. Find the probability that
a) 2 or more have correct pressure
b) 3 − 6 have correct pressure.
23
Exercise 6-6: According to statistics of finance ministry, one in five cars have unpaid
tax. Suppose we check 10 randomly chosen cars.
a) What is the probability that exactly 4 of them have unpaid tax?
b) What is the probability that 4 or more of them have unpaid tax?
Answer: 0.088, 0.121
Multinomial Distribution
If each trial has more than 2 possible outcomes, we have a multinomial distribution.
If k outcomes result with probabilities p1 , . . . , pk , after n independent trials
f (x1 , . . . , xk ; p1 , . . . , pk ; n) =
where
X
xi = n and
P
n!
px1 1 · · · pxkk
x 1 ! . . . xk !
pi = 1.
Exercise 6-7: At a traffic light, green signal stays for 15 seconds, yellow for 5 seconds
and the red for 40 seconds. We pass through it 5 times. We encounter green light X1
times, yellow X2 times and red X3 times. Find the distribution of X1 , X2 , X3 .
5!
x1
x2
x3
Answer:
0.25 0.083 0.67
x1 !x2 !x3 !
Exercise 6-8: In a large classroom, 55% of the students are from CENG, 35% are from
ECE and 10% are from IE departments. We randomly choose 6 students. What is the
probability that 3 are from CENG, 2 are from ECE and 1 is from IE?
Solution: Using multinomial distribution,
p=
6!
0.553 0.352 0.10 = 0.1223
3!2!1!
24
Exercise 6-9: Jale and Se¸cil are testing some equipment. Jale estimates that 10% are
defective, Se¸cil estimates that 15% are defective. They test 30 items and find 4 defective
ones. What is the probability of this outcome
a) Assuming Jale is right?
b) Assuming Se¸cil is right?
c) Who is right? (Assuming one of them is right)
30
Solution: a)
0.14 0.926 = 0.1771
4
30
b)
0.154 0.8526 = 0.2028
4
c) Se¸cil is right with probability:
0.2028
= 0.53
0.2028 + 0.1771
Exercise 6-10: You receive a large shipment of electronic components. It is either
”good”, which means 5% is defective, or ”bad”, which means 15% is defective. You
randomly choose a sample of 20 components and test them. You reject the shipment if
there are 2 or more defectives, accept otherwise.
a) Suppose the shipment is good. What is the probability of rejecting?
b) Suppose the shipment is bad. What is the probability of accepting?
Solution:
20
20
19
a) 1 − 0.95 +
0.05 × 0.95
1
= 1 − 0.7358
= 0.2642
20
b) 0.85 +
0.15 × 0.8519
1
20
= 0.1756
25
Exercise 6-11: In a court, there are 9 judges. They make the decision ”Guilty” or
”Innocent” independently. Each judge has the same rate of error. They find an innocent
person guilty 20% of the time, and a guilty person innocent 30% of the time. An accused
person is considered guilty if 7 or more judges find him guilty.
a) Suppose you are innocent. What is the probability that the court will find you guilty?
b) Suppose you are guilty. What is the probability that the court will find you innocent?
Solution:
9
9
9
7
2
8
a)
0.2 × 0.8 +
0.2 × 0.8 +
0.29
7
8
9
= 0.000314
9
9
9
7
2
8
9
b) 1 −
0.7 × 0.3 +
0.7 × 0.3 +
0.7
7
8
9
= 1 − 0.4628
= 0.5372
26
Week 7– Hypergeometric and Negative Binomial Dist.
Hypergeometric Distribution
Hypergeometric distribution is based on sampling without replacement.
Exercise 7-1: There are 2 black and 8 white balls in a basket. We randomly choose 3.
What is the probability that all are white?
2 8
Answer:
0
3 = 0.467
10
3
In general, there are N items. We consider k of them as success and N − k as failures.
We randomly choose n items without replacement. What is the probability that there are
x successes?
k N −k
h(x; N, n, k) =
x
n−x
N
n
,
max{0, n − (N − k)} 6 x 6 min{n, k}
Exercise 7-2: A lot of 40 components is unacceptable if there are 3 or more defectives.
We test 5 randomly chosen elements and reject the lot if one is defective. What is the
probability that exactly one defective is found assuming there are 3 total defectives?
Answer: 0.3011
Theorem: The mean and variance of the hypergeometric distribution h(x; N, n, k) are
N −n
k
k
nk
2
, σ =
·n·
1−
µ=
N
N −1
N
N
There is a close relationship between binomial and hypergeometric distributions. If
n N , the distinction between with and without replacement disappears.
Exercise 7-3: A factory reports that of the 5000 tires sent to a local distributor, 1000
are slightly blemished. You purchase 10. What is the probability that exactly 3 are
blemished?
Answer: 0.2015 ≈ 0.2013
Exercise 7-4: There are 500 students in a CENG department. 150 use Linux and the
rest use Windows on their personal computers. We randomly choose 7 students. What is
the probability that 4 of them use Linux? Answer this using
a) Hypergeometric distribution.
b) Binomial distribution approximation.
Answer: 0.09659, 0.09724
Exercise 7-5: There are 600 cars in the parking area. 150 are Turkish and the rest are
foreign made. We randomly choose 12 cars. What is the probability that 6 of them are
Turkish? Answer this using
27
a) Hypergeometric distribution.
b) Binomial distribution approximation.
Exercise 7-6: A network makes errors in 1500 bits per 100000 bits transmitted. Each
packet consists of 100 bits. If there are 4 or more errors per packet, we request retransmission.
a) Assuming we can detect all errors, what is the probability of retransmission request?
b) Assuming we can detect at most 6 errors per packet, what is the probability of retransmission request? What is the probability of accepting a packet with errors?
Solution:
For one bit, error probability is q = 1500/100000 = 0.015 and correct arrival probability is: p = 1 − q = 0.985.
a) If there are 0,1,2 or 3 errors, we do not request a transmission. If there are 4, 5, 6, . . .
or 100 errors,
we
do.
100 97 3
100 99
100 98 2
100
1− p +
p q+
p q +
p q = 0.0642
1
2
3
b) Now
a
transmission
if there are 4,5 or 6 errors:
we
assumewe request
100 96 4
100 95 5
100 94 6
p q +
p q +
p q = 0.0634
4
5
6
The probability that there are 7 or more errors is:
0.0642 − 0.0634 = 0.0008
Exercise 7-7: a) Of the 50 cars in the parking lot, 13 are using diesel fuel and 37
gasoline. We randomly choose 10. What is the probability that 5 are using diesel?
b) Of the 500 people working at a hospital, 220 are female and 280 are male. We randomly
choose 10. What is the probability that 5 are female?
c) If you had to solve one of the above problems using an approximation, which one would
you choose? a) or b)? Which approximation would you use? Explain.
Solution: Using hypergeometric distribution,
13 37
5
5
a) p =
= 0.0546
50
10
b) p =
220 280
5 5
= 0.2309
500
10
c) We can use binomial approximation to hypergeometric distribution. We should prefer
part b) because n is larger and therefore we expect it to be a better approximation.
This approximation gives 0.0664 for part a), which means %22 relative error. It gives
0.2289 for part b), which means %0.8 relative error.
28
Negative Binomial Distribution
Consider an experiment where probability of success is fixed like in binomial. We are
interested in k successes in x trials, but this time, we want the kth success to occur in
xth trial.
Exercise 7-8: In NBA championship, the team that wins four out of seven games is the
winner. Suppose team A has probability 0.55 of winning a game over team B.
a) What is the probability that team A will win the series in 6 games?
b) What is the probability that team A will win the series?
Answer: 0.1853, 0.6083
If the probability of success (p) and failure (q = 1 − p) are fixed, the probability that
kth success occurs at trial x is.
x − 1 k x−k
∗
b (x; k, p) =
p q , x = k, k + 1, . . .
k−1
The average number of trials until kth success is:
µ=
k
p
We can prove this starting with
∞
X
i−1 k
i
p (1 − p)i−k
µ=
k
−
1
i=k
and using derivatives of geometric series.
Exercise 7-9: In a sports tournament, the team that wins 5 out of 9 games passes that
tour. Team A has probability 0.6 of winning any one game against team B. What is the
probability that this tour ends in exactly 7 games?
Solution: Team A may win in 7 games or team B may win in 7 games. Winner must
win in the 7th game, so
6
6
5
2
p =
0.6 0.4 +
0.45 0.62
4
4
= 0.1866 + 0.0553
= 0.2419
Exercise 7-10: Suppose that the probability of male or female birth is 0.5. A couple
wishes to have exactly two daughters, and they will continue to have babies until this
condition is satisfied. What is the probability that the family has 2 sons?
Answer: 0.188
29
Exercise 7-11: We throw a pair of dice until we get 6-6. What is the expected value of
the number of throws?
Answer: 36
Exercise 7-12: An oil company drills wells. Their probability of success is 0.2. They
will stop at the third success. What is the average number of wells they drill?
Answer: 15
Exercise 7-13: On a Saturday night, Alphan is playing a game on his phone. He wins
with probability 0.23. His friends are waiting for him to go out, but Alphan says ”I will
continue until I win 7 times”.
C
¸ a˜gatay says: ”We will wait exactly 25 games.” O˜guz says: ”We will wait exactly 30
games.” Alphan is more optimistic, he thinks his friends will wait at most 10 games.
a) What is the probability that C
¸ a˜gatay is right?
b) What is the probability that O˜guz is right?
c) What is the probability that Alphan is right?
d) What is the probability that all are wrong?
24
Solution: a)
0.237 0.7718 = 0.0415
6
29
b)
0.237 0.7723 = 0.0396
6
9
8
7
6
7
3
7
2
7
c)
0.23 0.77 +
0.23 0.77 +
0.23 0.77 +
0.237 = 0.0021
6
6
6
6
d) 1 − 0.0415 − 0.0396 − 0.0021 = 0.9168
Exercise 7-14: You have started using your father’s car today. On each day, there
is a probability of 0.01 that you make an accident. Your father says ”You can make a
mistake at most twice. At your third mistake, I will take back the car”. If you use the
car everyday, what is the probability you lose it on day 100?
Solution:
99
× 0.013 × 0.9997 = 0.0018
2
Exercise 7-15: A biased coin have probability of 0.7 of coming Heads. We start tossing
this coin. We will stop when we obtain 10 Tails. What is the probability we stop after 20
tosses?
Solution:
19
× 0.710 × 0.310 = 0.0154
10
30
Week 8– Poisson Distribution
Properties of Poisson Process
• The number of outcomes occurring in one time interval is independent of the number
that occurs in any other interval
• The probability that a single outcome will occur in a very short time interval is
proportional to the length of time interval
• The probability that more than one outcome will occur in such a short time interval
is negligible
The probability distribution of the Poisson random variable X is:
p(x; µ) =
e−µ µx
x!
where µ is the average number of outcomes per unit time.
Exercise 8-1: During an experiment, average number of radioactive particles passing
through a counter in 1 millisecond is 4. What is the probability that 6 particles enter the
counter in any given millisecond?
Answer: 0.1042
Exercise 8-2: Average number of tankers arriving at a port is 10. The facilities can
handle at most 15 tankers per day. What is the probability that tankers have to be
turned away on any given day?
Answer: 0.0487
Theorem: Both the mean and the variance of the Poisson distribution are µ.
Theorem: Let X be a binomial random variable with probability distribution b(x; n, p).
When n → ∞, p → 0 and np → µ remains constant,
b(x; n, p) → p(x, µ)
Exercise 8-3: In a factory, the probability of an accident on a given day is 0.005 and
accidents are independent of each other. What is the probability that in any given period
of 400 days
a) There will be one accident?
a) There will be at most three accidents?
Answer: 0.271, 0.857
31
Exercise 8-4: For a certain type of copper wire, it is known that, on the average, 1.5 flaws
occur per millimeter. Assuming that the number of flaws is a Poisson random variable,
what is the probability that in a certain portion of the wire of length 5 millimeters
a) No flaw occurs ?
b) 10 or more flaws occur?
Solution: λt = 5 × 1.5 = 7.5
e−λt (λt)x
= e−7.5 = 5.5308 × 10−4
a)
x!
b)
∞
X
e−7.5 7.5x
x1 0
x!
= 1 − 0.7764 = 0.2236
Exercise 8-5: Of all the computers in the campus, 2% have Ubuntu installed. We
randomly select 250 and test. What is the probability that we observe Ubuntu in 13 of
them? Answer using
a) Binomial distribution.
b) Poisson distribution.
Solution:
250
a)
0.0213 0.98237 = 1.189 × 10−3
13
e−5 513
= 1.321 × 10−3
b)λ = 250 × 0.02 = 5
13!
Exercise 8-6: On average, 1 person in 1000 make a numerical error while preparing
income tax form. If 10000 forms are selected at random and examined, what is the
probability that 15 or more contain an error?
Exercise 8-7: The number of customers arriving per hour at a auto service follows a
Poisson distribution with mean λ = 7.
a) Find the probability that, at a certain hour, no customers come.
b) Find the probability that, within two hours, at least 10 and at most 20 customers
come.
c) Find the mean number of arrivals during a 2-hour period.
−4
Answer: 9.12 × 10 , 0.8427, 14
32
Exercise 8-8: Aysun has analyzed several year’s lists and found that Emre hoca fails
15 students per semester on average. But this year he failed 25. So Aysun thinks Emre
hoca must have started using different limits, because the probability of such an outcome
is very low assuming he is using the old system.
Nesib points out that the probability that exactly 15 student fails is also low. He says
Emre hoca is probably using the usual system.
Let number of failed students be n. Assuming µ = 15, find the probability that
a) n = 25
b) n = 15
c) n > 25
d) 10 6 n 6 20
e) Who is right, Aysun or Nesib?
Solution: We have to use Poisson distribution, because only the average is given.
a) 0.9938 − 0.9888 = 0.0005
b) 0.5681 − 0.4657 = 0.1024
c) 1 − 0.9888 = 0.0112
d) 0.9170 − 0.0699 = 0.8471
e) Probably Aysun is right, because part c) gives 1% probability for such a result.
Exercise 8-9: The probability that a cell phone rings in any given second is 0.0025. Find
the probability that it rings 4 times or more in an hour, using:
a) Exact method.
b) An approximation.
Solution: a) Using binomial distribution, we find the probability as:
3600
3600
0
3600
1−
0.0025 0.9975
+
0.00251 0.99753599
0
1
3600
3600
2
3598
3
3597
+
0.0025 0.9975
+
0.0025 0.9975
2
3
= 1 − [0.0001 + 0.0011 + 0.0050 + 0.0149]
= 0.9789
b) Using Poisson distribution, the average per hour is: 0.0025 × 3600 = 9. Using the
table, we find the probability as:
1 − 0.0212 = 0.9788
33
Exercise 8-10: You are in the real estate business and on average, you sell 17 houses
per month. Find the probability of
a) Good month. (25 or more sales)
b) Normal month. (10 − 24 sales)
c) Bad month. (2 − 9 sales)
d) Terrible month. (0 − 1 sales)
(Include 8 digits for part d)
Solution: Using the Table on Poisson Probability Sums, we obtain:
a) 1 − 0.9594 = 0.0406
b) 0.9594 − 0.0261 = 0.9333
c) 0.0261 − 0.0000 = 0.0261
Using the Poisson formula, we obtain:
0
17
171
−17
+
= 7.45 × 10−7
d) e
0!
1!
Exercise 8-11: You work in a warehouse which receives 2 orders per hour on average.
It is open 8 hours per day. If on any given day you receive 24 or more orders, you call it
a difficult day. If you receive 8 − 23 orders, you call it a normal day. If you receive 7 or
less, you call it an easy day. Find the probability of experiencing
a) A difficult day
b) A normal day
c) An easy day
d) No orders. (8 digits)
Solution: Average per day = 2 × 8 = 16. Using the Table on Poisson Probability Sums,
we obtain:
a) 1 − 0.9633 = 0.0367
b) 0.9633 − 0.0100 = 0.9533
c) 0.0100
Using the Poisson formula, we obtain:
160
d) e−16
= 1.12 × 10−7
0!
34
Week 9– Normal Distribution
Normal distribution is the most important continuous probability distribution in statistics. The density of the normal random variable X, with mean µ and variance σ 2 is
1
2
e− 2σ2 (x−µ)
n(x; µ, σ) = √
,
2π σ
−∞ < x < ∞
The curve is symmetric about x = µ, which is its maximum. It asymptotically approaches the x−axis as we go away from center. The total area under the curve is 1.
x−µ
We can prove this using z =
.
σ
Areas Under the Normal Curve
To find the probability that x1 < X < x2 , we have to compute
Z x2
1
2
1
e− 2σ2 (x−µ) dx
P (x1 < X < x2 ) = √
2π σ x1
This can be transformed into
Z z2
1 2
1
P (z1 < Z < z2 ) = √
e− 2 z dz
2π z1
where Z is a normal random variable with mean 0 and variance 1. This is called
standard normal distribution.
Using polar coordinates, we can prove that
r
Z ∞
π
−ax2
e
dx =
a
−∞
Derivative with respect to a gives
Z
∞
2 −ax2
xe
−∞
√
π
dx = √
2a a
Exercise 9-1: Given a standard normal distribution, find the area of the curve
a) to the right of z = 1.84
b) between z = −1.97 and z = 0.86
Answer: 0.0329, 0.7807
Exercise 9-2: Given a standard normal distribution, find k such that P (Z > k) = 0.3015.
Answer: k = 0.52
Exercise 9-3: A certain type of battery lasts, on average, 3 years with a standard
deviation of 0.5 years. Assuming battery life is normally distributed, find the probability
that a given battery lasts less than 2.3 years.
Answer: 0.0808
35
Exercise 9-4: The average grade for an exam is 74 and the standard deviation is 7. If
12% of the class get A, what is the lowest possible A and highest possible B? Assume
grades are distributed normally.
Answer: 83, 82
Exercise 9-5: Find the value of k such that the area under the standard normal curve
between −k < z < k is equal to 0.762.
Solution:
P (−k < z < k) = 0.762
P (0 < z < k) = 0.381
P (−z < k) = 0.5 + 0.381 = 0.881
Using the table we find k = 1.18
Exercise 9-6: The IQ’s of 600 applicants to a certain college are approximately normally
distributed with µ = 115 and σ = 12. If the college requires an IQ of at least 95, how
many of them will be rejected? Note that IQ’s are rounded to the nearest integer.
Solution:
94.5 − 115
= −1.71
12
P (Z < −1.71) = 0.0436
Z=
0.0436 × 600 = 26
Exercise 9-7: The average time for a trip from your home to work is 24 minutes with a
standard deviation of 3.8 minutes. Assume the trip times are normally distributed. You
leave home at 08:35 and you must be at work by 09:00. What is the probability that you
will be late?
Answer: P (z > 0.26) = 1 − 0.6026 = 0.3974
Exercise 9-8: The average life of a small motor is 10 years with a standard deviation of
2 years. The manufacturer replaces free all motors that fail while under guarantee. To
replace only 3%, how long a guarantee should be offered? Assume lifetime of a motor
follows a normal distribution.
36
Exercise 9-9: Let random variable x have a normal distribution with µ = 710 and
σ = 93.
a) Find a such that P (710 − a < x < 710 + a) = 0.76.
b) Find b such that P (710 < x < b) = 0.36.
c) Find c such that P (x < c) = 0.14.
d) Find the probability that x > 1000.
Solution: a)
109.275
0.76
+ 0.5 = 0.88 P (z < k) = 0.88
2
⇒
b) 0.36 + 0.5 = 0.86 P (z < k) = 0.86
⇒
k = 1.175 a = 1.175 × 93 =
k = 1.08 b = 710 + 1.08 × 93 = 810.44
c) z = −1.08 c = 710 − 1.08 × 93 = 609.56
1000 − 710
= 3.12 P (z > 3.12) = 1 − 0.9991 = 0.0009
d)
93
Exercise 9-10: If the function f (x) = ke−x
Z ∞
2
Solution: Let I =
e−x /3 dx. Then
2 /3
is a probability distribution, what is k?
−∞
I
2
Z
∞
=
−x2 /3
e
−∞
Z ∞
Z
−∞
2π
e−y
2 /3
dy
∞
e−(x
2 +y 2 )/3
dxdy
−∞
Z
=
0
∞
dx
−∞
=
Z
Z
∞
e−r
2 /3
r drdθ
0
= 3π
Therefore I =
√
1
3π and k = √ .
3π
Second Method: We know that the normal distribution
√
1
2
1
e− 2σ2 (x−µ)
2π σ
is a probability distribution. If we choose µ = 0 and 2σ 2 = 3 we obtain the given function,
therefore
r
r
3
1
2
1
σ=
,
k=√
=√
2
2π 3
3π
37
Exercise 9-11: The average height of women is 161 cm with a standard deviation of 6
cm and the average height of men is 173 cm with a standard deviation of 7 cm. A mirror
in a shopping mall has dimensions such that 85% of women (equally distributed between
higher and lower than average values) can use it comfortably.
What percentage of men can use it comfortably?
(Assume height distribution is normal)
Solution:
0.85
+ 0.5 = 0.925
2
P (−k < z < k) = 0.85 ⇒ k = 1.44
161 + 1.44 × 6 = 169.64
161 − 1.44 × 6 = 152.36
So the mirror was designed for people with height between [152.36 − 169.64]. For men,
these correspond to z values:
152.36 − 173
169.64 − 173
= −0.48,
= −2.95
7
7
P (−2.95 < z < −0.48) = 0.3156 − 0.0016 = 0.314
= 31.4%
Exercise 9-12: In a large scale international examination, students in the top 1.5% get
A and the students in the top 3.5% after them get B. We are given that the limits of B
are [509.44 − 534.16].
a) What is the average (µ) of this distribution?
b) What is the standard deviation (σ) of this distribution?
(Assume grade distribution is normal)
Solution:
P (z > z1 ) = 1.5% = 0.015
P (z > z2 ) = 5% = 0.05
⇒
⇒
z1 = 2.17
z1 = 1.645
534.16 − µ
σ
509.44 − µ
1.645 =
σ
2.17 =
⇒
µ = 431.98,
38
σ = 47.09
Week 10– Normal Approximation to the Binomial
Theorem: If X is a binomial random variable with mean µ = np and variance σ 2 = npq,
then the limiting form of the distribution of
X − np
Z= √
npq
as n → ∞ is the standard normal distribution n(z; 0, 1).
Exercise 10-1: The probability that a patient recovers from a rare blood disease is 0.4.
If 100 people contract this disease, what is the probability that fewer than 30 survive?
Answer: x = 29.5, z = −2.14, P = 0.0162 Exact Result= 0.0148
Exercise 10-2: In a multiple choice exam, a student answers 80 questions randomly.
There are 4 answers for each question. What is the probability that the student guesses
between 25-30 (inclusive) of the questions correctly?
Answer: 0.1196 Exact Result= 0.1193
Exercise 10-3: A company produces component parts for an engine. Part specifications
suggest that 95% of items meet specifications. The parts are shipped to customers in lots
of 100.
a) What is the probability that more than 2 items in a lot will be defective?
b) What is the probability that more than 10 items in a lot will be defective?
Answer: 0.8749, 0.0059
Exercise 10-4: In a digital communication channel, the probability that a bit is received
in error is 10−5 . If 16 million bits are transmitted, what is the probability that more than
150 errors occur?
Answer: 0.7734
Exercise 10-5: Statistics show that on a Saturday night 1 out of every 10 drivers on the
road is drunk. 400 drivers are randomly checked. Let’s call the number of drunk drivers
n. What is the probability that
a) n < 32?
b) 49 < n?
c) 35 6 n 6 46?
Solution: Using normal approximation to binomial, we find
√
µ = 400 × 0.1 = 40, σ = 400 × 0.1 × 0.9 = 6
a) x = 31.5, z =
31.5 − 40
= −1.42
6
39
P (z < −1.42) = 0.0778
49.5 − 40
= 1.58
6
P (1.58 < z) = 1 − 0.9429 = 0.0571
b) x = 49.5, z =
c) x = 34.5, z = −0.92, x = 46.5, z = 1.08
P (−0.92 < z < 1.08) = 0.8599 − 0.1788 = 0.6811]
Exercise 10-6: In a shipment of 500 identical products, 30 are defective. We randomly
choose 20. Find the probability that 2 are defectives among the 20
a) Using the exact method
b) Using an approximation.
Exercise 10-7: A coin is tossed 400 times. We obtain n heads. Use the normal curve
approximation to find the probability that 185 6 n 6 210.
Exercise 10-8: Suppose 15% of all cars in Ankara are white. We observe the Eski¸sehir
Road and count passing cars. We observe n white cars in a total number of 400. What is
the probability that 50 6 n 6 70?
Answer: P (−1.47 < z < 1.47) = 0.8584
Exercise 10-9: There are 3000 students in a university and 750 are freshmen. We
randomly choose 10 students. What is the probability that 8 of them are freshmen?
Solve in two different ways. (6 digits after point)
Solution: Hypergeometric distribution gives:
750 2250
8
2
= 0.000377
3000
10
Binomial approximation with p =
750
= 0.25 gives
3000
10
0.258 0.752 = 0.000386
8
Exercise 10-10: We toss a single die 90 times. What is the probability we obtain 20 or
more sixes?
Use normal approximation to binomial:
r
1
1 5
µ = 90 · = 15,
σ = 90 · · = 3.54
6
6 6
19.5 − 15
z=
= 1.27
3.54
P (z > 1.27) = 1 − 0.8980 = 0.1020
40
Week 11– Gamma and Exponential Distributions
The gamma function is defined by:
Z ∞
xα−1 e−x dx,
Γ(α) =
α>0
0
Using integration by parts, we can show that
Γ(α) = (α − 1)Γ(α − 1)
Γ(1) = 1
Therefore
Γ(n) = (n − 1)! For a positive integer n
Gamma Distribution: The continuous random variable X has a gamma distribution
with parameters α and β if its density function is given by
 α−1 −x/β
 x e
x>0
β α Γ(α)
f (x; α, β) =

0
elsewhere
where α > 0, β > 0
Theorem: The mean and variance of the gamma distribution are
σ 2 = αβ 2
µ = αβ,
Exercise 11-1: In a biomedical research study, it was determined that the survival time,
in weeks, of an animal subjected to radiation has a gamma distribution with α = 5 and
β = 10.
Z What is the probability that an animal survives more than 30 weeks?
Hint:
x4 e−x dx = −e−x x4 + 4x3 + 12x2 + 24x + 24
Answer: 0.8153
41
Exponential Distribution: The continuous random variable X has an exponential
distribution with parameter β if its density function is given by
 −x/β
 e
x>0
β
f (x; β) =

0
elsewhere
where β > 0
Theorem: The mean and variance of the exponential distribution are
σ2 = β 2
µ = β,
Exercise 11-2: A system contains a component with time to failure T . The random
variable T is modeled by exponential distribution with mean time to failure β = 5. if 5
of these components are installed, what is the probability that at least 2 are functioning
at the end of 8 years?
Answer: 0.2667
Exercise 11-3: The length of time for one individual to be served at a cafeteria is a
random variable having an exponential distribution with a mean of 6 minutes. What is
the probability that a person is served in less than 4 minutes on at least 5 of the next 7
days?
Answer: 0.2052
42
√
Exercise 11-4: The lifetime of an electronic component has µ = 40 and σ = 20 2.
Nilay thinks that the distribution is gamma, but Mehmet thinks it is normal. The only
other information they have about the population is that 1.7% of the components have a
lifetime larger than 120.
Who is right and why?
Solution: According to Nilay:
αβ = 40, αβ 2 = 800 ⇒ α = 2, β = 20
Z ∞
1
x1 e−x/20 dx
P (x > 120) =
2
20
Γ(2)
120
= 7e−6
= 0.017
According to Mehmet:
Z=
120 − 40
√
= 2.83
20 2
P (Z > 2.83) = 1 − 0.9977 = 0.0023
%1.7 = 0.017 Clearly, Nilay is right.
Exercise 11-5: The length of time you have to wait at the cafeteria is a random variable
having an exponential distribution with a mean of 120 seconds. If you wait more than
400 seconds, you call it an unlucky day. If you eat at the cafeteria 20 days a month, what
is the probability that you experience 2 or more unlucky days in a month?
Solution: First we have to find the probability of unlucky days:
∞
Z ∞
1
−x/120
−x/120 −400/120
= 0.0357
e
dx = −e
=e
120 400
400
1 − 0.0357 = 0.9643
20
20
0
20
1
19
1−
0.0357 0.9643 +
0.0357 0.9643
0
1
= 0.1588
43
Exercise 11-6: A random variable X is modeled by gamma distribution with α = 2,
β = 8. Find the probability that P (X < 7).
Solution:
Z
7
P (X < 7) =
0
xe−x/8
dx
82 Γ(2)
Γ(2) = 1. Using integration by parts with u = x, dv = e−x/8 dx we obtain:
7
−x/8 x
+ 1 P (X < 7) = −e
8
0
=1−
15 −7/8
e
8
= 0.2184
Exercise 11-7: A random variable X is modeled by gamma distribution with α = 2,
β = 6. Find the probability that P (X < 7).
Solution:
Z
P (X < 7) =
0
7
xe−x/6
dx
62 Γ(2)
Γ(2) = 1. Using integration by parts with u = x, dv = e−x/6 dx we obtain:
x
7
P (X < 7) = −e−x/6
+ 1 6
0
=1−
13 −7/6
e
6
= 0.3253
44
Week 12– Sampling Distributions
A population consists of the totality of the observations with which we are concerned.
A sample is a subset of the population.
Any sampling procedure that produces inferences that consistently overestimate or
underestimate some characteristic of the population is said to be biased.
Any function of the random variables constituting a random sample is called a statistic. The probability distribution of a statistic is called a sampling distribution.
Theorem: (Central Limit Theorem) If X is the mean of a random sample of size n
taken from a population with mean µ and finite variance σ 2 , then the limiting form of
the distribution of
X −µ
√
Z=
σ/ n
as n → ∞ is the standard normal distribution n(z; 0, 1).
In other words, sampling distribution of X will be normal even if the population
distribution is not.
Exercise 12-1: An electrical firm manufactures light bulbs that have a lifetime of mean
800 hours and standard deviation of 40 hours. Find the probability that a random sample
of 16 bulbs will have an average lifetime less than 775 hours.
Answer: 0.0062
Exercise 12-2: An auto part must have a diameter of 5 mm. We know that population
σ = 0.1mm. We choose 100 parts randomly. The sample average is x = 5.027 mm. Can
we say the population mean is 5 mm?
Answer: 0.007
Exercise 12-3: The bus trip to a campus takes on average, 28 minutes with a standard
deviation of 5 minutes. In a week, the bus makes 40 trips. What is the probability that
weekly average is above 30 minutes? Assume the mean is measured to nearest minute.
Answer: P (z > 3.16) = 0.0008
45
Theorem: If independent samples of size n1 and n2 are drawn at random from two
populations with means µ1 and µ2 and variances σ12 and σ22 respectively, then the sampling
distribution of the differences of means X 1 − X 2 is approximately normally distributed
σ2 σ2
with mean and variance given by µ1 − µ2 and 1 + 2 . In other words
n1 n2
Z=
(X 1 − X 2 ) − (µ1 − µ2 )
q 2
σ1
σ2
+ n22
n1
is approximately a standard normal variable.
Exercise 12-4: We test the strength of steel cables manufactured by companies A and
B. The standard deviations of both are 5 and we test 30 cables from each. The results
are:
xA = 49.5, xB = 45.5, xA − xB = 4
Company B claims the population means are the same. What is the probability of seeing
this result if they are really the same?
Solution: Z = r
4−0
= 3.10
25 25
+
30 30
P (Z > 3.10) = 1 − 0.999 = 0.001
Exercise 12-5: The televisions of manufacturer A have a mean lifetime of 6.5 years and
a standard deviation of 0.9 years. Those of manufacturer B have a mean lifetime of 6.0
years and a standard deviation of 0.8 year. We take a random sample of 36 from A and
49 from B. What is the probability that sample from A will have a mean lifetime at least
1 year more than sample of B?
Answer: P (z > 2.65) = 0.0040
46
Exercise 12-6: A certain machine makes electrical resistors having a mean resistance
of 50 ohms and a standard deviation of 3 ohms. We choose a random sample of size n.
What is the probability that the average resistance of the sample is less than 49.7 ohms,
if the sample size is
a) n = 10?
b) n = 50?
c) n = 250?
Answer: P (z < −0.32) = 0.3745,
P (z < −1.58) = 0.0571
P (z < −0.71) = 0.2389,
Exercise 12-7: We randomly choose 35 students from school A and 45 student from
school B. (There are thousands of students in each school) We give them a mathematics
test and find that sample averages are 55 and 60. The standard deviations are 18 and 17
respectively.
What is the probability of seeing this result if the schools have the same average?
Solution:
(60 − 55) − 0
Z= r
182 172
+
35
45
= 1.26
P (Z > 1.26) = 1 − 0.8962
= 0.1038
47
Exercise 12-8: Average lifetime of an electronic component is 87.0 months, with a
standard deviation of 9.0 months. Assume normal distribution.
a) What is the probability that a single component will have a lifetime between 86.5 and
87.5 months?
b) What is the same probability for a sample average if sample size is 100?
Solution:
86.5 − 87
= −0.06,
a) z1 =
9
z2 =
87.5 − 87
= 0.06
9
P (−0.06 < z < 0.06) = 0.5239 − 0.4761
= 0.0478
b) z3 =
86.5 − 87
√
= −0.56,
9/ 100
z4 =
87.5 − 87
√
= 0.56
9/ 100
P (−0.56 < z < 0.56) = 0.7123 − 0.2877
= 0.4246
48
Week 13– Confidence Intervals
Suppose we know the variance σ 2 of a population and we are trying to find the mean.
The sample mean is distributed normally around the population mean, so
P (−zα/2 < Z < zα/2 ) = 1 − α
where
Z=
X −µ
√
σ/ n
and zα/2 denotes the z−value such that the area to the right is α/2.
1−α
α/2
−zα/2
α/2
zα/2
We can rewrite this as:
σ
σ
P X − zα/2 √ < µ < X + zα/2 √
=1−α
n
n
This is called the 100(1 − α)% confidence interval for µ.
Exercise 13-1: Average zinc concentration from a sample of 36 measurements is 2.6
grams per milliliter. Find the 95% and 99% confidence intervals for mean zinc concentration in the river. Assume σ = 0.3.
Answer: [2.50, 2.70], [2.47, 2.73] .
Exercise 13-2: A population has σ = 40 and we are trying to determine the mean. How
large a sample do we need if we want to be 95% sure that we are making an error of 15
or less?
Solution: P (Z < k) = 0.975
40
1.96 × √ 6 15
n
⇒
⇒
k = 1.96
n > 28
49
Exercise 13-3: A random sample of 130 units have average 36 and standard deviation
0.7.
a) Find a 90% confidence interval.
b) How large a sample do we need if we want to be 90% sure that sample mean is within
0.05 of the true mean?
Answer: [35.90 − 36.10], 531
Exercise 13-4: 200 high school students in a city are randomly chosen and given a
mathematics test. The mean and standard deviation of the sample are 46 and 14.
a) Find a 99% confidence interval for the mean.
b) Find the necessary sample size if we want the 99% confidence interval to have size 1.
Exercise 13-5: A sample of apple juice is tested for arsenic content. The standard
deviation is 1.8 ppb (part per billion), the sample size is 94 and the sample average is 9
ppb. The distribution is normal.
a) Find a 80% confidence interval for the population average.
b) Find a sample size such that the 80% confidence interval will be half of what you found
in part a).
c) Find a sample size such that the 99.5% confidence interval will be the same size as
what you found in part a).
Solution: We will use Area= 0.8/2 + 0.5 = 0.9 ⇒ z = 1.28 .
1.8
σ
a) √ z = √ 1.28 = 0.2376, therefore confidence interval is:
n
94
[9 − 0.2376, 9 + 0.2376] = [8.7624, 9.2376]
b) We have to find n such that
1.8
√ 1.28 = 0.2376/2 = 0.1188
n
n = 376
Or, we can simply multiply 94 by 4: 94 × 4 = 376
c) Area= 0.995/2 + 0.5 = 0.9975
⇒
z = 2.81 .
1.8
√ 2.81 = 0.2376
n
n = 453
50
Week 14– Prediction Intervals
Suppose we know the variance σ 2 of a population but do not know the mean µ. We
have a random sample of size n with average x. We want to predict the value of a single
future observation x0 .
If we define a new variable x − x0 , its variance will be
σ2
+ σ2
n
therefore the standard deviation is:
r
σ
1+
1
n
So 100(1 − α)% prediction interval is:
r
r
1
1
X − zα/2 σ 1 + < x0 < X + zα/2 σ 1 +
n
n
Exercise 14-1: Let sample size be 100, sample average 290, population standard deviation 32.
a) Construct a 99% confidence interval.
b) Construct a 99% prediction interval.
Answer: [282, 298], [207, 373]
Exercise 14-2: The average weight gain for a sample of 40 mice is 5.6 grams. The
population standard deviation is 1.3.
a) Construct a 90% confidence interval.
b) Construct a 90% prediction interval.
Answer: [5.26, 5.94], [3.43, 7.77]
51