Lecture #12 - Francis J. DiTraglia

Economics 103 – Statistics for Economists
Francis J. DiTraglia
University of Pennsylvania
Lecture # 12
Continuous RVs II: The Normal RV
Figure : Standard Normal RV (PDF)
2/ 37
0.6
F(x)
0.0
0.0
0.2
0.4
0.2
0.1
f(x)
0.3
0.8
1.0
0.4
Standard Normal Random Variable: N(0, 1)
−4
−2
0
2
4
x
−4
−2
0
2
4
x
Figure : Standard Normal PDF (left) and CDF (Right)
I
Notation: X ∼ N(0, 1)
I
Symmetric, Bell-shaped, E [X ] = 0, Var [X ] = 1
I
Support Set = (−∞, ∞)
3/ 37
http://glimmer.rstudio.com/fditraglia/normal cdf/
4/ 37
0.6
F(x)
0.0
0.0
0.2
0.4
0.2
0.1
f(x)
0.3
0.8
1.0
0.4
Standard Normal Random Variable: N(0, 1)
−4
−2
0
2
4
−4
−2
x
0
2
4
x
I
There is no closed-form expression for the N(0, 1) CDF.
I
For Econ 103, don’t need to know formula for N(0, 1) PDF.
I
You do need to know the R commands. . .
5/ 37
R Commands for the Standard Normal RV
dnorm – Standard Normal PDF
I
Mnemonic: d = density, norm = normal
I
Example: dnorm(0) gives height of N(0, 1) PDF at zero.
pnorm – Standard Normal CDF
I
Mnemonic: p = probability, norm = normal
I
Example: pnorm(1) = P(X ≤ 1) if X ∼ N(0, 1).
rnorm – Simulate Standard Normal Draws
I
Mnemonic: r = random, norm = normal.
I
Example: rnorm(10) makes ten iid N(0, 1) draws.
6/ 37
Φ(x0 ) Denotes the N(0, 1) CDF
You will sometimes encounter the notation Φ(x0 ). It means the
same thing as pnorm(x0 ) but it’s not an R command.
7/ 37
The N(µ, σ 2 ) Random Variable
Idea
Take a linear function of the N(0, 1) RV.
Formal Definition
N(µ, σ 2 ) ≡ µ + σX where X ∼ N(0, 1) and µ, σ are constants.
Properties of N(µ, σ 2 ) RV
I
Parameters: Expected Value = µ, Variance = σ 2
I
Symmetric and bell-shaped.
I
Support Set = (−∞, ∞)
I
N(0, 1) is the special case where µ = 0 and σ 2 = 1.
8/ 37
Expected Value: µ shifts PDF
0.4
all of these have σ = 1
0.2
0.1
0.0
f(x)
0.3
µ = −1
µ=0
µ=1
−4
−2
0
2
4
x
Figure : Blue µ = −1, Black µ = 0, Red µ = 1
9/ 37
Standard Deviation: σ scales PDF
0.8
all of these have µ = 0
0.4
0.2
0.0
f(x)
0.6
σ=2
σ=1
σ = 1/2
−4
−2
0
2
4
x
Figure : Blue σ 2 = 4, Black σ 2 = 1, Red σ 2 = 1/4
10/ 37
Linear Function of Normal RV is a Normal RV
Suppose that X ∼ N(µ, σ 2 ). Then if a and b constants,
a + bX ∼ N(a + bµ, b 2 σ 2 )
Important
I
For any RV X , E [a + bX ] = a + bX and Var (a + bX ) = b 2 X .
I
Key point: linear transformation of normal is still normal!
I
Linear transformation of Binomial is not Binomial!
11/ 37
Example
Suppose X ∼ N(µ, σ 2 ) and let Z = (X − µ)/σ. What is the
distribution of Z ?
(a) N(µ, σ 2 )
(b) N(µ, σ)
(c) N(0, σ 2 )
(d) N(0, σ)
(e) N(0, 1)
12/ 37
Linear Combinations of Multiple Independent Normals
Let X ∼ N(µx , σx2 ) independent of Y ∼ N(µy , σy2 ). Then if a, b, c
are constants:
aX + bY + c ∼ N(aµx + bµy + c, a2 σx2 + b 2 σy2 )
Important
I
Result assumes independence
I
Particular to Normal RV
I
Extends to more than two Normal RVs
13/ 37
Suppose X1 , X2 , ∼ iid N(µ, σ 2 )
¯ = (X1 + X2 )/2. What is the distribution of X
¯?
Let X
(a) N(µ, σ 2 /2)
(b) N(0, 1)
(c) N(µ, σ 2 )
(d) N(µ, 2σ 2 )
(e) N(2µ, 2σ 2 )
14/ 37
Where does the Empirical Rule come from?
Empirical Rule
Approximately 68% of observations within µ ± σ
Approximately 95% of observations within µ ± 2σ
Nearly all observations within µ ± 3σ
15/ 37
0.2
0.1
0.0
f(x)
0.3
0.4
pnorm(1)≈ 0.84
−4
−2
0
2
4
x
16/ 37
0.2
0.1
0.0
f(x)
0.3
0.4
pnorm(1) - pnorm(-1)≈ 0.84 − 0.16
−4
−2
0
2
4
x
17/ 37
0.2
0.1
0.0
f(x)
0.3
0.4
pnorm(1) - pnorm(-1)≈ 0.68
−4
−2
0
2
4
x
18/ 37
0.2
0.1
0.0
f(x)
0.3
0.4
Middle 68% of N(0, 1) ⇒ approx. (−1, 1)
−4
−2
0
2
4
x
19/ 37
Suppose X ∼ N(0, 1)
P(−1 ≤ X ≤ 1) = pnorm(1) - pnorm(-1)
≈ 0.683
P(−2 ≤ X ≤ 2) = pnorm(2) - pnorm(-2)
≈ 0.954
P(−3 ≤ X ≤ 3) = pnorm(3) - pnorm(-3)
≈ 0.997
20/ 37
What if X ∼ N(µ, σ 2 )?
P(X ≤ a) = P(X − µ ≤ a − µ)
= P
a−µ
X −µ
≤
σ
σ
a−µ
= P Z≤
σ
Where Z is a standard normal random variable, i.e. N(0, 1).
21/ 37
Which of these equals P (Z ≤ (a − µ)/σ) if Z ∼ N(0, 1)?
(a) pnorm(a)
(b) 1 − pnorm(a)
(c) pnorm(a)/σ − µ
(d) pnorm a−µ
σ
(e) None of the above.
22/ 37
Probability Above a Threshold: X ∼ N(µ, σ 2 )
P(X ≥ b) = 1 − P(X ≤ b) = 1 − P
X −µ
b−µ
≤
σ
σ
b−µ
= 1−P Z ≤
σ
= 1 − pnorm((b − µ)/σ)
Where Z is a standard normal random variable.
23/ 37
Probability of an Interval: X ∼ N(µ, σ 2 )
a−µ
X −µ
b−µ
≤
≤
σ
σ
σ
a−µ
b−µ
≤Z ≤
σ
σ
P(a ≤ X ≤ b) = P
= P
=
pnorm((b − µ)/σ) − pnorm((a − µ)/σ)
Where Z is a standard normal random variable.
24/ 37
Suppose X ∼ N(µ, σ 2 )
What is P(µ − σ ≤ X ≤ µ + σ)?
X −µ
P(µ − σ ≤ X ≤ µ + σ) = P −1 ≤
≤1
σ
= P (−1 ≤ Z ≤ 1)
= pnorm(1) − pnorm(-1)
≈ 0.68
25/ 37
Percentiles/Quantiles for Continuous RVs
Quantile Function Q(p) is the inverse of CDF F (x0 )
Plug in a probability p, get out the value of x0 such that F (x0 ) = p
Q(p) = F −1 (p)
In other words:
Z
x0
Q(p) = the value of x0 such that
f (x) dx = p
−∞
Inverse exists as long as F (x0 ) is strictly increasing.
26/ 37
Example: Median
The median of a continuous random variable is Q(0.5), i.e. the
value of x0 such that
Z
x0
f (x) dx = 1/2
−∞
27/ 37
What is the median of a standard normal RV?
0.2
0.1
0.0
f(x)
0.3
0.4
By symmetry, Q(0.5) = 0. R command: qnorm()
−4
−2
0
x
2
4
28/ 37
90th Percentile of a Standard Normal
0.2
0.1
0.0
f(x)
0.3
0.4
qnorm(0.9)≈ 1.28
−4
−2
0
x
2
4
29/ 37
Using Quantile Function to find Symmetric Intervals
Suppose X is a standard normal RV. What is the value of c such
0.2
0.1
0.0
f(x)
0.3
0.4
that P(−c ≤ X ≤ c) = 0.5?
−4
−2
0
x
2
4
30/ 37
qnorm(0.75)≈ 0.67
Suppose X is a standard normal RV. What is the value of c such
0.2
0.1
0.0
f(x)
0.3
0.4
that P(−c ≤ X ≤ c) = 0.5?
−4
−2
0
x
2
4
31/ 37
qnorm(0.75)≈ 0.67
Suppose X is a standard normal RV. What is the value of c such
0.2
0.1
0.0
f(x)
0.3
0.4
that P(−c ≤ X ≤ c) = 0.5?
−4
−2
0
x
2
4
32/ 37
pnorm(0.67)-pnorm(-0.67)≈?
Suppose X is a standard normal RV. What is the value of c such
0.2
0.1
0.0
f(x)
0.3
0.4
that P(−c ≤ X ≤ c) = 0.5?
−4
−2
0
x
2
4
33/ 37
pnorm(0.67)-pnorm(-0.67)≈ 0.5
Suppose X is a standard normal RV. What is the value of c such
0.2
0.1
0.0
f(x)
0.3
0.4
that P(−c ≤ X ≤ c) = 0.5?
−4
−2
0
x
2
4
34/ 37
95% Central Interval for Standard Normal
Suppose X is a standard normal random variable. What value of c
ensures that P(−c ≤ X ≤ c) ≈ 0.95?
35/ 37
R Commands for Arbitrary Normal RVs
Let X ∼ N(µ, σ 2 ) . Then we can use R to evaluate the CDF and
Quantile function of X as follows:
CDF F (x)
pnorm(x, mean = µ, sd = σ)
Quantile Function Q(p)
qnorm(p, mean = µ, sd = σ)
Notice that this means you don’t have to transform X to a
standard normal in order to find areas under its pdf using R.
36/ 37
Example from Homework: X ∼ N(0, 16)
One Way:
P(X ≥ 10) = 1 − P(X ≤ 10) = 1 − P(X /4 ≤ 10/4)
= 1 − P(Z ≤ 2.5) = 1 − Φ(2.5) = 1 − pnorm(2.5)
≈ 0.006
An Easier Way:
P(X ≥ 10) = 1 − P(X ≤ 10)
= 1 − pnorm(10, mean = 0, sd = 4)
≈ 0.006
37/ 37