Part II Review of random variables IR Random variable maps sample space to

Review of random variables
25
Part II
Review of random variables
Review of random variables
26
Random variable maps sample space to IR
Random variable maps sample space to IR
A random variable (rv) is a function that maps the sample
space (of an experiment, or observation) to the real numbers:
X : S , IR
e.g., toss two dice,
S = {(1, 1), (1, 2), . . . , (6, 5), (6, 6)}
ñ
e.g., X = total number of spots
ñ
e.g., X = no. of spots on the higher of the 2 dice
Review of random variables
27
Cumulative distribution function
Cumulative distribution function
Write FX for the (cumulative) distribution function (cdf) of rv
X.
Definition:
For any x ∈ IR,
FX (x) = pr(X ≤ x).
Review of random variables
28
Continuous and discrete random variables
Continuous and discrete random variables
Random variable X is continuous if FX (x) is a continuous
function of x.
Random variable Y is discrete if FY (y) is a step function (i.e.,
is piecewise constant).
(draw example graphs)
Review of random variables
29
Continuous and discrete random variables
A random variable can also be of mixed type (i.e., with both
discrete and continuous parts to its distribution).
For example,
X = amount of rainfall on a given day
has FX (x) = 0 for x < 0, a discontinuous step to FX (0) > 0,
and then increases continuously through positive values of x
to FX (∞) = 1.
(draw graph)
Review of random variables
30
Identically distributed random variables
Identically distributed random variables
If X and Y are such that
FX (x) = FY (x)
∀x ∈ IR
then X and Y are said to be identically distributed.
Example
In 10 tosses of a fair coin, let X be the number of heads, and
Y be the number of tails. Then, by symmetry, X and Y have
the same distribution. (But clearly X and Y are different
functions!)
Review of random variables
31
Discrete rv: probability mass function
Discrete rv: probability mass function
The probability mass function (pmf) of a discrete rv X is
fX (x) = pr(X = x)
∀x ∈ IR.
Example
In two tosses of a fair coin, suppose X is the number of
heads. Then


1/4 (x = 0)



1/2 (x = 1)
fX (x) =
1/4 (x = 2)




0
(otherwise)
Review of random variables
32
Discrete rv: probability mass function
Relation between pmf and cdf:
FX (x) =
X
fX (t).
t≤x
Example (cont.)
If X is number of heads in 2 tosses of a fair coin,


(x < 0)

0


1/4 (0 ≤ x < 1)
FX (x) =

3/4 (1 ≤ x < 2)



1
(x ≥ 2)
Review of random variables
33
Continuous rv: probability density function
Continuous rv: probability density function
For a continuous rv,
pr(X = x) = 0
∀x ∈ IR.
But we can usefully define the probability density function
(pdf) analogously to the pmf, replacing summation by
integration. For a continuous rv, the pdf is the function
fX (x) such that
Zx
FX (x) =
fX (t)dt
∀x ∈ IR.
−∞
(sketch graph of this relationship)
Review of random variables
34
Continuous rv: probability density function
From the Fundamental Theorem of Calculus, then, if fX (x) is
continuous,
d
FX (x).
fX (x) =
dx
Review of random variables
35
Continuous rv: probability density function
Example
Suppose that X is uniformly distributed on the interval (a, b).
The density (pdf) is
fX (x) =


1
b−a
0
(a < x < b)
otherwise
The corresponding cdf is


(x ≤ a)

0
FX (x) = (x − a)/(b − a) (a < x ≤ b)



1
(x > b)
Review of random variables
36
A notational shorthand: “∼”
A notational shorthand: “∼”
The cdf FX fully describes the distribution of X, as does the
pdf or pmf fX . It is common to write either
X ∼ FX (x)
or
X ∼ fX (x)
or to use a distribution’s name, e.g.,
X ∼ uniform(a, b).
The symbol ‘∼’means ‘is distributed as’.
Similarly, if X and Y are identically distributed, we may write
X ∼ Y.
Review of random variables
37
Characterization of pmf and pdf
Characterization of pmf and pdf
An arbitrary function f (x) is a pmf (or pdf) if and only if
1. f (x) ≥ 0 ∀x
P
2. x f (x) = 1 (pmf)
or
R∞
−∞ f (x)dx
= 1 (pdf)
(For more mathematical precision than this, go to ST213
Mathematics of Random Events.)
Review of random variables
38
A technical note: absolute continuity
A technical note: absolute continuity
There do exist continuous cdf’s F (x) which are such that no
density function f exists satisfying
Zx
F (x) =
f (t)dt
∀x.
−∞
Such cases are pathological, though, and will not concern us
further in this course.
A cdf F (x) for which f (x) does exist is said to be absolutely
continuous. All of the important continuous distributions
used in statistics are of this kind.
Review of random variables
39
Mixed discrete and continuous
Mixed discrete and continuous
If X has a mixture of discrete and continuous distribution,
(d)
(c)
FX (x) = pFX (x) + (1 − p)FX (x),
with p ∈ (0, 1), and we define
(d)
(c)
fX (x) = pfX (x) + (1 − p)fX (x)
(d)
where fX
(c)
is a pmf and fX
a pdf.
For example, pr(no rain) = p = 0.62, say.
Review of random variables
40
Transformation: functions of a rv
Transformation: functions of a rv
New rv from old:
Y = g(X).
How do FY , fY relate to FX , fX ?
Review of random variables
41
Transformation: functions of a rv
cdf
cdf
FY (y) = pr(Y ≤ y) = pr(g(X) ≤ y).
Suppose that g is (strictly) monotonic (and hence can be
inverted).
Then either g is increasing, so that
FY (y) = pr(X ≤ g −1 (y)) = FX (g −1 (y))
or is decreasing, in which case
FY (y) = pr(X ≥ g −1 (y)) = pr(−X ≤ −g −1 (y)) = F−X (−g −1 (y)).
Review of random variables
42
Transformation: functions of a rv
Discrete rv: pmf
Discrete rv: pmf
fY (y) = pr(Y = y) =
X
fX (x).
x: g(x)=y
Example
Discrete uniform distribution on {−1, 0, 1}:

1/3 (x ∈ {−1, 0, 1})
fX (x) =
0
(otherwise)
Consider Y = X 2 :



1/3 (y = 0)
fY (y) = 2/3 (y = 1)



0
(otherwise)
Review of random variables
43
Transformation: functions of a rv
Continuous rv: pdf
Continuous rv: pdf
We have seen how to get FY from FX , in general. Povided that
the derivative exists, then,
fY (y) =
d
FY (y).
dy
This is the most general approach, which works for any
transformation Y = g(X).
In the special but common case that g is both invertible and
continuously differentiable, a quick route from fX to fY is
d
−1
−1
fY (y) = fX (g (y)) g (y) .
dy
(Exercise: prove this.)
Review of random variables
44
Transformation: functions of a rv
Continuous rv: pdf
Example
X ∼ uniform(−1, 1)

1/2 (x ∈ (−1, 1))
fX (x) =
0
(otherwise)
Let Y = X 2 . By symmetry, |X| ∼ uniform(0, 1); and Y = |X|2
is a smooth, invertible function of |X|. Hence
p 1
d p y = 1 × √
(0 < y < 1).
fY (y) = f|X| ( y) dy
2 y
Review of random variables
45
Transformation: functions of a rv
Probability integral transform
A special transformation: the probability integral
transform (PIT)
Suppose X is continuous, and let Y = FX (X).
Then Y ∼ uniform(0, 1):
FY (y) = pr(Y ≤ y)
= pr(FX (X) ≤ y)
= pr(X ≤ FX−1 (y))
= FX (FX−1 (y)) = y
(0 < y < 1).
Note: the above proof applies directly when FX is strictly
increasing. Otherwise some care is needed in the definition
of FX−1 : see C&B Theorem 2.1.10.
Review of random variables
46
Transformation: functions of a rv
Probability integral transform
The PIT is useful for various statistical purposes, both
theoretical and practical.
A particular application of note is the simulation of an
arbitrary random variable X on a computer. For example, to
generate (i.e., simulate) values of X from a strictly increasing
cdf FX , we can first generate
U ∼ uniform(0, 1)
and then transform via the inverse PIT to
X = FX−1 (U).
This is not necessarily the most computationally efficient
method, but it has the advantage of being very general (and
for that reason is very much used in practice).
Review of random variables
47
Expectation
Definition
Expectation
The expected value or mean of any function g(X) is
P

(discrete)
x g(x)fX (x)
E[g(X)] = R ∞

g(x)f
(x)dx
(continuous)
X
−∞
This is just the average value of g(X) in repeated sampling.
Review of random variables
48
Expectation
Definition
Example: exponential distribution

 1 e−x/λ (x > 0)
fX (x) = λ
0
(otherwise)
Z∞
x −x/λ
e
dx
λ
h
i∞ Z ∞
=
−xe−x/λ
+
e−x/λ dx
0
0
Z∞
1 −x/λ
= λ
e
dx
0 λ
= λ.
E(X) =
0
Review of random variables
49
Expectation
Definition
Example: binomial distribution
!
m x
fX (x) =
p (1 − p)m−x
x
(x = 0, 1, . . . , m)
m
X
!
m x
p (1 − p)m−x
E(X) =
x
x
x=0
!
m
X
m x
=
x
p (1 − p)m−x .
x
x=1
m
m−1
Now note that x x = m x−1 , so
!
m
X
m−1 x
E(X) = m
p (1 − p)m−x
x−1
x=1
!
m−1
X m−1
= mp
p y (1 − p)(m−1)−y = mp.
y
y=0
Review of random variables
50
Expectation
Expectation transforms linearly
Expectation transforms linearly
For any constants a and b, and any function g,
E[a + bg(X)] = a + bE[g(X)]
(exercise: prove this)
So a change of location and/or scale does not fundamentally
affect expectation: the expected value transforms in the
‘obvious’ way.
Review of random variables
51
Expectation
Moments
Moments
The nth moment of X is
0
= E(X n ).
µn
The nth central moment of X is
µn = E[(X − µ)n ],
where µ = µ10 = E(X).
Review of random variables
52
Expectation
Mean and variance
Mean and variance
Mean:
µ = E(X) = µ10
Variance:
σ 2 = var(X) = E[(X − µ)2 ] = µ2
These are very commonly used to summarize a distribution.
p
[Standard deviation: σ = sd(X) = var(X) is measured in
the same units as X.]
Review of random variables
53
Expectation
Mean and variance
Example: exponential distribution
Z∞
1
(x − λ)2 e−x/λ dx
λ
Z0∞
1
=
(x 2 − 2xλ + λ2 ) e−x/λ dx
λ
0
var(X) =
Integrate the first term by parts as before; the second and
third terms can be got directly from knowledge of E(X) and
E(1). The result is var(X) = λ2 (exercise).
Review of random variables
54
Expectation
Variance of a linear function
Variance of a linear function
For any constants a and b,
var(aX + b) = a2 var(X)
(exercise: prove this)
Review of random variables
55
Expectation
An alternative formula for the variance
An alternative formula for the variance
A formula useful especially for computation:
var(X) = E(X 2 ) − [E(X)]2
(‘mean square minus squared mean’)
(exercise: prove this)
Review of random variables
56
Expectation
Moment generating function (mgf)
Moment generating function (mgf)
The mgf of a random variable X is defined as
MX (t) = E(etX ),
provided that the expectation exists (i.e., that the
corresponding sum or integral converges) for all t in some
neighbourhood of 0.
Review of random variables
57
Expectation
Mgf generates moments
Mgf generates moments
E(X n ) =
dn
MX (t)|t=0
dt n
Proof: e.g., continuous case, assuming that it is valid to
integrate under the integral sign (†: see, e.g., C&B 2.4),
Z
d
d ∞ tx
Case n = 1 :
MX (t) =
e fX (x)dx
dt
dt
Z ∞ −∞
=
xetx fX (x)dx (†)
−∞
= E(XetX ) = E(X) when t = 0.
The general case n > 1 follows by induction (exercise).
Review of random variables
58
Expectation
Mgf generates moments
Example: gamma distribution

 1 α x α−1 e−x/β
fX (x) = Γ (α)β
0
MX (t) =
1
Γ (α)βα
=
1
Γ (α)βα
Z∞
0
Z∞
(x > 0)
(otherwise)
etx x α−1 e−x/β dx
"
x
α−1
exp −x/
0
= 1/(1 − βt)α
β
1 − βt
!#
dx
if t < 1/β.
Hence, for example, the mean is
E(X) = MX0 (0) = αβ/(1 − βt)α+1 |t=0 = αβ
— and similarly for other moments.
Review of random variables
59
Expectation
Mgf characterizes distribution
Mgf characterizes distribution
If
MX (t) = MY (t)
for all t in some neighbourhood of 0, then
FX (x) = FY (x)
for all x.
Proof: uniqueness of the Laplace transform. Outside the
scope of this course.
(Note: the mgf is essentially the Laplace transform of fX .)
Review of random variables
60
Expectation
Mgf of a linear transformation
Mgf of a linear transformation
For any constants a and b, if Y = a + bX then
MY (t) = eat MX (bt)
(exercise: prove this)
Review of random variables
61
Expectation
Mgf of a sum of independent rv’s
Mgf of a sum of independent rv’s
If X and Y are independent, and
Z = X + Y,
then MZ (t) = MX (t)MY (t).
Proof:
E[et(X+Y ) ] = E(etX etY )
Z∞
(by independence)
=
etx ety fX (x)fY (y)dxdy
−∞
Z∞
Z∞
=
etx fX (x)dx
ety fY (y)dy
−∞
= MX (t)MY (t).
−∞