Review of random variables 25 Part II Review of random variables Review of random variables 26 Random variable maps sample space to IR Random variable maps sample space to IR A random variable (rv) is a function that maps the sample space (of an experiment, or observation) to the real numbers: X : S , IR e.g., toss two dice, S = {(1, 1), (1, 2), . . . , (6, 5), (6, 6)} ñ e.g., X = total number of spots ñ e.g., X = no. of spots on the higher of the 2 dice Review of random variables 27 Cumulative distribution function Cumulative distribution function Write FX for the (cumulative) distribution function (cdf) of rv X. Definition: For any x ∈ IR, FX (x) = pr(X ≤ x). Review of random variables 28 Continuous and discrete random variables Continuous and discrete random variables Random variable X is continuous if FX (x) is a continuous function of x. Random variable Y is discrete if FY (y) is a step function (i.e., is piecewise constant). (draw example graphs) Review of random variables 29 Continuous and discrete random variables A random variable can also be of mixed type (i.e., with both discrete and continuous parts to its distribution). For example, X = amount of rainfall on a given day has FX (x) = 0 for x < 0, a discontinuous step to FX (0) > 0, and then increases continuously through positive values of x to FX (∞) = 1. (draw graph) Review of random variables 30 Identically distributed random variables Identically distributed random variables If X and Y are such that FX (x) = FY (x) ∀x ∈ IR then X and Y are said to be identically distributed. Example In 10 tosses of a fair coin, let X be the number of heads, and Y be the number of tails. Then, by symmetry, X and Y have the same distribution. (But clearly X and Y are different functions!) Review of random variables 31 Discrete rv: probability mass function Discrete rv: probability mass function The probability mass function (pmf) of a discrete rv X is fX (x) = pr(X = x) ∀x ∈ IR. Example In two tosses of a fair coin, suppose X is the number of heads. Then   1/4 (x = 0)    1/2 (x = 1) fX (x) = 1/4 (x = 2)     0 (otherwise) Review of random variables 32 Discrete rv: probability mass function Relation between pmf and cdf: FX (x) = X fX (t). t≤x Example (cont.) If X is number of heads in 2 tosses of a fair coin,   (x < 0)  0   1/4 (0 ≤ x < 1) FX (x) =  3/4 (1 ≤ x < 2)    1 (x ≥ 2) Review of random variables 33 Continuous rv: probability density function Continuous rv: probability density function For a continuous rv, pr(X = x) = 0 ∀x ∈ IR. But we can usefully define the probability density function (pdf) analogously to the pmf, replacing summation by integration. For a continuous rv, the pdf is the function fX (x) such that Zx FX (x) = fX (t)dt ∀x ∈ IR. −∞ (sketch graph of this relationship) Review of random variables 34 Continuous rv: probability density function From the Fundamental Theorem of Calculus, then, if fX (x) is continuous, d FX (x). fX (x) = dx Review of random variables 35 Continuous rv: probability density function Example Suppose that X is uniformly distributed on the interval (a, b). The density (pdf) is fX (x) =   1 b−a 0 (a < x < b) otherwise The corresponding cdf is   (x ≤ a)  0 FX (x) = (x − a)/(b − a) (a < x ≤ b)    1 (x > b) Review of random variables 36 A notational shorthand: “∼” A notational shorthand: “∼” The cdf FX fully describes the distribution of X, as does the pdf or pmf fX . It is common to write either X ∼ FX (x) or X ∼ fX (x) or to use a distribution’s name, e.g., X ∼ uniform(a, b). The symbol ‘∼’means ‘is distributed as’. Similarly, if X and Y are identically distributed, we may write X ∼ Y. Review of random variables 37 Characterization of pmf and pdf Characterization of pmf and pdf An arbitrary function f (x) is a pmf (or pdf) if and only if 1. f (x) ≥ 0 ∀x P 2. x f (x) = 1 (pmf) or R∞ −∞ f (x)dx = 1 (pdf) (For more mathematical precision than this, go to ST213 Mathematics of Random Events.) Review of random variables 38 A technical note: absolute continuity A technical note: absolute continuity There do exist continuous cdf’s F (x) which are such that no density function f exists satisfying Zx F (x) = f (t)dt ∀x. −∞ Such cases are pathological, though, and will not concern us further in this course. A cdf F (x) for which f (x) does exist is said to be absolutely continuous. All of the important continuous distributions used in statistics are of this kind. Review of random variables 39 Mixed discrete and continuous Mixed discrete and continuous If X has a mixture of discrete and continuous distribution, (d) (c) FX (x) = pFX (x) + (1 − p)FX (x), with p ∈ (0, 1), and we define (d) (c) fX (x) = pfX (x) + (1 − p)fX (x) (d) where fX (c) is a pmf and fX a pdf. For example, pr(no rain) = p = 0.62, say. Review of random variables 40 Transformation: functions of a rv Transformation: functions of a rv New rv from old: Y = g(X). How do FY , fY relate to FX , fX ? Review of random variables 41 Transformation: functions of a rv cdf cdf FY (y) = pr(Y ≤ y) = pr(g(X) ≤ y). Suppose that g is (strictly) monotonic (and hence can be inverted). Then either g is increasing, so that FY (y) = pr(X ≤ g −1 (y)) = FX (g −1 (y)) or is decreasing, in which case FY (y) = pr(X ≥ g −1 (y)) = pr(−X ≤ −g −1 (y)) = F−X (−g −1 (y)). Review of random variables 42 Transformation: functions of a rv Discrete rv: pmf Discrete rv: pmf fY (y) = pr(Y = y) = X fX (x). x: g(x)=y Example Discrete uniform distribution on {−1, 0, 1}:  1/3 (x ∈ {−1, 0, 1}) fX (x) = 0 (otherwise) Consider Y = X 2 :    1/3 (y = 0) fY (y) = 2/3 (y = 1)    0 (otherwise) Review of random variables 43 Transformation: functions of a rv Continuous rv: pdf Continuous rv: pdf We have seen how to get FY from FX , in general. Povided that the derivative exists, then, fY (y) = d FY (y). dy This is the most general approach, which works for any transformation Y = g(X). In the special but common case that g is both invertible and continuously differentiable, a quick route from fX to fY is d −1 −1 fY (y) = fX (g (y)) g (y) . dy (Exercise: prove this.) Review of random variables 44 Transformation: functions of a rv Continuous rv: pdf Example X ∼ uniform(−1, 1)  1/2 (x ∈ (−1, 1)) fX (x) = 0 (otherwise) Let Y = X 2 . By symmetry, |X| ∼ uniform(0, 1); and Y = |X|2 is a smooth, invertible function of |X|. Hence p 1 d p y = 1 × √ (0 < y < 1). fY (y) = f|X| ( y) dy 2 y Review of random variables 45 Transformation: functions of a rv Probability integral transform A special transformation: the probability integral transform (PIT) Suppose X is continuous, and let Y = FX (X). Then Y ∼ uniform(0, 1): FY (y) = pr(Y ≤ y) = pr(FX (X) ≤ y) = pr(X ≤ FX−1 (y)) = FX (FX−1 (y)) = y (0 < y < 1). Note: the above proof applies directly when FX is strictly increasing. Otherwise some care is needed in the definition of FX−1 : see C&B Theorem 2.1.10. Review of random variables 46 Transformation: functions of a rv Probability integral transform The PIT is useful for various statistical purposes, both theoretical and practical. A particular application of note is the simulation of an arbitrary random variable X on a computer. For example, to generate (i.e., simulate) values of X from a strictly increasing cdf FX , we can first generate U ∼ uniform(0, 1) and then transform via the inverse PIT to X = FX−1 (U). This is not necessarily the most computationally efficient method, but it has the advantage of being very general (and for that reason is very much used in practice). Review of random variables 47 Expectation Definition Expectation The expected value or mean of any function g(X) is P  (discrete) x g(x)fX (x) E[g(X)] = R ∞  g(x)f (x)dx (continuous) X −∞ This is just the average value of g(X) in repeated sampling. Review of random variables 48 Expectation Definition Example: exponential distribution   1 e−x/λ (x > 0) fX (x) = λ 0 (otherwise) Z∞ x −x/λ e dx λ h i∞ Z ∞ = −xe−x/λ + e−x/λ dx 0 0 Z∞ 1 −x/λ = λ e dx 0 λ = λ. E(X) = 0 Review of random variables 49 Expectation Definition Example: binomial distribution ! m x fX (x) = p (1 − p)m−x x (x = 0, 1, . . . , m) m X ! m x p (1 − p)m−x E(X) = x x x=0 ! m X m x = x p (1 − p)m−x . x x=1 m m−1 Now note that x x = m x−1 , so ! m X m−1 x E(X) = m p (1 − p)m−x x−1 x=1 ! m−1 X m−1 = mp p y (1 − p)(m−1)−y = mp. y y=0 Review of random variables 50 Expectation Expectation transforms linearly Expectation transforms linearly For any constants a and b, and any function g, E[a + bg(X)] = a + bE[g(X)] (exercise: prove this) So a change of location and/or scale does not fundamentally affect expectation: the expected value transforms in the ‘obvious’ way. Review of random variables 51 Expectation Moments Moments The nth moment of X is 0 = E(X n ). µn The nth central moment of X is µn = E[(X − µ)n ], where µ = µ10 = E(X). Review of random variables 52 Expectation Mean and variance Mean and variance Mean: µ = E(X) = µ10 Variance: σ 2 = var(X) = E[(X − µ)2 ] = µ2 These are very commonly used to summarize a distribution. p [Standard deviation: σ = sd(X) = var(X) is measured in the same units as X.] Review of random variables 53 Expectation Mean and variance Example: exponential distribution Z∞ 1 (x − λ)2 e−x/λ dx λ Z0∞ 1 = (x 2 − 2xλ + λ2 ) e−x/λ dx λ 0 var(X) = Integrate the first term by parts as before; the second and third terms can be got directly from knowledge of E(X) and E(1). The result is var(X) = λ2 (exercise). Review of random variables 54 Expectation Variance of a linear function Variance of a linear function For any constants a and b, var(aX + b) = a2 var(X) (exercise: prove this) Review of random variables 55 Expectation An alternative formula for the variance An alternative formula for the variance A formula useful especially for computation: var(X) = E(X 2 ) − [E(X)]2 (‘mean square minus squared mean’) (exercise: prove this) Review of random variables 56 Expectation Moment generating function (mgf) Moment generating function (mgf) The mgf of a random variable X is defined as MX (t) = E(etX ), provided that the expectation exists (i.e., that the corresponding sum or integral converges) for all t in some neighbourhood of 0. Review of random variables 57 Expectation Mgf generates moments Mgf generates moments E(X n ) = dn MX (t)|t=0 dt n Proof: e.g., continuous case, assuming that it is valid to integrate under the integral sign (†: see, e.g., C&B 2.4), Z d d ∞ tx Case n = 1 : MX (t) = e fX (x)dx dt dt Z ∞ −∞ = xetx fX (x)dx (†) −∞ = E(XetX ) = E(X) when t = 0. The general case n > 1 follows by induction (exercise). Review of random variables 58 Expectation Mgf generates moments Example: gamma distribution   1 α x α−1 e−x/β fX (x) = Γ (α)β 0 MX (t) = 1 Γ (α)βα = 1 Γ (α)βα Z∞ 0 Z∞ (x > 0) (otherwise) etx x α−1 e−x/β dx " x α−1 exp −x/ 0 = 1/(1 − βt)α β 1 − βt !# dx if t < 1/β. Hence, for example, the mean is E(X) = MX0 (0) = αβ/(1 − βt)α+1 |t=0 = αβ — and similarly for other moments. Review of random variables 59 Expectation Mgf characterizes distribution Mgf characterizes distribution If MX (t) = MY (t) for all t in some neighbourhood of 0, then FX (x) = FY (x) for all x. Proof: uniqueness of the Laplace transform. Outside the scope of this course. (Note: the mgf is essentially the Laplace transform of fX .) Review of random variables 60 Expectation Mgf of a linear transformation Mgf of a linear transformation For any constants a and b, if Y = a + bX then MY (t) = eat MX (bt) (exercise: prove this) Review of random variables 61 Expectation Mgf of a sum of independent rv’s Mgf of a sum of independent rv’s If X and Y are independent, and Z = X + Y, then MZ (t) = MX (t)MY (t). Proof: E[et(X+Y ) ] = E(etX etY ) Z∞ (by independence) = etx ety fX (x)fY (y)dxdy −∞ Z∞ Z∞ = etx fX (x)dx ety fY (y)dy −∞ = MX (t)MY (t). −∞