Operations Research and Industrial Engineering Ph.D. Qualifying Exam–Statistical Modeling Sample Questions 1. Let X1 , X2 , . . . , Xn be a random sample from a population with probability density function: f (x|α, β) = β 2πx3 1/2 e −β(x−α)2 2α2 x ,x > 0 Find the maximum likelihood estimators of α, β. 2. Suppose a random variable X is distributed Bernoulli(p). A sample size of 4 is collected to test H0 : p ≤ 1/3 vs H1 : p > 1/3 and critical region Cr = [X > K]. What is the power function for the test for critical region using K = 2? 3. Given a random sample from a Geometric distribution, find the Fisher information in the sample. 4. Let X1 , X2 , . . . , Xn be a random sample from a population with pdf f (x|α) = αxα−1 , 0 < x < 1, α > 0 Is Qn i=1 Xi a complete sufficient statistic for α? Why or why not? 5. What is the relationship between the likelihood ratio test and the best test by the NeymanPearson theorem if both the null and alternative hypotheses are simple? 6. Let Y1 and Y2 be independent and identically distributed with a uniform distribution over the interval (θ, θ + 1). For testing H0 : θ = 0 versus H1 : θ > 0, we have two competing tests: • Test 1: Reject H0 if Y1 > 0.95 • Test 2: Reject H0 if Y1 + Y2 > c Find the values of c so that Test 2 has the same type I error as Test 1. 7. The likelihood function L(y1 , y2 , . . . , yn |θ) takes on different values depending on the arguments (y1 , y2 , . . . , yn ). A method for deriving a minimal sufficient statistic developed by Lehmann and Scheffe uses the ratio of the likelihoods evaluated at two points, (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ): L(x1 , x2 , . . . , xn |θ) L(y1 , y2 , . . . , yn |θ) Many times it is possible to find a function g(x1 , x2 , . . . , xn ) such that this ratio is free of the unknown parameter θ if and only if g(x1 , x2 , . . . , xn ) = g(y1 , y2 , . . . , yn ). If such a function g can be found then g(Y1 , Y2 , . . . , Yn ) is a minimal sufficient statistic for θ. Let Y1 , Y2 , . . . , Yn be a random sample from a Bernoulli distribution with p unknown. (a) Show that L(x1 , x2 , . . . , xn |θ) = L(y1 , y2 , . . . , yn |θ) 1 p 1−p P xi −P yi (b) Argue that for this ratio to be independent of p, we must have X xi = X yi (c) Using the method of Lehmann and Scheffe, what is a minimal sufficient statistic for p? How does this sufficient statistic compare to the sufficient statistic derived using the factorization theorem? 2 Operations Research and Industrial Engineering Ph.D. Qualifying Exam–Stochastic Processes Sample Questions 1. Let {N1 (t), t ≥ 0} and {N2 (t), t ≥ 0} be independent renewal processes, which count events. Set N (t) = N1 (t) + N2 (t) for all t ≥ 0. For each part you need to either give a short proof or provide a counterexample, i.e., justify your answer for full credit. (a) Are the interevent times of N (·) independent? (b) Are the interevent times of N (·) identically distributed? (c) Is N (·) a renewal process? 2. Suppose that people arrive at a bus stop according to a Poisson process with rate λ. Buses arrive exactly every t time units, and have unlimited capacity. Let Xi denote the total waiting time for those who board any particular bus i. (a) Calculate E[Xi ]. (b) Calculate Var[Xi ]. (c) Could you still derive the quantities above if the customer arrival process were a nonhomogeneous Poisson process? Either way, explain your answer. 3. Consider an irreducible discrete-time Markov chain with transition matrix P , and a finite or infinite state space S. Suppose P = P 2 . (a) Show that the Markov chain is aperiodic. (b) Show that pij = pjj for all i, j ∈ S. (c) Argue that a stationary distribution for the chain must exist, and give the stationary distribution, in terms of the elements of P . 4. The Google Markov Chain. Google’s PageRank algorithm works by analyzing what we will call the Google Markov Chain (GMC). The GMC can be formulated as a discrete-time Markov chain. The state space for the GMC consists of every page on the World Wide Web. From any page i the chain jumps to another page j for which there is an outgoing link from page i to page j. So, from page i we look at all outgoing links and uniformly randomly pick an outgoing link to jump to. In other words pij is equal to 1 divided by the number of outgoing links on page i, if there is a link from page i to page j. If there is no link from page i to page j, then pij = 0. If page i is a dead-end, i.e., there are no outgoing links, then the next page in the chain is chosen (uniformly) randomly from all pages in the World Wide Web (including page i itself). A web page’s PageRank is based on the long-run percentage of time spent on that page if one crawls the web under the dynamics of the GMC. In each part below, you must explain your answer (in a few sentences) for full credit. (a) Is the GMC always irreducible? (b) Does the GMC always have a unique stationary (or steady-state) distribution? 1 (c) Does the GMC always have a unique limiting distribution? (d) Suppose the World Wide Web consists of just 3 pages numbered 1, 2, 3. In the matrix L, `ij = 1 if there is a link from page i to page j, otherwise `ij = 0:   0 1 1   L =  1 0 1 . 0 0 0 Give the transition matrix for the GMC associated with link matrix L and calculate the PageRank of each page in the Web by calculating the steady-state distribution of the GMC. 5. The Stochastic Parking Attendant. Suppose cars arrive to a set of parking lots according to a renewal process with interarrival c.d.f. G(·). For simplicity, suppose that G(·) is aperiodic. When each car arrives, the parking attendant flips a biased coin, with probability of heads being 0 < p < 1. If the coin shows heads the car is directed to lot A. Otherwise, it is directed to lot B. Let the total car arrival process (from the outside) be {N (t), t ≥ 0} and the arrival processes to each lot {NA (t), t ≥ 0} and {NB (t), t ≥ 0}. Note that all three processes are counting processes. (a) Let {M (t), t ≥ 0} be the renewal function associated with N (·). Calculate MA (t), for t ≥ 0, the renewal function associated with NA (·). (b) Is {NA (t), t ≥ 0} always a renewal process? Explain your answer for full credit. (c) Are {NA (t), t ≥ 0} and {NB (t), t ≥ 0} independent processes? If not, provide a counterexample. If so, give an outline of a proof (three or four sentences, for example). (d) Let H(t) = E[NA (t) − NB (t)], for all t ≥ 0. Write down a renewal-type equation for H(t). (e) Now suppose we wish to calculate limt→∞ H(t). Is the Key Renewal Theorem applicable to this calculation? Explain. (f) Calculate limt→∞ H(t). Note that this can be done with or without the Key Renewal Theorem. 2