Ebola: Nonparametric Survival Analysis Without Life Data November 16, 2014 P[Time from case report to death > k] P[Death in kth|Survive for k-1] Extrapolate case reports Health-Care Workers CFR? Standard deviations Caseload Forecasts 21 Days Incubation? Why? • Speculations: 70% die? Need 262 beds in Guinea by Dec. 1? 1000 new cases per week by end of year? 200,000 –250,000 cases by Jan. 20? 21-day incubation? – How long to onset? death? CFR and distributions – How long to release? Empirical distribution [NEJM] – How many cases in treatment each week? • Forecast case reports and caseloads to justify $$$, equipment, and personnel • Compare countries? Treatments? Health-Care workers? Data • http://www.who.int/csr/disease/ebola/en – Cumulative case reports = confirmed + probable + suspected and death counts – “The total number of cases is subject to change due to reclassification, retrospective investigation, consolidation of cases and laboratory data, and enhanced surveillance. ” • Used weekly counts to smooth corrections, unsuccessfully – Cumulative death counts • Days: incubation, onset, hospitalization, release… [http://www.nejm.org/doi/pdf/10.1056/NEJMoa 1411100 appendix] Methods • Regress weekly case reports on time: linear, logarithmic, and piecewise both • Nonparametric ccdf (survivor function) estimates of time from case report to death from WHO counts – Maximum likelihood assuming nonstationary Poisson case reports [George and Agrawal] – Constrained least squares [Harris and Rattner, Gang, George] • minS|observed weekly deaths estimated weekly deaths|2 [Gang] • Subject to one or both of… – S observed total deaths = S estimated total deaths – P[Time to death ≤ Now first case date] deaths/cases – Constrained maximum entropy Sp(t)ln(p(t)) [Tribus] • Incubation time, Length-of-Stay (LoS in hospital) conditional empirical cdfs, from NEJM article appendix Weekly Case Reports • Linear and logarithmic regression of weekly case reports Y on T, days. Heteroskedastic! – Notice R2 values for alternative models? Y=bmT Y = mT+b m, R2 Guinea R2, seY Liberia R2, seY Sierra Leone R2, seY b, seY 3.12 -4.34 0.36 44.66 18.97 -144.97 0.35 268.62 23.25 -78.85 0.47 184.68 m, R2 1.0570 0.18 1.2884 0.71 1.1342 0.56 b, seY 9.89 1.27 0.20 1.70 23.39 0.84 Piecewise Regression • Weekly case report rates = slope • Rates increase after knot points – Not splined, chose knot points for min SSE – Slope coefficients = case reports/week! Country Before\After Slope per week R2, Linear R2, Logarithmic Guinea 7/20/2014 -1.36 0.17 0.36 Guinea 7/27/2014 5 0.17 0.13 Liberia 8/3/2014 3.31 0.58 0.4 Liberia 8/10/2014 8.9 0.01 0.03 Sierra Leone 8/10/2014 8.7 0.33 0.10 Sierra Leone 8/17/2014 33 0.26 0.23 Cumulative Case Reports, Forecasts, and 95% Upper Prediction Limits • Extrapolations are averages and prediction limits: 20 bootstraps of 36 weeks for Guinea 3500 3000 2500 Case Reports, Linear extrap 2000 Case Reports, Log extrap Case Reports, 95% upper Linear extrap 1500 Case Reports, 95% upper Log extrap 1000 500 0 3/16/2014 6/14/2014 9/12/2014 12/11/2014 “Survivor” Function Estimates: CCDF of Weeks from Case Report to Death 1.0 0.9 0.8 0.7 0.6 Guinea 0.5 Liberia 0.4 Sierra Leone 0.3 0.2 0.1 0.0 0 4 8 12 16 20 Weeks from case report to death 24 28 32 36 Survivor Function Estimates for Health-Care workers 1 0.9 0.8 0.7 0.6 Guinea R(t) 0.5 Liberia R(t) 0.4 Sierra Leone R(t) 0.3 0.2 0.1 0 0 4 8 12 16 20 Weeks from case report tp death 24 28 32 36 Compare Estimates All (left) vs. Health-Care Workers 1 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 Guinea 0.5 Guinea R(t) 0.5 Liberia R(t) Liberia 0.4 Sierra Leone 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 Sierra Leone R(t) 0 0 4 8 12 16 20 24 28 Weeks from case report to death 32 36 0 4 8 12 16 20 24 28 Weeks from case report tp death 32 36 Actuarial Weekly Death Rates Conditional on Survival 0.50 0.45 0.40 0.35 0.30 Guinea 0.25 Liberia 0.20 Sierra Leone 0.15 0.10 0.05 0.00 0 4 8 12 16 20 Weeks from case report to death 24 28 32 36 CFR ~Deaths/Case Reports All cases CFR Standard Deviations Health-Care Workers CFR Guinea 62.9% 0.9% 54% Liberia 41.4% 9.8% 41% Sierra Leone 30.3% 26.1% 80% Deaths/Case Reports is Biased Low • P[Time to death ≤ 32 weeks] > Deaths/Case Reports! – because some haven’t died yet • Empirical standard deviation estimates are from March-April, -May, -June, -July, -August, September, -October and total cohorts P[Death time ≤ 2 weeks] and P[LoS ≤ 4 weeks|Survive] • Death time = Death – Case Report • LoS = Release – hospitalization (Length of Stay) Country Probability of death in 2 weeks Std. Dev. LoS Guinea 49.2% 5.6% 100% Liberia 36.9% 24.1% 92.5% Sierra Leone 20.5% 13.1% 100% P[Incubation time > 21 days] • Supplementary Appendix 1, N Engl J Med. DOI: 10.1056/NEJMoa1411100 • Eyeballed data and computed empirical cdf – P[inc. time > 21 days|one-day exposure] = 3.5% – P[inc. time > 21 days|Multi-day exposure] = 10.1% – ~534 and ~154 cases respectively • Why do people think, “If you don’t have symptoms after 21 days, you won’t get it?” Length-of-Stay: P[LoS > t|Survive] • New England Journal of Medicine article appendix fit gamma distributions to data • Mark II eyeballs read hospitalization-to-release data from NEJM graphs. Used empirical cdf of LoS – Sample sizes are small • Use actuarial death rates and empirical cdf of LoS to forecast caseloads – Assume case report => case load in first week after report Caseloads vs. Required ETU Beds • WHO Dec. 1, 2014 required ETU beds – http://apps.who.int/iris/bitstream/10665/137091 /1/roadmapsitrep22Oct2014_eng.pdf?ua=1 • Forecast is from distribution of time from case report to either release or death Existing ETU beds Guinea 160 Liberia 620 Sierra Leone 346 WHO Required ETU beds 260 2690 1198 WHO Ratio % 61% 23% 29% Forecast Caseloads 1034 5535 5494 Caseload Estimates and Forecasts (NOT cumulative!) 8,000 7,000 6,000 5,000 4,000 Guinea Liberia 3,000 Sierra Leone 2,000 1,000 0 12/14/2014 11/14/2014 10/15/2014 9/15/2014 8/16/2014 7/17/2014 6/17/2014 5/18/2014 4/18/2014 3/19/2014 Analyses: append file name to http://pstlarry.home.comcast.net/ • • • • • Guinea: EbolaGna.xlsm Liberia: EbolaLib.xlsm Sierra Leone: EbolaSL.xlsm Regression and summary: EbolaSIR.xlsx Health-Care workers: EbolaHCW.xlsm Workbook Spreadsheets • *.xlsm contain workbook tabs: Data, npmle. nplseSummary, MaxEntropy, Recovery, nplse of subsets – – – – Npmle didn’t fit well, partly due to data revisions Nplse and MaxEntropy agreed tolerably VBA convolution for actuarial forecasts EbolaHCW.xlsm contains nplse ccdf estimates • EbolaSIR.xlsx contains country tabs, GuineaBootstrap SurvivalAnalysis, CaseLoad, Incubation – The country tabs contain regression analyses of case reports and Guinea bootstrap case prediction limits Conclusions: If survive week 1,… • Survivors may need care to prevent subsequent death due to secondary causes: liver damage, eyesight, and ??? • Accounting corrections messes up statistics? – Liberia reduced death counts – Sierra Leone recently reported deaths affect estimates: weekly estimates disagree with estimates from original case report and death counts – Walter Shewhart’s rule #1: “Preserve all relevant information in data” • Exponential cumulative case reports (=> Ro > 1) seems close to linear. Let’s hope so. • Caseloads ~8000 in Liberia and Sierra Leone by end of year. Can’t treat unreported cases. Questions • Case confirmations and adjustments cause problems – Why do estimates differ? Daily vs. weekly • Country vs. country, health-care workers vs. populations, npmle vs. nplse vs. max. entropy – Standard deviation estimates could be reduced – How to compensate for corrections? • Use statistics for planning, resource allocation, and service levels? SIR, SEIR? • Would WHO and Imperial College please share their individual case data? – Should countries support WHO and West Africa without data, estimates, and quantifiable uncertainty? Next Steps • Forecast prediction limits on caseloads for specified service levels – SIR by country or county? – Contact [email protected] if you would like more analyses – Updated weekly or whenever I get data • Requested case data from WHO, Imperial College, and CDC to validate estimates and represent dependence • Adjust data or adjust estimates to compensate for corrections? References • • • • • • • • George, L. L. and Avinash C. Agrawal, “Estimation of a hidden service distribution of an M/G/∞ system,” Naval Research Logistics, 20: 549–555. doi: 10.1002/nav.3800200314 , http://pstlarry.home.comcast.net/MGinfi1.docx George, L. L., “Field Reliability Without Life Data,” SPES/QP News, vol. 5, no. 2, Dec. 1999, pp. 13-14, http://www.amstat-online.org/sections/qp/1299newsletter.pdf Harris, Carl M. and Edward Rattner and Clifton Sutton, “Forecasting the extent of the HIV/AIDS epidemic,” Socio-Economic Planning Sciences, 1992, vol. 26, issue 3, pages 149-168 Ibid. “Estimating and Projecting Regional HIV/AIDS Cases and Costs. 1990-2000: A Case Study,” Interfaces, Vol. 29, No. 5, Sept.-Oct. 1997, pp. 38-53 Gang Cheng, “The nonparametric least-squares method for estimating monotone functions with interval censored observations,” PhD thesis, University of Iowa, 2012, http://ir.uiowa.edu/etd/2839 Jewell, Nicholas et al., “Estimation of the Case Fatality Ratio with Competing Risks Data: An Application to Severe Acute Respiratory Syndome (SARS),” UC Berkeley Div. of Biostat. Working paper series Number 176, 2005 WHO Ebola Response Team, “Ebola Virus Disease in West Africa—the First 9 Months of the Epidemic and Forward Projections,” N Engl J Med. DOI: 10.1056/NEJMoa1411100, appendix Maimuna Majumder, “Mathematical Modeling of the 2014 Ebola Outbreak,” MIT Sept. 26, 2014, http://maimunamajumder.wordpress.com/2014/09/26/mathematical-modeling-ofthe-2014-ebola-outbreak/
© Copyright 2024