Ebola: Nonparametric Survival Analysis Without Life Data October 23, 2014

Ebola: Nonparametric Survival
Analysis Without Life Data
October 23, 2014
P[Time from case report to death > k weeks]
P[Death in week k|Survive for k-1 weeks]
Regression analyses of case reports
Case Fatality Ratios by country
standard deviations
Caseload Forecasts
Why?
• News reports: 70% die, need 262 beds in Guinea
by Dec. 1, 1000 new cases per week by end of
year, 200,000 –250,000 cases by Jan. 20,…
– How long to death? CFR and distribution
– How long to release? Empirical distribution [NEJM]
– How many cases in treatment each week?
• Forecast case reports and caseloads
– e.g., Harris and Rattner, Jewell et al., Majumder
• Compare countries? Treatments? SIR? Ro?
Data
• http://www.who.int/csr/disease/ebola/en
– Cumulative case reports = confirmed + probable +
suspected and death counts
– “The total number of cases is subject to change due to
reclassification, retrospective investigation, consolidation of
cases and laboratory data, and enhanced surveillance. ”
• Used weekly counts to smooth corrections
• Days from hospitalization to release
[http://www.nejm.org/doi/pdf/10.1056/NEJMoa
1411100 appendix]
Methods
• Regress case reports on time: linear, logarithmic, and
piecewise linear
• Nonparametric ccdf (survivor function) estimates of time
from case report to death from WHO counts
– Maximum likelihood assuming nonstationary Poisson case
reports [George and Agrawal]
– Constrained least squares [Harris and Rattner, Gang, George]
• minS|observed weekly deaths  estimated weekly deaths|2 [Gang]
• Subject to one or both of…
– S observed total deaths = S estimated total deaths
– P[Time to death ≤ Now  first case date]  deaths/cases
– Constrained maximum entropy Sp(t)ln(p(t)) [Tribus]
• Length-of-Stay (in hospital) empirical cdf conditional on
recovery, from NEJM article appendix
Case Reports
• Linear and logarithmic regression case reports
Y on T, days
– Notice R2 values for alternative models?
Y=bmT
Y = mT+b
m, R2
b, seY
m, R2
b, seY
Guinea
5.68
-111
1.0120
104.25
R2, seY
0.83
164
0.93
0.21
Liberia
19.09
-1057
1.042
1.355
R2, seY
0.68
851
0.964
0.31
22.91
-514
1.0294
78.38
0.86
410
0.98
0.20
Sierra Leone
R2, seY
Piecewise Linear Regression
• Cumulative case reports
• Rate increases after knot points
– Not splined, chose knot points for min SSE
– Linear slope coefficients = case reports/day!
Country
Before\After
Slope per day
R2
Guinea
7/20/2014
2.95
0.97
Guinea
7/27/2014
13.22
0.97
Liberia
7/20/2014
1.11
0.56
Liberia
7/27/2014
56.78
0.98
Sierra Leone
8/3/2014
8.56
0.96
Sierra Leone
8/4/2014
41.26
0.95
Cumulative Case Reports and
Forecasts
18,000
16,000
14,000
12,000
10,000
Guinea
8,000
Liberia
Sierra Leone
6,000
Total Case reports
4,000
2,000
0
12/11/2014
11/11/2014
10/12/2014
9/12/2014
8/13/2014
7/14/2014
6/14/2014
5/15/2014
4/15/2014
3/16/2014
“Survivor” Function Estimates:
CCDF of Weeks from Case Report to Death
1.00
0.90
0.80
0.70
0.60
Guinea
0.50
Liberia
0.40
Sierra Leone
0.30
0.20
0.10
0.00
0
4
8
12
16
20
Weeks from case report to death
24
28
32
Actuarial Weekly Death Rates
Conditional on Survival
0.50
0.45
0.40
0.35
0.30
Guinea
0.25
Liberia
0.20
Sierra Leone
0.15
0.10
0.05
0.00
0
4
8
12
16
20
Weeks from case report to death
24
28
32
CFR = P[Death ≤ Time since first case]
Country
Probability of
death
Std. Dev.
Guinea
62.52%
~1.2%
Liberia
67.61%
~5.6%
Sierra Leone
59.29%
~14.2%
Nigeria
62% (mle) 69% (lse) na
43% (max.En.)
• Survivor function estimates for Nigeria
disagree: time, cases, and deaths are too few,
August 2014 only
Deaths/Cases is Biased Low
• P[Time to death ≤ 32 weeks] > Deaths/Case
Reports!
– Because some haven’t died yet
• Empirical standard deviation estimates are
from March-April, March-May, March-June,
March-July, March-August, March-September
and total cohorts
P[Death time ≤ 4 weeks] and
P[LoS ≤ 4 weeks|Survive]
• Death time = Death – Case Report
• LoS = Release – hospitalization (Length of
Stay)
Country
Probability of
death in 4 weeks
Std. Dev.
LoS
Guinea
48.1%
6.5%
100%
Liberia
47.1%
14.4%
92.5%
Sierra Leone
22.7%
14.0%
100%
Nigeria
33.33%
Length-of-Stay: P[LoS > t|Survive]
• New England Journal of Medicine article
appendix fit gamma distributions to data
• Mark II eyeballs read hospitalization-to-release
data from NEJM graphs. Used empirical cdf of LoS
– Sample sizes are small
• Use actuarial death rates and empirical cdf of LoS
to forecast caseloads
– Assume case report => case load in first week after
report
Caseloads vs. Required ETU Beds
• WHO Dec. 1, 2014 required ETU beds
– http://apps.who.int/iris/bitstream/10665/137091
/1/roadmapsitrep22Oct2014_eng.pdf?ua=1
• Forecast is from distribution of time from case
report to either release or death
Existing ETU
beds
Guinea
160
Liberia
620
Sierra Leone
346
WHO
Required
ETU beds
260
2690
1198
WHO
Forecast
Forecast
Ratio %
61%
23%
29%
Case-loads
838
2,880
2,737
Ratio %
19.09%
21.53%
12.64%
Caseload Estimates and Forecasts
(NOT cumulative)
3,500
3,000
2,500
2,000
Guinea
1,500
Liberia
Sierra Leone
1,000
500
0
12/14/2014
11/14/2014
10/15/2014
9/15/2014
8/16/2014
7/17/2014
6/17/2014
5/18/2014
4/18/2014
3/19/2014
Analyses
• Guinea
http://pstlarry.home.comcast.net/EbolaGna.xlsm
• Liberia
http://pstlarry.home.comcast.net/EbolaLib.xlsm
• Sierra Leone
http://pstlarry.home.comcast.net/EbolaSL.xlsm
• Regression and summary
http://pstlarry.home.comcast.net/EbolaSIR.xlsx
Workbook Spreadsheets
• *.xlsm contain workbook tabs {Data, npmle.
nplseSummary, MaxEntropy, Recovery, nplse of
subsets}
– Npmle didn’t fit well, partly due to data revisions
– Nplse and MaxEntropy agreed tolerably
– Macro is VBA convolution for actuarial forecasts
• EbolaSIR.xlsx contains tabs (Guinea, Liberia,
Sierra Leone, Survival Analysis, CaseLoad}
– The country tabs contain regression analyses of case
reports
Conclusions: If survive week 1,…
• Survivors may need care to prevent subsequent death due to
secondary causes: liver damage and ???
• Case fatality ratio estimates in first week: 46.9% in Guinea, 40.5% in
Liberia, ???% in Sierra Leone
– More deaths in 4th or 5th week in Guinea and Sierra Leone, in 8th week
in Liberia. Accounting phenomena?
– More deaths in 12th week in Guinea. Accounting phenomenon?
– Sierra Leone recently reported deaths affect estimates
• Standard deviations of first weekly death rates: 6.1% in Guinea,
17.1% in Liberia, 16.7% in Sierra Leone, based on six monthly
estimates
• Exponential increases in cumulative case reports (=> Ro > 1) seems
to have reverted to linear increases. Let’s hope so.
• Caseloads ~7200 by end of year, based on The WHO case reports
and deaths and NEJM LoS. Can’t treat unreported cases.
Questions
• Case confirmations and adjustments created
problems with case reports and deaths
– Why do survivor function estimates differ?
– Standard deviation estimates could be reduced with
more work, time, and data
• Use case forecasts, caseloads, and standard
errors for planning, allocation, and service levels?
• Would WHO and Imperial College please share
case data?
– Should countries support WHO and West Africa
without data, estimates, and uncertainty?
Next Steps
• Forecast tolerance limits on caseloads for
specified service levels
– Infectious time distribution for those who recover?
– SIR by country or county?
– Contact [email protected] if you would like more
analyses
– Updated weekly or whenever I get data
• Get case data from WHO, Imperial College, and
CDC
References
•
•
•
•
•
•
•
•
George, L. L. and Avinash C. Agrawal, “Estimation of a hidden service distribution of an
M/G/∞ system,” Naval Research Logistics, 20: 549–555. doi: 10.1002/nav.3800200314 ,
http://pstlarry.home.comcast.net/MGinfi1.docx
George, L. L., “Field Reliability Without Life Data,” SPES/QP News, vol. 5, no. 2, Dec. 1999, pp.
13-14, http://www.amstat-online.org/sections/qp/1299newsletter.pdf
Harris, Carl M. and Edward Rattner and Clifton Sutton, “Forecasting the extent of the
HIV/AIDS epidemic,” Socio-Economic Planning Sciences, 1992, vol. 26, issue 3, pages 149-168
Ibid. “Estimating and Projecting Regional HIV/AIDS Cases and Costs. 1990-2000: A Case
Study,” Interfaces, Vol. 29, No. 5, Sept.-Oct. 1997, pp. 38-53
Gang Cheng, “The nonparametric least-squares method for estimating monotone functions
with interval censored observations,” PhD thesis, University of Iowa, 2012,
http://ir.uiowa.edu/etd/2839
Jewell, Nicholas et al., “Estimation of the Case Fatality Ratio with Competing Risks Data: An
Application to Severe Acute Respiratory Syndome (SARS),” UC Berkeley Div. of Biostat.
Working paper series Number 176, 2005
WHO Ebola Response Team, “Ebola Virus Disease in West Africa—the First 9 Months of the
Epidemic and Forward Projections,” N Engl J Med. DOI: 10.1056/NEJMoa1411100, appendix
Maimuna Majumder, “Mathematical Modeling of the 2014 Ebola Outbreak,” MIT Sept. 26,
2014, http://maimunamajumder.wordpress.com/2014/09/26/mathematical-modeling-ofthe-2014-ebola-outbreak/