RA Challenge: Question, Data, Scoring and

The DREAM Rheumatoid Arthritis
Responder Challenge:
Motivation, Data, Scoring and Results
LARA MANGRAVITE
SAGE BIONETWORKS
ON BEHALF OF THE
RA CHALLENGE ORGANIZING TEAM
Challenge Organizers
Solly Sieberts
Abhi Pratap
Christine Suver
Bruce Hoff
Thea Norman
Venkat Balagurusamy
Stephen Friend
Gustavo Stolovitzky
Funders
Eli Stahl, Mt Sinai
Gaurav Pandey, Mt Sinai
Jing Cui, Brigham and Women’s
Andre Falcao, U Lisbon
Robert Plenge, Merck
Peter Gregersen, Feinstein Institute
Jeff Greenberg, Corrona
Dimitrios Pappas, Corrona
Kaleb Michaud, Arthritis Internet Registry
Generators of Training Dataset
Rheumatoid Arthritis Treatment
~30% of RA patients fail to respond to anti-TNF therapy
-- Predicting nonresponse would assist in precision medicine,
clinical trial design, and development of new therapies
Robert Plenge
Pharmacogenetics of antiTNF response
N
SNPheritability
(se)
P-value
All patients
2617
0.18 (0.10)
0.02
etanercept
716
0 (0.34)
0.5
infliximab
857
0.62 (0.29)
0.02
adalimumab
1027
0.36 (0.25)
0.08
infliximab +
adalimumab
1899
0.36 (0.13)
0.003
Drug
n=2,706
Ciu and Stahl et al PLoS Genetis 2013
Eli Stahl
Rationale
 Given sizable estimated heritability, is it possible to
use genetic features to predict treatment response?



Polygenic approach: Combined influence of weak effects
Population subtypes: Not all individuals react similarly
Does genetic heritability foretell genetic prediction?
RA Responder Challenge Design
Discovery (phase I)
GWAS of treatment
response in RA
(n≈2,700 patients)
Polygenic SNP
predictor of
response
Refine model
Genomic data
(e.g., expression
profiling)
Peer insights
1)
2)
etc.
Plenge et. al. Nature Genetics 2013
Open
Collabora on
synapse
RA Responder Challenge Design
Discovery (phase I)
Validation (phase II)
GWAS of treatment
response in RA
(n≈2,700 patients)
Polygenic SNP
predictor of
response
Submit
models
Refine model
Genomic data
(e.g., expression
profiling)
Peer insights
1)
2)
etc.
Plenge et. al. Nature Genetics 2013
Open
Collabora on
synapse
GWAS of treatment
response in RA
(n≈1,100 patients)
Score
models
RA Responder Challenge Design
Discovery (phase I)
Validation (phase II)
GWAS of treatment
response in RA
(n≈2,700 patients)
Polygenic SNP
predictor of
response
Submit
models
Refine model
Genomic data
(e.g., expression
profiling)
Peer insights
1)
2)
etc.
Plenge et. al. Nature Genetics 2013
Open
Collabora on
synapse
GWAS of treatment
response in RA
(n≈1,100 patients)
responses
Score
models
Publica on
Peer-review
RA Challenge Data
Discovery Dataset
Test Data
Genotypes
~ 2.3 million SNPs
Genotypes
~ 2.3 million SNPs
Clinical ~ 6 traits
Clinical ~ 6 traits
Response
N=2076
Combine set from 4 studies
N=723
Generated for this challenge
RA Challenge:
Build the best possible predictors of anti-TNFa response in RA
Team Phase
Community
Phase
TEAM PHASE February - June 2014
Self-aggregate into teams and build the best possible
predictor of response.
COMMUNITY PHASE July - October 2014
Work together across teams to assess the contribution of
genetics to prediction.
RA Responders Challenge
Predict treatment response as measured by change in
disease activity score (DAS28) in response to antTNFa therapy.

Scoring: Average rank of pearson correlation and spearman
correlation.
Identify poor responders to anti-TNFa therapy as
defined by EULAR criteria.

Scoring: Average rank of AUC and PR.
Team Phase Results
Subchallenge 1:
Predicting deltaDAS
Best models: Team Guan Lab
Solly Sieberts
Subchallenge 2:
Predicting nonresponders
Best models: Team Guan Lab
&
Team SBI_Lab
32 teams
The Community Phase (July – October)
Work in collaboration to determine:
-- Whether genetic information contributes in a
meaningful way to predictions?
-- Best possible predictors of response.
-- What components of the modeling approaches are
most beneficial for this question.
Community Phase Participants
Community Phase Logistics
 First part: teams split into groups and shared
knowledge to help inform one another’s efforts
 Second part: all teams came together to devise an
analytical plan to explicitly address these questions.
Teams share ideas and then work
individually to provide:
 Do models using genetic features improve on
prediction relative to clinical models?
 What is the contribution of feature selection vs.
modeling algorithm on performance?
 Does the use of biological priors in feature selection
improve relative to random selection?
 Can supervised ensemble approach improve upon
individual predictions?
Subchallenge 1:Predicting deltaDAS
Subchallenge 1:Predicting deltaDAS
Subchallenge 2: Predicting Nonresponders
Subchallenge 2: Predicting Nonresponders
Ensemble Modeling by Gaurav Pandey
Conclusions
 Gaussian Process Regression appears to work best
with this type of problem.
 SNP selection more important than algorithmic
selection in most cases.
 Genetic information improves prediction of
nonresponders over use of clinical information.
 Ability to predict response based on clinical features
may be valuable to clinicians in and of themselves.
Today’s Speakers: Best Performers from
Independent Team Phase
 Fan Zhu on behalf of Team Guan Lab
 A generic method for predicting clinical outcomes and drug
response
 Javier Garcia-Garcia on behalf of Team SBI_Lab
 Predicting response to arthritis treatments: regression-based
gaussian processes on small sets of SNPs