Viper project. - Familias home page

2
VI PER Project
Euroforgen Presentation
PENE Laurent
Head of Human Identification Division
1
INPS
●
●
●
National Institute of Scientific Police
National Agency under the supervision of the
General Director of the National Police
5 labs across the country (Staff ≈ 700)
Main laboratory located in Lyon
Multidisciplinary approach
●
●
●
DNA analysis = more than 80 % of the global
activity
2
Identity Card of VI PER project
●
Internal Project but open to external discussion
and collaboration
●
Period : 2015 – 2017
●
Project leader for INPS : Lyon Laboratory
●
Project coordinator : Laurent PENE
●
People involved : IT and Scientist people
2
Goals of VI PER project
●
●
●
Set-up a complete framework for post-analytical
phase of autosomal STR profiles utilization
Development of Statistical training for DNA
Experts
Harmonization of good practices for all the DNA
Experts of INPS
●
Huge challenge because we have 40 DNA
Experts in the institute
2
Why VI PER project ?
●
INPS DNA expert are poorly trained in statistical
interpretation
Lack of confidence to interpret complex
mixtures (3 contributors)
Some private DNA experts report complex
mixtures (4-5 contributors) without statistical
weight
●
●
●
Lack of use of validated software tools to
interpret DNA mixtures
INPS Analytical Method in 2015
●
Quantitation : Quantifiler or Quantifiler Duo
●
Identifiler + : 16 loci (Life Technologies)
●
France = CODIS Country
●
full volume (25 ml)
●
29 cycles
●
Electrophoresis Capillary with 3130 XL
●
Shift to Megaplex in 2016
●
Globalfiler, Powerplex Fusion 6C,
Investigator 24 plex
INPS Reporting Evidence in 2015
●
●
●
Calculation of Random Match Probability for
single source profiles
Manual deconvolution of mixtures with 2
contributors
Calculation of LR with DNAMIX (Software
developed by Bruce Weir and colleagues)
Binary approach
Very rarely interpretation of mixtures with 3
contributors
●
●
Set-up of a NEW Complete Framework
●
Validation with GeneMapper ID-X 1.5
●
Interpretation with :
GeneMapper ID-X 1.5
● LR Mix Studio : semi-continuous model
Import in the LIMS (Labvantage – Sapphire)
●
●
Improve quality control
Reporting statistical weight with Likelihood
Index
●
●
Validation GeneMapper ID-X 1.5
●
Definition
●
●
Selection of electrophoresis peaks which
correspond to alleles
Classification of Genetic profiles
Why Tresholds are useful ?
●
To avoid to take in account stutter peaks in
probabilistic calculation
With semi-continuous model stutters can
be consider as drop-in but they decrease
the LR value
To determine allelic association of a major
component of a mixture that can be uploaded to
National Databases
–
●
Tresholds used for classification
●
●
Analytical Tresholds ≈ 50 RFU
Stutter filters ≈ between 6 and 12 % for
backwards and around 2 % for forwards
●
Peak height ratio ≈ 60 %
●
Ratio between minor peaks and major peaks
●
Stochastic Tresholds ≈ 400 RFU
●
●
It's more a landmark than a treshold
Below this value you enter in a risk area
Proposed Classification (1)
●
Flat profiles = very low or no analytical signals
●
Single Source profile
Full profile
● Partial profile
Major Component profile
●
●
●
Even in mixtures with numerous contributors
we try to select the peaks of the major
contributor
Proposed Classification (2)
●
Mixtures profiles
●
●
MAD 2
– Mixtures with 2 contributors that can be
subjected to deconvolution
MAC 2 and MAC 3 :
– Mixtures with 2 or 3 contributors that can
be use for characterization or comparison
Proposed Classification (3)
●
Mixtures profiles
●
Mixed Profiles Not Interpretable
– Too much contributors : > 3
– Weak analytical signal
● For the 12 smaller loci of IDE +, more
than 20 % of allelic peaks are below the
stochastic treshold (400 RFU)
– This category of profile doesn't be used
for the interpretation stage
Interest of Profiles classification
●
●
●
●
N.B : the classification labels are included in
GMP allelic tables
Conditioning and quality control of alleles
imported in the LIMS
Monitoring the quality of the casework workflow
The classification defines what we can do with
the genetic profiles during the Interpretation
phase
Interpretation of Genetic Profiles
●
Definition
●
●
●
Deconvolution : determine a minor and a
major contributor from a casework profile with
or without reference profiles
Characterization : determine if 2 caseworks
profiles comes from a same individual
Comparison : determine if 1 casework profile
correspond to a reference profile
Characterization of casework profiles (1)
●
Characterization is important to give intelligence
data to the investigators
●
●
●
How many different casework profiles found
in one case ?
We would like to associate characterization
with a statistical weight
Same kind of challenge with national DNA
database matches between casework
profiles
Characterization of casework profiles (2)
●
Characterization is more tricky than comparison
from a statistical point of view
Casework Profile
Reference Profile
Uncertainty
Drop-in, Drop-out
Good Quality profile
= No Uncertainty
Casework Profile
Casework Profile
Uncertainty
Uncertainty
Interpretation Workflow (1)
●
Paper less approach
●
Software proposes – Human decides
●
Depending of the classification label we choose
a software tool
●
●
●
Single source profiles = open office spreadsheet to calculate RMP
MAD2 = GMP ID-X 1.5 for deconvolution
MAC 2 or 3 = LR Mix Studio to calculate LR
Interpretation Workflow (2)
●
●
Automated transfer of allelic values from GMP
ID-X to Spread-sheet and LR-Mix Studio
Upload of statistical LR-Mix studio report in the
LIMS
LR Mix Studio Internal Validation (1)
●
We are not statistician :
We choose statistical software with model
published in peer-reviewed articles
Checking simple formula calculation by hand
●
●
●
Modification of parameters
●
E.G : if you increase theta, the LR must
decrease for a true contributor
LR Mix Studio Internal Validation (2)
●
●
●
Artificial mixed samples with known contributors
Analysis of closed cases profiles with the new
software with included and excluded
contributors
Simulation functions is very difficult to validate
●
●
Not enough documented in articles or in the
manual
E.g : Monte-carlo simulations to make dropout sensitivity study
Reporting Statistical Interpretation (1)
●
●
Large figures are really confusing for
investigators and lawyers
Order of magnitude is preferable
E.g : earthquake Richter scale
Reporting of Log (1/RMP) and Log (LR)
●
●
●
●
We would like to call this value the Likelihood
Index
This value will be limited to an interval [-15, 15]
Reporting Statistical Interpretation (2)
●
Synthetic reporting with tables
Profiles
Number
Hp hypothesis
Hd hypothesis
Likelihood
Index
1
Suspect +
Victim
Unknow +
Victim
12
2
Suspect
Unknown
15
The Likelihood index is a statistical value which
represents the probability to observe the casework
genetic profiles if the Hp is true and Hd is false.
The Likelihood Index is comprised between – 15 and 15