H UMAN G ENOMIC D ATA Considerations for Next-Generation Computational Services CASC 31 March 2015 Matthew Trunnell Broad Institute 31 March 2015 1000 Genomes A Deep Catalog of Human Genetic Variation The goal of the 1000 Genomes Project is to find most gene<c variants that have frequencies of at least 1% in the popula<ons studied. hBp://www.1000genomes.org/ The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the applica<on of genome analysis technologies, including large-‐scale genome sequencing. hBp://cancergenome.nih.gov/ Human Microbiome Project (HMP) [has] the mission of genera<ng resources enabling comprehensive characteriza<on of the human microbiota and analysis of its role in human health and disease. hBp://nihroadmap.nih.gov/hmp/ Cost to sequence a genome $1,000,000,000 $10,000,000 ~100,000x $100,000 $1,000 $10 Network storage capacity: 21 petabytes Computational resource: 8000 cores Considerations for genomic data Regulatory Ethical issues issues Technical issues The regulatory landscape is complex ¡ General data privacy laws § e.g., EU Data Protection Directive ¡ Protection of personal health information § e.g., HIPAA ¡ Protection of human subjects in research § e.g., the “Common Rule” (45 CFR 46 subpart A.) None of these directly addresses genomic data. Regulatory Issues Ethical Issues Technical Issues Human genomic data is special ¡ Your genome encodes a lot of information about you. § Physical attributes § Risks of disease § Ancestry § … ¡ Your genome contains information about your parents. ¡ Your genome contains information about your offspring. Regulatory Issues Ethical Issues Technical Issues Genomic data is not generally PHI Divorced from all of the HIPAA identifiers and other patient information, genomic data is considered be de-identifiable. Regulatory Issues Ethical Issues Technical Issues Genomic data is not de-identifiable Regulatory Issues Ethical Issues Technical Issues Ethical protection of human subjects ¡ 1946 Nuremburg Code ¡ 1975 Declaration of Helsinki ¡ 1979 Belmont Report Key principles governing research with human subjects • Respect for persons • Beneficence • Justice Regulatory Issues Ethical Issues Technical Issues Informed consent ¡ The consent process allows research subjects to place restrictions on how their data may be used. ¡ Data use may be limited to § Specific diseases § Non-commercial use § Special populations Regulatory Issues Ethical Issues Technical Issues Use of genomic data is context-dependent • Permissions: business rules – “Since the trial is s<ll ongoing, I don’t want anyone to see it.” • Consent: ethical rules – “The donor wants her data used only for non-‐profit cancer research.” Regulatory Issues Ethical Issues Technical Issues Tracking consent is important Regulatory Issues Ethical Issues Technical Issues Maintaining genomic privacy may be important Regulatory Issues Ethical Issues Technical Issues Understanding one genome requires many genomes. Regulatory Issues Ethical Issues Technical Issues Approaches to maintaining data privacy Homomorphic encryption Compute directly on encrypted data: operation(n) == decrypt(operation’(encrypt(n))) Secure multi-party computation Perform a joint computation across distributed private data without sharing those data. http://www.humangenomeprivacy.org/2015/about.html Regulatory Issues Ethical Issues Technical Issues Files must give way to APIs At large scale, the file/folder model for managing data on computers becomes ineffective as a human interface, and eventually a hindrance to programmatic access. The solution: object storage + metadata. Regulatory Issues Ethical Issues Technical Issues Standards are needed for genomic data “The mission of the Global Alliance for Genomics and Health is to accelerate progress in human health by helping to establish a common framework of harmonized approaches to enable effective and responsible sharing of genomic and clinical data, and by catalyzing data sharing projects that drive and demonstrate the value of data sharing.” Regulatory Issues Ethical Issues Technical Issues Considerations for computing services ¡ Provide training in human subjects for IT and research personnel ¡ Maintain nformation security controls commensurate with HIPAA ¡ Invest in data engineering to develop new capabilities ¡ Develop pproaches to access control and auditing that address data use restrictions ¡ Create data services that transcend filesystems T HANK Y OU
© Copyright 2024