Three Critical Ideas for UC Health Sciences Cyber Infrastructure Joe Hesse - [email protected] Director of Innovation, UCSF Memory and Aging Center Technical Lead, UCSF Neuroscience Knowledge Network HPC Cluster Administrator, UCSF Institute for Human Genetics 3/23/2015 Driving “Cyber” Needs for Health Science In discovering causes and developing treatments for disease; in promoting health, encouraging prevention, and delivering care; we fundamentally need: 1. To reason, compute, and discover as health professionals, clinical researchers, social, and basic scientists, in any combination of roles, at any time. 2. To harness agile, cost-effective, and easy to use computational infrastructure throughout the full lifecycle of our research and clinical activities. 3. To interact with colleagues through ubiquitous and collaborative “data science” environments characterized by rich data and method annotations, secure and audited sharing, and transformational communication methods. 2 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Idea One: Develop and Deliver Regulatory Compliant Service Layers for Research Cyber Infrastructure. A Fundamental Problem: We practically force clinical researchers to abandon their clinical role (and access to their patient’s data) before they use most research computational infrastructure. Asking medical professionals to really “de-identify” data to meet compliance standards (w/ associated legal liabilities) is impractical. Results in a lot of “don’t ask, don’t tell behavior”. A Key Opportunity: Current technology trends (e.g. agile dev-ops, platforms as a service, software-defined everything) and the maturing open source tools (often with enterprise options) makes it practical to develop and deliver complex security and monitoring to research infrastructure at commodity prices using existing university IT capabilities. 3 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Delivering Regulatory Compliance through Mgmt & Orchestration HPC Pilot Project FY 2015/16 (funding decision pending) Designing a new unified management and orchestration layer and providing a single portal for access to three distinct high performance computing clusters (each currently serving distinct user communities). Opportunity to intentionally design security, monitoring and auditing layers with regulatory compliance as a target. Many benefits, but reduce user and administration costs by standardizing the most complex aspects of the environments is a key driver. 4 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 High Performance Computing: Simple Schematic of Layers. Common tools and service layers to support distinct HPC workloads and hardware 5 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Idea Two Create Continuum of Research Cyber Infrastructure to Support the Complete User Investigatory Experience Fundamental Barriers / Problems: Most research computational infrastructure organized is around the technology stacks that aim to meet cohering subsets of user needs, but that fall short of addressing the complete analytic and discovery lifecycle needs of complex investigations. Users often find it difficult to use idiosyncratically developed technology solutions; find it impossible to navigate between these infrastructural siloes; are usually unable to apply their funding in an agile manner across support organizations; and frequently lack knowledge about the most appropriate tools. Key Opportunity Extend common, regulatory compliant, management and orchestration layers to the full continuum of research cyber technologies. 6 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Vision of Continuum of Research Cyber Infrastructure Common portal, access, and billing tools streamline user experience. E.g. Using a secure reporting station to 1) query the EMR or research database for correlative variables to 2) drive an exploratory neuroimaging analysis using a large virtual workstation that will 3) become a pipelined HPC or GPU cluster analytic applied retrospectively to 1000’s of image studies is not only possible but commonplace. 7 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Idea Three Use Novel Collaboration and Data Science Environments to Bridge the huge Data, Method and Knowledge Divides Fundamental Problems / Barriers: With increasing size, complexity, and privacy / ethical concerns of our health sciences / biomedical data, siloed research infrastructure stacks become like true islands without any chance of meaningful integration for users. Data portability is extremely difficult and frequently insecure. Reproducibility of results and cleanly annotated analytic provenance of derived data products remains elusive. Enormous startup costs to simply adopt methods and tools from other labs or collaborators. Key Opportunity: Support and prioritize development of novel ubiquitous data environments designed for research and collaborative science. 8 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Collaboration and Data Workspace Environments KBase (www.kbase.us) is the first large-scale bioinformatics system that enables users to upload their own data, analyze it (along with collaborator and public data), build increasingly realistic models, and share and publish their workflows and conclusions. KBase aims to provide a knowledgebase: an integrated environment where knowledge and insights are created and multiplied. 9 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Collaboration and Data Workspace Environments KNECT, (inspired by KBase ) is a prototype knowledge network environment for precision medicine. KNECT aims to provide a common data workspace with: • richly typed and annotated data objects, • tightly integrated scalable data science cluster and service technologies (e.g. Spark, Docker) • clinically compliant security and auditing frameworks. 10 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Vision for Complete Data Science Environments • Collaboration spaces / knowledge networks / Open sciences environments layer on top of unified continuum of technologies to connect investigators and investigations. • Ubiquitous Hyper Converged Data Environments underneath the full technology stack enables complete data portability and scientific agility. 11 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Summary of Ideas for Cyber Infrastructure Health Science Priorities 1) Build regulatory compliance at the foundation of research infrastructure. 2) Emphasize a unified user experience across the continuum of research computational tools needed for translational health science discovery and delivery. 3) Prioritize support for novel collaboration and data environments that connect investigators and investigations. 12 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3/24/15 Acknowledgements and Appreciation Funding Sources and Key Supporters Dr. Keith Yamamoto UCSF Vice Chancellor for Research Dr. Bruce Miller Director, UCSF Memory and Aging Center Tau Consortium www.tauconsortium.com Dr. Neil Risch Director, UCSF Institute for Human Genetics Colleagues and Inspiration Dr. Kate Rankin UCSF Memory and Aging Brad Dispensa UCSF Institute for Human Genetics Dr. Adam Arkin LBNL, UC Berkeley, KBase Michael Schaffer Dir. of Tech. UCSF Memory and Aging Joe Bengfort UCSF Chief Information Officer Contact Info Joe Hesse – [email protected] Office: 415-502-0590 Mobile: 415-819-1054 13 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse UCSF Sandler Neurosciences Center 3/24/15
© Copyright 2024