Big data meets Darwin’s “entangled bank” – Changing perspectives on plant distributions Robert Peet US-NVC SEEK BIEN IAVS & EVS Site data: e.g. climate, soils, topography, etc. Taxon attribute data: e.g. phylogeny, distribution, life-history, functional attributes, etc. Occurrence data: attributes of individuals and taxa that occur at a site. Ecoinformatics vision Mobilization of biodiversity data Examples over 15 years VegBank and the US-NVC (1997-2015) IAVS (2004- 2015) SEEK (2003-2006) BIEN (2008-2015) Taxonomic database challenge The well-known problem: Integration of data from different times & places, by multiple investigators using varied taxonomic standards. The well-known solution: Identifications to taxon concepts that have mapped relationships to related concepts. AZ NM CO WY MT AB eBC wBC WA OR Distribution Abies lasiocarpa var. arizonica Abies lasiocarpa var. lasiocarpa USDA - ITIS Abies bifolia Abies lasiocarpa Flora North America A Minimal concepts B C Andropogon virginicus complex 9 elemental units; 17 concepts, 27 scientific names 1993: ESA Vegetation Panel & FGDC Mandate 1997: FGDC Standard, Version 1 1997: ESA Vegetation Panel Data Committee 2000: VegBank (ESA) & BIOTICS (NatureServe) 2008: FGDC Standard, Version 2 2013: MOU with USFS, USGS, ESA, NatureServe www.vegbank.org VegBank – the ESA Plot database • The ESA Vegetation Panel maintains VegBank (www.vegbank.org) as a public vegetation plot archive. • VegBank is expected to function in a manner analogous to GenBank. • Primary data will be deposited for reference, novel synthesis, and reanalysis. • Globally, many plot databases. Project Plot Core elements of VegBank Plot Observation Taxon / Individual Observation Taxon Interpretation Plot Interpretation The basis for VegX, the new international XML data exchange standard Interpretation Plants • Tax Interpretation • Taxon Alt • Linkage to concept database Communities • Class event • Multiple interpretations 1993: ESA Vegetation Panel & FGDC Mandate 1997: FGDC Standard, Version 1 1997: ESA Vegetation Panel Data Committee 2000: VegBank (ESA) & BIOTICS (NatureServe) 2008: FGDC Standard, Version 2 2013: MOU with USFS, USGS, ESA, NatureServe European Vegetation Survey GIVD VegX sPlot Arctic Vegetation Archive Wiser, Spencer, De Caceres, Kleikamp, Boyle & Peet. 2011. J. Vegetation Science 22: 598-609. >2,200,000 plot records in >125 databases Dengler, Peet, et al. 2011. J. Vegetation Science 22:582-597 Science Environment for Ecological Knowledge Multidisciplinary project to create: Scientific-workflow system (Kepler) Design, reuse, and execute scientific analyses Distributed data network (EcoGrid) Environmental, ecological, and systematics data KR & Semantic Mediation Discover, integrate, and compose hard-to-relate data and services via ontologies Taxonomic concept services Resolve taxon ambiguities Collaborators (the SEEK team) NCEAS, UNM, SDSC/UCSD, U Kansas Vermont, Napier, ASU, UNC Benefits of the Taxonomic Object Service Allows integration of ecological datasets Allows taxonomists to author new ideas, make new connections Allows all researchers to see previous taxonomic opinions Provides a stable identification system to reference taxon concepts Document and manage taxon concepts from multiple sources Document and manage concept relationships from multiple sources Input data files as txt, xls, mdb, or TCSXML Export data as txt, mdb, or TCS-XML Case study: Southeast US 1. Regional floras obsolete and incomplete. Need for an updated atlas of the flora of the Southeast 2. Datasets with inconsistent taxonomic concepts have defied integration 3. 65,000 concept relationships & 2,000,000+ taxon occurrences http://www.herbarium.unc.edu/seflora/firstviewer.htm NCU RAB USDA CVS Carya carolinae-septentrionalis According to Radford 1968, USDA PLANTS v 4.0, & Weakley 2008 Carya carolinae-septentrionalis Carya ovata According to Stone 1997 in FNA Carya ovata var. australis Carya ovata var. ovata Weakley 2005 – Reference concepts Radford 1968 – Concepts mapped NC Heritage Program – Weakley concepts CVS – Weakley concepts (mostly) USDA – Kartesz 1999 concepts (mostly) NCU & NCSC – Nominal concepts only Most museum collection identifications must be interpreted as nominal concepts!! To do otherwise would be to introduce false positives. Some nominal occurrences might or might not represent the taxon Carya carolinae-septentrionalis Choice of primary authority not available Versioning by date not available Many collections and observations not digital Most collections still identified as nominals BIEN Working Group (2008–2014) Ecologists, Informaticians, Plant Taxonomists Botanical Information and Ecology Network 2012 Principal Investigators Brad Boyle, U Arizona Richard Condit, STRI Steven Dolins, Bradley U Brian Enquist, U Arizona Robert Peet, U North Carolina Mark Schildhauer, NCEAS Barbara Theirs, NY Bot Garden Core Participants John Donoghue, U Arizona Peter Jorgensen, Missouri Bot Garden Nathan Kraft, U Maryland Aaron Marcuse-Kubitza, NCEAS Brian McGill, U Maine Naia Morueta-Holme, Aarhus U, DK Martha Narro, iPlant Bill Piel, Yale U Jim Regetz, NCEAS Brody Sandel, Aarhus U, DK Irena Simova, Charles U, CZ Nick Spencer, Landcare NZ Jens C. Svenning, Aarhus U DK Cyrille Violle, CNRS FR Susan Wiser, Landcare, NZ Goal: Understand and predict the occurrence and co-occurrence of plant taxa in the New World. Approach: Collect in one database all the known plant occurrence and occurrence data. > 14,000,000 plant occurrence records, representing ~100,000 taxa. Data Sources Cyberinfrastructure Plot and Trait Data TAXONOMIC PHYLOGENETIC INTELLIGENCE DATA SCRUBBING CORRECTING, Data Standardization & feedback Tools Specimen Data Exchange schema Database BIEN 2.0 Data Discovery Confederated resource BIEN 3.0 Science ! Deliverables Use of raw biodiversity observation data will yield seriously erroneous results BIEN Plant Species Richness (Pre-scrubbing) http://tnrs.iplantcollaborative.org/ BIEN 2.0 New World Summary (post Geo + Taxonomic scrubbing) Plot Data - CTFS - FIA (conservative) - Madidi plots - Vegbank - TEAM - SALVIAS - Many others . . . Herbarium and Observation Data - GBIF - MOBOT - NYBG - CRIA (Brazil collections) - Arizona - UNC, NCS etc. - REMIB (Mexico) - Utrecht - Many others . . . Specimens = 9,345,197 Species = 92,788 Observations = 12,171,014 Plots = 329,741 Plant Traits - Numerous literature sources - BIEN researcher data Traits = 27 Trait observations = 109,659 bien.nceas.ucsb.edu/bien/ BIEN 2 data and deliverables (geographic range maps & phylogeny) are available for use! BIEN Plant Species Richness from Geographic Range Overlap Number of Plant Species BIEN 3.0 Schema A perpetual motion machine? Individual data sources—or the entire database—can be loaded and re-loaded rapidly, allowing updates as data sources are modified or grow, or new sources are acquired. Aaron Marcuse-Kubitza NCEAS Brad Boyle, UofA BIEN3.0 results soon to come! Ecoinformatics vision Mobilization of biodiversity data Examples over 15 years VegBank and the US-NVC (1997-2015) IAVS (2004- 2015) SEEK (2003-2006) BIEN (2008-2015) We are pleased to acknowledge the support and cooperation of Gap Analysis Program
© Copyright 2024