Margo Seltzer, Mercè Crosas, Gary King

Integrating Data Citation with Provenance
Margo Seltzer, Mercè Crosas, Gary King
What is Dataverse? Data Cita;on with Dataverse Integrates with DataCite service for minYng persistent idenYfiers (DOIs) and registering dataset citaYon metadata Authors, Year, Dataset Title, DOI, Data Repository, UNF, version ASribuYon to data authors and distributors Provenance examples with R Code What we propose • 
• 
Describe Provenance with code used to transform or merge datasets Repository table Weekly data csv Usage table Represent Provenance as a graph, connec;ng the code with the input and output datasets (using DOIs) merge(repo
, usage, by = “repo”) Include Provenance graph in Dataset metadata An;cipated Collabora;ons: ts(data$data.science, start=2004, frequency = 52) weekly output aggregate.ts(weekly, FUN=sum) •  With DataCite, enhance their metadata schema to support provenance graph •  With USENIX, integrate their open access repository with the cita;on + provenance work from this project hSp://dataverse.org input • 
Fingerprint (UNF) to verify dataset, and version to specify what data are being referenced. UNF does not depend on the data format. @dataverseorg Merged table github.com/IQSS/dataverse monthly Weekly plot Monthly plot Dataverse Community Google Group