B2SAFE - EUDAT

B2SAFE
How to replicate your data
Version 2
B2SAFE – A robust, safe & highly
available data replication service…
…for community and
departmental
repositories
Allowing community and
departmental repositories to
replicate easily their research data
Aggregating data from different
communities into large data centres
Introducing economies of scale
Raising new possibilities for
interdisciplinary data-intensive
science.
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
Better safe than sorry….
“I want to replicate my collection X to two
data centres and store the collection safely
for 10 years”.
In today’s rich data-storage ecosystems, large
data centres must offer a robust, safe and
highly available replication service to
allow community and departmental
repositories to replicate their research data
•
•
•
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
to guard against data loss in longterm archiving and preservation,
to optimize access for users from
different regions, and
to bring data closer to powerful
computers for compute-intensive
analysis.
Who can benefit?
Small- and mediumsized repositories
which
Researchers – both
data producers and
data consumers.
• do not have the capacity to
store research data and offer
long-term access
• do not have long-term funding
in place for the preservation of
their data
• cannot offer major
computational services on
the stored data for a large
number of users
•
www.eudat.eu | http://www.eudat.eu/b2safe
•
•
B2SAFE Training
Producers know that their data will
be stored safely
Consumers get access to
optimized services on data
sources of interest to them
Consumers can use new multidisciplinary data-intensive
methods since EUDAT will collect
data from various communities
creating a cross-disciplinary data
domain which can be exploited
4
Your data is safe and highly available
The B2SAFE service is based on
the execution of auditable
policy rules and the use of
persistent identifiers (PIDs)
to identify the data objects and
collections
Tomorrow’s corpus of research data will be a
domain of registered data objects and
collections where PIDs identify the data
objects and collections.
Information associated with the PIDs allows
the integrity and authenticity of the data to be
checked.
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
Easy access to trusted and authentic
data
• B2SAFE guarantees
ownership rights of data
remain with the originators
• Service providers replicating
data via community-defined
portals must respect all
access permissions
• Persistent access and
services will be ensured in
the long-term
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
Who can join?
Any community and
departmental data
repositories which have
a repository
infrastructure
The infrastructure must
support PIDs and
metadata describing
the properties and
context of the data
being replicated
www.eudat.eu | http://www.eudat.eu/b2safe
Tight integration to
EUDAT infrastructure
Use iRODS and other
federation technologies
B2SAFE Training
What happens next?
Data from the
Community
repository is
replicated in
other data
centres…..
…distributed
across
Europe.
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
What makes B2SAFE unique?
• Data are stored in the
EUDAT Collaborative
Data Infrastructure (CDI)
with known policies.
Therefore, data are not
stored in an opaque cloud
environment
• EUDAT is building a suite
of additional services
relevant for the “engine
under the hood” of escience infrastructures
(e.g., EPOS, EMSO,
CLARIN, …)
• Communities do not have
to spend additional efforts
& costs in setting up and
maintain their own “ad
hoc”, private cloud
• Data are stored next to
HTC & HPC servers ideal
for intensive data
processing
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
9
EUDAT partners are already using
B2SAFE
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
B2SAFE – Coming soon in 2014
•
•
•
•
•
•
•
•
Easy to implement
Easy to use
Improved accessibility to data
Highly professional level of
service
Trusted environment
Agnostic to the type of data
Identification of data through
Persistent Identifiers (PIDs)
Data discovery through
Metadata
www.eudat.eu | http://www.eudat.eu/b2safe
B2SAFE Training
Thank you