What is Grid Computing? Enabling Grids for E sciencE Mik Mi

Enabling Grids for E
E-sciencE
sciencE
What is Grid Computing?
Mike Mi
Mik
Mineter
t
Training Outreach and Education
National e-Science Centre
[email protected]
www eu egee org
www.eu-egee.org
INFSO-RI-508833
You are welcome to re-use these slides. We
ask only that you let us know, by email to
[email protected]
g
pp @
2
Contents
Enabling Grids for E-sciencE
• Introduction to
– e-Research and e-Science
– Grids
– e-Infrastructure
• Grid concepts
• Grids - Where are we now?
INFSO-RI-508833
3
Enabling Grids for E-sciencE
‘‘e-Science
S i
iis about
b t global
l b l collaboration
ll b ti
y areas of science, and the next
in key
generation of infrastructure that will
enable it.
it.’
John Taylor
Director General of Research Councils
Office of Science and Technology
INFSO-RI-508833
4
Enabling Grids for E-sciencE
‘‘e-Science
S i
iis about
b t global
l b l collaboration
ll b ti
y areas of science, and the next
in key
generation of infrastructure that will
enable it.
it.’
Networks + Grids
– Networks connect resources
– Grids enable “virtual computing”
INFSO-RI-508833
5
V tua Obse
Observatories
vato es
Virtual
Observations made across entire electromagnetic spectrum
ROSAT ~keV
k V DSS O
Optical
i l 2MASS 2μ
2 IRAS 25
25μ
IRASS 100μ
GB 6cm
NVSS 20
20cm WENSS
ENSS 92
92cm
⇒e.g. different views of a local galaxy
Need
N
d allll off th
them tto understand
d t d physics
h i ffully
ll
Databases are located throughout the world
Peter Clarke
6
Biomedical Research Informatics
Delivered by Grid Enabled Services
C F G V ir t u a l
P u b lic a lly C u r a te d D a t a
E nsem bl
O r g a n is a t io n
O M IM
G la s g o w
S W I S S -P R O T
P r iv a te
E d in b u r g h
MGI
VO Authorisation
P r iv a te
d ata
O x fo rd
d ata
Information
Integrator
HUGO
…
RGD
L e ic e s te r
DATA
HUB
P r iv a te
d ata
N e th e rla n d s
Synteny
Grid
Service
P riv
i a te
d ata
P riv a te
d ata
London
P r iv a te
d ata
+
http://www.brc.dcs.gla.ac.uk/projects/bridges/
Enabling Grids for E-sciencE
DAME: Grid based tools and Inferstructure for Aero-Engine Diagnosis
and Prognosis
Engine flight data
London Airport
Airline
office
New York Airport
•“A Significant factor in the success of the Rolls-Royce
campaign to power the Boeing 7E7 with the Trent 1000
was the emphasis on the new aftermarket support service
for the engines provided via DS&S. Boeing personnel
were shown DAME as an example of the new ways of
gathering and processing the large amounts of data that
could be retrieved from an advanced aircraft such as the
7E7, and they were very impressed”,
impressed , DS&S 2004
Grid
Diagnostics Centre
Maintenance Centre
American data center
European data center
XTO
Companies:
Rolls-Royce
DS&S
Cybula
Universities:
York,
Leeds
Leeds,
Sheffield, Oxford
Engine Model
Case Based Reasoning
Follow-on project: BROADEN
INFSO-RI-508833
Signal Data Explorer
8
climateprediction.net and GENIE
•
•
Response of Atlantic
circulation to freshwater
forcing
2K
Largest climate model
ensemble
>45,000 users, >1,000,000
model years
10K
UK Grid for
f
Particle Physics
GridPP www.gridpp.ac.uk
ATLAS detectors, 2/3/06
Connecting people: Access Grid
Enabling Grids for E-sciencE
http://www.accessgrid.org/
Cameras
Microphones
INFSO-RI-508833
11
What is e-Research?
Enabling Grids for E-sciencE
• Collaborative research that is made possible by the
sharing across the Internet of resources (data,
instruments, computation, people’s expertise...)
–
–
–
–
Crosses organisational boundaries
Often very compute intensive
Often very data intensive
Sometimes large-scale collaboration
• Began with focus in the “big sciences” hence
initiatives are often badged as “e
“e-science”
science”
• Relevance of “e-science
e-science technologies”
technologies to new user
communities (social science, arts, humanities…) led to
the term “e-research”
INFSO-RI-508833
12
e-Research and Grids
Enabling Grids for E-sciencE
Collaborative
“virtual computing”
p
g
Improvised
p
cooperation
p
People with shared goals
INFSO-RI-508833
Sharing data, computers, software
Enabled by Grids:
National,, regional
g
International: EGEE grid
Email
File exchange
ssh access to run programs
Enabled by
y networks:
national, regional and
International: GEANT
13
Grids: a foundation for e-Research
enabling a whole-system approach
computers
software
Grid
sensor nets
instruments
Diagram derived from
Ian Foster’s slide
colleagues
Shared data
archives
What is Grid Computing?
Enabling Grids for E-sciencE
•
The grid vision is of “Virtual
computing” (+ information
services to locate computation
computation,
storage resources)
– C
Compare: Th
The web:
b ““virtual
it l
documents” (+ search engine
to locate them)
•
MOTIVATION: collaboration
through sharing resources
( d expertise)
(and
ti ) to
t expand
d
horizons of
– Research
– Commerce – engineering, …
– Public service – health,
environment,…
INFSO-RI-508833
15
The Grid Metaphor
Enabling Grids for E-sciencE
Mobile Access
G
R
I
D
Workstation
M
I
D
D
L
E
W
A
R
E
Supercomputer, PC
PC-Cluster
Cluster
Data-storage, Sensors, Experiments
Visualising
Internet, networks
INFSO-RI-508833
16
What is e-Infrastructure? – Political
view
Enabling Grids for E-sciencE
•
A shared resource
– That enables science,
research engineering,
research,
engineering
medicine, industry, …
– It will improve UK / European /
… productivity
ƒ Lisbon Accord 2000
ƒ E-Science Vision SR2000 –
John Taylor
– Commitment by UK
government
ƒ Sections 2
2.23
23-2
2.25
25
– Always there
ƒ c.f. telephones,
transport power
transport,
power, internet
INFSO-RI-508833
17
What is e-Infrastructure?
Enabling Grids for E-sciencE
Grids: permit resource sharing
across administrative domains
•
Networks: permit
communication across
geographical distance
•
Supporting organisations
C ll b
Collaboration
ti
p
– Operations
for g
grids,, networks
•
Resources
–
–
–
–
•
Computers
Digital libraries
Research data
Instruments
Middleware
– Authentication, Authorisation
– Registries, search engines
– Toolkits,
Toolkits environments
ƒ E.g. for collaboration
INFSO-RI-508833
Grid
Op
peration
ns, Sup
pport an
nd
tra
aining
•
Network
infrastructure
& Resources
18
Global Drivers of e-Research
Enabling Grids for E-sciencE
•
•
Digital technology – exponential growth - e.g. bandwidth
Opportunities for e-Infrastructure to support faster
faster, better
better,
different research
– Sharing expertise
ƒ
Support for cooperation and communication
– Sharing computation services
ƒ
E.g.
g to serve occasional p
peaks of high
g demand for computation
p
(especially trivially parallelisable ones)
– Sharing data
ƒ
ƒ
•
New sensors and instruments
Databases
Based on an infrastructure that requires and enables
multidisciplinary
ltidi i li
research
h
ƒ
ƒ
INFSO-RI-508833
Requires: IT + domain specialists
Enables: New interdisciplinary research
19
What is Grid computing?
Enabling Grids for E-sciencE
• The term “Grid” has become popular!
– Sometimes in Industryy : “Grids” = clusters
ƒ Motivations: better use of resources; scope for commercial services
– Also used to refer to the harvesting of donated, unused compute
cycles
l
ƒ (SETI@home, Climateprediction.net)
– These are e-Infrastructure but are not “grids”
grids from the eResearch viewpoint!
INFSO-RI-508833
20
Enabling Grids for E-sciencE
Grid concepts
INFSO-RI-508833
21
Virtual organisations and grids
Enabling Grids for E-sciencE
•
What’s a Virtual Organisation?
– People in different organisations seeking to cooperate and share
resources across their organisational boundaries E.g. A research
collaboration
•
Each grid is an infrastructure enabling one or more “virtual
organisations” to share and access resources
•
Key concept: The ability to negotiate resource-sharing
arrangements
g
among
g a set of participating
p
p
g parties
p
(providers
(p
and consumers) and then to use the resulting resource pool
for some purpose. (Ian Foster)
INFSO-RI-508833
22
Typical current grid
Enabling Grids for E-sciencE
•
Virtual organisations
negotiate with sites to
agree access to
resources
•
Grid
G
id middleware
iddl
runs
on each shared
resource to provide
– Data
D
services
i
– Computation
services
– Single sign-on
•
INTERNET
Distributed services
(both people and
middleware) enable
grid
the g
INFSO-RI-508833
23
Grid Middleware
Enabling Grids for E-sciencE
• When using a PC or
workstation you
– Login with a username
and password
((“Authentication”)
Authentication )
– Use rights given to you
(“Authorisation”)
– Run jobs
– Manage files: create
them, read/write, list
directories
• Components are
li k d by
linked
b a bus
b
• Operating system
• One admin.
admin domain
INFSO-RI-508833
• When using a Grid you
– Login with digital
credentials – single signon (“Authentication”)
– Use rights given you
(“Authorisation”)
– Run jobs
– Manage files: create
them, read/write, list
directories
• Services are linked by
th Internet
the
I t
t
• Middleware
• Many
M
admin.
d i domains
d
i
24
Empowering VO’s
Enabling Grids for E-sciencE
Application
A
li i
Application
toolkits …
toolkits,
Higher-level grid
services (brokering
(brokering,…))
Basic Grid services:
AA,, jjob submission,, info,, …
Where computer science meets
pp
communities!
the application
VO-specific developments:
– Portals
– Virtual Research
Environments
– Semantics, ontologies
– Workflow
– Registries of VO services
Production grids provide these
services.
INFSO-RI-508833
25
Example – Biomedical applications
Enabling Grids for E-sciencE
Biomedical community and the Grid, EGEE User Forum, March 1st 2006, I. Magnin
INFSO-RI-508833
26
Workflow example
Enabling Grids for E-sciencE
•
•
•
•
•
Taverna in MyGrid http://www.mygrid.org.uk/
“allows the e-Scientist to describe and enact their experimental
processes in a structured,
str ct red repeatable and verifiable
erifiable way”
a ”
GUI
Workflow
language
enactment engine
INFSO-RI-508833
27
The many scales of grids
Enabling Grids for E-sciencE
International instruments,..
National datacentres,
HPC instruments
HPC,
Institutes’ data;
Wider colllaboration
greater re
esourc
ces
International g
grid (EGEE)
(
)
National grids (e.g.
(e g
National Grid Service)
Regional grids (e.g.
Brunel/Westminster Grid
Alliance)
Campus grids
Condor pools,
p
,
clusters
Desktop
INFSO-RI-508833
28
Main components
Enabling Grids for E-sciencE
Access service How users logon to a Grid
Resource Broker (RB):
(RB) Service that matches the user’s
requirements with the available resources on a Grid
Information System:
System
y
Characteristics and status of resources
Computing Element (CE):
(CE) A batch queue on a site’s computers where
the user’s job is executed
Storage Element (SE):
(SE) provides (large-scale) storage for files
INFSO-RI-508833
29
Who provides the resources?!
Enabling Grids for E-sciencE
Service
Provider
Note
Access service
User / institute/ VO /
grid operations
Computer with client
software
Resource Broker
(RB)
VO / grid operations (No NGS-wide RB
exists)
Information System:
System
Grid operations
Computing Element
(CE)
VO / sometimes
centralised
provision also
Scalability requires
that VOs provide
resources to match
average
g need
Storage Element
(SE)
ditto
ditto
“VO”: virtual organisation
INFSO-RI-508833
“Grid operations”: funded effort
30
Current EGEE grid
Enabling Grids for E-sciencE
User/Grid
interface
Input files
Output files
Datasets info
File Replica
Catalogue
Information
Service
Resource
Broker
INFSO-RI-508833
Publish
h
Logging &
BookB k-keeping
Book
k i
Job Querry
Job
b Submit Event
E
Author.
&Authen.
Storage
Resource
J b St
Job
Status
t
Computing
Resource
= batch queue
31
EGEE is …
Enabling Grids for E-sciencE
• EU-funded project that has established the largest
multi-VO p
production grid
g
in the world!
• What’s happening
pp
g now?
http://gridportal.hep.ph.ic.ac.uk/rtm/
• What resources are connected?
http://goc.grid-support.ac.uk/gridsite/monitoring/
INFSO-RI-508833
32
Grid security and trust -1
Enabling Grids for E-sciencE
• Providers of resources (computers, databases,..) need risks to
be controlled: they are asked to trust users they do not know
– They trust a VO
– The VO trusts its members
• User’s need
– single sign-on: to be able to logon to a machine that can pass the
user’s identity to other resources
– To trust owners of the resources theyy are using
g
• Build middleware on layer providing:
– Authentication: know who wants to use resource
– Authorisation: know what the user is allowed to do
– Security: reduce vulnerability, e.g. from outside the firewall
– Non-repudiation:
N
di ti
k
knowing
i who
h did what
h t
• The “Grid Security
y Infrastructure” middleware is the basis of
(most) production grids
INFSO-RI-508833
33
Grid security and trust -2
Enabling Grids for E-sciencE
• Achieved by Certification:
– User’s identityy has to be certified byy one of the national
Certification Authorities (CAs)
• User
– User joins a VO
– Digital certificate
f
is basis off AA
– Identity passed to resources you use, where it is mapped to a
local account
• Policies express the rights for a Virtual
Organization
g
to use resources
INFSO-RI-508833
34
The Role of the Virtual Organisation
(VO)
Enabling Grids for E-sciencE
Compute
Center
VO
Service
Compute
Center
slide based on presentation given by Carl Kesselman at GGF Summer School 2004
INFSO-RI-508833
35
Enabling Grids for E-sciencE
If “The Grid”
vision leads us
here…
… then where are
we now?
INFSO-RI-508833
36
Grid projects - ~ 2003
Enabling Grids for E-sciencE
Many Grid development efforts — all over the world
•UK – OGSA-DAI, RealityGrid, GeoDise,
•NASA Information Power Grid
Comb-e-Chem, DiscoveryNet, DAME,
•DOE Science Grid
AstroGrid, GridPP, MyGrid, GOLD,
eDiamond, Integrative Biology, …
•NSF National Virtual Observatory
•Netherlands – VLAM, PolderGrid
•NSF GriPhyN
•Germany – UNICORE, Grid proposal
•DOE Particle Physics Data Grid
•France – Grid funding approved
•NSF TeraGrid
•Italy – INFN Grid
•DOE ASCI Grid
•Eire – Grid proposals
•DOE Earth Systems Grid
•Switzerland - Network/Grid proposal
•DARPA CoABS Grid
•DataGrid (CERN, ...)
•Hungary – DemoGrid, Grid proposal
•NEESGrid
•EuroGrid (Unicore)
•Norway Sweden - NorduGrid
•Norway,
•DataTag
DataTag (CERN
(CERN,…))
•DOH
DOH BIRN
•Astrophysical Virtual Observatory
•NSF iVDGL
•GRIP (Globus/Unicore)
•GRIA (Industrial applications)
•GridLab
GridLab (Cactus Toolkit)
•CrossGrid (Infrastructure Components)
•EGSO (Solar Physics)
INFSO-RI-508833
37
Grids: where are we now?
Enabling Grids for E-sciencE
• Many key concepts identified and known
• Many
yg
grid p
projects
j
have tested,, and benefit from,, these
• Major efforts now on establishing:
– Production Grids for multiple VO’s
ƒ “Production” = Reliable with commitments to quality of service
• In Europe, EGEE
• In UK, National Grid Service
• In US, Teragrid and OSG
ƒ One stack of middleware that serves many research communities
ƒ Establishing operational procedures and organisation
– Standards (a slow process)
(e.g. Open Grid Forum, http://www.gridforum.org/ )
– Sustainable
S t i bl infrastructure
i f
t
t
ƒ Move from research funding to GEANT-like model
ƒ European Grid Infrastructure federating National Grid
Infrastructures
INFSO-RI-508833
38
Where are we now? –user’s view
Enabling Grids for E-sciencE
Research
Pilot
projects
Early
adopters
Routine
production
Unimagined
possibilities
Networks
Grids
Web
Arts
Sciences,
Humanities
engineering
Early production grids:
International - EGEE
e-Soc-Sci
Types of use:
INFSO-RI-508833
Service-oriented,
workflow, “legacy” data
High throughput, new
data
39
Where are we now?!
Enabling Grids for E-sciencE
•
Standards are emerging… some near acceptance and some being
discarded
– Standards bodies:
ƒ
ƒ
ƒ
ƒ
W3C
GGF
OASIS
IETF
http://www.w3c.org/
p
gg g
http://www.ggf.org/
http://www.oasis-open.org/home/index.php
http://www.ietf.org/
– For a (slightly outdated) summary see http://www.innoq.com/soa/wshttp://www innoq com/soa/ws
standards/poster/
•
Production grids are based on de-facto standards at present
– Inevitably!
– GT2 especially
– But locks a grid into one middleware stack unable to benefit from the
diverse developments of new services
INFSO-RI-508833
40
National grid initiatives now include…
Enabling Grids for E-sciencE
CroGrid
INFSO-RI-508833
41
What are Grids? - Summary
Enabling Grids for E-sciencE
•
•
Grids enable virtual
computing across
administrative domains
– Resources share
authorisation and
authentication
– Resources accessed thru
abstractions
Motivations:
– Collaborative research,
diagnostics, engineering,
public service,..
p
– Resource utilisation and
sharing
Infrastructure for research
INFSO-RI-508833
C ll b
Collaboration
ti
Grid
Op
peration
ns, Sup
pport an
nd
tra
aining
•
Network
infrastructure
& Resource
centres
42
Further reading
Enabling Grids for E-sciencE
•
•
•
The Grid Cafe www.gridcafe.org
Grid Today http://www.gridtoday.com/
International Science Grid This Week http://www.isgtw.org/
•
•
UK All Hands Meeting http://www.allhands.org.uk/
National e-Science Centre http://www.nesc.ac.uk
p
•
International School of Grid Computing http://www.issgc.org/
•
•
Open Grid Forum http://www.ggf.org/
EGEE Conference (next week) http://www.euhttp://www eu
egee.org/egee_events/conference
INFSO-RI-508833
43