Globus Presented by: Yayati Kasralikar for CPA 5937

Globus
Presented by:
Yayati Kasralikar for CPA 5937
Motivational Example
Very large
Database of
cancer images
Highperformance
machine
Cancer image
Data Mining
Software
cancer
images
R
R
cancer
images
Data Preprocessing
Software
cancer
images
What is Grid?
1. Coordinates resources that are not
subject to centralized control.
2. Uses standard, open, general-purpose
protocols and interfaces.
3. Delivers nontrivial qualities of service.
• Let’s Examine some technologies:
–
–
–
Clusters -Centralized Control
P2P Systems (e.g. Gnutella)
Do not use Open and Standard protocols
Web Not coordinated use resources
Why use Grid?
• A biochemist exploits 10,000 computers to
screen 100,000 compounds in an hour.
• 1,000 physicists worldwide pool resources
for peta-op analyses of petabytes of data.
• An insurance company mines data from
partner hospitals for fraud detection.
• An application service provider offloads
excess load to a compute cycle provider
Virtual Organization (VO)
A dynamic set of individuals or institutions sharing
resources for problem solving
R
R
R
VO C
R
R
R
R
?
R
VO A
R
R
R
R
R
R
R
VO B
R
R
Grid Characteristics
• Scale and Resource Selection
– Particular applications selecting resources from a very
large collection according to criteria such as
connectivity,cost,security and reliability
• Heterogeneity at multiple levels
– heterogeneity ranging from physical devices, system
software to scheduling and usage
• Dynamic and unpredictable behavior
– Behavior and performance of shared resources vary
over time
• Multiple administrative domain.
– Challenging security problem
Globus Initiative
• Provide basic infrastructure, Protocols, Services,
APIs and SDKs for Grid Computing.
– Protocols: Focus on externals(interactions) rather than
internals(resource characteristics) (e,g. GRIP, IP)
– Service: Protocol+Behavior (e.g. Information).
– APIs and SDKs: Facilitate application developers to
develop complex applications(e.g. GSS API,JDBC
API,JNDI SDK). Application robustness, correctness,
development and maintenance cost.
• Globus Toolkit: A community-based,openarchitecture,open-source set of services and
software libraries that supports Grids and Grid
Applications.
Grid Protocol Architecture
Application
Collective
Application
Resource
Transport
Connectivity
Internet
Fabric
Link
Internet Protocol Architecture
Layered Grid Architecture
Connectivity Layer
Grid Protocol Architecture
Application
Collective
Resource
Connectivity
Fabric
Grid Security
Infrastructure
GSI
Nexsus
Interface
Collective
Resource
Connectivity
Fabric
Grid Resource
Information
Protocol
(GRIP)
Grid Resource
Registration
Protocol
(GRRP)
GridFTP
Data Transfer
Resource Management
Grid Protocol Architecture
Application
Grid Resource
Access
Management
(GRAM)
Grid Information Services
Resource Layer
Collective Layer
Grid Protocol Architecture
Application
Collective
Resource
Connectivity
Fabric
Directory
Services
Data Replication
Services
Monitoring
Services
Scheduling and
Brokering
Services
Application Layer
Grid Protocol Architecture
Application
Collective
Languages &
Frameworks
Collective APIs
and SDKs
Collective Service Protocols
Resource
Resource APIs
and SDKs
Resource Service Protocols
Connectivity
Fabric
Connectivity
APIs
Connectivity Protocols
Fabric
Communication Services
Communication
link
0
1
2
SP
EP
EP
SP
SP
Nexus communication mechanism
• Diverse Communication needs.
• IP does not meet these needs on the other hand MPI do
not provide rich range of communication abstractions.
• Communication link and remote service request (RSR).
– One-sided asynchronous RPC transfer data from SP to EP(s)
and integrate it into the process containing the EP(s)
Resource Management
Challenging resource management problems:
• site autonomy
– resources are typically owned and operated by different
organizations, in different administrative domains
• heterogeneous substrate
– different sites may use different local resource management
systems
• policy extensibility
– A resource management solution must support the frequent
development of new domain-specific management structures
• co-allocation
– using resources simultaneously at several sites
• online control.
– substantial negotiation can be required to adapt application
requirements to resource availability
Resource Management Architecture
RSL
specialization
Broker
RSL
Queries Informatio
n Service
& Info
Application
Ground RSL
Co-allocator
Simple ground RSL
Local
resource
managers
GRAM
GRAM
GRAM
LSF
Condor
NQE
Resource Specification Language
•
•
•
•
•
•
•
•
Based on the syntax for filter specifications in the
LDAP.
An RSL is constructed by combining simple
parameter specifications and conditions with
following operators:
&: Specify conjunction
| : Specify disjunction
+ : Combine two or more requests
Resource brokers,co-allocators and resource
managers can each define a set of parameters.
Example: I want “5 nodes with at least 256MB
memory, or 10 nodes with 64MB for myprog”
RSL:&(executable=myprog)(|(&(count=5)
(memory>=256)) (|(&(count=10)
(memory>=64)))
Local Resource Management
•
•
Globus Resource Allocation Manager (GRAM)
provide local component for resource
management.
GRAM is responsible for:
1. Processing RSL specifications
2. Enabling remote monitoring and management of
jobs
3. Periodically updates the information service.
•
Two major software components of GRAM:
1. GateKeeper: create Grid service
2. Job Manager Instance(JMI): resource management
and Job control
The Hour-Glass principle
• Simple well-defined interface form the neck.
• Uniform access to diverse local implementations
and higher-level global services.
Grid Security Characteristics
• Single Sign on
– Users must be able to authenticate just once to
access to multiple grid resources.
• Delegation
– Users must be able to endow a program with the
ability to run on his/her behalf.
• Integration with local security Solutions
– Interoperate with various local solutions.
• User-based trust relationships
– Each of the resource providers must not interact
with each other to configure security environment.
Security Policies:
• Grid Environment consists of multiple trust domains.
• Operations confined to a single trust domain are subject to
local security policy only.
• Both local and global participants exists. For each trust
domain, there exists a partial mapping from global to local.
• Operations between entities located in different trust domains
require mutual authentication.
• An authenticated global subject mapped into a local subject is
assumed to be equivalent to being locally authenticated as
that local subject.
• All access control decisions are made locally on the basis of
the local subject.
• A program or process is allowed to act on behalf of a user and
be delegated a subset of the user's rights.
• Processes running on behalf of the same subject within the
same trust domain may share a single set of credentials.
Globus Security Infrastructure
Credentials
User
User Proxy
Globus Credentials
GRAM
GRAM
User Process
User Process
User Process
User Process
Certificate
User Process
User Process
Kerberos
GSI
Certificate
Public Key
GSI
Globus Security Scenario
User
Single sign-on via “grid-id”
& generation of proxy cred.
Or: retrieval of proxy cred.
from online repository
User Proxy
Proxy
credential
Remote process
creation requests
GSI-enabled Authorize
Site A
GRAM server Map to local id
(Kerberos)
Create process
Generate credentials
Computer
Process
Kerberos
ticket
Local id
Restricted
proxy
Same GSI-enabled
GRAM server
Communication
Remote file
access request
Site C
(Kerberos)
Storage
system
Site B
(Unix)
Computer
Process
Local id
Restricted
proxy
GSI-enabled
FTP server
Authorize
Map to local id
Access file
Information Services
• Initial Discovery and ongoing monitoring of Resources
• Existing services such as LDAP and UDDI do not address
the dynamic addition and deletion of resources.
• Two Fundamental entities in Grid Information Service:
• Highly distributed information providers.
• Specialized aggregate directory services.
• Both these entities speak two fundamental protocols.
Information Services
VO-specific Aggregate Directories
discovery (GRIP)
D
D
registration (GRRP)
lookup (GRIP)
P
P
P
P
Information Provider Services
• Initial Discovery and ongoing monitoring of Resources
• Existing services such as LDAP and UDDI do not address
the dynamic addition and deletion of resources.
• Two Fundamental entities in Grid Information Service:
• Highly distributed information providers.
• Specialized aggregate directory services.
• Both these entities speak two fundamental protocols.
Information Services - Protocols
Grid Information Protocol (GRIP)
– Used to access information about entities
– GRIP supports both discovery and enquiry
– GRIP is adopted from Lightweight Directory Access
Protocol (LDAP)
– LDAP defines data model,query language and wire
protocol.
Grid Registration Protocol (GRRP)
– Define a notification mechanism to push simple
information from one ‘element’ to another ‘element’.
– It is a soft-state protocol which is resilient to failures.
– GRRP message contains name of the service,type
of notification service and timestamp.
Hierarchical Discovery
Each directory uses
GRIP and act as a
Information Provider
O1
Center 1 Host:hn=R1
Directory Host:hn=R2
Host:hn=R3
R1
Host
R2
Host
VO Directory
Host:hn=R1,O=O1
Host:hn=R2,O=O1
Host:hn=R3,O=O1
Host:hn=R1,O=O2
Host:hn=R2,O=O2
Host:hn=R1
O2
R1
Center 2 Host:hn=R1
Directory Host:hn=R2
R3
Host
R1
Host
Host
R3
Host
Information Provider
Network of aggregate directories
Data Transfer - GridFTP
• High-speed transport protocol which extends
the popular FTP protocol.
• GridFTP Functionality:
–
–
–
–
–
–
GridFTP must support GSI
Third-party control of data transfer
Parallel data transfer
Stripped data transfer
Partial file transfer
Support for reliable and restartable data transfer.
• The implementation consists of two principal
libraries: globus_ftp_control_library and
globus_ftp_client_library
Replica Management Service
Application
Attributes of
desired data
(1)
Metadata
Logical File
Names
(2)
Service
(3)
(5)
Location of
1 or more
replicas
(4)
Replica
Management
Service
Location of
Selected Replicas
(8)
Replica
Selection
Service
Sources and
destination
(6)
Performance
Measurements
and Predictions
(7)
Information Services
Replica Management Service
• Creating new copies of a complete or partial
collection of files
• Registering them in a Replica Catalog
• Allow Applications to query the catalog
• Data are organized into files.
– Logical File name Vs Physical File name.
• Key Architecture Decisions:
– Separation of Replication and Metadata Information
– Does not enforce Replication Semantics
– Provide Rollback to keep the state consistent in
case of failures
– No distributed locking mechanism
Relationships to other technologies
• World Wide Web
– Web technologies mainly support client-server
architecture. Lack features (at least for now) for
rich interaction and single-sign on security.
• ASP and SSP.
– Provide outsource solutions which depend on
specific customer. Lack dynamic configuration.
• Enterprise Computing
– Static arrangements of sharing resources.
• P2P computing
– Getting closer to Grid technology, but provide
specific solutions rather than common protocols.
Other Grid Perspective
• Grid as a next-generation Internet
• Grid is a source of free cycles
• Grid requires new programming models
• Grid makes high-performance
computers superfluous
References
• What Is The Grid? A Three Point Checklist. I. Foster,
GRIDToday, July 22, 2002: Vol. 1 No. 6.
• Grid Computing on the Web Using the Globus Toolkit,
G. Aloisio, M. Cafaro, P. Falabella, C. Kesselman, R. Williams
HPCN Europe.
• Computational Grids. I. Foster, C. Kesselman. Chapter 11 of
"The Grid: Blueprint for a New Computing Infrastructure",
Morgan-Kaufman, 1999.
• The Globus Project: A Status Report. I. Foster, C.
Kesselman. Proc. IPPS/SPDP '98 Heterogeneous Computing
Workshop, pp. 4-18, 1998.
• Globus: A Metacomputing Infrastructure Toolkit. I.
Foster, C. Kesselman. Intl J. Supercomputer Applications,
11(2):115-128, 1997.
References
• Data Management and Transfer in High Performance
Computational Grid Environments. B. Allcock, J. Bester, J.
Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S.
Meder, V. Nefedova, D. Quesnal, S. Tuecke. Parallel Computing
Journal, Vol. 28 (5), May 2002, pp. 749-771.
• Computational Grids. I. Foster, C. Kesselman. Chapter 2 of
"The Grid: Blueprint for a New Computing Infrastructure",
Morgan-Kaufman, 1999.
• A Directory Service for Configuring High-Performance
Distributed Computations. S. Fitzgerald, I. Foster, C.
Kesselman, G. von Laszewski, W. Smith, S. Tuecke. Proc. 6th
IEEE Symposium on High-Performance Distributed Computing,
pp. 365-375, 1997.
References
• Grid Information Services for Distributed Resource
Sharing. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman.
Proceedings of the Tenth IEEE International Symposium on
High-Performance Distributed Computing (HPDC-10), IEEE
Press, August 2001.
• A Security Architecture for Computational Grids. I. Foster,
C. Kesselman, G. Tsudik, S. Tuecke. Proc. 5th ACM Conference
on Computer and Communications Security Conference, pp. 8392, 1998.
• A Resource Management Architecture for Metacomputing
Systems. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman,
S. Martin, W. Smith, S. Tuecke. Proc. IPPS/SPDP '98 Workshop
on Job Scheduling Strategies for Parallel Processing, pg. 62-82,
1998.
Closing Remarks
We will probably see the spread of 'computer utilities', which, like
present electric and telephone utilities, will service individual homes
and offices across the country." - 1969, Len Kleinrock
We are a little late, but we are ready now!
Extra-1: A Model Architecture
for Data Grids
Metadata
Catalog
Attribute
Specification
Application
Logical Collection and
Logical File Name
Selected
Replica
Replica
Selection
Disk Cache
Tape Library
Disk Array
Replica Location 1
Multiple Locations
Performance
Information &
Predictions
GridFTP Control Channel
GridFTP
Data
Channel
Replica
Catalog
Disk Cache
Replica Location 2
Replica Location 3
MDS
NWS
Extra-2: Replica Catalog Structure:
Replica Catalog
Logical Collection
Logical Collection
C02 measurements 1998
C02 measurements 1999
Filename: Jan 1998
Filename: Feb 1998
…
Location
Location
jupiter.isi.edu
sprite.llnl.gov
Filename: Mar 1998
Filename: Jun 1998
Filename: Oct 1998
Protocol: gsiftp
UrlConstructor:
gsiftp://jupiter.isi.edu/
nfs/v6/climate
Filename: Jan 1998
…
Filename: Dec 1998
Protocol: ftp
UrlConstructor:
ftp://sprite.llnl.gov/
pub/pcmdi
Logical
File Parent
Logical File
Logical File
Jan 1998
Feb 1998
Size: 1468762