Storage Virtualization Team 3

DSC861A Emerging Technology
Storage Virtualization
Team 3
Jennifer Brola-Richards
Mohib Fanek
Kathy Larson
Donovan Miles
Vishu Reddy
Fran Trees
1
Presentation Outline
 Storage Virtualization

What is storage virtualization and why storage virtualization?
 Storage Evolution and Fundamental Concepts

What are innovations and fundamental concepts associated with
storage?
 Storage Virtualization Deep Dive

What, Where and How of Storage Virtualization?
 Case Study
 Research Topics in Storage Virtualization

What are potential topics of research and dissertation?
 Summary and Verbal Quiz
2
What is storage virtualization?
Storage Virtualization is the next frontier in Storage Advances
that aims to provide a layer of abstraction to reduce complexity.
Storage Networking Industry Association (SNIA) defines
Storage Virtualization as:
1. The act of abstracting, hiding, or isolating the internal
functions of a storage (sub) system or service from
applications, host computers, or general network resources,
for the purpose of enabling application and networkindependent management of storage or data.
2. The application of virtualization to storage services or
devices for the purpose of aggregating functions or devices,
hiding complexity, or adding new capabilities to lower
level storage resources.
3
Why storage virtualization?
Storage Virtualization aims to provide a layer of
abstraction to manage storage and reduce complexity !!!
Provided continuous availability
despite exponential growth (e.g.
FaceBook- Over 55 billion page views
a month, 41 million active users1)
Effectively group and manage
heterogeneous storage devices &
servers (e.g. Estimated number of
Google Servers 450,000 2!)
Allocate and manage
storage in accordance to the
Quality of Service (QoS)
associated with the data
(e. g. Gartner estimates
average data center
doubling its storage every
18 to 24 months)!)
Mergers and Acquisitions (e.g.
Microsoft & Yahoo!)
(1)
Multiple Storage Software
Platforms (e.g. IBM, EMC, HP,..)
Lucas Nealan, php|works, Atlanta September 13, 2007 (2) Wikipedia
4
What are the innovations and fundamentals associated with storage?
Client side storage innovations… variety of storage device
innovations that are smaller, higher capacity and cheaper have
helped end users cope with increasing storage requirements!
5
What are the innovations and fundamentals associated with storage?
Server side storage innovations… a combination of
storage devices, storage interfaces and storage software
innovations have helped enterprises cope with exponential
growth of data storage requirement !
Storage devices have evolved from tapes to hard drives to
RAID hard drives increasing capacity and resiliency.
6
What are the innovations and fundamentals associated with storage?
Storage interface innovations have evolved from SCSI to
ISCI, Fiber Channel (FCP) and InfiniBand to inter connect
devices and transport the data faster.
SCSI
ISCSI
FCP
Infiniband
7
What are the innovations and fundamentals associated with storage?
Storage Access File level access takes
center stage along with conventional
Block level access.
Block level access: Block addresses are
used to Read/Write data [Read/Write,
Block #] to the storage media.
Sample conventional Block
Allocation Map
File level access: Files are accessed by "semantics"
instructions [example: Open, Close]. Data inside files is
accessed by byte-ranges within the file (example: the first 10
bytes of a file). GFS (Google File System) is an example of
a large scale distributed file system.
8
What are the innovations and fundamentals associated with storage?
Metadata is Data about data; in the context of storage
metadata may describe an individual datum, or content
item, or a collection of data including multiple content
items.
Examples include: file size, who created file, attributes
such as read only, free block bitmaps, control data.
9
What are the innovations and fundamentals associated with storage?
Storage Software from simple back-up and restore to advanced
storage networks and storage management software functions.
(A) Simple Direct Attached Storage (DAS)
(B) Storage Area Network (SAN)
(C) Network Attached Storage (NAS)
10
What are the innovations and fundamentals associated with storage?
SAN and NAS: Key Differences
NAS
SAN
Access Methods
File access
Disk block access
Access Medium
Ethernet
Fiber Channel
Architecture
Decentralized
Centralized
Transport Protocol
Layer over TCP/IP SCSI/FC and SCSI/IP
Efficiency
Less
More
Good
Poor
Web
Workstations
Database
Database servers
Sharing and Access
Control
Typical Applications
Typical Clients
11
What and Where can Storage be Virtualized?
SNIA Storage Model
Potential Areas of
Virtualization
3
2
File Level Virtualization
Host Level Virtualization
6
*
4
Network Virtualization
Block Virtualization
**
5
Device Virtualization
1
Storage Level Virtualization
Source: The Storage Networking Tutorials, SNIAVIRT- Page 20
http://www.snia.org/education/tutorials/
* Host aka Server
** Device=aggregation of Host and Network (Meta Data)
12
What and Where can Storage be Virtualized?
Storage Virtualization: Innovations and Trends
1
Storage
Device Level
Virtualization
2
Host Level
Virtualization
Historical: Mainframe
Recent development
example: VMware
3
File Level
Virtualization
Historical: Mainframe
Recent development
example: NAS
4
5
Block
Virtualization
Device
Virtualization
Sub-Technique
6
Network
Virtualization
Sub-Technique
Historical: RAID Level, SCSI Interface
Recent Development Examples: Fiber
Channel
Major innovations continue to
emerge even in historical areas of
storage virtualization
Symmetrical (aka in-band) and
Asymmetrical (aka Out-of-Band)
are emerging as key areas of
abstraction and virtualization.
13
How is storage virtualized at the enterprise level?
Currently Networks are virtualized using Metadata or
Storage Volume Controllers. There are two types of network
virtualization…
Metadata or Storage
Volume Controllers
are placed (out of
band) outside the
path of data flow.
Metadata or Storage
Volume Controllers
(SVC) are placed (inband) or in the path
of data flow.
Source: IBM Redbook Page 8
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
14
How is storage virtualized at the enterprise level?
In-Band Virtualization
2
SVC controls who can get
access to the storage device
controls, how storage can be
accessed, how storage is
allocated, etc.
1
Metadata or Storage Volume
Controllers (SVC) are placed (inband) or in the path of data flow.
3
SVC are managed through
Storage Management Software.
Source: IBM Redbook Page 10
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
4
Key Challenge is the potential IO bottlenecks
15
How is storage virtualized at the enterprise level?
Out-of-Band Network Virtualization
2
SVC controls who can get
access to the storage device
controls, how storage can be
accessed, how storage is
allocated, etc.
Host sends
Metadata to SVC
4
1
Metadata or Storage Volume
Controllers (SVC) are placed (inband) or in the path of data flow.
Source: IBM Redbook Page 12
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
3
Storage Pool sends
Metadata to SVC
16
How is storage virtualized at enterprise level?
Virtualization Implementation Example
HIGH LEVEL DIAGRAM _ Typical Primary/Secondary site data replication with Storage Virtualization
Ethernet
(xxx) Blade server(s)
Type 1 SAN
Storage
with_52TB
Virtualization Engine
(xxx)
pSeries
server(s)
Blade SAN
Fabric
Type 2
SAN Storage_ 40 TB
Library wi
LT03 drives
xSeries server
Monitor
San Fabric B
Director
Monitor
3Com
3Com
San Fabric A
Director
VPN Comm-link
for remote
support
(2) Cisco 6509 switch
CISCOSYSTEMS
Type 2
SAN
Storage
26TB ea
CISCOSYSTEMS
SAN Fabric B
SAN Fabric A
Management
VLAN _ QA/
DEV _ storage,
library,
director _ 950
PROD_ Blades
+ Blade
Fabric_ 955
PRIMARY SITE
Environment:PROD, DEV, QA, SIT
Application:App1, App2
San Fabric A
Director
SD
Pwr
Network Appliances
DWDM
SD
Type 2 SAN Storage
Network Appliances
Pwr
SECONDARY SITE
Environment:Prod
Application:App1, App2
Network Appliances
SAN Fabric
A
CISCOSYSTEMS
SAN Fabric
B
VPN Comm-link
for remote
support
CISCOSYSTEMS
3Com
Library wi LTO3
drives
Type 1 Storage
3Com
San
Fabric B
Virtualization Engine
Monitor
D. Miles 06/09/07
(xxx)
pSeries
server(s)
San
Fabric A
Virtualization Engines
(xxx) xSeries server
Ethernet
17
Case Study
The Study
1.
2.
Shows that commingling of data and meta-data on a single logical
device means that there is no way to achieve different service level
objectives for data and meta-data in the same file system, without
moving file-system specific knowledge into the logical disk layers.
Shows that the standard assumptions underlying the organization of
data and meta-data in file systems are no longer valid in virtualized
storage environment and hence fail to materialize the full benefits of
storage virtualization.
Proposes a different file system organization of data and meta-data
designed to exploit the power of virtualized storage.
18
Case Study
Service Level requirements within a single file system
• Organization A Needs No Encryption
• Organization B_ Needs Encryption
– Stores Medical Records
– Security requirements for file data is
extremely high.
– Performs nightly indexing operation
on file systems
– All directory information and file
access times must be read to
determine “changed” state of data
– Business requirement that all file
data be encrypted at rest.
– File meta data has no security
requirement
In Unix fast file system (ffs), a
logical disk is divided into
collections of blocks called
cylinder groups, each of which
stores both file data blocks as well
as file meta-data
19
Case Study
Results
• Clean logical separation between
data and metadata
• Allows file system feature to use
virtualization features and achieve
different SLO’s
•
Redesign changes
– Code change
– Packing the re-located cylinder
group header in the first few meta
data cylinder groups ensures each
header is located @ a fixed,
predictable offset from the front of
the block device
– User configurable block address
space before which no data stored
and after no meta data stored
20
Case Study
5-7% gains on the new file system
layout
31-44% for the file lookup and file
delete benchmarks, which result
in little or no file data i/o, the
advantage of data-only
encryption become obvious
Future Work
•
Differing SLO’s for granular meta data
•
Completely separate fixed/dynamic metadata
•
Separate file data from user defined file attribute
data
21
What are potential topics of research and dissertation?
Sample Research Topics in Storage Virtualization
 Bayesian analysis for resource management
 Bayesian analysis for diagnostics
 Trusted domains for security
 Storage Virtualization and Metadata Standards
 Algorithm advances for block, device and other
component virtualization techniques
22
Summary and Verbal Quiz
Storage Basics
1. What type of storage is found in your work station?
2. What type of storage systems may be found in a large
enterprise?
3. How is data accessed from storage?
4. Network Attached Storage (NAS) is well suited for what type
of applications?
5. Storage Area Network (SAN) is well suited for what type of
applications?
Storage Virtualization
1. What is Storage Virtualization?
2. Where and What can be virtualized in storage?
3. How is storage virtualized at a network level?
4. How is storage virtualization currently implemented?
5. What are the potential research topics in storage virtualization?
23
Annotated References
1.
2.
3.
4.
5.
6.
Faibish. S., Fridella S, Bixby P., and Gupta U., “Storage Virtualization using a Block-device File
System” January 2008 ACM SIGOPS Operating Systems Review, Volume 42 Issue 1 Publisher:
ACM
The Storage Networking Tutorials, SNIAVIRT http://www.snia.org/education/tutorials/
http://en.wikipedia.org/wiki/Metadata
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
Nealan.L., php|works, Atlanta September 13, 2007
http://sizzo.org/wp/wp-content/uploads/2007/09/facebook_performance_caching.pdf
http://en.wikipedia.org/wiki/Google_platform
24