How to Manage Petabytes of Data in an

How to Manage Petabytes of Data in an
Environmentally Friendly way using
Less Power, Less Cooling, Less Cost
David Honey
Storage Solutions Architect
Agenda
• Where does energy come from and go to?
• Mass storage device power use; disk, tape, optical
• Multi tier vs disk only
• HSM is green
• What SGI is doing to help customers reach sustainability
• Q&A
Where does electricity come from ?
! 43 billion kWh of electricity generated
in NZ in 2006
Source: World Nuclear Association
Australian electricity sources are very different
! In 2006 power stations produced 255 billion kilowatt hours (TWh) of electricity, 65%
more than the 1990 level and growing at 3.3% pa.
! 18 TWh is used by power stations, leaving 237 TWh net production. 17 TWh is lost or
used in transmission and 9-10 more in energy sector consumption, leaving 210 TWh
for final consumption (or 187 TWh apart from aluminium exports)
! Much of the energy exported is used for generating electricity overseas; three times
as much black coal is exported as is used
§
25% of electricity produced is lost in
generation and transmission
§
90% energy sources are fossil fuels
Source: World Nuclear Association
Power: Energy used per unit of time
Power
Energy
Energy
(Watts)
(Wh p.a.)
(J p.a.)
2.8K
24.5M
88.3G
Hubble telescope
Space shuttle launch
1920P Altix
Synchrotron
CERN LHC
33T
330K
2.9G
10.4T
12M
105G
380T
120M
1T
3.8P
§
Because the Data Centre is always on it consumes a lot of energy
§
Conventional research uses energy too!
http://lhc-machine-outreach.web.cern.ch/lhc-machine-outreach/faq/lhc-energy-consumption.htm
Where does energy go?
Average of 12 Data Centres in NY&CA: Co-Location, Hosting, Government,
Financial Institution, Telecom, Scientific Computing, Data Storage
UPS
8%
Lighting
4%
Other
11%
HVAC air
movers
8%
HVAC
23%
Servers &
Storage
46%
Source: Lawrence Berkeley National Laboratory, http://hightech.lbl.gov/benchmarking-dc.html
There’s a Real Problem
• Energy consumption for US data centers and servers will nearly
double by 2011… requiring an additional 10 power plants (EPA
Study)
• Power consumption in data centers doubled from 2001 to 2005
(Lawrence Berkeley / Stanford study)
• Half the data centers will not have sufficient power for expansion by
2008 (Gartner survey ’06)
• In the highest level of redundancy and reliability data centers, for
every kilowatt used for processing, $22,000 is spent on power and
cooling infrastructure (Uptime Institute)
• Through 2009, energy costs will emerge as the second highest
operating cost in 70% of data centers worldwide (Gartner survey
’06)
Concerns No Longer Focused on Servers
• Servers currently account for about 60% of data
center power… storage will soon take that place
(Glass House ’07)
• eWeek.com
– Average number of storage terabytes maintained by Fortune
1000 organizations in 2004: 138
– Average number of storage terabytes maintained by Fortune
1000 organizations in October 2006: 600
Disk drive power dissipation
Hitachi Deskstar 7K1000
idle
seek
Seagate Barracuda ES 750
WD Raptor WD1500ADFD
Seagate Barracuda ES.2
WD Caviar SE16 WD7500AAKS
WD Caviar GP WD10EACS
Maxtor Atlas 15K II
Seagate Cheetah 15K.4
Fujitsu MAU3147
Hitachi Ultrastar 15K147
Seagate Cheetah 15K.5
0
5
10
15
Watts
§ SATA drives use 8 – 10W
§ SCSI drive use 12 – 20W
§ SAS drives use ~1W more than SCSI
§ FC drives use ~2W more than SCSI
Source: StorageReview.com
20
25
Disk drive startup power consumption
WD Caviar SE16 WD7500AAKS
Seagate Barracuda ES750
WD Raptor WD 1500ADFD
WD Caviar GP WD10EACS
Seagate Barracuda ES.2
Hitachi Deskstar 7K1000
12V rail
5V rail
Maxtor Atlas 15K II
Seagate Cheetah 15K.4
Seagate Cheetah 15K.5
Hitachi Ultrastar 15K147
Fujitsu MAU3147
0
5
10
15
20
Watts
§
SATA similar to SCSI
§
Drives use ~1.5x (scsi) – 2x (SATA) peak power to spin up
Source: StorageReview.com
25
30
35
Energy required for 20TB of disk
Type
Power
(Watts)
Drives
Energy
p.a.
(KWh)
15000
SAS
8.1
274
19,357
70
146
15000
SCSI
19.1
137
22,936
83
Hitachi 10K300
300
10000
SCSI
15.8
67
9,257
33
Seagate Cheetah NS
400
10000
SAS
11.1
50
4,883
18
Seagate Barracuda ES
750
7200
SATA
12.5
27
2,927
11
Hitachi 7K1000
1000
7200
SATA
12.9
20
2,269
8
WD GP WD10EACS
1000
5400
SATA
6.8
20
1,199
4.3
Seagate Barracuda PR
1500
7200
SATA
10.9
14
1,274
4.5
Capacity
(GB)
Speed
(rpm)
Seagate Savvio 15K.1
73
Seagate Cheetah 15K.4
§
Power based on 80% seeks 20% idle
§
1TB drives use ~10x less power per TB than 146GB SCSI drives
§
1TB drives use ~2x less power per TB than 400GB SCSI drives
§
SCSI/SAS/FC has 2x IOPs and less latency
§
New drives from Seagate and Western Digital use 50% less power
Joules
(GJ)
Tape drives
SAIT / LTO4
TS1130
T10000B
!
!
!
!
!
!
!
!
!
!
!
!
Capacity cart
! 1000GB
! 120MB/s native
! 16sec load time
! 48sec data access
! 63W (N/A)
800GB
45 / 120MB/s native
19sec load time
57sec file access time
30W (active)
5W (idle)
1000GB
160MB/s native
13sec load time
49sec data access
46W (active)
17W (idle)
! No electricity to store data
! Very high data transfer rates per device – sequential access
! Tape Libraries 10TB - 8PB+
! 30 year media life
Sources: Manufacturers data sheets; Clipper Group
Sport cart
75GB
12sec
Optical Storage
Holographic (3D)
Ultra Dense Optical (UDO) Blu-Ray / DVD
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
300GB
20MB/s data transfer
5 sec load time
50 year archive life
Blue laser
10,000,000 reads
WORM
80W
60GB (UDO2)
12MB/s data transfer
6.5sec media access time
50 year archive life
Blue laser
WORM
310W (19TB jbox)
20W (drive)
BD-R, BD-RE, DVD-R, DVD-RW
! 50GB / 5GB per disc
! 9MB/s data transfer
! 4.5sec media access time
! Blue laser
! Re-write 10,000 times
! 140W (jukebox)
! 20W (drive)
! Like tape, optical media requires no electricity to store data
! Data transfer rates are low compared to tape and disk
! Largest jukebox capacity 30TB
Sources: http://blogs.zdnet.com/storage/?p=313
http://www.incom.eu/jvc_mc-blx.html?L=1, http://www.kintronics.com/jukebox.html
Case Studies, tiered storage
Small
Medium
50TB
250TB
Large
1PB
50TB Storage Solution Power Comparison
Model
SGI IS220
+ SL T50
Watts
Type
Drives
510
265
775
SAS
LTO4
30
2
10
40
50
64
51
2.3x
Disk only
SGI IS4000
Capacity (TB)
1,793
! 8+2 RAID6
! 400GB FC drives for HSM
! 1TB SATA drives disk only
SATA
vs
250TB Storage Solution Power Comparison
Model
Watts
Type
Drives
Capacity (TB)
SGI IS4000
STK SL500
1,655
420
2,079
FC
LTO4
64
6
20
232
252
320
256
4.6x
Disk only
SGI IS6700
9,600
! 8+2 RAID6
! 400GB FC drives for HSM
! 1TB SATA drives disk only
SATA
vs
Source:http://www.sun.com/storagetek/tape_storage/tape_libraries/sl500/calc/calculator.jsp
1PB Storage Solution Power Comparison
Model
Watts
Type
Drives
Capacity (TB)
SGI IS4600
SL T950
4,092
966
5,058
FC
LTO4
128
12
41
1,154
1,195
1,200
960
7x
Disk only
SGI IS15000
35,800
! 8+2 RAID6
! 400GB FC drives for HSM
! 1TB SATA drives disk only
Source: Manufacturers data sheets
SATA
vs
Results Summary
Multi tier
Disk only
Greener
Small
17W/TB
35W/TB
2.3x
Medium
7.6W/TB
37.5W/TB
4.6x
Large
4.4W/TB
37W/TB
7x
! In each case multi tier RAIDs used SAS or FC disks to handle tape
and network IO
! A fundamental difference in service terms is the higher latency
involved in retrieving data from disk
HSM should be first choice for long term storage
• Disk and tape have a place in the data center
• With HSM, a relatively small volume of high
performance disk satisfy user requests with
good bandwidth and IOPS characteristics
• The bulk of data resides on tape when it’s
not active
• HSM provides the ‘smarts’ to move data to
the appropriate storage medium in response
to usage
• It is easy being green WRT storage – and
capital investment is lower than a ‘disk only’
solution
Customer Feedback
SGI’s Data Migration Facility (DMF) has been a key tool for storage management for CSIRO High
Performance Computing since 1991.
“It gives users a productivity boost, by allowing them to access vast quantities of data seamlessly, without
having to manage off-line media (CDs, DVDs, etc.) themselves. It allows users to have almost instant
access to large amounts of scratch disc space. It provides the foundation for a data-intensive computing
platform, by allowing users to easily work with large quantities of data. Lastly, it future-proofs their storage
holdings, by removing the work of moving data through generations of storage media away from the users
and onto the systems management, where it can be done in bulk automatically.
“CSIRO has situated its Data Store based on the DMF platform on four different sites and four different
hosts since 1991, and has in each case completed the move between platforms and sites in a weekend.
“Users can still access their data holdings in the same way as they did in 1991, while additional services
such as WWW and SRB access have been added.”
Dr. Robert C. Bell, Chief Technical Officer
CSIRO High Performance Scientific Computing
Traditional Multi Tier Storage – no HSM
• 2nd disk tier user to reduce backup window
• Users have access to 10TB
• Full time operators reqd to administer
backups and manually archive
10TB
online
= 5 years of fulls
+ 12 months of fulls
+ 13 days of incrementals
+ special backups
+ archives
190TB
offline
Multi tier with HSM
• Users now have access to 180TB
• 1 part time operator to load new tapes, monitor backups and
HSM exceptions
• Reduced TCO
180TB
10TB
nearline
offline
The backup problem
Accesses to file 123.dat
a 100GB file
Acquire
&
Convert
123.dat is backed up every week even
after it’s no longer interesting because to
do otherwise would require resources.
• 10.4TB of IO is performed to write
123.dat backups per year
Week10
Analyse & Interpret
• 1.3TB of media is used to store
123.dat and its copies most of which
will never be used
Week5
Retire
Week3 full backup
Week15
Week2 full backup
Week1 full backup
Day1 incremental backup
Time
Week20
Week25
Week30
Week35
The backup solution – HSM
Backups are very much shorter because only
the inode (512B) is required for recovery.
•0.6TB of IO is performed to write the
HSM copies of for 123.dat
•0.3TB of media is used to store 123.dat
and its copies
•53KB of IO per year is performed to
ensure 123.dat is recoverable – a 94%
reduction
Accesses to file 123.dat
a 100GB file
Week10
Acquire
&
Format
Analyse & Interpret
Week5
Week3 full backup
Week15
Day 180;
tier1 copy
automatically
removed
Week2 full backup
Week1 full
backup
Day1; copy to tier2
Day1
incremental
backup
Retire
Day 203;
tier1 copy
automatically
reinstated
from tier2
Day 233;
tier1 copy
automatically
removed
Week20
Week25
Week30
Week35
Time
Without HSM, Backups Consume More Energy
30W per CPU core
2W per LAN switch port
20W per TB
120W per socket
10W per DIMM
2W per SAN switch port
2W per TB
SGI Contribution to Energy Initiatives
tc99.ashraetcs.org/
www.thegreengrid.org/
www.climatesaverscomputing.org/
www.80plus.org
www.epa.gov
www.energystar.gov
www.eere.energy.gov/femp
www.spec.org/power_ssj2008/
www.green500.org
Green500 list
Green500
Rank
21
22
32
42
42
46
51
75
78
85
89
97
159
211
227
321
352
Site
Computer
Total Exploration Production
SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz
NASA/Ames Research Center/NAS
SGI Altix ICE 8200EX, Xeon QC 3.0/2.66 GHz
Grand Equipement National de Calcul Intensif - Centre Informatique SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz
HLRN at Universitaet Hannover / RRZN
SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz
HLRN at ZIB/Konrad Zuse-Zentrum fuer Informationstechnik
SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz
Irish Centre for High-End Computing
SGI Altix ICE 8200EX, Xeon quad core 2.80 GHz
University of Arizona
SGI Altix ICE 8200, Xeon quad core 2.83 GHz
SGI
SGI Altix ICE 8200, 3.0 GHz
New Mexico Computing Applications Center (NMCAC)
SGI Altix ICE 8200, Xeon quad core 3.0 GHz
Naval Research Laboratory (NRL)
SGI Altix ICE 8200, 3.0 GHz
Idaho National Laboratory
SGI Altix ICE 8200, 2.66 GHz
University of Minnesota
SGI Altix XE 1300 Cluster Solutions, 2.66GHZ, Inf
Roshydromet
SGI Altix ICE 8200, Xeon quad core 2.83 GHz
Japan Marine Science and Technology
Altix 4700 1.6 GHz
Wright-Patterson Air Force Base/DoD ASC
Altix 4700 1.6 GHz
Leibniz Rechenzentrum
Altix 4700 1.6 GHz
NASA/Ames Research Center/NAS
SGI Altix 1.5/1.6/1.66 GHz, Voltaire Infiniband
Mflops per
Watt
240.0
233.0
211.1
206.6
206.6
204.5
201.0
157.3
154.6
148.9
144.4
138.0
108.7
72.4
66.2
57.1
54.2
Power
(kW)
442.0
2090.0
608.2
129.2
129.2
122.8
67.1
123.1
861.6
97.5
123.1
125.4
125.8
201.6
777.3
990.2
1228.5
TOP500
Rank
17
3
14
107
108
118
463
190
12
406
247
268
451
397
49
44
39
Source: green500_200811.xls
•
November 2008 List
– 12 of “greenest” 100 are SGI systems
– SGI systems rank 55.4% better than list average
(17 SGI systems average 153.2 Mf/W vs. 98.6 Mf/W overall)
– The SGI ICE at Total is the 3rd most energy efficient x86based machine among the world's 30 fastest
supercomputers measured running Linpack
Building Water-Cooled Racks
(4) Individual Coils
Condensate Drain Pan
Target Heat Rejection
95% water / 05% air
Chilled-Water Supply
45°F to 60°F (7.2°C to 15.6°C)
14.4 gpm (3.3 m3/hr) Max.
Branch Feed to
Individual Coil
3/4” (1.91 cm) Coupling
Swivel Coupling to
Supply Hose
Building Award Winning Power Supplies
•
•
SGI awarded 80plus silver and
bronze certification for Altix systems
shipping since January 2007.
2837W Power Supply exceeds
ClimateSaver specifications for 1
and 2U servers.
Planned
Building More Efficient Future HPC Systems
•
•
•
•
•
•
SGI ‘Molecule’ concept
computer
10x core density, ½ the power
usage
Commodity components
(mobile device processor)
20x memory bw of a single
rack x86 cluster
Intel Atom N330 based; 8W
TDP compare 120 W for
XEON Harpertown
‘Kelvin’ liquid cooling system
Delivering Green Consulting Services
! Data Center Assessment
! Dense Systems Assessment
! Energy Efficiency Assessment
! Power Distribution Assessment
! Storage Assessment
Projected Electricity Use, All Scenarios 2007 to 2011
Source: U.S. EPA Report to Congress on Server and Data Center Energy Efficiency, Aug 07
The problem with disk
90
Enterprise storage capacity growth
Enterprise HDD capacity growth
80
(% Growth)
70
60
50
40
30
20
10
0
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
! Drive capacity growing at diminishing rate ~30%
! Enterprise capacity increasing ~ constant 50%
! Drive capacity growth exceeds IOPs growth
! Forces tending to increase drive counts (and power consumption)
Source: IDC 2008
2012
Power dissipation per unit area
Source: Uptime Institute
Source: Uptime Institute
DMF Saving the planet
•
Automated tape libraries use a fraction of the power
of RAID storage
•
SATA disk is greener than SCIS/SAS/FC but isn’t
appropriate for all workloads
•
It is better for the planet for users to manage
metadata better (w. MediaFlux) and wait a little
longer for data
•
“It’s cheaper to push everything off to tape than
have someone look at it”
•
Centralised storage services improve data
protection and data access to wider audiences
•
“The delete key is the greenest on the keyboard!”
•
Every Watt saved by ITS reduces green house gas
emissions
Q&A
Source: NZ Herald
Thank You!
© 2008 Silicon Graphics, Inc. All rights reserved. Silicon Graphics, SGI, SGI Altix, the SGI logo and the SGI cube are registered trademarks and
Innovation for Results is a trademark of Silicon Graphics, Inc., in the United States and/or other countries worldwide. All other trademarks
mentioned herein are the property of their respective owners. (06/08)