Data Storage Technologies

STUDY GUIDE
Data Storage Technologies
Ramūnas MARKAUSKAS
Vilnius University
2012
Data Storage Technologies
Study Guide
Cycle: 1st level
Study program: Information Technologies
Course unit code: ITDST
Awarding institution: Department of Computer Science II, Faculty of Mathematics and Informatics
Preparation of the study guide was supported by
the project „Increasing Internationality in Study Programs of the Department of
Computer Science II“, project number VP1–2.2–ŠMM-07-K-02-070, funded by The
European Social Fund Agency and the Government of Lithuania.
Studijų vadovo medžiagos rengimą rėmė
projektas „Kompiuterijos katedros studijų programų tarptautiškumo didinimas“, projekto
kodas VP1–2.2–ŠMM-07-K-02-070, finansuojamas iš Europos socialinio fondo ir Lietuvos
valstybės biudžeto lėšų.
© Ramūnas Markauskas, 2012
© Vilnius University, 2012
Contents
Abstract ............................................................................................................................................................. 4
Assessment strategy .......................................................................................................................................... 4
Content of the course ........................................................................................................................................ 4
Literature ........................................................................................................................................................... 6
Exam test ........................................................................................................................................................... 6
Project cases ...................................................................................................................................................... 7
Problem solving ................................................................................................................................................. 9
Laboratory work “Installation and Configuration of Virtual Data Storage (VDS)” ............................................ 9
Study Guide: Data Storage Technologies
Abstract
This course is intended for students who wish / need to understand the different type of storage
technologies, their architectures and the technological trends. While the information provided in
this course is essential for IT system administrators and key knowledge for data storage
administrators, but it can also be used in one’s everyday life dealing with personal storage.
The course is delivered by the means of inclusive lectures, task solving in class and individually,
individual analysis of literature, presentation, project work, case analysis, data interpretation and
consulting.
Assessment strategy
Assessment of the course consists of:
1. Exam test for maximum of 4 points (5 open questions, 0.4 points each and 10 multiplechoice questions, 0.2 points each), assessment criteria are based on the correctness of the
answer. Deadline: during exam session;
2. Project presentation / defence for maximum of 3 points. Project is done in a groups of two
students, assessment criteria are based on logical reasoning, technical requirements
conformity (up to 80%); level of presentation and oratory (up to 10%); style of presentation
(up to 10%). Deadline: till 14th lecture;
3. Class / homework presentation /defence for maximum of 3 points (5 problems, 0.6 points
each) , assessment criteria are based on one’s ability to explain logics of the solution (up to
40%) and connection between variables and parameters of technical equipment (up to
40%); right arithmetic operations (up to 20%). Deadline: till 9th lecture;
4. Additionally, to earn extra 3 points, one can present / defend a laboratory work
“Installation and Configuration of Virtual Data Storage”. Assessment criteria are based on
the fact of installation of virtual data storage (explanation up to 10%), configuration of
Back-End components (fact up to 15%, explanation up to 15%), configuration of Front-End
components and access (fact up to 20%, explanation up to 20%), demonstration of access
(up to 20%). Deadline: till the end of the course.
Content of the course
1. Introduction to Storage technologies [1: chapter 1]
a. Information Storage
b. Evolution of storage technology and architecture
c. Key challenges
d. Information Lifecycle
2. Storage system environment [1: chapter 2]
a. Components of storage system
b. Disk drive components
c. Disk drive performance
d. Fundamental laws for drive performance
e. Logic components of the host
3. Data protection (RAID) [1: chapter 3]
a. Implementation of RAID
b. RAID array components
c. RAID levels
Page 4 of 9
Study Guide: Data Storage Technologies
d. RAID comparison
e. RAID impact on disk performance
4. Intelligent storage systems [1: chapter 4]
a. Components
b. Intelligent storage array
5. DAS and SCSI [1: chapter 5]
a. Types of DAS
b. DAS benefits and limits
c. Disk drives interfaces
d. Introduction to parallel SCSI
e. SCSI command model
6. NAS [1: chapter 7]
a. General purpose servers versus NAS devices
b. Benefits of NAS
c. NAS File I/O
d. Components of NAS
e. NAS implementations
f. NAS file sharing protocols (NFS, CIFS)
g. NAS I/O operations
h. NAS performance and availability
7. SAN [1: chapter 6]
a. Overview of Fiber Channel
b. SAN and its evolution
c. Components of SAN
d. FC connectivity
e. FC ports
f. FC architecture
g. Zoning
h. FC login types
i. FC topologies
8. IP SAN [1: chapter 8]
a. Components of iSCSI
b. iSCSI host connectivity
c. iSCSI protocol stack
d. iSCSI names and sessions
e. iSCSI error handling and security
f. FCIP
9. CAS [1: chapter 9]
a. Fixed content and archives
b. Types of archives
c. Benefits of CAS
d. CAS architecture
e. Object storage and retrieval in CAS
f. CAS examples
10. Storage, server virtualization, real world examples [1: chapter 10]
a. Forms of virtualization
b. SNIA virtualization taxonomy
c. Storage virtualization configurations
d. Virtualization challenges
Page 5 of 9
Study Guide: Data Storage Technologies
e. Types of storage virtualization
11. Business continuity, Backup and recovery [1: chapter 11, 12]
a. Backup purpose
b. Backup considerations
c. Granularity
d. Recovery considerations
e. Backup methods and process
f. Backup and restore operations
g. Topologies
h. Backup in NAS
i. Backup technologies
12. Local and remote replication [1: chapter 13, 14]
13. Storage security and management [1: chapter 15, 16]
a. Storage security framework
b. Risk triad
c. Storage security domains
d. Security implementations
e. Monitoring infrastructure
f. Management activities
g. Management challenges
Literature
Required
[1] EMC education services, Information Storage and Management: Storing, Managing, and
Protecting Digital Information. John Wiley & Sons, 2009.
Optional
EMC education services, Information Storage and Management: Storing, Managing, and
Protecting Digital Information in Classic, Virtualized, and Cloud Environments, 2nd ed. John Wiley
& Sons, 2012.
G. Schulz, Cloud and Virtual Data Storage Networking. Taylor & Francis, 2011.
M. Gupta, Storage Area Network Fundamentals. Cisco Press, 2002.
Exam test
Example of an open question
How many milliseconds does it take for an HDD of 5.400 RPM to do a full round? Please, give the
full solution. (Answer may be found in lecture #2)
Example of multiple choice question
What is the minimal amount of controllers for a storage array of type AA (active - active)? (Answer
may be found in lecture #4)
[A] 1
[B] 2
[C] 3
[D] 4
Page 6 of 9
Study Guide: Data Storage Technologies
Project cases
Each group of two students should choose one project case and inform the lecturer by e-mail
about group members and their choice. The same case may be chosen not more than by two
groups and the solutions between groups should be totally different. Detailed case interpretation
will be done during laboratory work. Projects should be presented / defended in a presentational
form. The presentation should consist of:
1. Introduction: presentation of current situation;
2. Architectural solution: presentation of proposed system’s architecture, requirements for
the environment;
3. Proposal: presentation of real world hardware with specification and prices to fulfill the
requirements.
Intensive use of schemas and graphics is desirable.
1 st case
Select and present data storage equipment for MS Exchange 2010 environment with the following
parameters / requirements:
1. E-mail boxes count: 4000;
2. Average size of a box: 2 GB;
3. Expected growth of boxes count: 1% per year;
4. Expected growth of boxes size: 50% per year;
5. Part of intensively used boxes: 30%;
6. RPO: 24 hours;
7. RTO: 1 hour;
8. Period: 3 years.
nd
2 case
Select and present data storage equipment for e-mail server of your choice with the following
parameters / requirements:
1. E-mail boxes count: 500;
2. Average size of a box: 1 GB;
3. Expected growth of boxes count: 5% per year;
4. Expected growth of boxes size: 120% per year;
5. Part of intensively used boxes: 80%;
6. RPO: 5 min;
7. RTO: 0 min;
8. Period: 5 years.
3 rd case
Select and present data storage equipment for an application with the following parameters /
requirements:
1. IOPS: 450,000;
2. Block size: 4 KB;
3. Distribution of read / write operations: 80 / 20;
4. RPO: 0 hours;
5. RTO: 0 hour;
6. Records should be kept online at least for 3 years and may be destroyed after 10 years;
7. Period: 10 years.
4 th case
Current equipment: one AA type SAN device with nine 15,000 RPM disks, connected over FCP and
configured in RAID 5 mode (two RAID groups with four disks in each; one disk used as Hot Spare).
Requirements:
Page 7 of 9
Study Guide: Data Storage Technologies
1. IOPS: 250,000;
2. Block size: 4 MB;
3. Distribution of read / write operations: 40 / 60;
4. RPO: 0 hours;
5. RTO: 0 hour;
6. Records should be kept online at least for 3 years and may be destroyed after 10 years;
7. Period: 10 years.
5 th case
Select and present data storage equipment for application with the following parameters /
requirements:
1. IOPS: 50,000;
2. Block size: 16 KB;
3. Distribution of read / write operations: 90 / 10;
4. Database size: 20 TB;
5. Expected growth of database: 15% per year;
6. RPO: 0 hours;
7. RTO: 24 hour;
8. Records should be kept online at least for 5 years and may be destroyed after 25 years.
6 th case
Current equipment: one AA type SAN device with 128 GB SSD connected over FCP and configured
in RAID 5 mode (two RAID groups with six disks in each; one disk used as Hot Spare).
Requirements:
1. IOPS: 100,000;
2. Block size: 1 MB;
3. Distribution of read / write operations: 90 / 10;
4. Database size: 600 GB;
5. Expected growth of database: 100% per year;
6. RPO: 0 hours;
7. RTO: 24 hour;
8. Data distribution by creation date:
a. Recent year – 80.0%;
b. 1 to 3 years – 15.0%;
c. 3 to 10 years – 5.0%;
d. Over 10 years – 0.0%.
9. Records should be kept online at least for 3 years and may be destroyed after 15 years.
th
7 case
Current equipment: one AA type SAN device with 256 GB SSD connected over FCP and configured
in RAID 1 mode (six RAID groups with two disks in each; one disk used as Hot Spare).
Requirements:
1. IOPS: 150,000;
2. Block size: 1 MB;
3. Distribution of read / write operations: 85 / 15;
4. Database size: 1 TB;
5. Expected growth of database: 120% per year;
6. RPO: 0 hours;
7. RTO: 24 hour;
8. Data distribution by creation date:
a. Recent year – 60.0%;
b. 1 to 3 years – 25.0%;
Page 8 of 9
Study Guide: Data Storage Technologies
c. 3 to 10 years – 10.0%;
d. Over 10 years – 5.0%.
9. Records should be kept online at least for 5 years and may be destroyed after 20 years.
Problem solving
Each problem should be solved in a written form and defended.
1 st problem
Application operates with blocks of size 64 KB and has 1,000 intensive users, who generate 2 IOPS
each, and also 2,000 regular users, who generate 1 IOPS each. Each intensive user uses up to 400
GB and regular – up to 75 GB of storage space. Distribution of read / write operations is
respectively 2/1. Management processes create additional 20% IOPS flow. Calculate IOPS
requirements for RAID of type 1, 5 and 6.
2 nd problem
The manufacturer gives the following parameters of hard disk drive: rotation speed – 15,000 RPM;
external data transfer rate – 3 Gbps; internal data transfer rate – 120 MBps; average seek time – 3
ms, capacity – 2TB Calculate the IOPS capability of this disk if 64 KB data blocks are used.
3 rd problem
The manufacturer gives the following parameters of solid state drive: external data transfer rate –
6 Gbps; internal data transfer rate – 400 MBps; average seek time – 0.1 ms, capacity – 256 GB
Calculate the IOPS capability of this disk if 4 KB data blocks are used.
4 th problem
Calculate, how many disk from the 2nd problem will be needed to fulfill the requirements for the
application from the 1st problem with all mentioned RAID types.
5 th problem
Calculate, how many disk from the 3rd problem will be needed to fulfill the requirements for the
application from the 1st problem with all mentioned RAID types.
Laboratory work “Installation and Configuration of Virtual Data Storage
(VDS)”
Goal
• To get to know the technological principles of data storage, configuration abilities,
protocols in use;
• To practically try the process of data storage installation / configuration by using VDS.
Work flow
1. Get to know candidates for VDS installation (FreeNAS, NAS4Free, …);
2. Decide, which one you’ll be working with;
3. Prepare virtual machine in an environment of your own which fulfill requirements for your
chosen VDS;
4. Create additional 3 to 5 virtual disks of possible smallest size for information storage;
5. Install virtual machine and complete basic configuration tasks (user accounts, network
configuration, …);
6. Create RAID array of selected level;
7. Configure VDS to be accessed over CIFS and iSCSI;
8. Access VDS over CIFS and iSCSI;
9. Imitate disaster by removing one disk from a RAID array and give comment on a situation.
Page 9 of 9