Secondary Storage Management

Secondary Storage Management
Himanshu Gupta
Storage–1
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–2
Memory Hierarchy
Tertiary Storage (PetaBs; secs-minutes)
Non-Volatile
Secondary Storage/Disks (TBs; 10 msec)
(Virtual Memory; File System)
Blocks
Main Memory (GBs; 10-100 nsec)
Volatile
Cache (1 MB; 1-5 nsec)
Himanshu Gupta
Storage–3
Disk Controllers
P
...
M
C
...
Secondary
Storage
Himanshu Gupta
Storage–4
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–5
Secondary Storage – Disks
…
Terms: Head, Platter/Surface,
Cylinder, Track, Sector, Gap (to delineate sectors)
Himanshu Gupta
Storage–6
Top View
Himanshu Gupta
Storage–7
Disk Access Time
block x
in memory
I want
block X
?
Access time for a random block =
Seek Time +
Rotational Delay +
Transfer Time +
Other (contention, CPU time; negligible in comparison)
Himanshu Gupta
Storage–8
Disk Access Time
block x
in memory
I want
block X
?
Access time for the next block =
Transfer Time + Negligible - skip gap
- switch track
- once in a while,
next cylinder
Himanshu Gupta
Storage–9
Rule of Thumb
Random I/O: Expensive
Sequential I/O: Much less
•  Writing: Similar cost as reading
•  Update:
(a) Read Block
(b) Modify in Memory
(c) Write Block
Himanshu Gupta
Storage–10
Accelerating Access to Disk Storage
•  I/O Cost is dominant (compared to mainmemory computation)
Techniques to improve I/O cost:
•  Organize data by cylinders
•  Using multiple disks (in parallel)
•  Elevator algorithm (to serve data requests)
•  Pre-fetching, Large-Scale Buffering
Himanshu Gupta
Storage–11
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–12
Storing Relations on Disks
•  To store: Attributes, records, relations
•  Physical Data Hierarchy
Ø Attrs à Records à Blocks à Relations
•  Block is the unit of I/O transfer.
•  Database relations map to one or more
blocks. Himanshu Gupta
Storage–13
Fixed-Length Records
•  Records with fixed-length fields:
Ø  [Record
header, <attr1>, <attr2>, …. ]
Ø  Record header = (schema, length, timestamp).
Schema (type and bytes for each attribute) lets
us access the attributes. Can be substituted by
attribute offsets/pointers.
•  Packing records into block
Ø  [Block
header, <rec1>, <rec2>, … ]
Ø  Block header = (pointers to other blocks,
schema, record offsets, timestamps, other info)
Himanshu Gupta
Storage–14
Record Addressing
Rx
•  How does one refer to records? Ø  Physical/Direct
Addressing
Ø  Logical/Indirect Addressing
Ø  Mixed/Structure Approach
Himanshu Gupta
Storage–15
Purely Physical Address
Record Address/ID
Himanshu Gupta
=
Device ID
Cylinder #
Track/Surface #
Block #
Offset in block
Block ID
Storage–16
Logical/Indirect Addressing
E.g., Record ID is arbitrary bit string
Rec ID Physical
addr.
Advantages: Records are mobile, deletion doesn’t give
dangling pointers.
Gupta
Himanshu
Storage–17
Structured/Mixed Approach
Record Address: (Physical Block Address, Record-ID in block)
Header
A block:
Free space
R3
R4
R1
Himanshu Gupta
R2
Storage–18
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–19
Pointer Swizzling
•  Consider a block B read into the memory. What
happens to the pointers to B? Note that these
pointers are disk-addresses.
•  Option 1: Create a translation table in memory,
that maps disk-addresses to memory-addresses
(for blocks currently in memory).
•  Option 2: Change the disk-addresses to memoryaddresses (in other blocks in memory containing
pointers to B). This is called pointer-swizzling
(still need the “translation table”).
Himanshu Gupta
Storage–20
Pointer Swizzling: Issues
•  Automatic Swizzling: Whenever a block is
read: update the table, and swizzle all
relevant pointers.
•  Swizzling on demand: Swizzle only after
the first try (table is still updated, when a
block is read).
•  Unswizzling: when blocks are written back
•  Pinned blocks (due to recovery system, or
sizzling mechanism). To unpin, we first
need to unswizzle pointers to it.
Himanshu Gupta
Storage–21
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–22
Variable-Length/Format Fields/Records
•  Keep appropriate information in the record
header:
Ø  Pointers
to attribute positions. Works for
variable-length fields.
Ø  Number/size of elements. Works for attributes
that are list of values.
•  “Tagged fields”: Indicate type, length, etc.
at the start of the attribute. Works for
variable-format records.
Himanshu Gupta
Storage–23
Large Records
•  Some records may not fit in a single block
•  Need to be “spanned” across blocks
Unspanned: records must be within one block
block 1 block 2
R1 R2 R3 R4 R5 . ..
Spanned R1
Himanshu Gupta
block 1 R2
...
R3
(a)
R3
(b)
R4
block 2
R5
R6
R7
(a)
Storage–24
With spanned records:
R1
R3
(a)
R2
R3
(b)
R4
R5
R7
(a)
R6
need indication
need indication
of partial record pointer to rest of continuation
(+ from where?)
•  Spanned essential if record size > block size
•  Unspanned is simpler, but may waste space
Himanshu Gupta
Storage–25
Very Large Fields (BLOBs)
•  Some fields may be very large (e.g., videos or
pictures). May make sense to store them:
Ø  Separate
from the rest of the record,
Ø  Across disks for efficient parallelized access.
Ø  Indexed, so that any portion of it can be retrieved
(e.g., 45-50 mins of the movie).
Himanshu Gupta
Storage–26
Other Storage Options
•  Split Records: Rather than storing rows, we
can store columns! Ø  Will
require keeping the “ID” with each attribute.
Ø  Could be helpful, if we need some statistics for an
attribute. •  Ordering the records:
Ø  Sorted
by some attribute value.
Ø  Physical or logical (linked list)
Himanshu Gupta
Storage–27
Outline
• 
• 
• 
• 
• 
• 
Memory Hierarchy
Disk, and Disk Access Time
Storing (Relational) Data on Disks
Pointer Swizzling
Variable-Length or Large Records/Fields
Deletions and Insertions of Records
Himanshu Gupta
Storage–28
Record Deletion
Block
Rx
Himanshu Gupta
(a) Immediately reclaim space
(b) Dangling pointers? Next.
Storage–29
Concern with deletions
Dangling pointers
R1
Solutions
1. Do not worry
2. Use Tombstone
Himanshu Gupta
?
Storage–30
Tombstones for Physical IDs
E.g., Leave MARK in old location
A block
This space
never re-used
Himanshu Gupta
This space can
be re-used
Storage–31
Tombstones for Logical IDs
E.g., Leave MARK in the map
map
ID
7788
Himanshu Gupta
LOC
Never reuse
ID 7788 nor
space in map...
Storage–32
Insertions
•  If records NOT in sequence (easy case)
Ø  Insert
at end of file or in deleted slot
Ø  If records are variable size, not as easy...
•  If records in sequence
Ø  If
free space close by , not too bad...
Ø  Or use overflow idea...
Himanshu Gupta
Storage–33
Free
space
Himanshu Gupta
Storage–34
Next
Given a key, how to find a record quickly
Himanshu Gupta
Storage–35