Case study
DATABASE MANAGEMENT
SYSTEMS
Oracle Database 11g Release 2 (11.2) – www.oracle.com
MySQL 5.5 – www.mysql.com
Oracle 11g
Oracle 11g
• Oracle Database 11g Release 2 (11.2) –
Documentation Library - www.oracle.com
• [Oracle 1] Oracle® Database Concepts
11g Release 2 (11.2)
– Part Number E16508-05, October 2010
• [Oracle 2] Oracle® Database Backup
and Recovery User’s Guide
– Part Number E10642-04, October 2010
Outline
• 1. Overview of Oracle DBMS
• 2. Disk Storage, File Structures and Hashing
• 3. Indexing Structures
• 4. Query Processing
• 5. Transaction Processing
• 6. Concurrency Control
• 7. Recovery Techniques
Outline
• 1. Overview of Oracle DBMS ([Oracle 1], Part I.1)
• 2. Disk Storage, File Structures and Hashing
([Oracle 1], Part I.2, Part IV)
• 3. Indexing Structures ([Oracle 1], Part I.3)
• 4. Query Processing ([Oracle 1], Part II)
• 5. Transaction Processing ([Oracle 1], Part III.10)
• 6. Concurrency Control ([Oracle 1], Part III.9)
• 7. Recovery Techniques ([Oracle 1], Part VI.18,
Part V.13, [Oracle 2])
MySQL 5.5
MySQL V5
• Released in October 2005
– http://www.mysql.com
• Open source relational DBMS (Community
Edition)
• Database back-end of choice for Web
publishing
• Modular architecture for a choice of a storage
engine that performs best for user’s needs
• Support for the standard SQL (SQL:2003)
implementation and application programming
interfaces
MySQL Server Architecture
MySQL V5
• Several MySQL storage engines
– InnoDB for transactional systems
– MyISAM for analytical systems including data warehouses
and data marts
– Memory, formerly known as Heap, for high-performance
applications
– NDB, the Cluster Storage engine, for high availability and
scalability
– Archive for efficient storage of large data volumes
– Federated, providing for local access to remote data
tables
– Merge, also known as MRG_MyISAM, which collects
identical MyISAM tables for unified access
MySQL V5
MySQL V5
MySQL V5
• More support
– ACID (atomic, consistent, isolated, and durable)
transactions,
– Distributed transaction processing conforming to the
X/Open XA specification,
– Triggers and stored procedures, and views
– Indexing
– MySQL's internal metadata schema based on relational
standards
– Spatial extensions (R-tree indexes) with specifications
from Open Geospatial Consortium (OGC)
– Table partitioning
– Scalability from small databases to terabyte databases
MySQL 5.5 – InnoDB 1.1 Storage Engine
• [MySQL 1] InnoDB 1.1 for MySQL 5.5
User’s Guide, Oracle, 2011
• [MySQL 2] MySQL 5.5 Reference
Manual, Oracle, 2011
– 13.6 & 13.7, pp. 1108 - 1232
MySQL 5.5 – InnoDB 1.1 Storage Engine
• The default MySQL storage engine
– Its design follows the ACID model, with transactions featuring commit,
rollback, and crash-recovery capabilities to protect user data
– Row-level locking and Oracle-style consistent reads increase multiuser concurrency and performance.
– InnoDB tables arrange your data on disk to optimize common queries
based on primary keys. Each InnoDB table has a primary key index
called the clustered index that organizes the data to minimize I/O for
primary key lookups.
– To maintain data integrity, InnoDB also supports FOREIGN KEY
referential-integrity constraints.
– The InnoDB storage engine maintains its own buffer pool for caching
data and indexes in main memory. InnoDB stores its tables and
indexes in a tablespace, which may consist of several files (or raw
disk partitions).
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Data storage and structure
– By default, all InnoDB tables and indexes are
stored in the system tablespace that consists
of data files.
– As an alternative, you can store each InnoDB
table and its indexes in its own file. This
feature is called “multiple tablespaces”
because each table that is created when this
setting is in effect has its own tablespace.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Data storage and structure
– The physical row structure for an InnoDB table depends
on the row format specified when the table was created.
– InnoDB uses the COMPACT format by default, but the
REDUNDANT format is available to retain compatibility
with older versions of MySQL.
– To check the row format of an InnoDB table, use SHOW
TABLE STATUS.
– The compact row format decreases row storage space
by about 20% at the cost of increasing CPU use for some
operations.
– The COMPACT & REDUNDANT formats (MySQL 2, pp.
1170)
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Data storage and structure
– Tablespace  data files (segments)  extents  pages
– Each tablespace consists of database pages with a default size of
16KB.
– The pages are grouped into extents of size 1MB (64 consecutive
pages).
– The “files” inside a tablespace are called segments in InnoDB.
– When a segment grows inside the tablespace, InnoDB allocates the
first 32 pages to it individually. After that, InnoDB starts to allocate
whole extents to the segment. InnoDB can add up to 4 extents at a
time to a large segment to ensure good sequentiality of data.
– Two segments are allocated for each index in InnoDB. One is for
nonleaf nodes of the B-tree, the other is for the leaf nodes. Keeping
the leaf nodes contiguous on disk enables better sequential I/O
operations, because these leaf nodes contain the actual table data.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Indexing: clustered index & secondary indexes
• Clustered index
– Every InnoDB table has a special index called the clustered index
where the data for the rows is stored. Typically, the clustered index is
synonymous with the primary key.
– If you define a PRIMARY KEY on your table, InnoDB uses it as the
clustered index.
– If you do not define a PRIMARY KEY for your table, MySQL locates
the first UNIQUE index where all the key columns are NOT NULL and
InnoDB uses it as the clustered index.
– If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB
internally generates a hidden clustered index on a synthetic column
containing row ID values. The rows are ordered by the ID that InnoDB
assigns to the rows in such a table. The row ID is a 6-byte field that
increases monotonically as new rows are inserted. Thus, the rows
ordered by the row ID are physically in insertion order.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Secondary indexes
– All indexes other than the clustered index are
known as secondary indexes.
– In InnoDB, each record in a secondary index
contains the primary key columns for the row,
as well as the columns specified for the
secondary index.
– InnoDB uses this primary key value to search
for the row in the clustered index.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Physical structure of an InnoDB index
– All InnoDB indexes are B-trees where the index records
are stored in the leaf pages of the tree.
– The default size of an index page is 16KB.
– When new records are inserted, InnoDB tries to leave
1/16 of the page free for future insertions and updates
of the index records.
– If index records are inserted in a sequential order
(ascending or descending), the resulting index pages
are about 15/16 full.
– If records are inserted in a random order, the pages are
from 1/2 to 15/16 full.
– If the fill factor of an index page drops below 1/2,
InnoDB tries to contract the index tree to free the page.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Query processing and optimization
(MySQL 2, Chapter 7, pp. 568-667)
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Transaction management
– By default, each client that connects to the MySQL
server begins with autocommit mode enabled, which
automatically commits every SQL statement as you
execute it.
– To use multiple-statement transactions, you can
switch autocommit off with the SQL statement SET
autocommit = 0 and end each transaction with either
COMMIT or ROLLBACK.
– If you want to leave autocommit on, you can begin
your transactions within START TRANSACTION and
end them with COMMIT or ROLLBACK.
MySQL 5.5 – InnoDB 1.1 Storage Engine
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9 & 13.6.10, pp. 1159
– 1168)
– In the InnoDB transaction model, the goal is to combine the best
properties of a multi-versioning database with traditional twoขphase
locking. InnoDB does locking on the row level and runs queries as
nonlocking consistent reads by default, in the style of Oracle.
– The lock information in InnoDB is stored so space-efficiently that lock
escalation is not needed: Typically, several users are permitted to lock
every row in InnoDB tables, or any random subset of the rows, without
causing InnoDB memory exhaustion.
– In terms of the SQL:1992 transaction isolation levels, the default InnoDB
level is REPEATABLE READ. InnoDB offers all four transaction isolation
levels described by the SQL standard: READ UNCOMMITTED, READ
COMMITTED, REPEATABLE READ, and SERIALIZABLE.
– A user can change the isolation level for a single session or for all
subsequent connections with the SET TRANSACTION statement.
– To set the server's default isolation level for all connections, use the -transaction-isolation option on the command line or in an option file.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections
13.6.9 & 13.6.10, pp. 1159 – 1168)
– In row-level locking, InnoDB normally uses
next-key locking.
• That means that besides index records, InnoDB
can also lock the “gap” preceding an index record
to block insertions by other sessions in the gap
immediately before the index record.
• A next-key lock refers to a lock that locks an index
record and the gap before it.
• A gap lock refers to a lock that locks only the gap
before some index record.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9 & 13.6.10, pp. 1159
– 1168)
– InnoDB implements standard row-level locking where there are two
types of locks:
• A shared (S) lock permits a transaction to read a row.
• An exclusive (X) lock permits a transaction to update or delete a row.
– Additionally, InnoDB supports multiple granularity locking which permits
coexistence of record locks and locks on entire tables.
– To make locking at multiple granularity levels practical, additional types
of locks called intention locks are used.
• Intention locks are table locks in InnoDB.
• The idea behind intention locks is for a transaction to indicate which type of
lock (shared or exclusive) it will require later for a row in that table.
– There are two types of intention locks used in InnoDB (assume that
transaction T has requested a lock of the indicated type on table t):
• Intention shared (IS): Transaction T intends to set S locks on individual rows
in table t.
• Intention exclusive (IX): Transaction T intends to set X locks on those rows.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9
& 13.6.10, pp. 1159 – 1168)
– The intention locking protocol is as follows:
• Before a transaction can acquire an S lock on a row in table
t, it must first acquire an IS or stronger lock on t.
• Before a transaction can acquire an X lock on a row, it must
first acquire an IX lock on t.
– These rules can be conveniently summarized by
means of the following lock type compatibility matrix.
– A lock is granted to a requesting transaction if it is
compatible with existing locks, but not if it conflicts
with existing locks. A transaction waits until the
conflicting existing lock is released. If a lock request
conflicts with an existing lock and cannot be granted
because it would cause deadlock, an error occurs.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9
& 13.6.10, pp. 1159 – 1168)
– The lock type compatibility matrix
X
IX
S
IS
X
Conflict
Conflict
Conflict
Conflict
IX
Conflict
Compatible
Conflict
Compatible
S
Conflict
Conflict
Compatible
Compatible
IS
Conflict
Compatible
Compatible
Compatible
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9
& 13.6.10, pp. 1159 – 1168)
– Implicit transaction commit and rollback
– Deadlock detection and rollback
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Concurrency control (MySQL 2, Sections 13.6.9
& 13.6.10, pp. 1159 – 1168)
– InnoDB Multi-Versioning
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Backup and recovery management
– If the value of innodb_flush_log_at_trx_commit is 0, the
log buffer is written out to the log file once per second
and the flush to disk operation is performed on the log
file, but nothing is done at a transaction commit.
– When the value is 1 (the default), the log buffer is
written out to the log file at each transaction commit
and the flush to disk operation is performed on the log
file.
– When the value is 2, the log buffer is written out to the
file at each commit, but the flush to disk operation is not
performed on it. However, the flushing on the log file
takes place once per second also when the value is 2.
Note that the once-per-second flushing is not 100%
guaranteed to happen every second, due to process
scheduling issues.
MySQL 5.5 – InnoDB 1.1 Storage Engine
• Backup and recovery management
(MySQL 2, Section 13.6.7, pp. 1156 1159)
Other DBMSs
• Several commercial DBMSs
– IBM DB2
– Informix
– Microsoft Access
– Microsoft SQL Server
– InterSystem Cache
– O2
– Gemstone
• Several open source DBMSs
– PostgreSQL