Ch.8 - Faculty Website Listing

Chapter 8: Top-down Relational
Database Design: NORMALIZATION
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
What can happen when we combine
relations/tables?
Suppose we combine the tables
instructor and department
This creates redundancy (repetition of
information):
Database System Concepts - 6th Edition
8.2
©Silberschatz, Korth and Sudarshan
What can happen when we combine
relations/tables?
Another problem: UPDATE
When any of the redundant info is changed, the changes
hav eto be applied to multiple tuples!
Database System Concepts - 6th Edition
8.3
©Silberschatz, Korth and Sudarshan
What can happen when we combine
relations/tables?
Another problem: INSERTION
When a new department is created, there are no
instructors associated with it yet → need to use NULL
values!
Database System Concepts - 6th Edition
8.4
©Silberschatz, Korth and Sudarshan
Is there any reason to combine
relations/tables?
Any query that involves a natural join
between department and instructor will
execute faster on the combined table!
This is generally preferred in data mining.
Database System Concepts - 6th Edition
8.5
©Silberschatz, Korth and Sudarshan
The “top-down” approach
In this chapter, we look at the problem in the
opposite direction:
Starting with a large table that contains many
columns and much redundant information, how can
we split (decompose) it into tables with fewer
columns and less redundancy?
Database System Concepts - 6th Edition
8.6
©Silberschatz, Korth and Sudarshan
Top-down: Decomposition
Suppose we start with the table inst_dept. How would we
get the idea to decompose it into instructor and
department?
 Naïve approach: spot redundancies in data … but it
doesn’t work for two reasons!
Database System Concepts - 6th Edition
8.7
©Silberschatz, Korth and Sudarshan
Spotting redundancies in data
Problem 1: It’s costly
 Real-life DBs can have large amount of data (hundreds
of columns, hundreds of millions of rows)
 Spotting redundancies requires consideration of
combinations of elements from a set that is already
large →
 → Combinatorial explosion (N-squared and worse)
Database System Concepts - 6th Edition
8.8
©Silberschatz, Korth and Sudarshan
Spotting redundancies in data
Problem 2: From data alone, it’s impossible to decide
whether a pattern discovered is coincidence or not
 Is it the case that departments always reside in one
building and have a unique budget?
Database System Concepts - 6th Edition
8.9
©Silberschatz, Korth and Sudarshan
Solution:
Examine not the data itself (a.k.a. syntax), but the meaning
of the data, a.k.a. the semantics!
The designer must be allowed to specify rules of the
enterprise, a.k.a. functional dependencies, e.g.
dept_name → building, budget
Database System Concepts - 6th Edition
8.10
©Silberschatz, Korth and Sudarshan
dept_name  building, budget
What does it mean?
“If several rows have the same value for dept_name,
then they also have the same values for building and
budget.”
or
“If there were a schema (dept_name, building, budget),
then dept_name would be a candidate key.”
Database System Concepts - 6th Edition
8.11
©Silberschatz, Korth and Sudarshan
dept_name  building, budget
“If there were a schema (dept_name, building, budget),
then dept_name would be a candidate key.”
Since in our table inst_dept dept_name is not a candidate key,
the building and budget of a department may have to be repeated
along with dept_name.

This indicates the need to decompose inst_dept.
Database System Concepts - 6th Edition
8.12
©Silberschatz, Korth and Sudarshan
dept_name  building, budget
“If there were a schema (dept_name, building, budget),
then dept_name would be a candidate key.”
This example also shows how functional dependencies (FD) are
different from keys: a FD captures a rule that is in general more
granular than a key.
A key is a FD, but a FD is not always a key!
Database System Concepts - 6th Edition
8.13
©Silberschatz, Korth and Sudarshan
Not all decompositions are good!
Suppose we decompose
employee(ID, name, street, city, salary)
into
employee1 (ID, name)
employee2 (name, street, city, salary)
Problem: we cannot reconstruct the original employee relation!
Database System Concepts - 6th Edition
8.14
©Silberschatz, Korth and Sudarshan
A lossy decomposition
Database System Concepts - 6th Edition
8.15
©Silberschatz, Korth and Sudarshan
But there are also lossless decompositions!
 Technically it’s called a lossless-join decomposition
 Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)
A
B C
A B
B
C


1
2


1
2
A
B
A
B
A,B(r)
r
A (r)
B (r)
Database System Concepts - 6th Edition
1
2
A
B C


1
2
B,C(r)
A
B
8.16
©Silberschatz, Korth and Sudarshan
How to avoid lossy decompositions?
Database System Concepts - 6th Edition
8.17
©Silberschatz, Korth and Sudarshan
Goal — Devise a theory for the following
 Decide whether a particular relation R is in “good” form.
 When the relation R is not in “good” form, decompose
it into a set of relations {R1, R2, ..., Rn} such that

each relation is in good form

the decomposition is a lossless-join decomposition
 Our theory is based on dependencies:

functional dependencies

multivalued dependencies
 The process outlined above is called NORMALIZATION
Database System Concepts - 6th Edition
8.18
©Silberschatz, Korth and Sudarshan
8.2 First Normal Form (1NF)
 A domain is atomic if its elements are treated by the
DBMS as indivisible units.
 Examples of non-atomic domains:
Names
IDs
that can be broken up into parts (e.g. CS401)
Phone
Any
with first +middle + last
numbers
composite attributes!
 A relational schema R is in first normal form (1NF) if
the domains of all attributes of R are atomic.
 For now, we assume all relations to be in 1NF (but see
Ch.22: Object-Based Databases)
Database System Concepts - 6th Edition
8.19
©Silberschatz, Korth and Sudarshan
1NF
Atomicity is actually a property of how the elements of the
domain are used!
Example: Students are given roll numbers which are
strings of the form CS0012 or EE1127

Strings would normally be considered indivisible …

… but if the first two characters are extracted to find
the dept., the domain of roll numbers is not atomic.

Doing so is a bad idea: leads to encoding of
information in the app. program rather than in the DB.
Why
is this bad?
What
should the DB designer do in this case?
Database System Concepts - 6th Edition
8.20
©Silberschatz, Korth and Sudarshan
8.3 Functional Dependencies (FD)
 FDs are constraints on the set of legal relations.
 Require that the value for a certain set of attributes
determine uniquely the value for another set of
attributes.
 A FD is a generalization of the concept of key: A key
requires that the value for a certain set of attributes
determine uniquely the value for all remaining
attributes.
Database System Concepts - 6th Edition
8.21
©Silberschatz, Korth and Sudarshan
Functional Dependencies
 Let R be a relation schema, and a, b two sets of attributes
  R and   R
 The functional dependency

holds on R if and only if for any legal relations r(R), whenever
any two tuples t1 and t2 of r agree on the attributes , they also
agree on the attributes . That is,
t1[] = t2 []  t1[ ] = t2 [ ]
 Example: Consider r(A,B ) with the following instance of r.
1
1
3
4
5
7
 On this instance, A  B does NOT hold, but B  A does hold.
Database System Concepts - 6th Edition
8.22
©Silberschatz, Korth and Sudarshan
QUIZ: Functional Dependencies
Decide if the following FDs hold or not:
 AB
 BA
 {A, C}  D
 {A, B, C}  D
Database System Concepts - 6th Edition
8.23
©Silberschatz, Korth and Sudarshan
FD vs. key
 K is a superkey for relation schema R if and only if K  R
 K is a candidate key for R if and only if

K  R, and

for no   K,   R
 FDs allow us to express constraints that cannot be expressed
using (super)keys. Consider the schema:
inst_dept (ID, name, salary, dept_name, building, budget )
We expect these FDs to hold:
dept_name building
ID  building
but would not expect the following FD to hold:
dept_name  salary
Database System Concepts - 6th Edition
8.24
©Silberschatz, Korth and Sudarshan
QUIZ: FD vs. key
Decide if the following are candidate keys of not:
 A
 B
 {A, C}
 {A, B, C}
 {A, B, C, D}
 D
Database System Concepts - 6th Edition
8.25
©Silberschatz, Korth and Sudarshan
Uses for FDs
 Test relations to see if they are legal under a given set of
FDs.

If a relation r is legal under a set F of FDs, we say
that r satisfies F.
 Specify constraints on the set of legal relations
We
say that F holds on R if all legal relations on R
satisfy the set of FDs F.
 Note: A specific instance of a relation schema may
satisfy a FD even if the FD does not hold on all legal
instances.

Example: a specific instance of instructor may, by
chance, satisfy
name  ID.
Database System Concepts - 6th Edition
8.26
©Silberschatz, Korth and Sudarshan
Trivial FD
A functional dependency is trivial if it is satisfied by all
instances of a relation

Example:

ID, name  ID

name  name
In general,  
Database System Concepts - 6th Edition
 is trivial if   
8.27
©Silberschatz, Korth and Sudarshan
QUIZ: Trivial FDs
Give 4 examples of trivial FDs in this relation.
Database System Concepts - 6th Edition
8.28
©Silberschatz, Korth and Sudarshan
The Holy Grail:
Closure of a set of FDs
 Given a set F of FDs, there are certain other
FDs that are logically implied by F.

For example: If A  B and B  C, then we can
infer that A  C
 The set of all FDs logically implied by F is the
closure of F.
 We denote the closure of F by F+.
 F+ is a superset of F.
Database System Concepts - 6th Edition
8.29
©Silberschatz, Korth and Sudarshan
8.3.2 Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of
FDs if for all FDs in F+ of the form
 
(where   R and   R), at least one of the following is
true:
    is trivial (i.e.,   )
  is a superkey for R
Database System Concepts - 6th Edition
8.30
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF
at least one of the following holds:
    is trivial (i.e.,   )
  is a superkey for R
Is this schema in BCNF?
instr_dept (ID, name, salary, dept_name, building, budget )
Database System Concepts - 6th Edition
8.31
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF
at least one of the following holds:
    is trivial (i.e.,   )
  is a superkey for R
Is this schema in BCNF?
instr_dept (ID, name, salary, dept_name, building, budget )
No, because
dept_name building, budget
holds, but dept_name is not a superkey (Why?)
Database System Concepts - 6th Edition
8.32
©Silberschatz, Korth and Sudarshan
Extra-credit QUIZ: BCNF
EOL1/3
Database System Concepts - 6th Edition
8.33
©Silberschatz, Korth and Sudarshan
Quiz: What is the difference between keys and
functional dependencies (FD)?
Database System Concepts - 6th Edition
8.34
©Silberschatz, Korth and Sudarshan
Quiz: What is the difference between keys and
functional dependencies (FD)?
A key is a FD, but a FD is not always a key!
Database System Concepts - 6th Edition
8.35
©Silberschatz, Korth and Sudarshan
BCNF Decomposition
Suppose we have a schema R and a non-trivial
dependency  causes a violation of BCNF.
We decompose R into:
• (U  )
•( R - (  -  ) )
Database System Concepts - 6th Edition
8.36
©Silberschatz, Korth and Sudarshan
Decomposing a Schema into BCNF
We decompose R into:
• (U  )
•( R - (  -  ) )
In our example:

= dept_name

= building, budget
and inst_dept is replaced by

(U  ) = ( dept_name, building, budget )
(
R - (  -  ) ) = ( ID, name, salary, dept_name )
Database System Concepts - 6th Edition
8.37
©Silberschatz, Korth and Sudarshan
QUIZ 1: BCNF
We decompose R into:
• (U  )
•( R - (  -  ) )
Take  = {A, B, C, D}   = {C, D, E, F}, and the
entire relation is R = {A,B,C,D,E,F,G,H}
What is the decomposition?
Database System Concepts - 6th Edition
8.38
©Silberschatz, Korth and Sudarshan
QUIZ 2: BCNF
We decompose R into:
• (U  )
•( R - (  -  ) )
Take  = {A, B}   = {E, F}, and the entire relation
is R = {A,B,C,D,E,F,G,H}
What is the decomposition?
Database System Concepts - 6th Edition
8.39
©Silberschatz, Korth and Sudarshan
QUIZ 3: BCNF
Is this relation in BCNF?
Hint: Rename the attributes A, B, C, ….
Database System Concepts - 6th Edition
8.40
©Silberschatz, Korth and Sudarshan
QUIZ 3: BCNF
A: Not BCNF, b/c both FDs are violations!
Decompose it to BCNF!
Database System Concepts - 6th Edition
8.41
©Silberschatz, Korth and Sudarshan
QUIZ 3: BCNF
Solution:
Database System Concepts - 6th Edition
8.42
©Silberschatz, Korth and Sudarshan
Dependency Preservation
Constraints, including FDs, are costly to check in
practice unless they pertain to only one relation.
If it is sufficient to test only those dependencies on each
individual relation of a decomposition in order to
ensure that all functional dependencies hold, then
that decomposition is dependency preserving.
Database System Concepts - 6th Edition
8.43
©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
ER model of a bank: A
customer can have
more than 1 personal
banker, but at most
one at any given
branch.
A ternary relationshipset is needed:
Database System Concepts - 6th Edition
8.44
©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
Implementation:
R = cust_banker_branch = (customer_id, employee_id,
branch_name, type)
FDs: FD1: employee_id  branch_name
FD2: (customer_id, branch_name)  (employee_id, type)
Is cust_banker_branch in BCNF?
Database System Concepts - 6th Edition
8.45
©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
Implementation:
R = cust_banker_branch = (customer_id, employee_id,
branch_name, type)
FDs: FD1: employee_id  branch_name
FD2: (customer_id, branch_name)  (employee_id, type)
Apply the decomposition algorithm!
Database System Concepts - 6th Edition
8.46
©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
Implementation:
R = cust_banker_branch = (customer_id, employee_id,
branch_name, type)
FDs: FD1: employee_id  branch_name
FD2: (customer_id, branch_name)  (employee_id, type)
Decomposition:
R1 = (employee_id, branch_name)
R2 = (customer_id, employee_id, type)
Problem: FD2 is now “spread” across
two relations!
Database System Concepts - 6th Edition
8.47
©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
Conclusion:
BCNF is not dependency preserving (in
general)
Because it is not always possible to achieve both
BCNF and dependency preservation, we consider a
weaker normal form …
Database System Concepts - 6th Edition
8.48
©Silberschatz, Korth and Sudarshan
Third Normal Form = 3NF
A relation schema R is in third normal form (3NF) if for all:
   in F+
at least one of the following holds:

   is trivial (i.e.,   )

 is a superkey for R

Each attribute A in  –  is contained in a candidate key for R.
(NOTE: each attribute may be in a different candidate key)
 If a relation is in BCNF it is in 3NF (since in BCNF one of the first two
conditions above must hold).
 Third condition is a minimal relaxation of BCNF to ensure dependency
preservation.
Database System Concepts - 6th Edition
8.49
©Silberschatz, Korth and Sudarshan
SKIP all other 3NF theory!
The only facts about 3NF we cover are
those on the previous slide!
Database System Concepts - 6th Edition
8.50
©Silberschatz, Korth and Sudarshan
Whatever happened with 2NF?
In a nutshell, it forbids attributes to depend on parts of keys.
It is not of practical use anymore.
See Second normal form - Wikipedia, the free encyclopedia
for more details.
Database System Concepts - 6th Edition
8.51
©Silberschatz, Korth and Sudarshan
Review of Normal Forms
Database System Concepts - 6th Edition
8.52
©Silberschatz, Korth and Sudarshan
Updated list of Normalization Goals
Let R be a relation scheme with a set F of FDs:
 Decide whether R is in “good” form.
 If R is not in “good” form, decompose it into a set of relation schemes
{R1, R2, ..., Rn} such that :

each relation scheme is in good form

the decomposition is a lossless-join decomposition

Preferably, the decomposition should be dependency preserving.
Database System Concepts - 6th Edition
8.53
©Silberschatz, Korth and Sudarshan
8.4 Functional-Dependency Theory
This is the formal theory that tells us which
functional dependencies are implied logically
by a given set of functional dependencies.
Database System Concepts - 6th Edition
8.54
©Silberschatz, Korth and Sudarshan
Remember: Closure of a Set of FDs
 Given a set F set of FDs, there are certain other
FDs that are logically implied by F.

E.g. transitivity: If A  B and B  C, then
also A  C
 The set of all functional dependencies logically
implied by F is the closure of F.
 We denote the closure of F by
Database System Concepts - 6th Edition
8.55
F+.
©Silberschatz, Korth and Sudarshan
Armstrong’s Axioms
We can find F+, the closure of F, by repeatedly applying
Armstrong’s Axioms:

if   , then   
(reflexivity)

if   , then     
(augmentation)

if   , and   , then    (transitivity)
These rules are

sound (They generate only FDs that actually hold)

complete (They generate all FDs that hold).
Database System Concepts - 6th Edition
8.56
©Silberschatz, Korth and Sudarshan
QUIZ: Armstrong’s Axioms
Write Armstrong’s Axioms:

(reflexivity)

(augmentation)

(transitivity)
Database System Concepts - 6th Edition
8.57
©Silberschatz, Korth and Sudarshan
QUIZ: Armstrong’s Axioms
Write Armstrong’s Axioms:

if   , then   
(reflexivity)

if   , then     
(augmentation)

if   , and   , then   
(transitivity)
Database System Concepts - 6th Edition
8.58
©Silberschatz, Korth and Sudarshan
Examples of use of A’s Axioms
Given the following relation:
and the set of FDs
R = (A, B, C, G, H, I)
F=
{ AB
AC
CG  H
CG  I
B  H}
Some other members of the closure F+ are:

AH


AG  I


by transitivity from A  B and B  H
by augmenting A  C with G, to get AG  CG
and then transitivity with CG  I
CG  HI

by augmenting CG  I to infer CG  CGI,
and augmenting of CG  H to infer CGI  HI,
and then transitivity
Database System Concepts - 6th Edition
8.59
©Silberschatz, Korth and Sudarshan
Your turn!

if   , then   
(reflexivity)

if   , then     
(augmentation)

if   , and   , then   
(transitivity)
Prove that


if and only if   
Double implication:
L-to-R and R-to-L!
EOL 2/3
Database System Concepts - 6th Edition
8.60
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Consider the following relation:
What non-trivial FDs exist?
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.61
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Let’s simplify the notation:
AB → C and C → B.
F1: Person, Shop Type → Nearest Shop
F2: Nearest Shop → Shop Type
Is the relation in BCNF?
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.62
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Let’s simplify the notation:
AB → C and C → B.
No, b/c C → B is a violation: C is not superkey.
Is the relation in 3NF?
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.63
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Let’s simplify the notation:
AB → C and C → B.
No, b/c C → B is a violation: C is not superkey.
Remember: 3NF has the following condition in addition to BCNF:
Each attribute A in  –  is contained in a candidate key for R.
(NOTE: each attribute may be in a different candidate key)
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.64
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Let’s simplify the notation:
AB → C and C → B.
B is part of the candidate key AB.
This shows that C → B is not a 3NF violation, so the relation is
in 3NF!
Do the BCNF decomposition
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.65
©Silberschatz, Korth and Sudarshan
QUIZ: BCNF and 3NF
Let’s simplify the notation:
AB → C and C → B.
R1 = {B, C}
R2 = {A, C}
Is the decomposition above dependency-preserving?
Source:
Database http://en.wikipedia.org/wiki/Boyce-Codd_normal_form
System Concepts - 6th Edition
8.66
©Silberschatz, Korth and Sudarshan
More FD theorems, a.k.a. rules or
results
Exercise
8.4
Exercise
8.26
Exercise
8.5
Database System Concepts - 6th Edition
8.67
©Silberschatz, Korth and Sudarshan
Naïve Algorithm for Computing F+
Apply as many axioms and theorems to derive new FDs!
Can you find 3 more FDs in this manner?
Do you see a problem with this approach?
Database System Concepts - 6th Edition
8.68
©Silberschatz, Korth and Sudarshan
Extra-credit QUIZ
Database System Concepts - 6th Edition
8.69
©Silberschatz, Korth and Sudarshan
Algorithm for Computing F+
To compute the closure F+ of a set of FDs F:
Assign F+ = F
repeat
for each FD f in F+
apply reflexivity and augmentation rules on f
add the resulting FDs to F +
for each pair of FDs f1and f2 in F +
if f1 and f2 can be combined using transitivity,
add the resulting FD to F +
until F + does not change any further
Database System Concepts - 6th Edition
8.70
©Silberschatz, Korth and Sudarshan
QUIZ: Algorithm for Computing F+
To compute the closure F+ of a set of FDs F:
Assign F+ = F
repeat
for each FD f in F+
apply reflexivity and augmentation rules on f
add the resulting FDs to F +
for each pair of FDs f1and f2 in F +
if f1 and f2 can be combined using transitivity,
add the resulting FD to F +
until F + does not change any further
Apply the algorithm to R = {A, B, C}, with AB → C
Database System Concepts - 6th Edition
8.71
©Silberschatz, Korth and Sudarshan
Closure of Attribute Sets
Since computing the entire closure F+ is in general a formidable
task, we set ourselves first a more modest goal:
 Given a set of attributes , define the closure of  under F
(denoted by +) as the set of attributes that are functionally
determined by  under F

Algorithm to compute +, the closure of  under F
result := ;
while (changes to result) do
for each    in F do
begin
if   result then result := result  
end
Database System Concepts - 6th Edition
8.72
©Silberschatz, Korth and Sudarshan
Example of Attribute Set Closure
 R = (A, B, C, G, H, I)
 F = {A  B
AC
CG  H
CG  I
B  H}
 (AG)+
1. result = AG
2. result = ABCG
(A  C and A  B)
3. result = ABCGH (CG  H and CG  AGBC)
4. result = ABCGHI (CG  I and CG  AGBCH)
Stop (Why?)
 Is AG a candidate key?
Is AG a super key?
1. Does AG  R? Yes, b/c (AG)+ = R.
2. Is any subset of AG a superkey?
1.
1.
2.
Does A  R? No, b/c (A)+ ≠ R
Does G  R? No, b/c (G)+ ≠ R
Database System Concepts - 6th Edition
8.73
©Silberschatz, Korth and Sudarshan
Uses of Attribute Closure Algorithm
 Testing for superkey:

To test if  is a superkey, we compute +, and
check if + contains all attributes of R
 Testing if certain FDs hold:

To check if    holds (is in F+), just check if  
+
 Another algorithm for computing closure F+ of F:

For each   R, find the closure +
for
each S  +, we output the FD   S
Still very expensive, but at least
we have a more systematic way
of doing it!
Database System Concepts - 6th Edition
8.74
©Silberschatz, Korth and Sudarshan
Uses of Attribute Closure Algorithm
 Testing for superkey:

To test if  is a superkey, we compute +, and
check if + contains all attributes of R
 Testing if a certain FDs holds:

To check if    holds (is in F+), just check if  
+
 Another algorithm for computing closure F+ of F:

For each   R, find the closure +
for
each S  +, we output the FD   S
Database System Concepts - 6th Edition
8.75
©Silberschatz, Korth and Sudarshan
QUIZ: Uses of Attribute Closure Alg.
 Testing for superkey:

To test if  is a superkey, we compute +, and
check if + contains all attributes of R
R = (A, B, C, D)
F = {A  BC, B  C, A  B, AB  C, BC → D}
Is AD a superkey?
Is AD a candidate key?
Database System Concepts - 6th Edition
8.76
©Silberschatz, Korth and Sudarshan
QUIZ: Uses of Attribute Closure Alg.
 Testing if a certain FDs holds:

To check if    holds (is in F+), just check if  
+
R = (A, B, C, D)
F = {A  BC, B  C, A  B, AB  C, BC → D}
Does AC → D hold ?
Database System Concepts - 6th Edition
8.77
©Silberschatz, Korth and Sudarshan
Why is closure important for BCNF?
 To check if a non-trivial dependency  causes a violation of BCNF
1. compute + (the attribute closure of ), and
2. verify that it includes all attributes of R, that is, it is a superkey of R.
 Simplified test: To check if a relation schema R is in BCNF, it suffices
to check only the dependencies in the given set F for violation of BCNF,
rather than checking all dependencies in F+.
 If none of the dependencies in F causes a violation of BCNF, then
none of the dependencies in F+ will cause a violation of BCNF
either.
 However, simplified test using only F is incorrect when testing a
relation in a decomposition of R
 Consider R = (A, B, C, D, E), with F = { A  B, BC  D}
 Decompose R into R1 = (A, B) and R2 = (A, C, D, E)
 Neither of the dependencies in F contain only attributes from
(A,C,D,E) so we might be mislead into thinking R2 satisfies
BCNF.
 In fact, dependency AC  D in F+ shows R2 is not in BCNF.
Database System Concepts - 6th Edition
8.78
©Silberschatz, Korth and Sudarshan
SKIP: Canonical Cover
and everything until …
Database System Concepts - 6th Edition
8.79
©Silberschatz, Korth and Sudarshan
8.8 Overall DB Design Process
We have assumed that the schema R is given, but how
does R appear in practice?
 R can be generated when converting E-R diagram to a
set of tables.
 R can be a single relation containing all attributes that are
of interest (called universal relation).

Normalization then breaks R into smaller relations.
 R can be the result of some ad hoc design of relations,
which we then test/convert to normal form.
Database System Concepts - 6th Edition
8.80
©Silberschatz, Korth and Sudarshan
ER Model and Normalization
When an E-R diagram is carefully designed, identifying all entities
correctly, the tables generated from the E-R diagram should not need
further normalization.
However, in a real (imperfect) design, there can be:
 FDs from non-key attributes of an entity set to other attributes of the
same entity set, e.g.:

employee entity with attributes including department_name and
building, and the FD department_name building

Good design would have made department a separate entity
 FDs from non-key attributes of a relationship set to other …

It’s possible, but rare, since most relationships are binary.
Database System Concepts - 6th Edition
8.81
©Silberschatz, Korth and Sudarshan
Denormalization for Performance
 May want to use non-normalized schema for performance
 For example, displaying prereqs along with course_id, and title requires
join of course with prereq
 Alternative 1: Use denormalized relation containing attributes of course
as well as prereq with all above attributes

faster lookup

extra space and extra execution time for updates

extra coding work for programmer and possibility of error in extra code
 Alternative 2: use a materialized view defined as
course

prereq
Benefits and drawbacks same as above, except no extra coding work
for programmer and avoids possible errors
Database System Concepts - 6th Edition
8.82
©Silberschatz, Korth and Sudarshan
Other Design Issues
Some aspects of DB design are not caught by normalization.
Examples of bad DB design, to be avoided: Instead of
earnings (company_id, year, amount ), use
 Separate tables: earnings_2004, earnings_2005, earnings_2006, etc. All
these tables are in BCNF, but:

querying across years is difficult

a new table needs to be created each year
 One table, but with a separate column for each year:
company_year (company_id, earnings_2004, earnings_2005, earnings_2006)
It’s also in BCNF, but also makes querying across years difficult and requires new
attribute each year.

Is an example of a crosstab, where values for one attribute become
column names

Used in spreadsheets, and other data analysis tools
Database System Concepts - 6th Edition
8.83
©Silberschatz, Korth and Sudarshan
SKIP
8.9 Modeling Temporal Data
Database System Concepts - 6th Edition
8.84
©Silberschatz, Korth and Sudarshan
Homework for Ch.8
 8.4, 8.5
 8.6 (Derive only 6 new FDs, using the
closure algorithm in Fig. 8.7)
 8.26
 8.29 except (d) – 3NF decomposition
Database System Concepts - 6th Edition
8.85
©Silberschatz, Korth and Sudarshan
The next slides are a collection of the
algorithms we need to know from this chapter
Database System Concepts - 6th Edition
8.86
©Silberschatz, Korth and Sudarshan
Algorithm for F+
Assign F+ = F
repeat
for each FD f in F+
apply reflexivity and augmentation rules on f
add the resulting FDs to F +
for each pair of FDs f1and f2 in F +
if f1 and f2 can be combined using transitivity,
add the resulting FD to F +
until F + does not change any further
Database System Concepts - 6th Edition
8.87
©Silberschatz, Korth and Sudarshan
Algorithm for +
result := ;
while (changes to result) do
for each    in F do
begin
if   result then result := result  
end
Database System Concepts - 6th Edition
8.88
©Silberschatz, Korth and Sudarshan
Decomposing a Schema into BCNF
Suppose we have a schema R and a non-trivial dependency
 causes a violation of BCNF.
We decompose R into:
(U  )
(R-(-))
•
•
Database System Concepts - 6th Edition
8.89
©Silberschatz, Korth and Sudarshan