Slides for EDBT/ICDT keynote on Joins and Convex Geometry

Links between Convex Geometry
and Join Processing
Christopher Ré
Stanford University
“Query processing is not rocket science…
When you flunk out of query processing, we
make you go build rockets.”
– Anonymous (J. Hamilton or D. DeWitt)
Warning:
This is (mostly) a theory talk…
… but we (and others) are building
database engines with these ideas.
Motivation: Joins!
Databases are about three things:
Worst-case Efficiency,
Parallel Efficiency, and
Beyond Worst-case
Efficiency
Joins Since System R
R(A,B), S(A,C), T(B,C)
Join(R,S,T) = { (a,b,c) : (a,b) in R, (a,c) in S, (b,c) in T}
Join(R,S,T) = Join(Join(R,S),T)
System R searches through pairwise joins
For 40+ years, major commercial database
use System-R style optimizer.
Nugget: “DBs have been asymptotically
suboptimal for the last 4 decades…”
6
Example Queries
Data: R(A,B), S(A,C),
and T(B,C)
Today: graph where
edges colored R,S,T.
R
v
Q1 = Join(R,S)
S
T
Nodes are
data values.
“triples of nodes on a path of
length 2 that goes via R then S”
Q2 = Join(R,S,T) “triples of nodes that form R-S-T
triangles”
Background: Triangles
[Alon 80, Loomis-Whitney 49]
Data: R(A,B), S(A,C), and T(B,C))
Let Q be Join(R,S,T) = “R-S-T triangles”
If R,S,T contain ≤ N tuples,
how big can |Q| be?
R(A,B), S(A,C)
|Join(R,S)| ≤ N2
R covers B,
S covers C
T(B,C))
|Join(R,S,T)| ≤ N2
Correct asymptotic:|Q| in Q( N3/2 )
Can we compute Q in time O(N3/2)?
8
Pairwise Joins are Suboptimal
R(A,B), S(B,C),T(A,C)
R=[N] x {1}
S ={1} x [N]
T = {1} x [N]
N
1
Data in
R and S
Data
N
|R|=|S|=|T|=N
[N]={1,…,N}
JOIN(R,S) = [N]x {1} x [N]
|Join(R,S)| = N2
DB is toast!
Panic! A simple modification:
any pairwise join plan takes W(N2)
9
Heavy
versus Light
Relax! [Itai & Rodeh 78, Alon et. al 97]
It’s cute, let’s see it.
“The heavy-light technique”
10
Sketch: Heavy v. Light Nodes
Goal: Time O(N3/2) – ignoring log factors.
Call a node heavy if it has more than N1/2 neighbors.
Let H be set of heavy nodes.
(case 1) If v in H, check whether each edge e in E
forms a triangle with v.
v
e
x
y
v1
v2
..
…
2 probes:
each O(1) time.
N Edges
Case I: In total most 2 N|H| probes
Since |H| ≤ 2N1/2 then total time O(N3/2)
11
Case 2.
(case 2) If v not in H, for each pair of edges check.
v
x
y
v1
v2
..
…
N Edges
Case II: Each light node explores d(v)2 where d(v) is
the degree of node v.
å
vÎV -H
d(v)d(v) £ N
å
d(v) £ 2N
3/2
vÎV-H
Union is linear, so we’re done.
12
How do we generalize to joins?
13
Fractional Hypergraph Covers
Given a hypergraph H=(V,E) a fractional edge
cover is x : E R such that x ≥ 0 and
for each v in V we have Se : v in e x(e) ≥ 1
Ex: R(A,B),S(B,C),T(A,C).
x(R,S,T)=(1,0,1) … or…
x(R,S,T)=(0.5, 0.5, 0.5)
x(R) + x(T) ≥ 1 // cover for A
x(R) + x(S) ≥ 1 // cover for B
x(S) + x(T) ≥ 1 // cover for C
We think of a query as hypergraph to cover.
R
S
T
14
Size bounds [GM05, AGM08]
Fix a query Q=(V,E).
Let N be a tuple of |E| positive integers.
Define S(Q, N) be the maximum size
of Q subject to|Re|≤ Ne
Thm [Atserias, Grohe, Marx FOCS08]: Given
any hypergraph cover x for (V,E) then
S(Q,N) ≤ Pe in E |Re|x(e)
Triangle: |R|=|S|=|T| ≤ N,
x(R)=x(S)=x(T)=0.5 N1.5
15
One more example.
R(A,B,C),S(A,B,D),T(A,C,D),U(B,C,D)
x(R) = x(S) = x(T) = x(U) = 1/3
Output size is O(N4/3)
Known since
Loomis-Whitney (1940s Geometers!)
16
AGM’s result.
Atserias, Grohe, and Marx (AGM) allow one to
write a linear program that tightly bounds the
output size of any join query.
Proof using Han/Shearer’s lemma
(non constructive)
Open: Compute the output in
upper bound time?
We would call this worst-case optimal
17
Ngo, Porat, Ré, and Rudra (PODS 2012)
1st algorithm for joins with
optimal worst-case runtime
(experts: optimal data complexity)
We show AGM’s fractional cover inequality is
equivalent to the Bollabás-Thomason
inequality from geometry.
Algorithmic Idea: LP is a guide to decide
“heavy” v. “light” of previous example.
18
Implemented & described at ICDT14!
Todd Veidhulzen. “Leapfrog Triejoin: A Simple,
Worst-Case Optimal Join Algorithm”
[ICDT14, Best Newcomer Award!]
Tidbit: Faster on cyclic queries than other
commercial DB optimizers… without resorting
to specialized graph processing!
19
Much simpler proofs!
Even simpler
than heavy vs.
light!
Hung
Ngo
Atri
Rudra
Skew strikes back:
New Developments in the Theory of
Join Algorithms. (SIGMOD Record 13 )
20
3 Vignettes of Related Ideas
21
(1) Heavy
versus Light
Faster Detection: Alon-Yuster-Zwick, one can
check if a graph contains a 2k-length cycle in
O(N2-1/k) using heavy vs. light argument.
Systems: Heavy vs. light used in parallel
database systems (e.g., Teradata).
22
(2) Map Reduce Joins: Afrati & Ullman
EDBT 2010
Q1 = R(A,B),S(B,C),T(A,C)
Q2 = R(A,B),S(B,C)
Mappers send data to
B
reducer via hash function.
A
Goal: Given p reducers, minimize communication
by picking “how large” each attribute’s share is.
Afrati & Ullman. Solve Constrained Mathematical Program.
Lower bound portion uses Covers!
Optimal: Recast as a (fractional) cover problem!
[Koutris, Suciu, and Beame PODS14]
23
(3) Tighter Runtime Guarantees
Measure: longest
runtime in each strata
Databases
with N tuples.
(Traditional Worst-case)
Databases
with N+1 tuples.
Yannakakis’s seminal algorithm has a
stronger guarantee for a-acyclic queries.
Output Size
Databases
with N tuples.
Databases
with N+1 tuples.
0.5N 0.2N
O( N + OUT )
General: O(N + Nw* + OUT)
where w* is the fractional
hypertreewidth
(NPRR+Y’s+Treewidth)
24
Begs a question: What is the
tightest guarantee that
one can hope for?
25
Pathology of Worst-case Analysis for Joins
Worst case: answer is huge
Usually, the answer is smaller than the database—not larger!
Worst case: one reads the entire input.
Rarely, if ever, does this happen on large databases…
Worst case: insensitive how data are stored.
Data often stored in an index or sorted, and this changes landscape
My not-so-secret goal:
Join theory even closer to practice.
26
We all want to go “beyond worst-case”
Tim Roughgarden
(Stanford) has great
notes on this.
One of the1st beyond worstcase analysis was by a DB
theoretician: Ron Fagin.
Instance Optimality
27
Measuring complexity
Databases
with N tuples.
Databases
with N+1 tuples.
Worst-case analysis
(Traditional CS)
Let T(A,D) be # of steps that
algorithm A takes on database D
Let W(A,N) = supD T(A,D) s.t. D has N tuples
Measure W(A,N) growth with N, asymptotically.
28
Notions of complexity
Databases
with N tuples.
Databases
with N+1 tuples.
Instance
Optimality
Let T(A,D) be # of steps that
algorithm A takes on database D
Algorithm Opt is instance optimal if there exists
constant c such that T(Opt,D) ≤ c T(A,D)
for A in a class of algorithms and any D.
Essentially singleton boxes, much stronger…
29
So how do we pick a class of algorithms?
30
What do join algorithms do?
Famous algorithms: Hash, Sort-merge,
index-nested, block-nested loop, Grace,
PRISM, double pipelined.…
Observation: Algorithms are generic.
Do not depend on data values
but may use data order & equality.
E.g., use an index to skip many consecutive values.
Call these comparison-based algorithms.
31
A Nugget: “A little sorting changes
the complexity landscape a lot.”
32
Warm up: Intersection [Huang & Lin 1972]
Given two sorted lists R and S of length N
R[1] < … < R[N] and S[1] < … < S[N].
Suppose R[j] = 2j and S[j] = 2j+1 for j=1…N
At position i, no idea what comes next.
Ping-pong back and forth—W(N) time
33
Warm up: Intersection [Huang & Lin 1972]
Given two sorted lists R and S of length N
R[1] < … < R[N] and S[1] < … < S[N].
Suppose R[i] = i and S[i] = N/2 + i for i=1…N/2
Skip to R[N] in O(log N) time!
34
Message: Difference in the certification
Certify each
alternation
R[N] < S[1]
is enough
running time is Log(N) v. N
Same: Input and output size are the same!
Different: Work to certify output is empty
35
Goal: Algorithms that run in
time proportional to the size of
a smallest certificate.
36
Generalizing Certificates to Joins
Assume: a global attribute order A1…An.
All relations are stored consistently with this order
(… we can remove this …)
To define a certificate, need to describe:
1. How are data stored? (search trees), and
2. How are certificates encoded? (arguments)
37
Think of the data in a trie
A relation
R(A2,A4,A5)
A2
A4
A5
1
2
4
1
2
7
1
3
5
7
4
2
10
4
1
Index
indicates the
tuples order.
R[3
]
A2
R[1,2]
R[2,1]
A4
A5
R[1,1,2] R[1,2,1]
R[3,1,1
]
38
NB: search trees capture
hash tables, B+trees, tries,
up to a log N factor.
Compare elements in
this dictionary order…
39
Any algorithm must certify its output
An argument is a set of propositional
statements of the following forms.
1. R[i] < S[j]
2. R[i] = S[j]
3. R[i] = R[j]
Here i,j are tuples of indexes
as illustrated in previous slide.
Certificate is an argument, cert, such that any
instance that satisfies cert has the same output
(up to isomorphism).
40
Certificate Complexity
Goal: run in time O( (|cert| + Z) log N) where
cert denotes a smallest certificate,
Z is the size of the output, and
N is the size of the data. NB: Input Size under log
O hides constants depending on Q.
Runtimes of the above are essentially
instance optimal for comparison based.
Ron Fagin says “log-instance optimal”
41
Comments about Certificates
1. A comparison-based algorithm (all available
join algorithms) takes at least |cert| steps.
2. N ≥ |cert| where N is the input size (strictly finer
notion of complexity)
3. Certificates provide an instance-dependent
measure of complexity. (conditioning)
42
Minesweeper Algorithm (MS)
“Removing the haystack to find the needles”
43
Minesweeper: Key operation
Deduce: no output tuple can be in an interval
Consider: Q = R(A,B),S(A)
A
B
2
4
3
2
3
5
Index for R
in (A,B) order
([-∞, 1],*)
(=2,[-∞,3])
(=3, [3,4] ) No output tuple has
Big Conceptual Change:
Find best way to rule out all
tuples… not to find tuples.
A = 3 and B in [3,4]
A
4
Index for S
44
Picture the Output Space as a Grid
B
values
R(A,B),S(A)
([-∞,4], *)
([-∞, 1],*)
(=2,[-∞,4])
(=3, [3,4] )
A values
A
B
A
2
4
4
3
2
3
5
R(A,B)
S(A)
Goal: Run
proportional to
smallest cover.
45
The algorithm
Idea: Reuse information as much as possible.
1. Pick an uncovered point, t.
2. Find all possible ways to
cover t with “gaps”.
3. Insert gaps in to a data
structure.
Repeat until all points covered.
t
Hard part: Data
structure to find t,
efficiently.
46
Nugget: “The boundary for efficiency
has changed from the worst case.”
47
The Boundary
View a query as a hypergraph.
R(A,B),S(A,B,C),T(B,C,D)
Want: Acyclic-like properties but closed under edge
removal. a-acyclic does not have this property.
Turns out, b-acyclic [Fagin83] is the right notion
48
Certificate Dichotomy [PODS14]
Theorem [NNRR14, Certificate Dichotomy]
Given query Q
(1) if Q is b-acyclic, then there is some order of
attributes such that MS takes O(|cert| + Z) on all
instances.
(1) Assuming the 3SUM conjecture, for any bcyclic query there is some family of instances
where any algorithm runs in time W( |cert|4/3 + Z)
49
O hides log N factor
Further Results
1. No polynomial time bound in |cert| for aacyclic queries (Exponential time hypothesis)
a-acyclic is the worst-case boundary, this changes
complexity landscape.
2. Q, treewidth w, MS runs in O(|cert|w+1 + Z) time
3. Fractional results for triangle, O(|cert|3/2 + Z).
50
Some PRELIMINARY Empirical Results
7 graph datasets (100M+ edges)
Dung Nguyen, Hung Q. Ngo & LogicBlox.
Single threaded & multicore version inside LB.
(1) Real certificates can be 1000x smaller than N
(2) On small # of b-acyclic queries in LB,
up to 230x faster and at most 5% slower than LFTJ
Take away: minesweeper might be useful.
Real evaluation underway with LB’s help! 51
Future Directions in Joins
Compressed Representations.
 Column-type storage [Abadi VLDB05]
 Morphing and Bit Weaving [Patel et. al. VLDB06 VLDB2012].
 Factorised DBs using covers [Olteanu et al. VLDB2012-2013].
Compression & bit-level operations for modern hardware?
Deeper understanding of data properties.
 Bounded degree data [A. Durand & E. Grandjean ToCL97]
 Bounded expansion [Segoufin ICDT13, Kazana&Segoufin
PODS13]
 Bounded treewidth data [Gottlob et al. AAAI06, Aarnborg 85,
Grohe ICDT99]
Personally obsessed!
One join algorithm to rule them all? Stay tuned…
52
Convex Geometry and Joins
Tool: Fractional covers to bound volumes
1. worst-case optimal Join
algorithm via Fractional Covers
2. Other Uses of Fractional Covers
3. Beyond Worst-case for Joins
using Covers and Geometry.
Conclusion
Joins are awesome:
Theory and Practice can argue!
Geometry is key to these
new algorithms.
Beyond worst-case analysis may be
an opportunity for the DB community.
54