Projected slides on SC and linearizability

Sequen&al Consistency and Linearizability (Or, Reasoning About Concurrent Objects) Acknowledgement: Slides par&ally adopted from the companion slides for the book "The Art of Mul&processor Programming" by Maurice Herlihy and Nir Shavit What We'll Cover Today Chapter 3 of: Digital copy can be obtained via WUSTL library: hQp://catalog.wustl.edu/search/ Concurrent Computa&on memory
object
object
Objec&vism What does it mean for a concurrent object to be correct? •  How do we specify its behavior? •  How do we implement one? •  How do we tell if the implementa/on is correct? Both sequen&al consistency and linearizability are correctness condi/ons for concurrent objects. Sequen&al Consistency [1] •  Typically used as a memory consistency model, which specifies in what order the memory opera&ons may appear to execute. •  You can think of a memory loca&on as a concurrent object. Sequen&al Consistency [1] “[T]he result of any execution is the same as if the operations
of all the processors* were executed in some sequential order,
and the operations of each individual processor appear in this
sequence in the order specified by its program.” ― Leslie Lamport, 1979
•  The sequence of instruc&ons from threads execu&ng concurrently are interleaved to form a global linear order of all instruc&ons. •  The sequence of instruc&ons from a given thread is defined by the thread’s program. •  Implicit: each memory loca&on is coherent according to the global linear order. * I will be using processor vs thread interchangeably. What SC Buys Us r1, r2: registers. x ← 0 y ← 0 Rela&ng to the dag: 1
x ← 1 2
4 y ← 1 r1 ← y 3
5 r2 ← x 6
r1 = ?? r2 = ?? Under SC, the final result we get from execu&ng this dag is as if the execu&on followed some topological sort of the dag. r1 1 0 1 0 r2 1 1 0 0 possible? yes yes yes (example) (2,4,3,5) (2,3,4,5) (4,5,2,3) Not under SC SC precludes execu&ons that don't conform to program order. Peterson's Algorithm
public void lock() {
flag[i] = true;
victim = i;
while (flag[j] && victim == i) {};
}
public void unlock() {
flag[i] = false;
}
Crux of Peterson Proof Needs SC (1) writeB(flag[B]=true) è writeB(vic&m=B) (3) writeB(vic&m=B) è writeA(vic&m=A) (2) writeA(vic&m=A) è readA(flag[B]) è readA(vic&m) A read flag[B] == true and vic&m == A, so it could not have entered the CS (QED) The proof assumes execu&on follows program order, and that each memory loca&on is coherent. Linearizability [2] •  Originally developed to reason about behaviors of concurrent objects. •  Stronger than sequen&al consistency (we will say more about that later). Example: Circular FIFO Queue public class FIFOQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
head
0
public void enq(T x) {
1
capacity-1 y
z
if (tail-head == capacity)
throw new FullException();
items[tail % capacity] = x;
tail++;
}
public T deq() {
if (tail == head)
throw new EmptyException();
Item item = items[head % capacity];
head++;
return item;
}}
tail
2
Now Consider the Following Scenario •  The FIFO queue is accessed by two threads concurrently. •  But, –  One thread enq only –  The other deq only Wait-­‐free 2-­‐Thread Queue head
0
tail
1
x
y
deq()
7
2
6
3
5
4
enq(z)
z
Wait-­‐free 2-­‐Thread Queue head
0
tail
1
x
y
result = x
7
2
6
3
5
4
queue[tail]
= z
z
Wait-­‐free 2-­‐Thread Queue head
0
tail
1
y
head++
7
z
6
x
2
3
5
4
tail--
Wait-­‐Free Concurrent 2-­‐Thread Queue public class FIFOQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
head
tail
0
public void enq(T x) {
1
capacity-1 y
z 2
if (tail-head == capacity)
throw new FullException();
items[tail % capacity] = x;
tail++;
}
public T deq() {
if (tail == head)
throw new EmptyException();
Item item = items[head % capacity];
head++;
Claim: the same implementa/on can be return item;
}}
used as a wait-­‐free two-­‐thread queue, and the queue will behave correctly. What is a Concurrent Queue? •  Need a way to specify a concurrent queue object •  Need a way to prove that an algorithm implements the object’s specifica&on •  Lets talk about object specifica&ons … Sequen&al Objects •  Each object has a state –  Usually given by a set of fields –  Queue example: sequence of items •  Each object has a set of methods –  Only way to manipulate state –  Queue example: enq and deq methods Sequen&al Specifica&ons •  If (precondi&on) –  the object is in such-­‐and-­‐such a state –  before you call the method, •  Then (postcondi&on) –  the method will return a par&cular value –  or throw a par&cular excep&on. •  and (postcondi&on, con’t) –  the object will be in some other state –  when the method returns, Pre and PostCondi&ons for Deq •  Precondi&on: –  Queue is non-­‐empty •  Postcondi&on: –  Returns first item in queue •  Postcondi&on: –  Removes first item in queue Pre and PostCondi&ons for Deq •  Precondi&on: –  Queue is empty •  Postcondi&on: –  Throws Empty excep&on •  Postcondi&on: –  Queue state unchanged Why Sequen&al Specifica&ons Totally Rock •  Interac&ons among methods captured by side-­‐
effects on object state –  State meaningful between method calls •  Documenta&on size linear in number of methods –  Each method described in isola&on •  Can add new methods –  Without changing descrip&ons of old methods Sequen&al vs Concurrent •  Sequen&al –  Methods take &me? Who knew? –  Objects need meaningful states only between method calls. •  Concurrent –  Method call is not an event –  Method call is an interval. –  Because method calls overlap, object might never be between method calls. Sequen&al vs Concurrent •  Sequen&al: –  Each method can be described in isola&on. –  Can add new methods without affec&ng older methods •  Concurrent: –  Must characterize all possible interac&ons with concurrent calls •  What if two enq() calls overlap? •  Two deq() calls? enq() and deq()? … –  Everything can poten&ally interact with everything else The Big Ques&on •  What does it mean for a concurrent object to be correct? –  What is a concurrent FIFO queue? –  FIFO means strict temporal order –  Concurrent means ambiguous temporal order Intui&vely… public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
All queue modifica&ons } finally {
are mutually exclusive lock.unlock();
}
}
Intui&vely Lets capture the idea of describing
the concurrent via the sequential
lock()
q.deq
q.enq
lock()
enq
time
enq
unlock()
deq
unlock()
deq
Behavior is
“Sequential”
Linearizability [2] •  Each method should –  “take effect” instantaneously (lineariza&on point) –  between invoca&on (making a method call) and response (returning from a method call) events •  If we can look at the trace from parallel execu&on, and pick a lineariza&on point for all method invoca&ons such that the history "make sense" (i.e., matches sequen&al behavior), then the history is linearizable. •  A linearizable object: one all of whose possible execu&ons are linearizable. Example q.enq(x)
q.enq(y)
q.deq(y)
q.deq(x)
time
Example q.enq(x)
q.enq(y)
q.deq(y)
q.deq(x)
time
Example q.enq(x)
q.enq(y)
q.deq(y)
q.deq(x)
time
Example q.enq(x)
q.deq(y)
q.enq(y)
time
Example q.enq(x)
q.deq(y)
q.enq(y)
q.deq(x)
time
Example q.enq(x)
q.deq(y)
q.enq(y)
q.deq(x)
time
Read/Write Register Example write(0)
read(1)
write(1)
write(1) already
happened time
write(2)
read(0)
Read/Write Register Example write(0)
read(1)
write(1)
write(1) already
happened time
write(2)
read(0)
Read/Write Register Example write(0)
read(1)
write(1)
write(1) already
happened time
write(2)
read(1)
Read/Write Register Example write(0)
read(1)
write(1)
write(1) already
happened time
write(2)
read(1)
Read/Write Register Example write(0)
write(2)
write(1)
time
read(1)
Read/Write Register Example write(0)
write(2)
write(1)
time
read(1)
Talking About Execu&ons •  Why? –  Can’t we specify the lineariza&on point of each opera&on without describing an execu&on? •  Not Always –  In some cases, lineariza&on point depends on the execu&on Linearizability, Formally Split Method Calls into Two Events: •  Invoca&on –  method name & args – q.enq(x)
•  Response –  result or excep&on – q.enq(x) returns void
– q.deq() returns x
– q.deq() throws empty
Invoca&on Nota&on A q.enq (x)
thread
object
method
arguments
Response Nota&on A q: void
thread
object
result
Response Nota&on A q: empty()
thread
object
exception
Defini&on •  Invoca&on & response match if Thread
names agree
Object names agree
A q.enq(3)
A q:void
Method call
History: Describing an Execu&on A
A
A
H= B
B
B
B
q.enq(3)
q:void
q.enq(5)
p.enq(4)
p:void
q.deq()
q:3
Sequence of
invocations and
responses
Object Projec&on A
A
A
H|q = B
B
B
B
q.enq(3)
q:void
q.enq(5)
p.enq(4)
p:void
q.deq()
q:3
Picking out entries
performed on the
target object.
Thread Projec&on A
A
A
H|B = B
B
B
B
q.enq(3)
q:void
q.enq(5)
p.enq(4)
p:void
q.deq()
q:3
Picking out entries
performed by a
given thread.
History: Describing an Execu&on A
A
A
H= B
B
B
B
q.enq(3)
q:void
q.enq(5)
p.enq(4)
p:void
q.deq()
q:3
An invocation is
pending if it has no
matching response.
Well-­‐Formed Histories A
A
B
H= A
B
B
B
q.enq(3)
q:void
p.enq(4)
q.enq(5)
p:void
q.deq()
q:3
A q.enq(3)
H|A = A q:void
A q.enq(5)
B
B
H|B =
B
B
p.enq(4)
p:void
q.deq()
q:3
Each per-thread projection contains matching
invocation and responses, except for the last
pending invocation.
Equivalent Histories A
A
A
H= B
B
B
B
q.enq(3)
q:void
q.enq(5)
p.enq(4)
p:void
q.deq()
q:3
A
B
A
G= B
B
B
A
Threads see the same
thing in both
q.enq(3)
p.enq(4)
q:void
p:void
q.deq()
q:3
q.enq(5)
H|A = G|A
H|B = G|B
Sequen&al Specifica&ons •  A sequen&al specifica&on is some way of telling whether a single-­‐thread, single-­‐object history is legal •  For example: –  Pre and post-­‐condi&ons –  But plenty of other techniques exist … Legal Histories •  A sequen&al (mul&-­‐object) history H is legal if –  For every object x –  H|x follows the sequen&al specifica&on for x Precedence Ordering A
B
B
A
B
B
q.enq(3)
p.enq(4)
p.void
q:void
q.deq()
q:3
A
B
B
B
A
B
q.enq(3)
p.enq(4)
p.void
q.deq()
q:void
q:3
A method call precedes another if response event precedes invoca&on event Otherwise they overlap Nota&on •  Given –  History H –  method execu&ons m0 and m1 in H •  We say m0 èH m1, if
–  m0 precedes m1 in H •  Rela&on m0 èH m1 is a –  Par&al order –  Total order if H is sequen&al
m0
m1
Linearizability [2] •  History H is linearizable if it can be extended to G by –  Appending zero or more responses to pending invoca&ons –  Discarding other pending invoca&ons •  So that G is equivalent to –  a legal sequen&al history S –  where èG ⊂ èS Fix the invoca&ons that already took effect and discard the rest that haven't. Linearizability [2] •  History H is linearizable if it can be extended to G by –  Appending zero or more responses to pending invoca&ons –  Discarding other pending invoca&ons •  So that G is equivalent to –  a legal sequen&al history S –  where èG ⊂ èS Equivalent to S captures the fact that input/output of the methods must be the same as in sequen&al execu&on. Linearizability [2] •  History H is linearizable if it can be extended to G by –  Appending zero or more responses to pending invoca&ons –  Discarding other pending invoca&ons •  So that G is equivalent to –  a legal sequen&al history S –  where èG ⊂ èS The total order in S must respect the par&al order ("real-­‐&me" ordering) in the original history. Ensuring èG ⊂ èS èG = {aàc,bàc}
èS = {aàb,aàc,bàc}
a
b
èG
time
c
èS
Example A
B
B
B
B
B
q.enq(3)
q.enq(4)
q:void
q.deq()
q:4
q:enq(6)
Complete this pending invoca/on A q.enq(3)
B q.enq(4)
time
B q.deq(4)
B q.enq(6)
Example A
B
B
B
B
B
A
q.enq(3)
q.enq(4)
q:void
q.deq()
q:4
q:enq(6)
q:void
Complete this pending invoca/on A q.enq(3)
B q.enq(4)
time
B q.deq(4)
B q.enq(6)
Example A
B
B
B
B
B
A
q.enq(3)
q.enq(4)
q:void
q.deq()
q:4
q:enq(6)
q:void
Discard this pending invoca/on A q.enq(3)
B q.enq(4)
time
B q.deq(4)
B q.enq(6)
Example Equivalent sequen/al history A
B
B
B
B
A
q.enq(3)
q.enq(4)
q:void
q.deq()
q:4
q:void
B
B
A
A
B
B
q.enq(4)
q:void
q.enq(3)
q:void
q.deq()
q:4
A q.enq(3)
B q.enq(4)
time
B q.deq(4)
Composability Theorem •  History H is linearizable if and only if –  For every object x –  H|x is linearizable •  Why Does Composability MaQer? –  Modularity –  Can prove linearizability of objects in isola&on –  Can compose independently-­‐implemented objects Reasoning About Linearizability: Wait-­‐Free Concurrent 2-­‐Thread Queue public class FIFOQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
public void enq(T x) {
if (tail-head == capacity)
throw new FullException();
items[tail % capacity] = x;
tail++;
}
public T deq() {
if (tail == head)
throw new EmptyException();
Item item = items[head % capacity];
head++;
Lineariza/on return item;
}}
point General Strategy •  Iden&fy one atomic step where method “happens” –  Cri&cal sec&on –  Machine instruc&on •  Not necessarily a single lineariza&on point –  Might need to define several different ones for a given method Linearizability: Summary •  Powerful specifica&on tool for shared objects •  Allows us to capture the no&on of objects being “atomic” •  Don’t leave home without it Alterna&ve: Sequen&al Consistency •  History H is sequen&ally consistent if it can be extended to G by –  Appending zero or more responses to pending invoca&ons –  Discarding other pending invoca&ons •  So that G is equivalent to –  a legal sequen&al history S –  where èG ⊂ èS Allows reordering opera&ons by different threads to establish a global sequen&al order. G is equivalent to S, which implies that S preserves program order within a thread. SC is Weaker than Linearizability q.enq(x)
q.deq(y)
q.enq(y)
time
SC is Weaker than Linearizability q.enq(x)
q.deq(y)
q.enq(y)
time
Theorem Sequen&al Consistency is not composable FIFO Queue Example p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
H|p Sequen&ally Consistent p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
H|q Sequen&ally Consistent p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
Ordering imposed by p p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
Ordering imposed by q p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
Ordering imposed by both p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
Combining orders p.enq(x)
q.enq(x)
q.enq(y)
time
p.deq(y)
p.enq(y)
q.deq(x)
Hardware Memory Consistency •  No modern-­‐day processor implements sequen&al consistency. •  Hardware ac&vely reorders instruc&ons. •  Compilers may reorder instruc&ons, too. •  Why? •  Because most of performance is derived from a single thread’s unsynchronized execu&on of code. Memory Barriers (Fences) •  A memory barrier (or memory fence) is a hardware ac&on that enforces an ordering constraint between the instruc&ons before and aqer the fence. •  A memory barrier can be issued explicitly as an instruc&on (e.g., in x86: mfence) •  The typical cost of a memory fence is comparable to that of an L2-­‐cache access. Memory Consistency in High-­‐Level Language •  In Java, can ask compiler to keep a variable up-­‐to-­‐date by declaring it vola&le: –  Adds a memory barrier aqer each store –  Inhibits reordering, removing from loops, & other “compiler op&miza&ons”
•  C++11 standard offers atomic variables that come with a set of different memory ordering constraints. Summary: Real-­‐World •  Hardware weaker than sequen&al consistency •  Can get sequen&al consistency at a price •  Linearizability beQer fit for high-­‐level soqware References [1] L. Lamport. How to make a mul&processor computer that correctly executes mul&process programs. IEEE Transac&ons on Computers. September 1979;C-­‐28(9):690. [2] M. Herlihy, J. M. Wing. Linearizability: a correctness condi&on for concurrent objects. ACM Transac&ons on Programming Languages and Systems (TOPLAS). 1990;12(3):
463–492. This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
•  You are free:
–  to Share — to copy, distribute and transmit the work
–  to Remix — to adapt the work
•  Under the following conditions:
–  Attribution. You must attribute the work to “The Art of
Multiprocessor Programming” (but not in any way that suggests that
the authors endorse you or your use of the work).
–  Share Alike. If you alter, transform, or build upon this work, you may
distribute the resulting work only under the same, similar or a
compatible license.
•  For any reuse or distribution, you must make clear to others the license
terms of this work. The best way to do this is with a link to
–  http://creativecommons.org/licenses/by-sa/3.0/.
•  Any of the above conditions can be waived if you get permission from
the copyright holder.
•  Nothing in this license impairs or restricts the author's moral rights.
Art of Multiprocessor
Programming
94