CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch CPSC 668 Set 19: Asynchronous Solvability 1 Problems Solvable in FailureProne Asynchronous Systems • Although consensus is not solvable in failureprone asynchronous systems (neither message passing nor read/write shared memory), there are some interesting problems that are solvable: – – – – set consensus weakenings of consensus approximate agreement renaming - "opposite" of consensus k-exclusion - fault-tolerant variant of mutex CPSC 668 Set 19: Asynchronous Solvability 2 Model Assumptions • asynchronous • shared memory with read/write registers • at most f crash failures of procs. • results can be translated to message passing if f < n/2 (cf. Chapter 10) • may be a few asides into message passing CPSC 668 Set 19: Asynchronous Solvability 3 Set Consensus Motivation • By judiciously weakening the definition of the consensus problem, we can overcome the asynchronous impossibility • We've already seen a weakening of consensus: – weaker termination condition for randomized algorithms • How about weakening the agreement condition? • One weakening is to allow more than one decision value: – allow a set of decisions CPSC 668 Set 19: Asynchronous Solvability 4 Set Consensus Definition Termination: Eventually, each nonfaulty processor decides. k-Agreement: The number of different values decided on by nonfaulty processors is at most k. Validity: Every nonfaulty processor decides on a value that is the input of some processor. CPSC 668 Set 19: Asynchronous Solvability 5 Set Consensus Algorithm • Uses a shared atomic snapshot object X – can be implemented with read/write registers • update your segment of X with your input • repeatedly scan X until there are at least n - f nonempty segments • decide on minimum value appearing in any segment CPSC 668 Set 19: Asynchronous Solvability 6 Correctness of Set Consensus Algorithm • Termination: at most f crashes. • Validity: every decision is some proc's input • Why does k-agreement hold? – We'll show it does as long as k > f. – Sanity check: When k = 1, we have standard consensus. As long as there is less than 1 failure, we can solve the problem. CPSC 668 Set 19: Asynchronous Solvability 7 k-Set Agreement Condition • Let S be set of min values in final scan of each nf proc; these are the nf decisions • Suppose in contradiction |S| > f + 1. • Let v be largest value in S, the decision of pi. • So pi's final scan misses at least f + 1 values, contradicting the code. CPSC 668 Set 19: Asynchronous Solvability 8 Set Consensus Lower Bound Theorem: There is no algorithm for solving kset consensus in the presents of f failures, if f ≥ k. • Straightforward extensions of consensus impossibility result fail; even proving the existence of an initial bivalent configuration is quite involved. • Original proof of set-consensus impossibility used concepts from algebraic topology • Textbook's proof uses more elementary machinery, but still rather involved CPSC 668 Set 19: Asynchronous Solvability 9 Approximate Agreement Motivation • An alternative way to weaken the agreement condition for consensus: • Require that the decisions be close to each other, but not necessarily equal • Seems appropriate for continuousvalued problems (as opposed to discrete) CPSC 668 Set 19: Asynchronous Solvability 10 Approximate Agreement Definition Termination: Eventually, each nonfaulty processor decides. -Agreement: All nonfaulty decisions are within of each other. Validity: Every nonfaulty decision is within the range of the input values. CPSC 668 Set 19: Asynchronous Solvability 11 Approximate Agreement Algorithm • Assume procs know the range from which input values are drawn: – let D be the length of this range • up to n - 1 procs can fail • algorithm is structured as a series of "asynchronous rounds": – exchange values via a snapshot object, one per round – compute midpoint for next round • continue until spread of values is within , which requires about log2 D/ rounds CPSC 668 Set 19: Asynchronous Solvability 12 Approximate Agreement Algorithm Initially local variable v = pi's input Initially local variable r = 1 1. update pi's segment of ASO[r] to be v 2. let scan be set of values obtained by scanning ASO[r] 3. v := midpoint(scan) 4. if r = log2 (D/) + 1 then decide v and terminate 5. else r++ CPSC 668 Set 19: Asynchronous Solvability 13 Analysis of Algorithm Definitions w.r.t. a particular execution: • M = log2 (D/) + 1 • U0 = set of input values • Ur = set of all values ever written to ASO[r] CPSC 668 Set 19: Asynchronous Solvability 14 Helpful Lemma Lemma (16.8): Consider any round r < M. Let u be the first value written to ASO[r]. Then the values written to ASO[r+1] are in this range: min(Ur) (min(Ur)+u)/2 u (max(Ur)+u)/2 max(Ur) elements of Ur+1 are in here CPSC 668 Set 19: Asynchronous Solvability 15 Implications of Lemma • The range of values written to the ASO object for round r + 1 is contained within the range of values written to the ASO object for round r. – range(Ur+1) range(Ur) • The spread (max - min) of values written to the ASO object for round r + 1 is at most half the spread of values written to the ASO object for round r. – spread(Ur+1) ≤ spread(Ur)/2 CPSC 668 Set 19: Asynchronous Solvability 16 Correctness of Algorithm • Termination: Each proc executes M asynchronous rounds. • Validity: The range at each round is contained in the range at the previous round. • -Agreement: spread(UM) ≤ spread(U0)/2M ≤ D/2M ≤ CPSC 668 Set 19: Asynchronous Solvability 17 Handling Unknown Input Range • Range might not be known. • Actual range in an execution might be much smaller than maximum possible range. • First idea: have a preprocessing phase in which procs try to determine input range – but asynchrony and possible failures makes this approach problematic CPSC 668 Set 19: Asynchronous Solvability 18 Handling Unknown Input Range • Use just one atomic snapshot object • Dynamically recalculate how many rounds are needed as more inputs are revealed • Skip over rounds to try to catch up to maximum observed round • Only consider values associated with maximum observed round • Still use midpoint CPSC 668 Set 19: Asynchronous Solvability 19 Unknown Input Range Algorithm shared atomic snapshot object A; initially all segments updatei(A,[x,1,x]), where x is pi's input repeat scan A let S be spread of all inputs in non- segments if S = 0 then maxRound := 0 else maxRound := log2(S/) let rmax be largest round in non- segments let values be set of candidates in segments with round number rmax update pi's segment in A with [x,rmax+1,midpt(values)] until rmax ≥ maxRound decide midpoint(values) CPSC 668 Set 19: Asynchronous Solvability 20 Analysis of Unknown Input Range Algorithm Definitions w.r.t. a particular execution: • U0 = set of all input values • Ur = set of all values ever written to A with round number r • M = largest r s.t. Ur is not empty With these changes, correctness proof is similar to that for known input range algorithm. CPSC 668 Set 19: Asynchronous Solvability 21 Key Differences in Proof • Why does termination hold? – a proc's local maxRound variable can only increase if another proc wakes up and increases the spread of the observable inputs. This can happen at most n - 1 times. • Why does -agreement hold? – If pi's input is observed by pj the last time pj computes its maxRound, same argument as before. – Otherwise, when pi wakes up, it ignores its own input and uses values from maxRound or later. CPSC 668 Set 19: Asynchronous Solvability 22 Renaming • Procs start with unique names from a large domain • Procs should pick new names that are still distinct but that are from a smaller domain • Motivation: Suppose original names are serial numbers (many digits), but we'd like the procs to do some kind of time slicing based on their ids CPSC 668 Set 19: Asynchronous Solvability 23 Renaming Problem Definition Termination: Eventually every nonfaulty proc pi decides on a new name yi Uniqueness: If pi and pj are distinct nonfaulty procs, then yi ≠ yj. We are interested in anonymous algorithms: procs don't have access to their indices, just to their original names. Code depends only on your original name. CPSC 668 Set 19: Asynchronous Solvability 24 Performance of Renaming Algorithm • New names should be drawn from {1,2,…,M}. • We would like M to be as small as possible. • Uniqueness implies M must be at least n. • Due to the possibility of failures, M will actually be larger than n. CPSC 668 Set 19: Asynchronous Solvability 25 Renaming Results • Algorithm for wait-free case (f = n - 1) with M = n + f = 2n - 1. • Algorithm for general f with M = n + f. • Lower bound that M must be at least n + 1, for wait-free case. – Proof similar to impossibility of wait-free consensus • Stronger lower bound that M must be at least n + f, if f is the number of failures – Proof uses algebraic topology and is related to lower bound for set consensus CPSC 668 Set 19: Asynchronous Solvability 26 Wait-Free Renaming Algorithm Shared atomic snapshot object A; initially all segments s := 1 // suggestion for my new name while true do update pi's segment of A to be [x,s], where x is pi's original name scan A if s is also someone else's suggestion then let r be rank of x among original names of non- segments let s be r-th smallest positive integer not currently suggested by another proc else decide on s for new name and terminate CPSC 668 Set 19: Asynchronous Solvability 27 Analysis of Renaming Algorithm Uniqueness: Suppose in contradiction pi and pj choose same new name, s. pi's last update before deciding: suggests s CPSC 668 pi's last scan before deciding s pj's last scan before deciding s sees s as pi's suggestion and doesn't decide s Set 19: Asynchronous Solvability 28 Analysis of Renaming Algorithm • New name space is {1,…,2n - 1}. • Why? • rank of a proc pi's original name is at most n (the largest one) • worst case is when each of the n - 1 other procs has suggested a different new name for itself, say {1,…,n - 1}. • Then pi suggests n + n - 1 = 2n - 1. CPSC 668 Set 19: Asynchronous Solvability 29 Analysis of Renaming Algorithm Termination: Suppose in contradiction some set T of nonfaulty procs never decide in some execution. • Consider the suffix of the execution in which – each proc in T has already done at least one update and – only procs in T take steps (others have either already crashed or decided). CPSC 668 Set 19: Asynchronous Solvability 30 Analysis of Renaming Algorithm • Let F be the set of new names that are free (not suggested at the beginning of by any proc not in T) -- the trying procs need to choose new names from this set. • Let z1, z2,… be the names in F in order. • By the definition of , no proc wakes up during and reveals an additional original name, so all procs in T are working with the same set of original names during . • Let pi be proc whose original name has smallest rank (among this set of original names). Let r be this rank. CPSC 668 Set 19: Asynchronous Solvability 31 Analysis of Renaming Algorithm • Eventually procs other than pi stop suggesting zr as a new name: – After starts, every scan indicates a set of free names that is no larger than F. – Every trying proc other than pi has a larger rank and thus continually suggests a new name for itself that is larger than zr, once it does the first scan in . CPSC 668 Set 19: Asynchronous Solvability 32 Analysis of Renaming Algorithm • Eventually pi does suggest zr as its new name: – By choice of zr as r-th smallest free new name, and fact that eventually other trying procs stop suggesting z1 through zr, eventually pi sees zr as free name with r-th smallest rank. • Contradicts assumption that pi is trying (i.e., stuck). • So termination holds. CPSC 668 Set 19: Asynchronous Solvability 33 General Renaming • Suppose we know that at most f procs will fail, where f is not necessarily n - 1. • We can use the wait-free algorithm, but it is wasteful in the size of the new name space, 2n - 1, if f < n - 1. • We can do better (if f < n - 1) with a slightly different algorithm: – keep track in the snapshot object of whether you have decided – an undecided proc suggests a new name only if its original name is among the f + 1 lowest names of procs that have not yet decided. CPSC 668 Set 19: Asynchronous Solvability 34 k-Exclusion Problem • A fault-tolerant version of mutual exclusion. • Processors can fail by crashing, even in the critical section (stay there forever). • Allow up to k processors to be in the critical section simultaneously. • If < k processors fail, then any nonfaulty processor that wishes to enter the critical section eventually does so. CPSC 668 Set 19: Asynchronous Solvability 35 k-Exclusion Algorithm cf. paper by Afek et al. [5]. CPSC 668 Set 19: Asynchronous Solvability 36 k-Assignment Problem • A specialization of k-Exclusion to include: • Uniqueness: Each proc in the critical section has a variable called slot, which is an integer between 1 and m. If pi and pj are in the C.S. concurrently, then they have different slots. • Models situation when there is a pool of identical resources, each of which must be used exclusively: – k is number of procs that can be in the pool concurrently – m is the number of resources – To handle failures, m should be larger than k CPSC 668 Set 19: Asynchronous Solvability 37 k-Assignment Algorithm Schema k-assignment entry section k-exclusion entry section renaming using m = 2k-1 names k-assignment exit section k-exclusion exit section CPSC 668 Set 19: Asynchronous Solvability 38 k-Assignment Algorithm Schema k-assignment entry section k-exclusion entry section request-name for long-lived renaming using m = 2k-1 names k-assignment exit section release-name for long-lived renaming using m = 2k-1 names k-exclusion entry section CPSC 668 Set 19: Asynchronous Solvability 39
© Copyright 2024