1 ARIADNE: Agnostic Reconfiguration In A Disconnected Network Environment Konstantinos Aisopos, Andrew DeOrio, Li-Shiuan Peh, and Valeria Bertacco 2011 PACT Presented by: Helen Arefayne Hagos Yajing Chen 1 Motivation 2 Time 150 M transistors 800 M transistors 1B transistors Permanent failure University of Michigan 2 1 Solution 3 Reconfiguration University of Michigan 3 2 Existing solutions Targeted fault 4 Reliability Performance Area Bounded in number Limited N/A N/A Pattern constrained Limited N/A N/A High Good Excess High Bad High Limited Bad Low Off-chip Not constrained On-chip Immunet Vicis University of Michigan 4 3 Proposed solution 5 Agnostic Reconfiguration algorithm In A Disconnected Network Environment University of Michigan 5 4 ROUTING ALGORITHM 6 0 1 2 4 5 6 University of Michigan 3 6 5 ROUTING ALGORITHM..Cont’d 1ST Broadcast De N S E W 1 0 0 1 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 7 0 0 0 0 7 RT[1]=E root up RT[1]=W up 0 1 2 ? up up 4 down down 5 up 3 RT[1]=S,W 6 2 RT[1]=N,E RT[1]=N RT[1]=W down Conflict Resolution: up 6 University of Michigan 7 ROUTING ALGORITHM ..Cont’d 8 2nd Broad Cast… RT[2]=W RT[2]=E RT[2]=E up up 0 1 2 down 3 up up up 4 down 5 6 RT[2]=N,ERT[2]=N RT[2]=N N cycles for a single broadcast O(N2)cycles for the entire reconfiguration University of Michigan 8 ROUTING ALGORITHM ..Cont’d 9 5th Broadcast RT[5]=W,SX RT[5]=S RT[5]=W 0 1 2 4 5 6 RT[5]=W 3 RT[5]=W Illegal path from Node 0 to Node 5 University of Michigan 9 SYNCHRONIZATION 10 Atomic broadcasts Atomic synchronization University of Michigan 10 8 Router Architecture University of Michigan 11 11 11 Experiment Setup 12 ARIADNE is implemented in the Wisconsin Multifacet GEMS simulator as part of the GARNET network model Two state-of-art routing algorithms: Vicis and Immunet Network Architecture GARNET System Configuration GEMS Network Topology 8x8 2D mesh Processors Memory Controllers 4 Coherence Channel Width 64 bits Router Architecture 5-stage pipeline Router Ports, VCs 5, 2(private) Router Buffers/port 5-flit/VC L1 Caching University of Michigan L2 Caching in-order SPARC cores MOESI protocol Simulation Input Uniform Random(UR) TRanspose (TR) Synthetic Traffic private unified 32KB/node 2-way latency:3cycles Benchmark Suite PARSEC Packet Length 5 shared distributed 1MB/node 16-way latency:15cycles Simulation Time 100K cycles Simulation Warmup 10K cycles 12 Performance Evaluation: Synthetic Traffic 13 Average Latency: the delay experienced by a packet from source to destination Throughput: the rate of packets delivered per cycle Zero Load Latency Saturation Throughput University of Michigan 13 Performance Evaluation: PARSEC Average Latency for 50 faults Latency over variable number of faults University of Michigan 14 14 Reliability Evaluation 15 The ability to maximize the connectivity of a faulty network PARSEC Synthetic Traffic The probability of deadlock University of Michigan 15 Reconfiguration Evaluation 16 During reconfiguration, all packets experience an additional delay of 𝑁 2 cycles (N=64; 4096 cycles) Average Latency over time Reconfiguration Duration University of Michigan 16 Hardware Overhead Verilog HDL Beginning with 5-stage router design: The router with Ariadne: 17 2.708𝒎𝒎𝟐 2.761𝒎𝒎𝟐 Slightly more overhead than Vicis Three times less than Immunet Wire Routing Tables Overhead 𝒎𝒎𝟐 %Overhead Ariadne 1 Adaptive 0.053 1.97% Vicis 1 Deterministic 0.040 1.48% Immunet 4 Adaptive, deterministic and a small safe table 0.162 5.98% University of Michigan 17 Summary 18 Deadlock Avoidance Normal Operation: up*/down* Reconfiguration Eject and re-inject Cost: an additional dedicated flitbuffer per NIC On-Chip A fully “distributed” solution Require a global clock to support atomicity Protect ARIADNE Triple Modular Redundancy (TMR) Partitioned Network The packet delivery is guaranteed within each partition Cannot transfer packets among partitions University of Michigan 18 Conclusion 19 High performance and deadlock-free routing by utilizing up*/down* Performance gain: 40% to 140% at 50 faults during normal operation High reliability If a path between two node exists, at least one deadlock-free path is guaranteed A fully distributed solution with simple hardware and low complexity Nodes coordinates to explore the surviving topology 1.97% area overhead University of Michigan 19 Discussion 20 Are the two benchmarks representative enough? Does the packet size affect the result? Does the topology influence the result? How about the scalability? What is the power consumption during reconfiguration? University of Michigan 20
© Copyright 2024