Dragonfly Topology and Routing
27 Slides463.87 KB
Dragonfly Topology and Routing
Outline Background Motivation Topology description Routing – Minimal Routing – Valiant Routing – UGAL/G Adaptive Routing – Indirect Adaptive Routing Credit Round Trip Reservation Piggyback Progressive – Performance Comparison
Background As memory and processor performance increases, interconnect networks are becoming critical Topology of an interconnect network affects the performance and cost of the network A good interconnect network, exploits emerging technologies
Motivation Increasing router pin bandwidth – High-radix routers Development of active optical cables – Longer links with less cost per unit distance Using above technology advancements, we can build networks with higher performance. How?
Motivation Reduced network diameter and latency
Motivation Problem 1: Number of ports in each router is limited (64, 128, ) – We want much higher radices (8K – 1M nodes) Problem 2: Long global links between groups are expensive and dominate network cost – We should minimize number of global channels traversed by an average packet
Motivation Solution: use group of networks connected to a sub-network as a virtual high-radix router – All minimal routes traverse at most only one global link – Length of global links are increased to reduce the cost
Dragonfly Topology K radix of each router p a h - 1 K’ virtual router radix a(p h) N ap(ah 1) [Kim et al. ISCA08]
Topology Description Three-level architecture: – Router, Group, System Arbitrary networks can be used for inter-group and intra-group networks K’ K – Very high radix virtual routers – Enables very low global diameter ( 1) To balance channel load on load balanced traffic: – a 2p 2h
Topology Variations [Kim et al. ISCA08]
Minimal Routing Step 1 : If Gs Gd and Rs does not have a connection to Gd, route within Gs from Rs to Ra, a router that has a global channel to Gd. Step 2 : If Gs Gd, traverse the global channel from Ra to reach router Rb in Gd. Step 3 : If Rb Rd, route within Gd from Rb to Rd.
Minimal Routing
Minimal Routing Good for uniform traffic – All links are used evenly Link saturation happens on adversarial traffic – Global ADV – Local ADV Load balancing mechanism needed to distribute traffic
Valiant Randomized Routing Step 1 : If Gs Gi and Rs does not have a connection to Gi, route within Gs from Rs to Ra, a router that has a global channel to Gi. Step 2 : If Gs Gi traverse the global channel from Ra to reach router Rx in Gi. Step 3 : If Gi Gd and Rx does not have a connection to Gd, route within Gi from Rx to Ry, a router that has a global channel to Gd. Step 4 : If Gi Gd, traverse the global channel from Ry to router Rb in Gd. Step 5 : If Rb Rd, route within Gd from Rb to Rd.
Valiant Routing
Valiant Routing Balances use of global links Increases path length by at least one global link Performs poorly on benign traffic Maximum throughput can be 50%
UGAL-G/L Adaptive Routing Choose between MIN and VAL on a packet by packet basis to load balance the network Path with minimum delay is selected: – Queue length – Hop count UGAL-L uses local queue info at the current router node UGAL-G uses queue info for all global channels in Gs
UGAL Adaptive Routing Measuring path queue length is unrealistic (UGAL-G) Use local queue length to approximate path queue length Local queues only sense congestion on a global channel via backpressure over the local channel – Requires stiff backpressure
Adaptive Routing [Jiang et al. ISCA09]
Indirect Adaptive Routing Improve routing decision through remote congestion information Four methods: – Credit Round Trip – Reservation – Piggyback – Progressive
Credit Round Trip [Jiang et al. ISCA09]
Credit Round Trip Delay the return of local credits to the congested router Creates the illusion of stiffer backpressure MIN GC VAL GC Congestion Drawbacks: – Remote Congestion is still sensed through local queue – Info is not up to date Credits Delayed Credits Source Router [Jiang et al. ISCA09] 22
Reservation Reserve bandwidth on minimal global channel If successful send the packet minimally If not, route non-minimally Drawbacks: – Needs buffer at source router to hold waiting packets – Packet latency increased by round-trip time of RES flit – RES flits can create significant load on source group MIN GC VAL GC Congestion RES Failed RES Flit Source Router [Jiang et al. ISCA09]
Piggyback Broadcast link state info of GCs to adjacent routers Each router maintains the most recent link state information for every GCs in its group. routing decision is made using both global state information and the local queue depth congestion level of each GC is compressed into a single bit (SGC) Drawbacks: –Consumes extra bandwidth –Congestion information not up to date due to broadcast delay MIN GC VAL GC Congestion GC Free GC Busy Source Router [Jiang et al. ISCA09]
Progressive Re-evaluate the decision to route minimally at each hop in the source group Non-minimal routing decisions are final The packet is routed minimally until congestion encountered. Then it routes non-minimally Drawbacks: – Adds extra hops – Needs an additional virtual channel to avoid deadlocks MIN GC VAL GC Congestion Source Router [Jiang et al. ISCA09]
Steady State Traffic: Uniform Random 300 Packet Latency (Simulation cycles) 280 260 Piggyback Credit Round Trip Progressive Reservation Minimal 240 220 200 180 160 140 120 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Throughput (Flit Injection Rate) [Jiang et al. ISCA09] 26
Steady State Traffic: Worst Case Packet Latency (Simulation cycles) 450 400 350 Piggyback Credit Round Trip Progressive Reservation Valiant’s 300 250 200 150 100 0 0.1 0.2 0.3 Throughput (Flit Injection Rate) 0.4 0.5 [Jiang et al. ISCA09] 27