Contention-based nonminimal adaptive routing in high-radix networks
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
European Commisision's projectROMOL - Riding on Moore's Law (EC-FP7-321253)
Adaptive routing is an efficient congestion avoidance mechanism for modern Datacenter and HPC networks. Congestion detection traditionally relies on the occupancy of the router queues. However, this approach can hinder performance due to coarse-grain measurements with small buffers, and potential routing oscillations with large buffers. We introduce an alternative mechanism, labelled Contention-Based Adaptive Routing. Our mechanism adapts routing based on an estimation of “network contention”, the simultaneity of traffic flows contending for a network port. Our system employs a set of counters which track the demand for each output port. This exploits path diversity thanks to earlier detection of adversarial traffic patterns, and decouples buffer size and queue occupancy from contention detection. We evaluate our mechanism in a Dragonfly network. Our evaluations show this mechanism achieves optimal latency under uniform traffic and similar to best previous routing mechanisms under adversarial patterns, with immediate adaptation to traffic pattern changes.
CitationFuentes, P., Vallejo, E., García, M., Beivide, R., Rodríguez, G., Minkenberg, C., Valero, M. Contention-based nonminimal adaptive routing in high-radix networks. A: IEEE International Parallel and Distributed Processing Symposium. "2015 IEEE 29th International Parallel and Distributed Processing Symposium: 25-29 May 2015, Hyderabad, India: proceedings". Hyderabad: Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 103-112.