1Pipe: Scalable Total Order Communication in Data Center Networks
1Pipe is a causal and total order communication primitive to scatter groups of messages via data center network. With in-network computation using Barefoot or Arista switches, 1Pipe achieves scalability and high performance with low CPU and network overheads. Published in SIGCOMM’21.
This paper proposes 1Pipe, a novel communication abstraction that enables different receivers to process messages from senders in a consistent total order. More precisely, 1Pipe provides both unicast and scattering (i.e., a group of messages to different destinations) in a causally and totally ordered manner. 1Pipe provides a best effort service that delivers each message at most once, as well as a reliable service that guarantees delivery and provides restricted atomic delivery for each scattering. 1Pipe can simplify and accelerate many distributed applications, e.g., transactional key-value stores, log replication, and distributed data structures.
We propose a scalable and efficient method to implement 1Pipe inside data centers. To achieve total order delivery in a scalable manner, 1Pipe separates the bookkeeping of order information from message forwarding, and distributes the work to each switch and host. 1Pipe aggregates order information using in-network computation at switches. This forms the “control plane” of the system. On the “data plane”, 1Pipe forwards messages in the network as usual and reorders them at the receiver based on the order information.
Evaluation on a 32-server testbed shows that 1Pipe achieves scalable throughput (80M msg/s per host) and low latency (10𝜇s) with little CPU and network overhead. 1Pipe achieves linearly scalable throughput and low latency in transactional key-value store, TPC-C, remote data structures, and replication that outperforms traditional designs by 2∼20x.
- Bojie Li, Gefei Zuo, Wei Bai, and Lintao Zhang. 1Pipe: Scalable Total Order Communication in Data Center Networks. SIGCOMM ‘21. [Paper PDF]
- Gefei Zuo. Near-Optimal Total Order Message Scattering in Data Center Networks. Second Place, SOSP’17 Student Research Competition (SRC) Undergraduate Category. [PDF] [Poster]
- Aug. 2021, SIGCOMM’21 talk by Bojie Li. [Slides with audio (25 min)] [Slides with audio (12 min)] [Talk Transcription]
- June 2021, APNet’21 SIGCOMM/NSDI talk by Bojie Li. [Slides] [Video] [Video backup link]
- Oct. 2017, SOSP’17 Student Research Competition (SRC) by Gefei Zuo. [PDF] [Poster]
- Oct. 2017, Second Place, SOSP’17 Student Research Competition (SRC) Undergraduate Category by Gefei Zuo.
- Bojie Li, 4th year Ph.D. student in MSRA and USTC (now a Senior Engineer in Huawei)
- Gefei Zuo, 4th year undergraduate in USTC (now a Ph.D. student in University of Michigan)
- Dr. Wei Bai, Associate Researcher in Microsoft Research Asia (now a Researcher in Microsoft Research Redmond)
- Dr. Lintao Zhang, Principal Researcher in Microsoft Research Asia (now with BaseBit Technologies)