Testing New Networking Protocols
March 22, 2017 | MITEstimated reading time: 4 minutes
The transmission control protocol, or TCP, which manages traffic on the internet, was first proposed in 1974. Some version of TCP still regulates data transfer in most major data centers, the huge warehouses of servers maintained by popular websites.
That’s not because TCP is perfect or because computer scientists have had trouble coming up with possible alternatives; it’s because those alternatives are too hard to test. The routers in data center networks have their traffic management protocols hardwired into them. Testing a new protocol means replacing the existing network hardware with either reconfigurable chips, which are labor-intensive to program, or software-controlled routers, which are so slow that they render large-scale testing impractical.
At the Usenix Symposium on Networked Systems Design and Implementation later this month, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will present a system for testing new traffic management protocols that requires no alteration to network hardware but still works at realistic speeds — 20 times as fast as networks of software-controlled routers.
The system maintains a compact, efficient computational model of a network running the new protocol, with virtual data packets that bounce around among virtual routers. On the basis of the model, it schedules transmissions on the real network to produce the same traffic patterns. Researchers could thus run real web applications on the network servers and get an accurate sense of how the new protocol would affect their performance.
“The way it works is, when an endpoint wants to send a [data] packet, it first sends a request to this centralized emulator,” says Amy Ousterhout, a graduate student in electrical engineering and computer science (EECS) and first author on the new paper. “The emulator emulates in software the scheme that you want to experiment with in your network. Then it tells the endpoint when to send the packet so that it will arrive at its destination as though it had traversed a network running the programmed scheme.”
Ousterhout is joined on the paper by her advisor, Hari Balakrishnan, the Fujitsu Professor in Electrical Engineering and Computer Science; Jonathan Perry, a graduate student in EECS; and Petr Lapukhov of Facebook.
Traffic control
Each packet of data sent over a computer network has two parts: the header and the payload. The payload contains the data the recipient is interested in — image data, audio data, text data, and so on. The header contains the sender’s address, the recipient’s address, and other information that routers and end users can use to manage transmissions.
When multiple packets reach a router at the same time, they’re put into a queue and processed sequentially. With TCP, if the queue gets too long, subsequent packets are simply dropped; they never reach their recipients. When a sending computer realizes that its packets are being dropped, it cuts its transmission rate in half, then slowly ratchets it back up.
A better protocol might enable a router to flip bits in packet headers to let end users know that the network is congested, so they can throttle back transmission rates before packets get dropped. Or it might assign different types of packets different priorities, and keep the transmission rates up as long as the high-priority traffic is still getting through. These are the types of strategies that computer scientists are interested in testing out on real networks.
Speedy simulation
With the MIT researchers’ new system, called Flexplane, the emulator, which models a network running the new protocol, uses only packets’ header data, reducing its computational burden. In fact, it doesn’t necessarily use all the header data — just the fields that are relevant to implementing the new protocol.
When a server on the real network wants to transmit data, it sends a request to the emulator, which sends a dummy packet over a virtual network governed by the new protocol. When the dummy packet reaches its destination, the emulator tells the real server that it can go ahead and send its real packet.
If, while passing through the virtual network, a dummy packet has some of its header bits flipped, the real server flips the corresponding bits in the real packet before sending it. If a clogged router on the virtual network drops a dummy packet, the corresponding real packet is never sent. And if, on the virtual network, a higher-priority dummy packet reaches a router after a lower-priority packet but jumps ahead of it in the queue, then on the real network, the higher-priority packet is sent first.
The servers on the network thus see the same packets in the same sequence that they would if the real routers were running the new protocol. There’s a slight delay between the first request issued by the first server and the first transmission instruction issued by the emulator. But thereafter, the servers issue packets at normal network speeds.
The ability to use real servers running real web applications offers a significant advantage over another popular technique for testing new network management schemes: software simulation, which generally uses statistical patterns to characterize the applications’ behavior in a computationally efficient manner.
“Being able to try real workloads is critical for testing the practical impact of a network design and to diagnose problems for these designs,” says Minlan Yu, an associate professor of computer science at Yale University. “This is because many problems happen at the interactions between applications and the network stack” — the set of networking protocols loaded on each server — “which are hard to understand by simply simulating the traffic.”
“Flexplane takes an interesting approach of sending abstract packets through the emulated data-plane resource management solutions and then feeding back the modified real packets to the real network,” Yu adds. “This is a smart idea that achieves both high link speed and programmability. I hope we can build up a community using the FlexPlane test bed for testing new resource management solutions.”
Suggested Items
RTX's Collins Aerospace to Deliver Additional MAPS Gen II Units
12/03/2024 | RTXCollins Aerospace, an RTX business, received its fourth order for the Mounted Assured Positioning, Navigation (PNT) and Timing Generation II systems (MAPS Gen II), valued at $95 million.
Q3 2024 Global Semiconductor Equipment Billings Grew 19% YoY
12/03/2024 | SEMIGlobal semiconductor equipment billings increased 19% year-over-year to US$30.38 billion in the third quarter of 2024, while quarter-over-quarter billings registered 13% growth during the same period, SEMI announced in its Worldwide Semiconductor Equipment Market Statistics (WWSEMS) Report.
ULVAC Launches New Deposition System for Semiconductor Applications
12/02/2024 | JCN NewswireULVAC, Inc. has started accepting orders for the ENTRON-EXX, a new multi-chamber deposition system for semiconductor applications. The ENTRON-EXX builds on the proven productivity and flexibility of its predecessor, the ENTRON-EX W300, while offering enhanced data intelligence and expandability.
Scanfil Progressing Well in its CSRD
11/29/2024 | ScanfilThe Corporate Sustainability Reporting Directive (CSRD) is a significant legislative measure introduced by the European Union (EU) to enhance the transparency and accountability of companies regarding their environmental, social, and governance (ESG) impacts.
Is It the Death of the Dashboard?
11/27/2024 | Nolan Johnson, SMT007 MagazineIn 2024, companies are leveraging AI agents to automate data-driven decisions, bypassing the need for specialized data science skills. In this interview with Tim Burke and Jennifer Davis of Arch Systems, they say these AI tools can handle straightforward tasks, significantly improving factory efficiency by addressing numerous small issues. This shift allows existing staff to use data effectively without extensive retraining. Both large and small language models are used to support real-time decisions on the production floor, integrating various data sources to create a comprehensive digital twin of the factory’s operations.