Optimal utilization of network resources and isolation among traffic in modern datacenters and across backbone links, where thousands of different applications and millions of connections are competing for high bandwidth and low latency, often requires the use of various mechanisms at end hosts to shape, rate limit, pace, prioritize, or delay some packets over others. These mechanisms involve one or more classifiers, responsible for matching packets and moving them into different queues based on policy, a shaping algorithm attached to each queue, responsible for delaying, dropping, or remarking packets as necessary, and one or more scheduling algorithms, consuming packets from the various queues and generally responsible for providing fairness and prioritization across different queues. Scaling these architectures at end hosts with thousands of traffic classes while maintaining performance and isolation is challenging, especially in Cloud environments or when trying to offload functionality to hardware.
In this paper we present Carousel, a rate limiter that scales to thousands of policies and million of flows, does not require multiple queues or the cooperation of VMs or OS drivers in classifying packets, and is easily implemented on modern hardware and software. Production experience at a Cloud service provider demonstrate that existing traffic shapers consume 10% of server’s overall CPU resources while Carousel can accomplish the same work with negligible cost. Rate conformance of Carousel is within 0.5% of the target rate as compared with 6% achieved by existing systems.