We design and build the world’s largest, fastest, most reliable data-center and WAN networks, to enable compute and storage not available anywhere else.
Our team brings together experts in networking, distributed systems, kernel and systems programming, and algorithms to create the networks that power Google. Our networks are among the world’s largest and fastest, and we design them to be reliable, cheap, and easy to evolve. We often use new technologies unavailable outside Google.
We exemplify Google’s Hybrid Approach to Research: we deploy real-world systems at global scale. Many members of our team have extensive research experience, we publish papers in conferences such as SIGCOMM, NSDI, SOSP, and OSDI, and we work closely with interns and faculty from leading universities.
Every Google product relies on the technologies we develop. Our networks support complex, highly-available, planetary-scale distributed systems with billions of users. We constantly evolve our networks to meet the requirements of, and create opportunities for, new and better Google products, especially the rapidly-growing Google Cloud.
13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), USENIX Association, Santa Clara, CA (2016), pp. 523-535
Sigcomm '15, Google Inc (2015)
Communications of the ACM, vol. Vol. 59, No. 9 (2016), pp. 88-97
Sigcomm '15, Google Inc (2015)
Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, USENIX Association, Berkeley, CA, USA (2010), pp. 351-364
Congestion control, network measurement, and traffic management
All networks are subject to congestion; we want to operate ours at high utilization levels (to reduce costs) while meeting strict performance objectives. We’re inventing new congestion avoidance protocols, and scaling up our global-scale, near-real-time, automated traffic engineering system. We’re building new techniques to measure our networks, accurately and at scale, to drive our evaluation of congestion-control techniques, and as real-time input to automated traffic management.
Data-center network design
We continue to innovate in designs for scalable, fast, cheap, reliable, and evolvable data-center networks. When necessary, we design our own hardware, and innovate in network topology and routing protocols. We’re exploring automatic techniques to optimize network designs.
We’re working towards increasingly automated network management systems, enabling us to rapidly repair and modify our networks with little or no downtime. We’re using techniques such as formal modeling of network topologies and highly-available distributed systems, while working closely with Google’s network operators to implement automated workflows.
Programmable packet processing
To match the continuing increases in storage and networking hardware speed, we are developing new communication APIs and mechanisms for low-latency and CPU-efficient communication. We want our network switches and endpoints to implement novel packet-processing functions without compromising on cost or performance -- functions such as load balancing, virtualization, access control, reliable transport, and packet-level event monitoring. We’re exploring a variety of hardware and software techniques for fast, flexible, safe packet processing, including onload, offload, RDMA, P4, and more.
We use software-defined networking extensively in both data-center networks and WANs. We collaborated on and popularized early work on OpenFlow, and continue to raise the level of abstraction for silicon-agnostic switching. We are developing SDN controller platforms that can handle Google’s needs for scale and reliability, and a set of SDN applications for routing, traffic management, and other functions.
High velocity development and testing
To introduce our network innovations into production as rapidly as possible, without compromising availability, we test our designs and implementations early, often, and extensively. We are developing advanced software validation techniques, we embrace automation in all aspects of testing and qualification, and we build powerful infrastructure for testing, debugging, and root-causing, in both physical and emulated testbeds.
We’ve developed one of the world’s largest, most cost-effective wide area networks, and we continue to find ways to increase its scale and reliability, while extracting the best possible performance from expensive WAN hardware and fiber links. We’re employing Google-designed hardware, SDN controllers, and global-scale automated traffic engineering to address these challenges.