Ubiq: A Scalable and Fault-tolerant Log Processing Infrastructure


Most of today’s Internet applications are data-centric and generate vast amounts of data (typically, in the form of event logs) that needs to be processed and analyzed for detailed reporting, enhancing user experience and increasing monetization. In this paper, we describe the architecture of Ubiq, a geographically distributed framework for processing continuously growing log files in real time with high scalability, high availability and low latency. The Ubiq framework fully tolerates infrastructure degradation and datacenter-level outages without any manual intervention. It also guarantees exactly-once semantics for application pipelines to process logs in the form of event bundles. Ubiq has been in production for Google’s advertising system for many years and has served as a critical log processing framework for hundreds of pipelines. Our production deployment demonstrates linear scalability with machine resources, extremely high availability even with underlying infrastructure failures, and an end-to-end latency of under a minute.