We present the implementation of a large-scale latency estimation system based on GNP and incorporated into the Google content delivery network. Our implementation does not rely on active participation of Web clients, and carefully controls the overhead incurred by latency measurements using a scalable centralized scheduler. It also requires only a small number of CDN modifications, which makes it attractive for any CDN interested in large-scale latency estimation.
We investigate the issue of coordinate stability over time and show that coordinates drift away from their initial values with time, so that 25% of node coordinates become inaccurate by more than 33 milliseconds after one week. However, daily recomputations make 75% of the coordinates stay within 6 milliseconds of their initial values. Furthermore, we demonstrate that using coordinates to decide on client-to-replica redirection leads to selecting replicas closest in term of measured latency in 86% of all cases. In another 10% of all cases, clients are redirected to replicas offering latencies that are at most two times longer than optimal. Finally, collecting a huge volume of latency data and using clustering techniques enable us to estimate latencies between globally distributed Internet hosts that have not participated in our measurements at all. The results are sufficiently promising that Google may offer a public interface to the latency estimates in the future.