Optimal Distributed Submodular Optimization via Sketching


As an important special case of submodular optimization problems, coverage problems are a central problem in optimization with a wide range of applications in data mining and machine learning. As we need to handle larger and larger data sets, there is a clear need to develop distributed solutions to these problems. While several results have been developed for distributed coverage maximizations, all the existing method have notable limitations, e.g., they all achieve either suboptimal approximation guarantees or suboptimal space and memory complexities. Moreover, most previous results for submodular maximization either explicitly or implicitly assume that one has a value oracle access to the submodular function. Such a value oracle for coverage functions has the following form: given a subfamily of (input) subsets, determine the size of the union of the subsets in this subfamily.