Jump to Content
Mohammadhossein Bateni

Mohammadhossein Bateni

MohammadHossein Bateni is a staff research scientist at Google, where he is a member of the NYC Algorithms and Optimization Team. He obtained his Ph.D. and M.A. in Computer Science from Princeton University in 2011 and 2008, respectively, after finishing his undergraduate studies with a B.Sc. in Computer Engineering at Sharif University of Technology in 2006. Hossein is broadly interested in combinatorics and combinatorial optimization. His research focuses on approximation algorithms, distributed computing, and analysis of game-theoretic models.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Representative Selection (RS) is the problem of finding a small subset of exemplars from a dataset that is representative of the dataset. In this paper, we study RS for attributed graphs, and focus on finding representative nodes that optimize the accuracy of a model trained on the selected representatives. Theoretically, we establish a new hardness result for RS (in the absence of a graph structure) by proving that a particular, highly practical variant of it (RS for Learning) is hard to approximate in polynomial time within any reasonable factor, which implies a significant potential gap between the optimum solution of widely-used surrogate functions and the actual accuracy of the model. We then study the setting where a (homophilous) graph structure is available, or can be constructed, between the data points. We show that with an appropriate modeling approach, the presence of such a structure can turn a hard RS (for learning) problem into one that can be effectively solved. To this end, we develop RS-GNN, a representation learning-based RS model based on Graph Neural Networks. Empirically, we demonstrate the effectiveness of RS-GNN on problems with predefined graph structures as well as problems with graphs induced from node feature similarities, by showing that RS-GNN achieves significant improvements over established baselines on a suite of eight benchmarks. View details
    Sequential Attention for Feature Selection
    Taisuke Yasuda
    Proceedings of the 11th International Conference on Learning Representations (2023)
    Preview abstract Feature selection is the problem of selecting a subset of features for a machine learning model that maximizes model quality subject to a budget constraint. For neural networks, prior methods, including those based on L1 regularization, attention, and other techniques, typically select the entire feature subset in one evaluation round, ignoring the residual value of features during selection, i.e., the marginal contribution of a feature given that other features have already been selected. We propose a feature selection algorithm called Sequential Attention that achieves state-of-the-art empirical results for neural networks. This algorithm is based on an efficient one-pass implementation of greedy forward selection and uses attention weights at each step as a proxy for feature importance. We give theoretical insights into our algorithm for linear regression by showing that an adaptation to this setting is equivalent to the classical Orthogonal Matching Pursuit (OMP) algorithm, and thus inherits all of its provable guarantees. Our theoretical and empirical analyses offer new explanations towards the effectiveness of attention and its connections to overparameterization, which may be of independent interest. View details
    Preview abstract Metric clustering is a fundamental primitive in machine learning with several applications for mining massive data-sets. An important example of metric clustering is the $k$-center problem. While this problem has been extensively studied in distributed settings, all previous algorithms require $\Omega(k)$ space per machine and $\Omega(n k)$ total work. In this paper, we develop the first highly scalable approximation algorithm for $k$-center clustering requiring $o(k)$ space per machine with $o(n k)$ total work. In particular, our algorithm needs $\widetilde{O}(n^{\eps})$ space per machine and $\tilde{O}(n^{1+\epsilon})$ total work, and computes an $O(\log \log \log n)$-approximation of the problem by selecting $(1+o(1))k$ centers in $O(\log \log n)$ rounds. This is achieved by introducing core-sets of truly sublinear size. View details
    Cache-aware load balancing of data center applications
    Aaron Schild
    Ray Yang
    Richard Zhuang
    Proceedings of the VLDB Endowment, vol. 12 (2019), pp. 709-723
    Preview abstract Our deployment of cache-aware load balancing in the Google web search backend reduced cache misses by ~0.5x, contributing to a double-digit percentage increase in the throughput of our serving clusters by relieving a bottleneck. This innovation has benefited all production workloads since 2015, serving billions of queries daily. A load balancer forwards each query to one of several identical serving replicas. The replica pulls each term's postings list into RAM from flash, either locally or over the network. Flash bandwidth is a critical bottleneck, motivating an application-directed RAM cache on each replica. Sending the same term reliably to the same replica would increase the chance it hits cache, and avoid polluting the other replicas' caches. However, most queries contain multiple terms and we have to send the whole query to one replica, so it is not possible to achieve a perfect partitioning of terms to replicas. We solve this via a voting scheme, whereby the load balancer conducts a weighted vote by the terms in each query, and sends the query to the winning replica. We develop a multi-stage scalable algorithm to learn these weights. We first construct a large-scale term-query graph from logs and apply a distributed balanced graph partitioning algorithm to cluster each term to a preferred replica. This yields a good but simplistic initial voting table, which we then iteratively refine via cache simulation to capture feedback effects. View details
    Beating Approximation Factor 2 for Minimum k-way Cut in Planar and Minor-free Graphs
    Alireza Farhadi
    MohammadTaghi Hajiaghayi
    Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM (2019), pp. 1055-1068
    Preview abstract The k-cut problem asks, given a connected graph G and a positive integer k, to find a minimum-weight set of edges whose removal splits G into k connected components. We give the first polynomial-time algorithm with approximation factor 2−ε (with constant ε>0) for the k-cut problem in planar and minor-free graphs. Applying more complex techniques, we further improve our method and give a polynomial-time approximation scheme for the k-cut problem in both planar and minor-free graphs. Despite persistent effort, to the best of our knowledge, this is the first improvement for the k-cut problem over standard approximation factor of 2 in any major class of graphs. View details
    Preview abstract Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems, e.g., in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced-partitioning problem where the goal is to partition the vertices of a given graph into k pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks such as MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the linear embedding order. We examine various ways to find the first embedding, e.g., via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps,minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, e.g., a label-propagation method [UB13], FENNEL [TGRV14] and Spinner [MLS14]. We report our results both on a private map graph and several public social networks,and show that our results beat previous distributed algorithms: For instance, compared to the label-propagation algorithm [UB13], we report an improvement of 15-25% in the cut value. We also observe that our algorithms admit scalable distributed implementation for any number of partitions. Finally, we explain three applications of this work at Google. •Balanced partitioning is used to route multi-term queries to different replicas in Google Search backend in a way that reduces the cache miss rates by≈0.5%, which leads to a double-digit gain in throughput of production clusters [AAB+19]. •Applied to the Google Maps Driving Directions, balanced partitioning minimizes the number of cross-shard queries with the goal of saving in CPU usage. This system achieves load balancing by dividing the world graph into several “shards.” Live experiments demonstrate an≈40% drop in the number of cross-shard queries when compared to a standard geography-based method. •In a job scheduling problem for our data centers, we use balanced partitioning to evenly distribute the work while minimizing the amount of communication across geographically distant servers. In fact, the hierarchical nature of our solution goes well with the layering of data center servers, where certain machines are closer to each other and have faster links to one another. View details
    Preview abstract In modern machine learning tasks, the presence of categorical features with extremely large vocabularies is a reality. This becomes a bottleneck when using an ML model, which generally grows at least linearly with the vocabulary size, affecting the memory, training and inference costs, as well as overfitting risk. In this work, we seek to compress the vocabulary by maximizing the mutual information between the compressed categorical feature and the target binary labels. We note the relationship of this problem to that of quantization in a discrete memoryless channel, where there exists a quadratic-time algorithm to solve the problem. Unfortunately, such an algorithm does not scale to data sets with massive vocabularies and, in this paper, we develop a distributed quasi-linear O(n log n) algorithm with provable approximation guarantees. We first observe that although entropy is a submodular function, this is not the case for mutual information between a categorical feature and label. To tackle this problem, we define a set function over a different space, which still contains the optimal solution, and prove this function is submodular. We also provide a query oracle to the submodular function that runs in amortized logarithmic time, and is easy to compute in a distributed fashion. Combining these results with a greedy algorithm allows us to achieve a (1-1/e)-approximation in quasi-linear time. Finally, we compare our proposed algorithm to several existing approaches using the large-scale Criteo learning task and demonstrate better performance in retaining mutual information and also verify the learning performance of the compressed vocabulary. View details
    Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs
    Aaron Bernstein
    Cliff Stein
    Sepehr Assadi
    Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM (2019), pp. 1616-1635
    Preview abstract Maximum matching and minimum vertex cover are among the most fundamental graph optimization problems. Recently, randomized composable coresets were introduced as an effective technique for solving these problems in various models of computation on massive graphs. In this technique, one partitions the edges of an input graph randomly into multiple pieces, compresses each piece into a smaller subgraph, namely a coreset, and solves the problem on the union of these coresets to find the final solution. By designing small size randomized composable coresets, one can obtain efficient algorithms, in a black-box way, in multiple computational models including streaming, distributed communication, and the massively parallel computation (MPC) model. We develop randomized composable coresets of size Oe(n) that for any constant ε > 0, give a (3/2 + ε)-approximation to matching and a (3 + ε)-approximation to vertex cover. Our coresets improve upon the previously best approximation ratio of O(1) for matching and O(log n) for vertex cover. Most notably, our result for matching goes beyond a 2-approximation, which is a natural barrier for maximum matching in many models of computation. Our coresets lead to improved algorithms for the simultaneous communication model with randomly partitioned input, the streaming model when the input arrives in a random order, and the MPC model with O~(n√n) memory per machine and only two MPC rounds. Furthermore, inspired by the recent work of Czumaj et al. (arXiv 2017), we study algorithms for matching and vertex cover in the MPC model with only Oe(n) memory per machine. Building on our coreset constructions, we develop parallel algorithms that give an O(1)-approximation to both matching and vertex cover in only O(log log n) MPC rounds and O~(n) memory per machine. We further improve the approximation ratio of our matching algorithm to (1 + ε) for any constant ε > 0. Our results settle multiple open questions posed by Czumaj et al. A key technical ingredient of our paper is a novel application of edge degree constrained subgraphs (EDCS) that were previously introduced in the context of maintaining matchings in dynamic graphs. At the heart of our proofs are new structural properties of EDCS that identify these subgraphs as sparse certificates for large matchings and small vertex covers which are quite robust to sampling and composition. View details
    Optimal Distributed Submodular Optimization via Sketching
    Hossein Esfandiari
    Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018), pp. 1138-1147
    Preview abstract As an important special case of submodular optimization problems, coverage problems are a central problem in optimization with a wide range of applications in data mining and machine learning. As we need to handle larger and larger data sets, there is a clear need to develop distributed solutions to these problems. While several results have been developed for distributed coverage maximizations, all the existing method have notable limitations, e.g., they all achieve either suboptimal approximation guarantees or suboptimal space and memory complexities. Moreover, most previous results for submodular maximization either explicitly or implicitly assume that one has a value oracle access to the submodular function. Such a value oracle for coverage functions has the following form: given a subfamily of (input) subsets, determine the size of the union of the subsets in this subfamily. View details
    Fast Algorithms for Knapsack via Convolution and Prediction
    MohammadTaghi Hajiaghayi
    Saeed Seddighin
    Cliff Stein
    Proceedings of the 50th Annual ACM Symposium on the Theory of Computing (STOC) (2018), pp. 1269-1282
    Preview abstract The knapsack problem is a fundamental problem in combinatorial optimization. It has been studied extensively from theoretical as well as practical perspectives as it is one of the most well-known NP-hard problems. The goal is to pack a knapsack of size t with the maximum value from a collection of n items with given sizes and values. Recent evidence suggests that a classic O(nt) dynamic-programming solution for the knapsack problem might be the fastest in the worst case. In fact, solving the knapsack problem was shown to be equivalent to the (min,+) convolution problem (Cygan et al., ICALP 2017), which is thought to be facing a quadratic-time barrier. This hardness is in contrast to the more famous (+,·) convolution (generally known as polynomial multiplication), that has an O(nlogn)-time solution via Fast Fourier Transform. Our main results are algorithms with near-linear running times for the knapsack problem, if either the values or sizes of items are small integers. More specifically, if item sizes are integers bounded by s_max, the running time of our algorithm is O~((n + t)s_max). If the item values are integers bounded by v_max, our algorithm runs in time O~(n + t v_max). Best previously known running times were O(nt), O(n^2 s_max) and O(n s_max v_max) (Pisinger, J. of Alg., 1999). At the core of our algorithms lies the prediction technique: Roughly speaking, this new technique enables us to compute the convolution of two vectors in time O (n e_max) when an approximation of the solution within an additive error of e_max is available. Our results also have implications regarding algorithms for several other problems including tree sparsity, tree separability and the unbounded knapsack problem, in the case when some of the relevant numerical input values are bounded. View details
    A study of compact reserve pricing languages
    Hossein Esfandiari
    Saeed Seddighin
    Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (2017), pp. 363-368
    Preview abstract Motivated by ad auctions we study the multiplicative reserve price system (MRPS). MRPS is a compact way to set reserve prices on several auctions simultaneously. In this paper first we consider the problem of finding the best way of assigning reserve prices in this system. We show that this problem is NP-hard. Next, by characterizing the the properties of an optimum solution we design an approximation algorithm for this problem. Interestingly, our experiments show that our algorithm achieves 90–98% of the optimum solution that sets the reserve price of each auction independently (i.e., the optimum set of reserve prices). We show that in the adversarial setting we lose at most a logarithmic factor compare to the optimum set of reserve prices. To show the tightness of our results in the adversarial setting, we show that there is no compact pricing system (i.e. a pricing system that uses O(n^{1−ε}) bits to set n reserve prices) that loses less than a logarithmic factor compare to the optimum set of reserve prices, in the worst case. Notice that, interestingly, this hardness result holds for any pricing system and not only MRPS. View details
    A Study of Compact Reserve Pricing Languages
    Hossein Esfandiari
    Saeed Seddighin
    AAAI 2017 (2017), pp. 363-368
    Preview abstract Online advertising allows advertisers to implement fine-tuned targeting of users. While such precise targeting leads to more effective advertising, it introduces challenging multidimensional pricing and bidding problems for publishers and advertisers. In this context, advertisers and publishers need to deal with an exponential number of possibilities. As a result, designing efficient and compact multidimensional bidding and pricing systems and algorithms are practically important for online advertisement. Compact bidding languages have already been studied in the context of multiplicative bidding. In this paper, we study the compact pricing problem. View details
    Affinity Clustering: Hierarchical Clustering at Scale
    Soheil Behnezhad
    Mahsa Derakhshan
    MohammadTaghi Hajiaghayi
    Raimondas Kiveris
    NIPS 2017, pp. 6867-6877
    Preview abstract Graph clustering is a fundamental task in many data-mining and machine-learning pipelines. In particular, identifying good hierarchical clustering structure is at the same time a fundamental and challenging problem for several applications. In many applications, the amount of data to analyze is increasing at an astonishing rate each day. Hence there is a need for new solutions to efficiently compute effective hierarchical clusterings on such huge data. In this paper, we propose algorithms to address this problem. First, we analyze minimum spanning tree-based clustering algorithms and their corresponding hierarchical clusterings. In particular we consider classic single-linkage clustering based on Kruskal's algorithm and a variation of Boruvka algorithm that we call affinity clustering and prove new interesting properties of these clusterings via the concept of certificates. Then we present new algorithms in the MapReduce model and their efficient real world implementations via Distributed Hash Tables (DHTs). Our MapReduce algorithms indeed improve upon the previous MapReduce algorithms for finding a minimum spanning tree in graphs as well. Finally we show experimentally that our algorithms are scalable for huge data and competitive with state-of-the-art algorithms. In particular we show that Affinity Clustering is in practice superior to several state-of-the-art clustering algorithms. View details
    Almost Optimal Streaming Algorithms for Coverage Problems
    Hossein Esfandiari
    29th ACM Symposium on Parallelism in Algorithms and Architectures (2017)
    Preview abstract Maximum coverage and minimum set cover problems --collectively called coverage problems-- have been studied extensively in streaming models. However, previous research not only achieve sub-optimal approximation factors and space complexities, but also study a restricted set arrival model which makes an explicit or implicit assumption on oracle access to the sets, ignoring the complexity of reading and storing the whole set at once. In this paper, we address the above shortcomings, and present algorithms with improved approximation factor and improved space complexity, and prove that our results are almost tight. Moreover, unlike most of previous work, our results hold on a more general edge arrival model. More specifically, we present (almost) optimal approximation algorithms for maximum coverage and minimum set cover problems in the streaming model with an (almost) optimal space complexity of O~(n), i.e., the space is {\em independent of the size of the sets or the size of the ground set of elements}. These results not only improve over the best known algorithms for the set arrival model, but also are the first such algorithms for the more powerful {\em edge arrival} model. In order to achieve the above results, we introduce a new general sketching technique for coverage functions: This sketching scheme can be applied to convert an α-approximation algorithm for a coverage problem to a $(1-\eps)\alpha$-approximation algorithm for the same problem in streaming, or RAM models. We show the significance of our sketching technique by ruling out the possibility of solving coverage problems via accessing (as a black box) a $(1 \pm \eps)$-approximate oracle (e.g., a sketch function) that estimates the coverage function on any subfamily of the sets. View details
    A PTAS for Planar Group Steiner Tree via Bootstrapping Approximation
    Daniel Marx
    Erik Demaine
    MohammadTaghi Hajiaghayi
    Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, ACM, pp. 570-583
    Preview abstract We present the first polynomial-time approximation scheme (PTAS), i.e., (1 + ε)-approximation algorithm for any constant ε > 0, for the planar group Steiner tree problem (in which each group lies on a boundary of a face). This result improves on the best previous approximation factor of O(log n(log log n)O(1) ). We achieve this result via a novel and powerful technique called bootstrapping approximation, which allows one to bootstrap from a superconstant approximation factor (even superpolynomial in the input size) all the way down to a PTAS. This is in contrast with the popular ex- isting approach for planar PTASs of constructing light-weight spanners in one iteration, which notably requires a constant-factor approximate solution to start from. Bootstrapping approximation removes the main barrier for designing PTASs for problems which have no known constant-factor approxima- tion (even on planar graphs), and thus can be used to obtain PTASs for several difficult-to-approximate problems. Our second major contribution required for the planar group Steiner tree PTAS is a spanner con- struction, which reduces the graph to have total weight within a factor of the optimal solution while approximately preserving the optimal solution. This is particularly challenging because group Steiner tree requires deciding which terminal in each group to connect by the tree, making it much harder than recent previous approaches to construct spanners for planar TSP by Klein (FOCS’05 & SICOMP’08), subset TSP by Klein (STOC’06), Steiner tree by Borradaile, Klein, and Mathieu (SODA’07 & TALG’09), and Steiner forest by Bateni, Hajiaghayi, and Marx (STOC’10 & JACM’11) (and its improvement to an efficient PTAS by Eisenstat, Klein, and Mathieu (SODA’12)). The spanner construction algorithm first constructs a “pre-spanner” by several steps. we divide the groups in a solution into minimal and non- minimal groups according to whether the optimal solution includes one or multiple vertices of the group. Next we make sure every vertex in a minimal group reached by the optimal solution is in the pre-spanner. By adding more to the pre-spanner, we guarantee that the vertices of (nonminimal) groups reached by the optimal solution but not in the spanner, called tips, exist for only a constant number of nonminimal groups. Next we make sure each such tip of a nonminimal group is first weakly and then via yet another involved step strongly isolated. We are then able to handle strongly isolated tips of one group via another structural result of optimum solutions. Finally we invoke the known spanner construction for Steiner tree as a black box on top of our already constructed pre-spanner to construct an actual spanner. We iterate all the above a polynomial number of times via the bootstrapping approximation technique to obtain the desired PTAS for planar group Steiner tree. Our PTAS for planar group Steiner tree implies the first PTAS for geometric Euclidean group Steiner tree with obstacles, as well as a (2 + ε )-approximation algorithm for group TSP with obstacles, improv- ing over the best previous constant-factor approximation algorithms. By contrast, we show that planar group Steiner forest, a slight generalization of planar group Steiner tree, is APX-hard on planar graphs of treewidth 3, even if the groups are pairwise disjoint and every group is a vertex or an edge. View details
    Preview abstract Coverage problems are central in optimization and have a wide range of applications in data mining and machine learning. While several distributed algorithms have been developed for coverage problems, the existing methods suffer from several limitations, e.g., they all achieve either suboptimal approximation guarantees or suboptimal space and memory complexities. In addition, previous algorithms developed for submodular maximization assume oracle access, and ignore the computational complexity of communicating large subsets or computing the size of the union of the subsets in this subfamily. In this paper, we develop an improved distributed algorithm for the k-cover and the set cover with λ outliers problems, with almost optimal approximation guarantees, almost optimal memory complexity, and linear communication complexity running in only four rounds of computation. Finally, we perform an extensive empirical study of our algorithms on a number of publicly available real data sets, and show that using sketches of size 30 to 600 times smaller than the input, one can solve the coverage maximization problem with quality very close to that of the state-of-the-art single-machine algorithm. View details
    Distributed Balanced Partitioning via Linear Embedding
    Ninth ACM International Conference on Web Search and Data Mining (WSDM), ACM (2016), pp. 387-396
    Preview abstract Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems: in some cases, a big graph is chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced partitioning problem where the goal is to partition the vertices of a given graph into k pieces, minimizing the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks, e.g., MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the linear embedding order. We examine various ways to find the first embedding, e.g., via a hierarchical clustering or Hilbert curves. Then we apply four different techniques such as local swaps, minimum cuts on partition boundaries, as well as contraction and dynamic programming. Our empirical study compares the above techniques with each other, and to previous work in distributed algorithms, e.g., a label propagation method [34], FENNEL [32] and Spinner [23]. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: we notice, e.g., 15-25% reduction in cut size over [34]. We also observe that our algorithms allow for scalable distributed implementation for any number of partitions. Finally, we apply our techniques for the Google Maps Driving Directions to minimize the number of multi-shard queries with the goal of saving in CPU usage. During live experiments, we observe an ≈ 40% drop in the number of multi-shard queries when comparing our method with a standard geography-based method. View details
    Preview abstract We consider the setting where a seller must allocate a collection of goods to budgeted buyers, as exemplified by online advertising systems where platforms decide which impressions to serve to various advertisers. Such resource allocation problems are challenging for two reasons: (a) the seller must strike a balance between optimizing her own revenues and guaranteeing fairness to her (repeat) buyers and (b) the problem is inherently dynamic due to the uncertain, time-varying supply of goods available with the seller. We propose a stochastic approximation scheme akin to a dynamic market equilibrium. Our scheme relies on frequent re-solves of an Eisenberg-Gale convex program, and does not require the seller to have any knowledge about how goods arrival processes evolve over time. The scheme fully extracts buyer budgets (thus maximizing seller revenues), while at the same time provides a 0.47 approximation of the proportionally fair allocation of goods achievable in the offline case, as long as the supply of goods comes from a wide family of (possibly non-stationary) Gaussian processes. We then extend our results to a more general family of metrics called \alpha-fairness. Finally, we deal with a multi-objective problem where the seller is concerned with both the proportional fairness and efficiency of the allocation, and propose a hybrid algorithm which achieves a 0.27 bi-criteria guarantee against fairness and efficiency. View details
    Revenue Maximization for Selling Multiple Correlated Items
    Sina Dehghani
    MohammadTaghi Hajiaghayi
    Saeed Seddighin
    23rd Annual European Symposium on Algorithms (ESA), Springer-Verlag (2015)
    Preview abstract We study the problem of selling $n$ items to a single buyer with an additive valuation function. We consider the valuation of the items to be correlated, i.e., desirabilities of the buyer for the items are not drawn independently. Ideally, the goal is to design a mechanism to maximize the revenue. However, it has been shown that a revenue optimal mechanism might be very complicated and as a result inapplicable to real-world auctions. Therefore, our focus is on designing a simple mechanism that achieves a constant fraction of the optimal revenue. Babaioff et al. (FOCS'14) propose a simple mechanism that achieves a constant fraction of the optimal revenue for independent setting with a single additive buyer. However, they leave the following problem as an open question: "Is there a simple, approximately optimal mechanism for a single additive buyer whose value for $n$ items is sampled from a common base-value distribution?" Babaioff et al. show a constant approximation factor of the optimal revenue can be achieved by either selling the items separately or as a whole bundle in the independent setting. We show a similar result for the correlated setting when the desirabilities of the buyer are drawn from a common base-value distribution. It is worth mentioning that the core decomposition lemma which is mainly the heart of the proofs for efficiency of the mechanisms does not hold for correlated settings. Therefore we propose a modified version of this lemma which is applicable to the correlated settings as well. Although we apply this technique to show the proposed mechanism can guarantee a constant fraction of the optimal revenue in a very weak correlation, this method alone can not directly show the efficiency of the mechanism in stronger correlations. Therefore, via a combinatorial approach we reduce the problem to an auction with a weak correlation to which the core decomposition technique is applicable. In addition, we introduce a generalized model of correlation for items and show the proposed mechanism achieves an $O(\log k)$ approximation factor of the optimal revenue in that setting. View details
    Network Cournot Competition
    Melika Abolhasani
    MohammadTaghi Hajiaghayi
    Hamid Mahini
    Anshul Sawant
    WINE, The 10th Conference on Web and Internet Economics (2014)
    Preview
    Preview abstract A major challenge faced by the marketers attempting to optimize their advertising campaigns is to deal with budget constraints. The problem is even harder in the face of multidimensional budget constraints, particularly in the presence of many decision variables involved, and the interplay among the decision variables through these such constraints. Concise bidding strategies help advertisers deal with this challenge by introducing fewer variables to act on. In this paper, we study the problem of finding optimal concise bidding strategies for advertising campaigns with multiple budget constraints. Given bid landscapes—i.e., predicted value (e.g., number of clicks) and cost per click for any bid—that are typically provided by ad-serving systems, we optimize the value given budget constraints. In particular, we consider bidding strategies that consist of no more than k different bids for all keywords. For constant k, we provide a PTAS to optimize the profit, whereas for arbitrary k we show how constant-factor approximation can be obtained via a combination of solution enumeration and dependent LP-rounding techniques. Finally, we evaluate the performance of our algorithms on real datasets in two regimes with 1- and 3-dimensional budget constraint. In the former case where uniform bidding has provable performance guarantee, our algorithm beats the state of the art by an increase of 1% to 6% in the expected number of clicks. This is achieved by only two or three clusters—contrast with the single cluster permitted in uniform bidding. With only three dimensions in the budget constraint (one for total consumption, and another two for enforcing minimal diversity), the gap between the performance of our algorithm and an enhanced version of uniform bidding grows to an average of 5% to 6% (sometimes as high as 9%). Although the details of experiments for the multidimensional budget constraint to the full version of the paper are deferred to the full version of the paper, we report some highlights from the results. View details
    Secretary Problems and Online Auctions
    Encyclopedia of Algorithms, Springer (2014), pp. 1-4
    Preview
    Distributed Balanced Clustering via Mapping Coresets
    Aditya Bhaskara
    NIPS, Neural Information Processing Systems Foundation (2014)
    Preview abstract Large-scale clustering of data points in metric spaces is an important problem in mining big data sets. For many applications, we face explicit or implicit size constraints for each cluster which leads to the problem of clustering under capacity constraints or the balanced clustering'' problem. Although the balanced clustering problem has been widely studied, developing a theoretically sound distributed algorithm remains an open problem. In the present paper we develop a general framework based on mapping coresets'' to tackle this issue. For a wide range of clustering objective functions such as k-center, k-median, and k-means, our techniques give distributed algorithms for balanced clustering that match the best known single machine approximation ratios. View details
    Multiplicative Bidding in Online Advertising
    Sam Chiu-wai Wong
    ACM Conference on Economics and Computation (EC) (2014)
    Preview abstract In this paper, we initiate the study of the multiplicative bidding language adopted by major Internet search companies. In multiplicative bidding, the effective bid on a particular search auction is the product of a base bid and bid adjustments that are dependent on features of the search (for example, the geographic location of the user, or the platform on which the search is conducted). We consider the task faced by the advertiser when setting these bid adjustments, and establish a foundational optimization problem that captures the core difficulty of bidding under this language. We give matching algorithmic and approximation hardness results for this problem; these results are against an information-theoretic bound, and thus have implications on the power of the multiplicative bidding language itself. Inspired by empirical studies of search engine price data, we then codify the relevant restrictions of the problem, and give further algorithmic and hardness results. Our main technical contribution is an O(log n)-approximation for the case of multiplicative prices and monotone values. We also provide empirical validations of our problem restrictions, and test our algorithms on real data against natural benchmarks. Our experiments show that they perform favorably compare with the baseline. View details
    Submodular secretary problems with extensions
    MohammadTaghi Hajiaghayi
    ACM Transactions on Algorithms, vol. 9 (4) (2013)
    Preview abstract Online auction is the essence of many modern markets, particularly networked markets, in which information about goods, agents, and outcomes is revealed over a period of time, and the agents must make irrevocable decisions without knowing future information. Optimal stopping theory, especially the classic secretary problem, is a powerful tool for analyzing such online scenarios which generally require optimizing an objective function over the input. The secretary problem and its generalization the multiple-choice secretary problem were under a thorough study in the literature. In this article, we consider a very general setting of the latter problem called the submodular secretary problem, in which the goal is to select k secretaries so as to maximize the expectation of a (not necessarily monotone) submodular function which defines efficiency of the selected secretarial group based on their overlapping skills. We present the first constant-competitive algorithm for this case. In a more general setting in which selected secretaries should form an independent (feasible) set in each of $l$ given matroids as well, we obtain an O(l log^2 r)-competitive algorithm generalizing several previous results, where $r$ is the maximum rank of the matroids. Another generalization is to consider $l$ knapsack constraints (i.e., a knapsack constraint assigns a nonnegative cost to each secretary, and requires that the total cost of all the secretaries employed be no more than a budget value) instead of the matroid constraints, for which we present an O(l)-competitive algorithm. In a sharp contrast, we show for a more general setting of subadditive secretary problem, there is no o~(\sqrt n) competitive algorithm and thus submodular functions are the most general functions to consider for constant-competitiveness in our setting. We complement this result by giving a matching O(\sqrt n) competitive algorithm for the subadditive case. At the end, we consider some special cases of our general setting as well. View details
    Preview abstract We consider two natural generalizations of the Asymmetric Traveling Salesman problem: the k-Stroll and the k-Tour problems. The input to the k-Stroll problem is a directed n-vertex graph with nonnegative edge lengths, an integer k, as well as two special vertices s and t. The goal is to find a minimum-length s-t walk, containing at least k distinct vertices (including the endpoints s,t). The k-Tour problem can be viewed as a special case of k-Stroll, where s=t. That is, the walk is required to be a tour, containing some pre-specified vertex s. When k=n, the k-Stroll problem becomes equivalent to Asymmetric Traveling Salesman Path, and k-Tour to Asymmetric Traveling Salesman. Our main result is a polylogarithmic approximation algorithm for the k-Stroll problem. Prior to our work, only bicriteria (O(log2 k),3)-approximation algorithms have been known, producing walks whose length is bounded by 3OPT, while the number of vertices visited is ?(k/log2 k). We also show a simple O(log2 n/loglogn)-approximation algorithm for the k-Tour problem. The best previously known approximation algorithms achieved min(O(log3 k),O(log2 n?logk/loglogn)) approximation in polynomial time, and O(log2 k) approximation in quasipolynomial time. View details
    Preview abstract Moss and Rabani [12] study constrained node-weighted Steiner tree problems with two independent weight values associated with each node, namely, cost and prize (or penalty). They give an O(logn)-approximation algorithm for the prize-collecting node-weighted Steiner tree problem (PCST) View details
    Revenue Maximization with Nonexcludable Goods
    Nima Haghpanah
    Internet and Network Economics - 9th International Workshop, WINE 2013, Springer
    Preview
    Submodular secretary problems with extensions
    MohammadTaghi Hajiaghayi
    ACM Transactions on Algorithms, vol. 9 (4) (2013)
    Preview
    A polynomial-time approximation scheme for planar multiway cut
    MohammadTaghi Hajiaghayi
    Philip Klein
    Claire Mathieu
    Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2012)
    Preview
    Euclidean Prize-Collecting Steiner Forest
    MohammadTaghi Hajiaghayi
    Algorithmica, vol. 62 (2012), pp. 906-929
    Scheduling to Minimize Staleness and Stretch in Real-Time Data Warehouses
    Lukasz Golab
    MohammadTaghi Hajiaghayi
    Howard Karloff
    Theory of Computing Systems, vol. 49(4) (2011), pp. 757-780
    Prize-collecting Steiner Problems on Planar Graphs
    Chandra Chekuri
    Alina Ene
    MohammadTaghi Hajiaghayi
    Daniel Marx
    Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, PA (2011), pp. 1028-1049
    Approximation Schemes for Steiner Forest on Planar Graphs and Graphs of Bounded Treewidth
    MohammadTaghi Hajiaghayi
    Daniel Marx
    Journal of the ACM, vol. 58(5) (2011), pp. 21
    Improved Approximation Algorithms for Prize-Collecting Steiner Tree and TSP
    MohammadTaghi Hajiaghayi
    Howard Karloff
    SIAM Journal on Computing, vol. 40(2) (2011), pp. 309-332
    Towards an efficient algorithmic framework for pricing cellular data service
    MohammadTaghi Hajiaghayi
    Sina Jafarpour
    Dan Pei
    Proceedings of the 30th Annual Conference of the IEEE Communications Society (INFOCOM), IEEE Computer Society, Los Alamitos, CA (2011), pp. 581-585