Edith Cohen
I am a Research Scientist at Google (Mountain View, CA) since 2015. I am also a (visiting) full professor at the School of Computer Science at Tel Aviv University in Israel. I attended
Tel Aviv University
(Israel)
for my undergraduate studies in math, physics, and
computer science, graduating in 1985, and continued to obtain an M.Sc.
in computer science in 1986, supervised by Michael Tarsi. I then headed to
Stanford, CA, where I
worked on a Ph.D. in Computer
Science,
with
Andrew Goldberg and Nimrod Megiddo, graduating in 1991.
From 1991 to 2012, I was a member of the research staff, first at the
legendary AT&T Bell Laboratories , and after a 1997 split, at AT&T Labs Research. In 1997 I also visited the Computer Science Division at UC Berkeley. From 2012 to 2014 I was a principal researcher at Microsoft Research (Silicon Valley).
My research interests include Algorithms design, Data Mining, Machine Learning, Optimization, Networks, and more. I prefer a principled approach to modeling and design and I am always looking to learn and expand my horizons. In the span of my research career, I developed models and scalable algorithms in a wide range of areas including query processing and optimization, data summarization, content search and delivery, caching, prefetching, routing, streaming and distributed computation, fundamental graph algorithms and scalable mining of massive graphs. My work enables the leveraging of much larger data sets than previously possible. More details on my work can be found on my personal web page.
Research Areas
Authored Publications
Google Publications
Other Publications
Sort By
Preview abstract
Classical streaming algorithms operate under the (not always reasonable) assumption that the input stream is fixed in advance. Recently, there is a growing interest in designing robust streaming algorithms that provide provable guarantees even when the input stream is chosen adaptively as the execution progresses. We propose a new framework for robust streaming that combines techniques from two recently suggested frameworks by Hassidim et al. [NeurIPS 2020] and by Woodruff and Zhou [FOCS 2021]. These recently suggested frameworks rely on very different ideas, each with its own strengths and weaknesses. We combine these two frameworks into a single hybrid framework that obtains the "best of both worlds", thereby solving a question left open by Woodruff and Zhou.
View details
Preview abstract
CountSketch and Feature Hashing (the "hashing trick") are popular randomized dimensionality reduction methods that support recovery of $\ell_2$-heavy hitters (keys $i$ where $v_i^2 > \epsilon \|\boldsymbol{v}\|_2^2$) and approximate inner products. When the inputs are {\em not adaptive} (do not depend on prior outputs), classic estimators applied to a sketch of size $O(\ell/\epsilon)$ are accurate for a number of queries that is exponential in $\ell$. When inputs are adaptive, however, an adversarial input can be constructed after $O(\ell)$ queries with the classic estimator and the best known robust estimator only supports $\tilde{O}(\ell^2)$ queries. In this work we show that this quadratic dependence is in a sense inherent: We design an attack that after $O(\ell^2)$ queries produces an adversarial input vector whose sketch is highly biased. Our attack uses "natural" non-adaptive inputs (only the final adversarial input is chosen adaptively) and universally applies with any correct estimator, including one that is unknown to the attacker. In that, we expose inherent vulnerability of this fundamental method.
View details
Preview abstract
The problem of learning threshold functions is a fundamental one in machine learning. Classical learning theory implies sample complexity of $O(\xi^{-1} \log(1/\beta))$ (for generalization error $\xi$ with confidence $1-\beta$). The private version of the problem, however, is more challenging and in particular, the sample complexity must depend on the size $|X|$ of the domain. Progress on quantifying this dependence, via lower and upper bounds, was made in a line of works over the past decade. In this paper, we finally close the gap for approximate-DP and provide a nearly tight upper bound of $\widetilde{O}(\log^* |X|)$, which matches a lower bound by Alon et al (that applies even with improper learning) and improves over a prior upper bound of $\widetilde{O}((\log^* |X|)^{1.5})$ by Kaplan et al. We also provide matching upper and lower bounds of $\tilde{\Theta}(2^{\log^*|X|})$ for the additive error of private quasi-concave optimization (a related and more general problem). Our improvement is achieved via the novel Reorder-Slice-Compute paradigm for private data analysis which we believe will have further applications.
View details
Preview abstract
Composition theorems are general and powerful tools that facilitate privacy accounting across multiple data accesses from per-access privacy bounds. However they often result in weaker bounds compared with end-to-end analysis. Two popular tools that mitigate that are the exponential mechanism (or report noisy max) and the sparse vector technique. They were generalized in a couple of recent private selection/test frameworks, including the work by Liu and Talwar (STOC 2019), and Papernot and Steinke (ICLR 2022).
In this work, we first present an alternative framework for private selection and testing with a simpler privacy proof and equally-good utility guarantee. Second, we observe that the private selection framework (both previous ones and ours) can be applied to improve the accuracy/confidence trade-off for many fundamental privacy-preserving data-analysis tasks, including query releasing, top-k selection, and stable selection.
Finally, for online settings, we apply the private testing to design a mechanism for adaptive query releasing, which improves the sample complexity dependence on the confidence parameter for the celebrated private multiplicative weights algorithm of Hardt and Rothblum (FOCS 2010).
View details
Preview abstract
\texttt{CountSketch} is a popular dimensionality reduction technique that maps vectors to a lower-dimension using
a randomized set of linear measurements. The sketch has the property that the $\ell_2$-heavy hitters of a vector (entries with $v_i^2 \geq \frac{1}{k}\|\boldsymbol{v}\|^2_2$) can be recovered from its sketch. We study the robustness of the sketch in adaptive settings, such as online optimization, where input vectors may depend on the output from prior inputs. We show that the classic estimator can be attacked with a number of queries of the order of the sketch size and propose a robust estimator (for a slightly modified sketch) that allows for quadratic number of queries. We improve robustness by a factor of $\sqrt{k}$ (for $k$ heavy hitters) over prior approaches.
View details
Preview abstract
Differentially private algorithms for common metric aggregation tasks, such as clustering or averaging, often have limited practicality due to their complexity or to the large number of data points that is required for accurate results.
We propose a simple and practical tool $\mathsf{FriendlyCore}$ that takes a set of points $\cD$ from an unrestricted (pseudo) metric space as input. When $\cD$ has effective diameter $r$, $\mathsf{FriendlyCore}$ returns a ``stable'' subset $\cC \subseteq \cD$ that includes all points, except possibly few outliers, and is {\em guaranteed} to have diameter $r$. $\mathsf{FriendlyCore}$ can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy. Surprisingly, $\mathsf{FriendlyCore}$ is light-weight with no dependence on the dimension. We empirically demonstrate its advantages in boosting the accuracy of mean estimation and clustering tasks such as $k$-means and $k$-GMM, outperforming tailored methods.
View details
Preview abstract
Common datasets have the form of elements with keys (e.g., transactions and products) and the goal is to perform analytics on the aggregated form of key and frequency pairs. A weighted sample of keys by (a function of) frequency is a highly versatile summary that provides a sparse set of representative keys and supports approximate evaluations of query statistics. We propose private weighted sampling (PWS): A method that sanitizes a weighted sample as to ensure element-level differential privacy, while retaining its utility to the maximum extent possible. PWS maximizes the reporting probabilities of keys and estimation quality of a broad family of statistics. PWS improves over the state of the art even for the well-studied special case of private histograms, when no sampling is performed. We empirically observe significant performance gains of 20%-300% increase in key reporting for common Zipfian frequency distributions and accurate estimation with X2-8 lower frequencies. PWS is applied as a post-processing of a non-private sample, without requiring the original data. Therefore, it can be a seamless addition to existing implementations, such as those optimizes for distributed or streamed data. We believe that due to practicality and performance, PWS may become a method of choice in applications where privacy is desired.
View details
Preview abstract
Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify k cluster centers without disclosing information on individual data points. Despite significant research progress, the problem had so far resisted practical solutions. In this work we aim at providing simple implementable differentially private clustering algorithms that provide utility when the data is "easy", e.g., when there exists a significant separation between the clusters.
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
We are able to get improved sample complexity bounds in some cases of Gaussian mixtures and k-means. We complement our theoretical analysis with an empirical evaluation on synthetic data.
View details
Preview abstract
Optimization of a machine learning model is typically carried out by performing stochastic gradient updates on epochs that consist of randomly ordered training examples. This practice means that eachfraction of an epoch comprises an independent random sample of the training data that may not preserve informative structure present in the full data. We hypothesize that the training can be more effective, allowing each epoch to provide some of the benefits of multiple ones, with more principled, ``self-similar'' arrangements.
Our case study is matrix factorization, commonly used to learn metric embeddings of entities such as videos or words from example associations. We construct arrangements that preserve the weighted Jaccard similarities of rows and columns and experimentally observe that our arrangements yield training acceleration of 3\%-30\% on synthetic and recommendation datasets. Principled arrangements of training examples emerge as a novel and potentially powerful performance knob for SGD that merits further exploration.
View details
Preview abstract
Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation can be formulated as repeated weighted averaging of the label vectors of adjacent nodes. Social diffusion propagates through shortest paths. A common ground to these diffusions is their {\em linearity}, which does not distinguish between contributions of few ``strong'' relations and many ``weak'' relations.
Recently, non-linear methods such as node embeddings and graph convolutional networks (GCN) demonstrated a large gain in quality for SSL tasks. These methods introduce multiple components and greatly vary on how the graph structure, seed label information, and other features are used.
We aim here to study the contribution of non-linearity, as an isolated ingredient, to the performance gain. To do so, we place classic linear graph diffusions in a self-training framework. Surprisingly, we observe that the resulting {\em bootstrapped diffusions} not only significantly improve over the respective non-bootstrapped baselines but also outperform state-of-the-art non-linear methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.
View details
Preview abstract
Clustering of data points is a fundamental tool in data analysis. We consider points $X$ in a relaxed metric space, where the triangle inequality holds within a constant factor. A clustering of $X$ is a partition of $X$ defined by a set of points $Q$ ({\em centroids}), according to the closest centroid. The {\em cost} of clustering $X$ by $Q$ is $V(Q)=\sum_{x\in X} d_{xQ}$. This formulation generalizes classic $k$-means clustering, which uses squared distances. Two basic tasks, parametrized by $k \geq 1$, are {\em cost estimation}, which returns (approximate) $V(Q)$ for queries $Q$ such that $|Q|=k$ and {\em clustering}, which returns an (approximate) minimizer of $V(Q)$ of size $|Q|=k$. When the data set $X$ is very large, we seek efficient constructions of small samples that can act as surrogates for performing these tasks. Existing constructions that provide quality guarantees, however, are either worst-case, and unable to benefit from structure of real data sets, or make explicit strong assumptions on the structure. We show here how to avoid both these pitfalls using adaptive designs.
The core of our design are the novel {\em one2all} probabilities, computed for a set $M$ of centroids and $\alpha \geq 1$: The clustering cost of {\em each} $Q$ with cost $V(Q) \geq V(M)/\alpha$ can be estimated well from a sample of size $O(\alpha |M|\epsilon^{-2})$. For cost estimation, we apply one2all with a bicriteria approximate $M$, while adaptively balancing $|M|$ and $\alpha$ to optimize sample size per quality. For clustering, we present a wrapper that adaptively applies a base clustering algorithm to a sample $S$, using the smallest sample that provides the desired statistical guarantees on quality. We demonstrate experimentally the huge gains of using our adaptive instead of worst-case methods.
View details
HyperLogLog Hyper Extended: Sketches for Concave Sublinear Frequency Statistics
KDD (2017) (to appear)
Preview abstract
One of the most common statistics computed over data elements is the number of distinct keys. A thread of research pioneered by Flajolet and Martin three decades ago culminated in the design of optimal approximate counting sketches, which have size that is double logarithmic in the number of distinct keys and provide estimates with a small relative error. Moreover, the sketches are composable, and thus suitable for streamed, parallel, or distributed computation.
We consider here all statistics of the frequency distribution of keys, where a contribution of a key to the aggregate is concave and grows (sub)linearly with its frequency. These fundamental aggregations are very common in text, graphs, and logs analysis and include logarithms, low frequency moments, and capping statistics.
We design composable sketches of double-logarithmic size for all concave sublinear statistics. Our design combines theoretical optimality and practical simplicity. In a nutshell, we specify tailored mapping functions of data elements to output elements so that our target statistics on the data elements is approximated by the (max-) distinct statistics of the output elements, which can be approximated using off-the-shelf sketches. Our key insight is relating these target statistics to the {\em complement Laplace} transform of the input frequencies.
View details
Greedy Maximization Framework for Graph-based Influence Functions
HotWeb, IEEE (2016)
Preview abstract
The study of graph-based submodular maximization problems was initiated in a seminal work of Kempe, Kleinberg, and Tardos (2003): An {\em influence} function of subsets of nodes is defined by the graph structure and the aim is to find subsets of seed nodes with (approximately) optimal tradeoff of size and influence. Applications include viral marketing, monitoring, and active learning of node labels. This powerful formulation was studied for (generalized) {\em coverage} functions, where the influence of a seed set on a node is the maximum utility of a seed item to the node, and for pairwise {\em utility} based on reachability, distances, or reverse ranks.
We define a rich class of influence functions which unifies and extends previous work beyond coverage functions and specific utility functions. We present a meta-algorithm for approximate greedy maximization with strong approximation quality guarantees and worst-case near-linear computation for all functions in our class. Our meta-algorithm generalizes a recent design by Cohen et al (2014) that was specific for distance-based coverage functions.
View details
Preview abstract
Distances in a network capture relations between nodes and are the basis of centrality, similarity, and influence measures.Often, however, the relevance of a node $u$ to a node $v$ is more precisely measured not by the magnitude of the distance, but by the number of nodes that are closer to $v$ than $u$. That is, by the {\em rank} of $u$ in an ordering of nodes by increasing distance from $v$.
We identify and address fundamental challenges in rank-based graph mining. We first consider single-source computation of reverse-ranks and design a ``Dijkstra-like'' algorithm which computes nodes in order of increasing approximate reverse rank while only traversing edges adjacent to returned nodes. We then define {\em reverse-rank influence}, which naturally extends reverse nearest neighbors influence [Korn and Muthukrishnan 2000] and builds on a well studied distance-based influence. We present near-linear algorithms for greedy approximate reverse-rank influence maximization. The design relies on our single-source algorithm. Our algorithms utilize near-linear preprocessing of the network to compute all-distance sketches. As a contribution of independent interest, we present a novel algorithm for computing these sketches, which have many other applications, on multi-core architectures.
We complement our algorithms by establishing the hardness of computing {\em exact} reverse-ranks for a single source and {\em exact} reverse-rank influence. This implies that when using near-linear algorithms, the small relative errors we obtain are the best we can currently hope for.
Finally, we conduct an experimental evaluation on graphs with tens of millions of edges, demonstrating both scalability and accuracy.
View details
Multi-Objective Weighted Sampling
HotWeb 2015 (to appear)
Preview abstract
Key value data sets of the form $\{(x,w_x)\}$ where $w_x >0$ are prevalent. Common queries over such data are {\em segment $f$-statistics} $Q(f,H) = \sum_{x\in H}f(w_x)$, specified for a segment $H$ of the keys and a function $f$. Different choices of $f$ correspond to count, sum, moments, cap, and threshold statistics.
When the data set is large, we can compute a smaller sample from which we can quickly estimate statistics. A weighted sample of keys taken with respect to $f(w_x)$ provides estimates with statistically guaranteed quality for $f$-statistics. Such a sample $S^{(f)}$ can be used to estimate $g$-statistics for $g\not=f$, but quality degrades with the disparity between $g$ and $f$. In this paper we address applications that require quality estimates for a set $F$ of different functions. A naive solution is to compute and work with a different sample $S^{(f)}$ for each $f\in F$. Instead, this can be achieved more effectively and seamlessly using a single {\em multi-objective} sample $S^{(F)}$ of a much smaller size.
We review multi-objective sampling schemes and place them in our context of estimating $f$-statistics. We show that a multi-objective sample for $F$ provides quality estimates for any $f$ that is a positive linear combination of functions from $F$. We then establish a surprising and powerful result when the target set $M$ is {\em all} monotone non-decreasing functions, noting that $M$ includes most natural statistics. We provide efficient multi-objective sampling algorithms for $M$ and show that a sample size of $k \ln n$ (where $n$ is the number of active keys) provides the same estimation quality, for any $f\in M$, as a dedicated weighted sample of size $k$ for $f$.
View details
Stream Sampling for Frequency Cap Statistics
CoRR, vol. abs/1502.05955 (2015)
Reverse Ranking by Graph Structure: Model and Scalable Algorithms
Average Distance Queries through Weighted Samples in Graphs and Metric Spaces: High Scalability with Tight Statistical Guarantees
All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis
IEEE Trans. Knowl. Data Eng., vol. 27 (2015), pp. 2320-2334
Multi-Objective Weighted Sampling
CoRR, vol. abs/1509.07445 (2015)
Stream Sampling for Frequency Cap Statistics
KDD (2015), pp. 159-168
Average Distance Queries through Weighted Samples in Graphs and Metric Spaces: High Scalability with Tight Statistical Guarantees
All-Distances Sketches
Encyclopedia of Algorithms (2015)
Min-Hash Sketches
Encyclopedia of Algorithms (2015)
Coordinated Sampling
Encyclopedia of Algorithms (2015)
Probe scheduling for efficient detection of silent failures
Haim Kaplan
Yishay Mansour
Yoav Tzur
Perform. Eval., vol. 79 (2014), pp. 73-89
Author retrospective for search and replication in unstructured peer-to-peer networks
Variance Competitiveness for Monotone Estimation: Tightening the Bounds
CoRR, vol. abs/1406.6490 (2014)
Estimation for monotone sampling: competitiveness and customization
PODC (2014), pp. 124-133
Computing Classic Closeness Centrality, at Scale
Sketch-based Influence Maximization and Computation: Scaling up with Guarantees
Timed Influence: Computation and Maximization
Algorithms and estimators for summarization of unaggregated data streams
Nick G. Duffield
Haim Kaplan
Carsten Lund
Mikkel Thorup
J. Comput. Syst. Sci., vol. 80 (2014), pp. 1214-1244
Sketch-based Influence Maximization and Computation: Scaling up with Guarantees
Computing classic closeness centrality, at scale
All-distances sketches, revisited: HIP estimators for massive graphs analysis
PODS (2014), pp. 88-99
Distance queries from sampled data: accurate and efficient
KDD (2014), pp. 681-690
Probe Scheduling for Efficient Detection of Silent Failures
Haim Kaplan
Yishay Mansour
Yoav Tzur
CoRR, vol. abs/1302.0792 (2013)
What You Can Do with Coordinated Samples
On the Tradeoff between Stability and Fit
Scalable Neighborhood Sketching and Distance Distribution Estimation in Graph Datasets: Revisited, Unified, and Improved
CoRR, vol. abs/1306.3284 (2013)
A Labeling Approach to Incremental Cycle Detection
Scheduling Subset Tests: One-Time, Continuous, and How They Relate
Scalable similarity estimation in social networks: closeness, node labels, and random edge lengths
Daniel Delling
Fabian Fuchs
Andrew V. Goldberg
Moisés Goldszmidt
Renato F. Werneck
COSN (2013), pp. 131-142
Don't let the negatives bring you down: sampling from streams of signed updates
What you can do with Coordinated Samples
A Case for Customizing Estimators: Coordinated Samples
Envy-Free Makespan Approximation
Michal Feldman
Amos Fiat
Haim Kaplan
Svetlana Olonetsky
SIAM J. Comput., vol. 41 (2012), pp. 12-25
How to Estimate Change from Samples
Structure-Aware Sampling: Flexible and Accurate Summarization
Structure-Aware Sampling: Flexible and Accurate Summarization
Truth, Envy, and Truthful Market Clearing Bundle Pricing
Get the most out of your sample: optimal unbiased estimators using partial information
Get the Most out of Your Sample: Optimal Unbiased Estimators using Partial Information
Structure-aware sampling on data streams
Efficient Stream Sampling for Variance-Optimal Estimation of Subset Sums
Nick G. Duffield
Haim Kaplan
Carsten Lund
Mikkel Thorup
SIAM J. Comput., vol. 40 (2011), pp. 1402-1431
Envy-free makespan approximation: extended abstract
Labeling Dynamic XML Trees
On the Interplay between Incentive Compatibility and Envy Freeness
Truth and Envy in Capacitated Allocation Games
Leveraging Discarded Samples for Tighter Estimation of Multiple-Set Aggregates
Coordinated Weighted Sampling for Estimating Aggregates Over Multiple Weight Assignments
Composable, Scalable, and Accurate Weight Summarization of Unaggregated Data Sets
Nick G. Duffield
Haim Kaplan
Carsten Lund
Mikkel Thorup
PVLDB, vol. 2 (2009), pp. 431-442
Envy-Free Makespan Approximation
Coordinated Weighted Sampling for Estimating Aggregates Over Multiple Weight Assignments
Decay Models
Encyclopedia of Database Systems (2009), pp. 757-761
Stream sampling for variance-optimal estimation of subset sums
Leveraging discarded samples for tighter estimation of multiple-set aggregates
Sketch-Based Estimation of Subpopulation-Weight
Tighter estimation using bottom k sketches
Confident estimation for multistage measurement sampling and aggregation
Variance optimal sampling based estimation of subset sums
Estimating Aggregates over Multiple Sets
Processing top-k queries from samples
Algorithms and estimators for accurate summarization of internet traffic
Nick G. Duffield
Haim Kaplan
Carsten Lund
Mikkel Thorup
Internet Measurement Comference (2007), pp. 265-278
Sketching unaggregated data streams for subpopulation-size queries
Bottom-k sketches: better and more efficient estimation of aggregates
Associative search in peer to peer networks: Harnessing latent semantics
Spatially-decaying aggregation over a network
Summarizing data using bottom-k sketches
A short walk in the Blogistan
Maintaining time-decaying stream aggregates
Making routing robust to changing traffic demands: algorithms and evaluation
Processing top k queries from samples
Packet classification in large ISPs: design and evaluation of decision tree classifiers
Performance aspects of distributed caches using TTL-based consistency
Balanced-Replication Algorithms for Distribution Trees
Coping with network failures: routing strategies for optimal demand oblivious restoration
Optimal oblivious routing in polynomial time
Yossi Azar
Amos Fiat
Haim Kaplan
Harald Räcke
J. Comput. Syst. Sci., vol. 69 (2004), pp. 383-394
Spatially-decaying aggregation over a network: model and algorithms
Guest Editors' foreword
Efficient estimation algorithms for neighborhood variance and other moments
Making intra-domain routing robust to changing and uncertain traffic demands: understanding fundamental tradeoffs
A case for associative peer to peer overlays
Optimal oblivious routing in polynomial time
Maintaining time-decaying stream aggregates
Reachability and Distance Queries via 2-Hop Labels
Proactive caching of DNS records: addressing a performance bottleneck
Associative Search in Peer to Peer Networks: Harnessing Latent Semantics
Efficient sequences of trials
Predicting and bypassing end-to-end Internet service degradations
Anat Bremler-Barr
Haim Kaplan
Yishay Mansour
IEEE Journal on Selected Areas in Communications, vol. 21 (2003), pp. 961-978
Connection caching: model and algorithms
Predicting and bypassing end-to-end internet service degradations
Anat Bremler-Barr
Haim Kaplan
Yishay Mansour
Internet Measurement Workshop (2002), pp. 307-320
Restoration by path concatenation: fast recovery of MPLS paths
Yehuda Afek
Anat Bremler-Barr
Haim Kaplan
Michael Merritt
Distributed Computing, vol. 15 (2002), pp. 273-283
Caching Documents with Variable Sizes and Fetching Costs: An LP-Based Approach
Search and replication in unstructured peer-to-peer networks
Replication strategies in unstructured peer-to-peer networks
Labeling Dynamic XML Trees
Balanced-Replication Algorithms for Distribution Trees
Reachability and distance queries via 2-hop labels
Exploiting Regularities in Web Traffic Patterns for Cache Replacement
Refreshment policies for Web content caches
Prefetching the means for document transfer: a new approach for reducing Web latency
Competitive Analysis of the LRFU Paging Algorithm
Search and replication in unstructured peer-to-peer networks
All-Pairs Small-Stretch Paths
Refreshment Policies for Web Content Caches
Proactive Caching of DNS Records: Addressing a Performance Bottleneck
Restoration by path concatenation: fast recovery of MPLS paths
Aging through cascaded caches: performance issues in the distribution of web content
Performance Aspects of Distributed Caches Using TTL-Based Consistency
Restoration path concatenation: fast recovery of MPLS paths
Anat Bremler-Barr
Yehuda Afek
Haim Kaplan
Michael Merritt
SIGMETRICS/Performance (2001), pp. 316-317
Competitive Analysis of the LRFU Paging Algorithm
The Age Penalty and Its Effect on Cache Performance
Finding Interesting Associations without Support Pruning
Mayur Datar
Shinji Fujiwara
Aristides Gionis
Piotr Indyk
Rajeev Motwani
Jeffrey D. Ullman
Cheng Yang
IEEE Trans. Knowl. Data Eng., vol. 13 (2001), pp. 64-78
Finding Interesting Associations without Support Pruning
Mayur Datar
Shinji Fujiwara
Aristides Gionis
Piotr Indyk
Rajeev Motwani
Jeffrey D. Ullman
Cheng Yang
ICDE (2000), pp. 489-499
Prefetching the Means for Document Transfer: A New Approach for Reducing Web Latency
Connection caching under vaious models of communication
Polylog-time and near-linear work approximation scheme for undirected shortest paths
J. ACM, vol. 47 (2000), pp. 132-166
Approximating Matrix Multiplication for Pattern Recognition Tasks
Connection Caching
LP-based Analysis of Greedy-dual-size
Managing TCP Connections Under Persistent HTTP
Exploiting Regularities in Web Traffic Patterns for Cache Replacement
Efficient Algorithms for Predicting Requests to Web Servers
Improving End-to-End Performance of the Web Using Server Volumes and Proxy Filters
Structure Prediction and Computation of Sparse Matrix Products
J. Comb. Optim., vol. 2 (1998), pp. 307-332
Fast Algorithms for Constructing t-Spanners and Paths with Stretch t
SIAM J. Comput., vol. 28 (1998), pp. 210-236
Evaluating Server-Assisted Cache Replacement in the Web
Using Selective Path-Doubling for Parallel Shortest-Path Computations
J. Algorithms, vol. 22 (1997), pp. 30-56
All-Pairs Small-Stretch Paths
Size-Estimation Framework with Applications to Transitive Closure and Reachability
J. Comput. Syst. Sci., vol. 55 (1997), pp. 441-453
Learning Noisy Perceptrons by a Perceptron in Polynomial Time
FOCS (1997), pp. 514-523
Approximating Matrix Multiplication for Pattern Recognition Tasks
On Optimizing Multiplications of Sparse Matrices
IPCO (1996), pp. 219-233
Efficient Parallel Shortest-Paths in Digraphs with a Separator Decomposition
J. Algorithms, vol. 21 (1996), pp. 331-357
Approximate Max-Flow on Small Depth Networks
SIAM J. Comput., vol. 24 (1995), pp. 579-597
Improvise: Interactive Multimedia Process Visualization Environment
Polylog-time and near-linear work approximation scheme for undirected shortest paths
STOC (1994), pp. 16-26
Improved Algorithms for Linear Inequalities With Two Variables per Inequality
New algorithms for generalized network flows
Estimating the Size of the Transitive Closure in Linear Time
FOCS (1994), pp. 190-200
Algorithms and Complexity Analysis for Some Flow Problems
Using Selective Path-Doubling for Parallel Shortest-Path Computations
ISTCS (1993), pp. 78-87
Efficient Parallel Shortest-Paths in Digraphs with a Separator Decomposition
SPAA (1993), pp. 57-67
Fast algorithms for constructing t-spanners and paths with stretch t
FOCS (1993), pp. 648-658
Strongly Polynomial-Time and NC Algorithms for Detecting Cycles in Periodic Graphs
Approximate Max Flow on Small Depth Networks
FOCS (1992), pp. 648-658
New Algorithms for Generalized Network Flows
NP-Completeness of graph decomposition problems
Improved Algorithms for Linear Inequalities with Two Variables per Inequality (Extended Abstract)
Algorithms and Complexity Analysis for Some Flow Problems
Strongly Polynomial-Time and NC Algorithms for Detecting Cycles in Dynamic Graphs (Preliminary Version)