Source author record

Yossi Azar

Yossi Azar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Computer Science and Game Theory Discrete Mathematics Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

16works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Hierarchical Clustering via Sketches and Hierarchical Correlation Clustering

Recently, Hierarchical Clustering (HC) has been considered through the lens of optimization. In particular, two maximization objectives have been defined. Moseley and Wang defined the \emph{Revenue} objective to handle similarity information given by a weighted graph on the data points (w.l.o.g., $[0,1]$ weights), while Cohen-Addad et al. defined the \emph{Dissimilarity} objective to handle dissimilarity information. In this paper, we prove structural lemmas for both objectives allowing us to convert any HC tree to a tree with constant number of internal nodes while incurring an arbitrarily small loss in each objective. Although the best-known approximations are 0.585 and 0.667 respectively, using our lemmas we obtain approximations arbitrarily close to 1, if not all weights are small (i.e., there exist constants $ε, δ$ such that the fraction of weights smaller than $δ$, is at most $1 - ε$); such instances encompass many metric-based similarity instances, thereby improving upon prior work. Finally, we introduce Hierarchical Correlation Clustering (HCC) to handle instances that contain similarity and dissimilarity information simultaneously. For HCC, we provide an approximation of 0.4767 and for complementary similarity/dissimilarity weights (analogous to $+/-$ correlation clustering), we again present nearly-optimal approximations.

preprint2020arXiv

Beyond Tree Embeddings -- a Deterministic Framework for Network Design with Deadlines or Delay

We consider network design problems with deadline or delay. All previous results for these models are based on randomized embedding of the graph into a tree (HST) and then solving the problem on this tree. We show that this is not necessary. In particular, we design a deterministic framework for these problems which is not based on embedding. This enables us to provide deterministic $\text{poly-log}(n)$-competitive algorithms for Steiner tree, generalized Steiner tree, node weighted Steiner tree, (non-uniform) facility location and directed Steiner tree with deadlines or with delay (where $n$ is the number of nodes). Our deterministic algorithms also give improved guarantees over some previous randomized results. In addition, we show a lower bound of $\text{poly-log}(n)$ for some of these problems, which implies that our framework is optimal up to the power of the poly-log. Our algorithms and techniques differ significantly from those in all previous considerations of these problems.

preprint2020arXiv

Hierarchical Clustering: a 0.585 Revenue Approximation

Hierarchical Clustering trees have been widely accepted as a useful form of clustering data, resulting in a prevalence of adopting fields including phylogenetics, image analysis, bioinformatics and more. Recently, Dasgupta (STOC 16') initiated the analysis of these types of algorithms through the lenses of approximation. Later, the dual problem was considered by Moseley and Wang (NIPS 17') dubbing it the Revenue goal function. In this problem, given a nonnegative weight $w_{ij}$ for each pair $i,j \in [n]=\{1,2, \ldots ,n\}$, the objective is to find a tree $T$ whose set of leaves is $[n]$ that maximizes the function $\sum_{i<j \in [n]} w_{ij} (n -|T_{ij}|)$, where $|T_{ij}|$ is the number of leaves in the subtree rooted at the least common ancestor of $i$ and $j$. In our work we consider the revenue goal function and prove the following results. First, we prove the existence of a bisection (i.e., a tree of depth 2 in which the root has two children, each being a parent of $n/2$ leaves) which approximates the general optimal tree solution up to a factor of $\frac{1}{2}$ (which is tight). Second, we apply this result in order to prove a $\frac{2}{3}p$ approximation for the general revenue problem, where $p$ is defined as the approximation ratio of the Max-Uncut Bisection problem. Since $p$ is known to be at least 0.8776 (Wu et al., 2015, Austrin et al., 2016), we get a 0.585 approximation algorithm for the revenue problem. This improves a sequence of earlier results which culminated in an 0.4246-approximation guarantee (Ahmadian et al., 2019).

preprint2020arXiv

Set Cover with Delay -- Clairvoyance is not Required

In most online problems with delay, clairvoyance (i.e. knowing the future delay of a request upon its arrival) is required for polylogarithmic competitiveness. In this paper, we show that this is not the case for set cover with delay (SCD) -- specifically, we present the first non-clairvoyant algorithm, which is $O(\log n \log m)$-competitive, where $n$ is the number of elements and $m$ is the number of sets. This matches the best known result for the classic online set cover (a special case of non-clairvoyant SCD). Moreover, clairvoyance does not allow for significant improvement - we present lower bounds of $Ω(\sqrt{\log n})$ and $Ω(\sqrt{\log m})$ for SCD which apply for the clairvoyant case. In addition, the competitiveness of our algorithm does not depend on the number of requests. Such a guarantee on the size of the universe alone was not previously known even for the clairvoyant case - the only previously-known algorithm (due to Carrasco et al.) is clairvoyant, with competitiveness that grows with the number of requests. For the special case of vertex cover with delay, we show a simpler, deterministic algorithm which is $3$-competitive (and also non-clairvoyant).

preprint2016arXiv

Online Lower Bounds via Duality

In this paper, we exploit linear programming duality in the online setting (i.e., where input arrives on the fly) from the unique perspective of designing lower bounds on the competitive ratio. In particular, we provide a general technique for obtaining online deterministic and randomized lower bounds (i.e., hardness results) on the competitive ratio for a wide variety of problems. We show the usefulness of our approach by providing new, tight lower bounds for three diverse online problems. The three problems we show tight lower bounds for are the Vector Bin Packing problem, Ad-auctions (and various online matching problems), and the Capital Investment problem. Our methods are sufficiently general that they can also be used to reconstruct existing lower bounds. Our techniques are in stark contrast to previous works, which exploit linear programming duality to obtain positive results, often via the useful primal-dual scheme. We design a general recipe with the opposite aim of obtaining negative results via duality. The general idea behind our approach is to construct a primal linear program based on a collection of input sequences, where the objective function corresponds to optimizing the competitive ratio. We then obtain the corresponding dual linear program and provide a feasible solution, where the objective function yields a lower bound on the competitive ratio. Online lower bounds are often achieved by adapting the input sequence according to an online algorithm's behavior and doing an appropriate ad hoc case analysis. Using our unifying techniques, we simultaneously combine these cases into one linear program and achieve online lower bounds via a more robust analysis. We are confident that our framework can be successfully applied to produce many more lower bounds for a wide array of online problems.

preprint2016arXiv

Polylogarithmic Bounds on the Competitiveness of Min-cost (Bipartite) Perfect Matching with Delays

We consider the problem of online Min-cost Perfect Matching with Delays (MPMD) recently introduced by Emek et al, (STOC 2016). This problem is defined on an underlying $n$-point metric space. An adversary presents real-time requests online at points of the metric space, and the algorithm is required to match them, possibly after keeping them waiting for some time. The cost incurred is the sum of the distances between matched pairs of points (the connection cost), and the sum of the waiting times of the requests (the delay cost). We present an algorithm with a competitive ratio of $O(\log n)$, which improves the upper bound of $O(\log^2n+\logΔ)$ of Emek et al, by removing the dependence on $Δ$, the aspect ratio of the metric space (which can be unbounded as a function of $n$). The core of our algorithm is a deterministic algorithm for MPMD on metrics induced by edge-weighted trees of height $h$, whose cost is guaranteed to be at most $O(1)$ times the connection cost plus $O(h)$ times the delay cost of every feasible solution. The reduction from MPMD on arbitrary metrics to MPMD on trees is achieved using the result on embedding $n$-point metric spaces into distributions over weighted hierarchically separated trees of height $O(\log n)$, with distortion $O(\log n)$. We also prove a lower bound of $Ω(\sqrt{\log n})$ on the competitive ratio of any randomized algorithm. This is the first lower bound which increases with $n$, and is attained on the metric of $n$ equally spaced points on a line. The problem of Min-cost Bipartite Perfect Matching with Delays (MBPMD) is the same as MPMD except that every request is either positive or negative, and requests can be matched only if they have opposite polarity. We prove an upper bound of $O(\log n)$ and a lower bound of $Ω(\log^{1/3}n)$ on the competitive ratio of MBPMD with a more involved analysis.

preprint2016arXiv

When should an expert make a prediction?

We consider a setting where in a known future time, a certain continuous random variable will be realized. There is a public prediction that gradually converges to its realized value, and an expert that has access to a more accurate prediction. Our goal is to study {\em when} should the expert reveal his information, assuming that his reward is based on a logarithmic market scoring rule (i.e., his reward is proportional to the gain in log-likelihood of the realized value). Our contributions are: (1) we characterize the expert's optimal policy and show that it is threshold based. (2) we analyze the expert's asymptotic expected optimal reward and show a tight connection to the Law of the Iterated Logarithm, and (3) we give an efficient dynamic programming algorithm to compute the optimal policy.

preprint2015arXiv

Liquid Price of Anarchy

Incorporating budget constraints into the analysis of auctions has become increasingly important, as they model practical settings more accurately. The social welfare function, which is the standard measure of efficiency in auctions, is inadequate for settings with budgets, since there may be a large disconnect between the value a bidder derives from obtaining an item and what can be liquidated from her. The Liquid Welfare objective function has been suggested as a natural alternative for settings with budgets. Simple auctions, like simultaneous item auctions, are evaluated by their performance at equilibrium using the Price of Anarchy (PoA) measure -- the ratio of the objective function value of the optimal outcome to the worst equilibrium. Accordingly, we evaluate the performance of simultaneous item auctions in budgeted settings by the Liquid Price of Anarchy (LPoA) measure -- the ratio of the optimal Liquid Welfare to the Liquid Welfare obtained in the worst equilibrium. Our main result is that the LPoA for mixed Nash equilibria is bounded by a constant when bidders are additive and items can be divided into sufficiently many discrete parts. Our proofs are robust, and can be extended to achieve similar bounds for simultaneous second price auctions as well as Bayesian Nash equilibria. For pure Nash equilibria, we establish tight bounds on the LPoA for the larger class of fractionally-subadditive valuations. To derive our results, we develop a new technique in which some bidders deviate (surprisingly) toward a non-optimal solution. In particular, this technique does not fit into the smoothness framework.

preprint2015arXiv

Truthful Online Scheduling with Commitments

We study online mechanisms for preemptive scheduling with deadlines, with the goal of maximizing the total value of completed jobs. This problem is fundamental to deadline-aware cloud scheduling, but there are strong lower bounds even for the algorithmic problem without incentive constraints. However, these lower bounds can be circumvented under the natural assumption of deadline slackness, i.e., that there is a guaranteed lower bound $s > 1$ on the ratio between a job's size and the time window in which it can be executed. In this paper, we construct a truthful scheduling mechanism with a constant competitive ratio, given slackness $s > 1$. Furthermore, we show that if $s$ is large enough then we can construct a mechanism that also satisfies a commitment property: it can be determined whether or not a job will finish, and the requisite payment if so, well in advance of each job's deadline. This is notable because, in practice, users with strict deadlines may find it unacceptable to discover only very close to their deadline that their job has been rejected.

preprint2015arXiv

TSP with Time Windows and Service Time

We consider TSP with time windows and service time. In this problem we receive a sequence of requests for a service at nodes in a metric space and a time window for each request. The goal of the online algorithm is to maximize the number of requests served during their time window. The time to traverse an edge is the distance between the incident nodes of that edge. Serving a request requires unit time. We characterize the competitive ratio for each metric space separately. The competitive ratio depends on the relation between the minimum laxity (the minimum length of a time window) and the diameter of the metric space. Specifically, there is a constant competitive algorithm depending whether the laxity is larger or smaller than the diameter. In addition, we characterize the rate of convergence of the competitive ratio to $1$ as the laxity increases. Specifically, we provide a matching lower and upper bounds depending on the ratio between the laxity and the TSP of the metric space (the minimum distance to traverse all nodes). An application of our result improves the lower bound for colored packets with transition cost and matches the upper bound. In proving our lower bounds we use an interesting non-standard embedding with some special properties. This embedding may be interesting by its own.

preprint2014arXiv

Online Covering with Convex Objectives and Applications

We give an algorithmic framework for minimizing general convex objectives (that are differentiable and monotone non-decreasing) over a set of covering constraints that arrive online. This substantially extends previous work on online covering for linear objectives (Alon {\em et al.}, STOC 2003) and online covering with offline packing constraints (Azar {\em et al.}, SODA 2013). To the best of our knowledge, this is the first result in online optimization for generic non-linear objectives; special cases of such objectives have previously been considered, particularly for energy minimization. As a specific problem in this genre, we consider the unrelated machine scheduling problem with startup costs and arbitrary $\ell_p$ norms on machine loads (including the surprisingly non-trivial $\ell_1$ norm representing total machine load). This problem was studied earlier for the makespan norm in both the offline (Khuller~{\em et al.}, SODA 2010; Li and Khuller, SODA 2011) and online settings (Azar {\em et al.}, SODA 2013). We adapt the two-phase approach of obtaining a fractional solution and then rounding it online (used successfully to many linear objectives) to the non-linear objective. The fractional algorithm uses ideas from our general framework that we described above (but does not fit the framework exactly because of non-positive entries in the constraint matrix). The rounding algorithm uses ideas from offline rounding of LPs with non-linear objectives (Azar and Epstein, STOC 2005; Kumar {\em et al.}, FOCS 2005). Our competitive ratio is tight up to a logarithmic factor. Finally, for the important special case of total load ($\ell_1$ norm), we give a different rounding algorithm that obtains a better competitive ratio than the generic rounding algorithm for $\ell_p$ norms. We show that this competitive ratio is asymptotically tight.

preprint2013arXiv

Colored Packets with Deadlines and Metric Space Transition Cost

We consider scheduling of colored packets with transition costs which form a general metric space. We design a $1 - O(\sqrt{MST(G) / L})$ competitive algorithm. Our main result is a hardness result of $1 - Ω(\sqrt{MST(G) / L})$ which matches the competitive ratio of the algorithm for each metric space separately. In particular, we improve the hardness result of Azar at el. 2009 for uniform metric spaces. We also extend our result to weighted directed graphs which obey the triangular inequality and show a $1 - O(\sqrt{TSP(G) / L})$ competitive algorithm and a nearly-matching hardness result. In proving our hardness results we use some interesting non-standard embedding.

preprint2012arXiv

Efficient Submodular Function Maximization under Linear Packing Constraints

We study the problem of maximizing a monotone submodular set function subject to linear packing constraints. An instance of this problem consists of a matrix $A \in [0,1]^{m \times n}$, a vector $b \in [1,\infty)^m$, and a monotone submodular set function $f: 2^{[n]} \rightarrow \bbR_+$. The objective is to find a set $S$ that maximizes $f(S)$ subject to $A x_{S} \leq b$, where $x_S$ stands for the characteristic vector of the set $S$. A well-studied special case of this problem is when $f$ is linear. This special case captures the class of packing integer programs. Our main contribution is an efficient combinatorial algorithm that achieves an approximation ratio of $Ω(1 / m^{1/W})$, where $W = \min\{b_i / A_{ij} : A_{ij} > 0\}$ is the width of the packing constraints. This result matches the best known performance guarantee for the linear case. One immediate corollary of this result is that the algorithm under consideration achieves constant factor approximation when the number of constraints is constant or when the width of the constraints is sufficiently large. This motivates us to study the large width setting, trying to determine its exact approximability. We develop an algorithm that has an approximation ratio of $(1 - ε)(1 - 1/e)$ when $W = Ω(\ln m / ε^2)$. This result essentially matches the theoretical lower bound of $1 - 1/e$. We also study the special setting in which the matrix $A$ is binary and $k$-column sparse. A $k$-column sparse matrix has at most $k$ non-zero entries in each of its column. We design a fast combinatorial algorithm that achieves an approximation ratio of $Ω(1 / (Wk^{1/W}))$, that is, its performance guarantee only depends on the sparsity and width parameters.

preprint2012arXiv

Online Load Balancing on Unrelated Machines with Startup Costs

Motivated by applications in energy-efficient scheduling in data centers, Khuller, Li, and Saha introduced the {\em machine activation} problem as a generalization of the classical optimization problems of set cover and load balancing on unrelated machines. In this problem, a set of $n$ jobs have to be distributed among a set of $m$ (unrelated) machines, given the processing time of each job on each machine, where each machine has a startup cost. The goal is to produce a schedule of minimum total startup cost subject to a constraint $\bf L$ on its makespan. While Khuller {\em et al} considered the offline version of this problem, a typical scenario in scheduling is one where jobs arrive online and have to be assigned to a machine immediately on arrival. We give an $(O(\log (mn)\log m), O(\log m))$-competitive randomized online algorithm for this problem, i.e. the schedule produced by our algorithm has a makespan of $O({\bf L} \log m)$ with high probability, and a total expected startup cost of $O(\log (mn)\log m)$ times that of an optimal offline schedule with makespan $\bf L$. The competitive ratios of our algorithm are (almost) optimal. Our algorithms use the online primal dual framework introduced by Alon {\em et al} for the online set cover problem, and subsequently developed further by Buchbinder, Naor, and co-authors. To the best of our knowledge, all previous applications of this framework have been to linear programs (LPs) with either packing or covering constraints. One novelty of our application is that we use this framework for a mixed LP that has both covering and packing constraints. We hope that the algorithmic techniques developed in this paper to simultaneously handle packing and covering constraints will be useful for solving other online optimization problems as well.

preprint2010arXiv

Optimal whitespace synchronization strategies

The whitespace-discovery problem describes two parties, Alice and Bob, trying to establish a communication channel over one of a given large segment of whitespace channels. Subsets of the channels are occupied in each of the local environments surrounding Alice and Bob, as well as in the global environment between them (Eve). In the absence of a common clock for the two parties, the goal is to devise time-invariant (stationary) strategies minimizing the synchronization time. This emerged from recent applications in discovery of wireless devices. We model the problem as follows. There are $N$ channels, each of which is open (unoccupied) with probability $p_1,p_2,q$ independently for Alice, Bob and Eve respectively. Further assume that $N \gg 1/(p_1 p_2 q)$ to allow for sufficiently many open channels. Both Alice and Bob can detect which channels are locally open and every time-slot each of them chooses one such channel for an attempted sync. One aims for strategies that, with high probability over the environments, guarantee a shortest possible expected sync time depending only on the $p_i$'s and $q$. Here we provide a stationary strategy for Alice and Bob with a guaranteed expected sync time of $O(1 / (p_1 p_2 q^2))$ given that each party also has knowledge of $p_1,p_2,q$. When the parties are oblivious of these probabilities, analogous strategies incur a cost of a poly-log factor, i.e.\ $\tilde{O}(1 / (p_1 p_2 q^2))$. Furthermore, this performance guarantee is essentially optimal as we show that any stationary strategies of Alice and Bob have an expected sync time of at least $Ω(1/(p_1 p_2 q^2))$.

preprint2010arXiv

Ranking with Submodular Valuations

We study the problem of ranking with submodular valuations. An instance of this problem consists of a ground set $[m]$, and a collection of $n$ monotone submodular set functions $f^1, \ldots, f^n$, where each $f^i: 2^{[m]} \to R_+$. An additional ingredient of the input is a weight vector $w \in R_+^n$. The objective is to find a linear ordering of the ground set elements that minimizes the weighted cover time of the functions. The cover time of a function is the minimal number of elements in the prefix of the linear ordering that form a set whose corresponding function value is greater than a unit threshold value. Our main contribution is an $O(\ln(1 / ε))$-approximation algorithm for the problem, where $ε$ is the smallest non-zero marginal value that any function may gain from some element. Our algorithm orders the elements using an adaptive residual updates scheme, which may be of independent interest. We also prove that the problem is $Ω(\ln(1 / ε))$-hard to approximate, unless P = NP. This implies that the outcome of our algorithm is optimal up to constant factors.

Yossi Azar

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Hierarchical Clustering via Sketches and Hierarchical Correlation Clustering

Beyond Tree Embeddings -- a Deterministic Framework for Network Design with Deadlines or Delay

Hierarchical Clustering: a 0.585 Revenue Approximation

Set Cover with Delay -- Clairvoyance is not Required

Online Lower Bounds via Duality

Polylogarithmic Bounds on the Competitiveness of Min-cost (Bipartite) Perfect Matching with Delays

When should an expert make a prediction?

Liquid Price of Anarchy

Truthful Online Scheduling with Commitments

TSP with Time Windows and Service Time

Online Covering with Convex Objectives and Applications

Colored Packets with Deadlines and Metric Space Transition Cost

Efficient Submodular Function Maximization under Linear Packing Constraints

Online Load Balancing on Unrelated Machines with Startup Costs

Optimal whitespace synchronization strategies

Ranking with Submodular Valuations