Source author record

Benjamin Moseley

Benjamin Moseley appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Databases Performance Artificial Intelligence Computer Science and Game Theory Discrete Mathematics Distributed, Parallel, and Cluster Computing Information Theory math.CO math.IT Networking and Internet Architecture

Catalog footprint

What is connected

21works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Configuration Balancing for Stochastic Requests

The configuration balancing problem with stochastic requests generalizes many well-studied resource allocation problems such as load balancing and virtual circuit routing. In it, we have $m$ resources and $n$ requests. Each request has multiple possible configurations, each of which increases the load of each resource by some amount. The goal is to select one configuration for each request to minimize the makespan: the load of the most-loaded resource. In our work, we focus on a stochastic setting, where we only know the distribution for how each configuration increases the resource loads, learning the realized value only after a configuration is chosen. We develop both offline and online algorithms for configuration balancing with stochastic requests. When the requests are known offline, we give a non-adaptive policy for configuration balancing with stochastic requests that $O(\frac{\log m}{\log \log m})$-approximates the optimal adaptive policy. In particular, this closes the adaptivity gap for this problem as there is an asymptotically matching lower bound even for the very special case of load balancing on identical machines. When requests arrive online in a list, we give a non-adaptive policy that is $O(\log m)$ competitive. Again, this result is asymptotically tight due to information-theoretic lower bounds for very special cases (e.g., for load balancing on unrelated machines). Finally, we show how to leverage adaptivity in the special case of load balancing on related machines to obtain a constant-factor approximation offline and an $O(\log \log m)$-approximation online. A crucial technical ingredient in all of our results is a new structural characterization of the optimal adaptive policy that allows us to limit the correlations between its decisions.

preprint2022arXiv

Minimizing Completion Times for Stochastic Jobs via Batched Free Times

We study the classic problem of minimizing the expected total completion time of jobs on $m$ identical machines in the setting where the sizes of the jobs are stochastic. Specifically, the size of each job is a random variable whose distribution is known to the algorithm, but whose realization is revealed only after the job is scheduled. While minimizing the total completion time is easy in the deterministic setting, the stochastic problem has long been notorious: all known algorithms have approximation ratios that either depend on the variances, or depend linearly on the number of machines. We give an $\widetilde{O}(\sqrt{m})$-approximation for stochastic jobs which have Bernoulli processing times. This is the first approximation for this problem that is both independent of the variance in the job sizes, and is sublinear in the number of machines $m$. Our algorithm is based on a novel reduction from minimizing the total completion time to a natural makespan-like objective, which we call the weighted free time. We hope this free time objective will be useful in further improvements to this problem, as well as other stochastic scheduling problems.

preprint2022arXiv

On the Impossibility of Decomposing Binary Matroids

We show that there exist $k$-colorable matroids that are not $(b,c)$-decomposable when $b$ and $c$ are constants. A matroid is $(b,c)$-decomposable, if its ground set of elements can be partitioned into sets $X_1, X_2, \ldots, X_l$ with the following two properties. Each set $X_i$ has size at most $ck$. Moreover, for all sets $Y$ such that $|Y \cap X_i| \leq 1$ it is the case that $Y$ is $b$-colorable. A $(b,c)$-decomposition is a strict generalization of a partition decomposition and, thus, our result refutes a conjecture from arXiv:1911.10485v2 .

preprint2020arXiv

A Relational Gradient Descent Algorithm For Support Vector Machine Training

We consider gradient descent like algorithms for Support Vector Machine (SVM) training when the data is in relational form. The gradient of the SVM objective can not be efficiently computed by known techniques as it suffers from the ``subtraction problem''. We first show that the subtraction problem can not be surmounted by showing that computing any constant approximation of the gradient of the SVM objective function is $\#P$-hard, even for acyclic joins. We, however, circumvent the subtraction problem by restricting our attention to stable instances, which intuitively are instances where a nearly optimal solution remains nearly optimal if the points are perturbed slightly. We give an efficient algorithm that computes a ``pseudo-gradient'' that guarantees convergence for stable instances at a rate comparable to that achieved by using the actual gradient. We believe that our results suggest that this sort of stability the analysis would likely yield useful insight in the context of designing algorithms on relational data for other learning problems in which the subtraction problem arises.

preprint2020arXiv

An Objective for Hierarchical Clustering in Euclidean Space and its Connection to Bisecting K-means

This paper explores hierarchical clustering in the case where pairs of points have dissimilarity scores (e.g. distances) as a part of the input. The recently introduced objective for points with dissimilarity scores results in every tree being a 1/2 approximation if the distances form a metric. This shows the objective does not make a significant distinction between a good and poor hierarchical clustering in metric spaces. Motivated by this, the paper develops a new global objective for hierarchical clustering in Euclidean space. The objective captures the criterion that has motivated the use of divisive clustering algorithms: that when a split happens, points in the same cluster should be more similar than points in different clusters. Moreover, this objective gives reasonable results on ground-truth inputs for hierarchical clustering. The paper builds a theoretical connection between this objective and the bisecting k-means algorithm. This paper proves that the optimal 2-means solution results in a constant approximation for the objective. This is the first paper to show the bisecting k-means algorithm optimizes a natural global objective over the entire tree.

preprint2020arXiv

Approximate Aggregate Queries Under Additive Inequalities

We consider the problem of evaluating certain types of functional aggregation queries on relational data subject to additive inequalities. Such aggregation queries, with a smallish number of additive inequalities, arise naturally/commonly in many applications, particularly in learning applications. We give a relatively complete categorization of the computational complexity of such problems. We first show that the problem is NP-hard, even in the case of one additive inequality. Thus we turn to approximating the query. Our main result is an efficient algorithm for approximating, with arbitrarily small relative error, many natural aggregation queries with one additive inequality. We give examples of natural queries that can be efficiently solved using this algorithm. In contrast, we show that the situation with two additive inequalities is quite different, by showing that it is NP-hard to evaluate simple aggregation queries, with two additive inequalities, with any bounded relative error.

preprint2020arXiv

Dynamic Weighted Fairness with Minimal Disruptions

In this paper, we consider the following dynamic fair allocation problem: Given a sequence of job arrivals and departures, the goal is to maintain an approximately fair allocation of the resource against a target fair allocation policy, while minimizing the total number of disruptions, which is the number of times the allocation of any job is changed. We consider a rich class of fair allocation policies that significantly generalize those considered in previous work. We first consider the models where jobs only arrive, or jobs only depart. We present tight upper and lower bounds for the number of disruptions required to maintain a constant approximate fair allocation every time step. In particular, for the canonical case where jobs have weights and the resource allocation is proportional to the job's weight, we show that maintaining a constant approximate fair allocation requires $Θ(\log^* n)$ disruptions per job, almost matching the bounds in prior work for the unit weight case. For the more general setting where the allocation policy only decreases the allocation to a job when new jobs arrive, we show that maintaining a constant approximate fair allocation requires $Θ(\log n)$ disruptions per job. We then consider the model where jobs can both arrive and depart. We first show strong lower bounds on the number of disruptions required to maintain constant approximate fairness for arbitrary instances. In contrast we then show that there there is an algorithm that can maintain constant approximate fairness with $O(1)$ expected disruptions per job if the weights of the jobs are independent of the jobs arrival and departure order. We finally show how our results can be extended to the setting with multiple resources.

preprint2020arXiv

Fair Hierarchical Clustering

As machine learning has become more prevalent, researchers have begun to recognize the necessity of ensuring machine learning systems are fair. Recently, there has been an interest in defining a notion of fairness that mitigates over-representation in traditional clustering. In this paper we extend this notion to hierarchical clustering, where the goal is to recursively partition the data to optimize a specific objective. For various natural objectives, we obtain simple, efficient algorithms to find a provably good fair hierarchical clustering. Empirically, we show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.

preprint2020arXiv

Fast Noise Removal for $k$-Means Clustering

This paper considers $k$-means clustering in the presence of noise. It is known that $k$-means clustering is highly sensitive to noise, and thus noise should be removed to obtain a quality solution. A popular formulation of this problem is called $k$-means clustering with outliers. The goal of $k$-means clustering with outliers is to discard up to a specified number $z$ of points as noise/outliers and then find a $k$-means solution on the remaining data. The problem has received significant attention, yet current algorithms with theoretical guarantees suffer from either high running time or inherent loss in the solution quality. The main contribution of this paper is two-fold. Firstly, we develop a simple greedy algorithm that has provably strong worst case guarantees. The greedy algorithm adds a simple preprocessing step to remove noise, which can be combined with any $k$-means clustering algorithm. This algorithm gives the first pseudo-approximation-preserving reduction from $k$-means with outliers to $k$-means without outliers. Secondly, we show how to construct a coreset of size $O(k \log n)$. When combined with our greedy algorithm, we obtain a scalable, near linear time algorithm. The theoretical contributions are verified experimentally by demonstrating that the algorithm quickly removes noise and obtains a high-quality clustering.

preprint2020arXiv

Functional Aggregate Queries with Additive Inequalities

Motivated by fundamental applications in databases and relational machine learning, we formulate and study the problem of answering functional aggregate queries (FAQ) in which some of the input factors are defined by a collection of additive inequalities between variables. We refer to these queries as FAQ-AI for short. To answer FAQ-AI in the Boolean semiring, we define relaxed tree decompositions and relaxed submodular and fractional hypertree width parameters. We show that an extension of the InsideOut algorithm using Chazelle's geometric data structure for solving the semigroup range search problem can answer Boolean FAQ-AI in time given by these new width parameters. This new algorithm achieves lower complexity than known solutions for FAQ-AI. It also recovers some known results in database query answering. Our second contribution is a relaxation of the set of polymatroids that gives rise to the counting version of the submodular width, denoted by #subw. This new width is sandwiched between the submodular and the fractional hypertree widths. Any FAQ and FAQ-AI over one semiring can be answered in time proportional to #subw and respectively to the relaxed version of #subw. We present three applications of our FAQ-AI framework to relational machine learning: k-means clustering, training linear support vector machines, and training models using non-polynomial loss. These optimization problems can be solved over a database asymptotically faster than computing the join of the database relations.

preprint2020arXiv

Greed Works -- Online Algorithms For Unrelated Machine Stochastic Scheduling

This paper establishes performance guarantees for online algorithms that schedule stochastic, nonpreemptive jobs on unrelated machines to minimize the expected total weighted completion time. Prior work on unrelated machine scheduling with stochastic jobs was restricted to the offline case, and required linear or convex programming relaxations for the assignment of jobs to machines. The algorithms introduced in this paper are purely combinatorial. The performance bounds are of the same order of magnitude as those of earlier work, and depend linearly on an upper bound on the squared coefficient of variation of the jobs' processing times. Specifically for deterministic processing times, without and with release times, the competitive ratios are 4 and 7.216, respectively. As to the technical contribution, the paper shows how dual fitting techniques can be used for stochastic and nonpreemptive scheduling problems.

preprint2020arXiv

Optimal Resource Allocation for Elastic and Inelastic Jobs

Modern data centers are tasked with processing heterogeneous workloads consisting of various classes of jobs. These classes differ in their arrival rates, size distributions, and job parallelizability. With respect to paralellizability, some jobs are elastic, meaning they can parallelize linearly across many servers. Other jobs are inelastic, meaning they can only run on a single server. Although job classes can differ drastically, they are typically forced to share a single cluster. When sharing a cluster among heterogeneous jobs, one must decide how to allocate servers to each job at every moment in time. In this paper, we design and analyze allocation policies which aim to minimize the mean response time across jobs, where a job's response time is the time from when it arrives until it completes. We model this problem in a stochastic setting where each job may be elastic or inelastic. Job sizes are drawn from exponential distributions, but are unknown to the system. We show that, in the common case where elastic jobs are larger on average than inelastic jobs, the optimal allocation policy is Inelastic-First, giving inelastic jobs preemptive priority over elastic jobs. We obtain this result by introducing a novel sample path argument. We also show that there exist cases where Elastic-First (giving priority to elastic jobs) performs better than Inelastic-First. We then provide the first analysis of mean response time under both Elastic-First and Inelastic-First by leveraging recent techniques for solving high-dimensional Markov chains.

preprint2020arXiv

Scheduling for Weighted Flow and Completion Times in Reconfigurable Networks

New optical technologies offer the ability to reconfigure network topologies dynamically, rather than setting them once and for all. This is true in both optical wide area networks (optical WANs) and in datacenters, despite the many differences between these two settings. Because of these new technologies, there has been a surge of both practical and theoretical research on algorithms to take advantage of them. In particular, Jia et al. [INFOCOM '17] designed online scheduling algorithms for dynamically reconfigurable topologies for both the makespan and sum of completion times objectives. In this paper, we work in the same setting but study an objective that is more meaningful in an online setting: the sum of flow times. The flow time of a job is the total amount of time that it spends in the system, which may be considerably smaller than its completion time if it is released late. We provide competitive algorithms for the online setting with speed augmentation, and also give a lower bound proving that speed augmentation is in fact necessary. As a side effect of our techniques, we also improve and generalize the results of Jia et al. on completion times by giving an $O(1)$-competitive algorithm for arbitrary sizes and release times even when nodes have different degree bounds, and moreover allow for the weighted sum of completion times (or flow times).

preprint2020arXiv

Structural Iterative Rounding for Generalized $k$-Median Problems

This paper considers approximation algorithms for generalized $k$-median problems. This class of problems can be informally described as $k$-median with a constant number of extra constraints, and includes $k$-median with outliers, and knapsack median. Our first contribution is a pseudo-approximation algorithm for generalized $k$-median that outputs a $6.387$-approximate solution, with a constant number of fractional variables. The algorithm builds on the iterative rounding framework introduced by Krishnaswamy, Li, and Sandeep for $k$-median with outliers. The main technical innovation is allowing richer constraint sets in the iterative rounding and taking advantage of the structure of the resulting extreme points. Using our pseudo-approximation algorithm, we give improved approximation algorithms for $k$-median with outliers and knapsack median. This involves combining our pseudo-approximation with pre- and post-processing steps to round a constant number of fractional variables at a small increase in cost. Our algorithms achieve approximation ratios $6.994 + ε$ and $6.387 + ε$ for $k$-median with outliers and knapsack median, respectively. These improve on the best-known approximation ratio $7.081 + ε$ for both problems \cite{DBLP:conf/stoc/KrishnaswamyLS18}.

preprint2020arXiv

Symmetric Linear Programming Formulations for Minimum Cut with Applications to TSP

We introduce multiple symmetric LP relaxations for minimum cut problems. The relaxations give optimal and approximate solutions when the input is a Hamiltonian cycle. We show that this leads to one of two interesting results. In one case, these LPs always give optimal and near optimal solutions, and then they would be the smallest known symmetric LPs for the problems considered. Otherwise, these LP formulations give strictly better LP relaxations for the traveling salesperson problem than the subtour relaxation. We have the smallest known LP formulation that is a 9/8-approximation or better for min-cut. In addition, the LP relaxation of min-cut investigated in this paper has interesting constraints; the LP contains only a single typical min-cut constraint and all other constraints are typically only used for max-cut relaxations.

preprint2013arXiv

Bargaining for Revenue Shares on Tree Trading Networks

We study trade networks with a tree structure, where a seller with a single indivisible good is connected to buyers, each with some value for the good, via a unique path of intermediaries. Agents in the tree make multiplicative revenue share offers to their parent nodes, who choose the best offer and offer part of it to their parent, and so on; the winning path is determined by who finally makes the highest offer to the seller. In this paper, we investigate how these revenue shares might be set via a natural bargaining process between agents on the tree, specifically, egalitarian bargaining between endpoints of each edge in the tree. We investigate the fixed point of this system of bargaining equations and prove various desirable for this solution concept, including (i) existence, (ii) uniqueness, (iii) efficiency, (iv) membership in the core, (v) strict monotonicity, (vi) polynomial-time computability to any given accuracy. Finally, we present numerical evidence that asynchronous dynamics with randomly ordered updates always converges to the fixed point, indicating that the fixed point shares might arise from decentralized bargaining amongst agents on the trade network.

preprint2013arXiv

The Complexity of Scheduling for p-norms of Flow and Stretch

We consider computing optimal k-norm preemptive schedules of jobs that arrive over time. In particular, we show that computing the optimal k-norm of flow schedule, is strongly NP-hard for k in (0, 1) and integers k in (1, infinity). Further we show that computing the optimal k-norm of stretch schedule, is strongly NP-hard for k in (0, 1) and integers k in (1, infinity).

preprint2012arXiv

Scalable K-Means++

Over half a century old and showing no signs of aging, k-means remains one of the most popular data processing algorithms. As is well-known, a proper initialization of k-means is crucial for obtaining a good final solution. The recently proposed k-means++ initialization algorithm achieves this, obtaining an initial set of centers that is provably close to the optimum solution. A major downside of the k-means++ is its inherent sequential nature, which limits its applicability to massive data: one must make k passes over the data to find a good initial set of centers. In this work we show how to drastically reduce the number of passes needed to obtain, in parallel, a good initialization. This is unlike prevailing efforts on parallelizing k-means that have mostly focused on the post-initialization phases of k-means. We prove that our proposed initialization algorithm k-means|| obtains a nearly optimal solution after a logarithmic number of passes, and then show that in practice a constant number of passes suffices. Experimental evaluation on real-world large-scale data demonstrates that k-means|| outperforms k-means++ in both sequential and parallel settings.

preprint2011arXiv

Fast Clustering using MapReduce

Clustering problems have numerous applications and are becoming more challenging as the size of the data increases. In this paper, we consider designing clustering algorithms that can be used in MapReduce, the most popular programming environment for processing large datasets. We focus on the practical and popular clustering problems, $k$-center and $k$-median. We develop fast clustering algorithms with constant factor approximation guarantees. From a theoretical perspective, we give the first analysis that shows several clustering algorithms are in $\mathcal{MRC}^0$, a theoretical MapReduce class introduced by Karloff et al. \cite{KarloffSV10}. Our algorithms use sampling to decrease the data size and they run a time consuming clustering algorithm such as local search or Lloyd's algorithm on the resulting data set. Our algorithms have sufficient flexibility to be used in practice since they run in a constant number of MapReduce rounds. We complement these results by performing experiments using our algorithms. We compare the empirical performance of our algorithms to several sequential and parallel algorithms for the $k$-median problem. The experiments show that our algorithms' solutions are similar to or better than the other algorithms' solutions. Furthermore, on data sets that are sufficiently large, our algorithms are faster than the other parallel algorithms that we tested.

preprint2010arXiv

Online Scheduling on Identical Machines using SRPT

Due to its optimality on a single machine for the problem of minimizing average flow time, Shortest-Remaining-Processing-Time (\srpt) appears to be the most natural algorithm to consider for the problem of minimizing average flow time on multiple identical machines. It is known that $\srpt$ achieves the best possible competitive ratio on multiple machines up to a constant factor. Using resource augmentation, $\srpt$ is known to achieve total flow time at most that of the optimal solution when given machines of speed $2- \frac{1}{m}$. Further, it is known that $\srpt$'s competitive ratio improves as the speed increases; $\srpt$ is $s$-speed $\frac{1}{s}$-competitive when $s \geq 2- \frac{1}{m}$. However, a gap has persisted in our understanding of $\srpt$. Before this work, the performance of $\srpt$ was not known when $\srpt$ is given $(1+\eps)$-speed when $0 < \eps < 1-\frac{1}{m}$, even though it has been thought that $\srpt$ is $(1+\eps)$-speed $O(1)$-competitive for over a decade. Resolving this question was suggested in Open Problem 2.9 from the survey "Online Scheduling" by Pruhs, Sgall, and Torng \cite{PruhsST}, and we answer the question in this paper. We show that $\srpt$ is \emph{scalable} on $m$ identical machines. That is, we show $\srpt$ is $(1+\eps)$-speed $O(\frac{1}{\eps})$-competitive for $\eps >0$. We complement this by showing that $\srpt$ is $(1+\eps)$-speed $O(\frac{1}{\eps^2})$-competitive for the objective of minimizing the $\ell_k$-norms of flow time on $m$ identical machines. Both of our results rely on new potential functions that capture the structure of \srpt. Our results, combined with previous work, show that $\srpt$ is the best possible online algorithm in essentially every aspect when migration is permissible.

preprint2010arXiv

Scheduling to Minimize Energy and Flow Time in Broadcast Scheduling

In this paper we initiate the study of minimizing power consumption in the broadcast scheduling model. In this setting there is a wireless transmitter. Over time requests arrive at the transmitter for pages of information. Multiple requests may be for the same page. When a page is transmitted, all requests for that page receive the transmission simulteneously. The speed the transmitter sends data at can be dynamically scaled to conserve energy. We consider the problem of minimizing flow time plus energy, the most popular scheduling metric considered in the standard scheduling model when the scheduler is energy aware. We will assume that the power consumed is modeled by an arbitrary convex function. For this problem there is a $Ω(n)$ lower bound. Due to the lower bound, we consider the resource augmentation model of Gupta \etal \cite{GuptaKP10}. Using resource augmentation, we give a scalable algorithm. Our result also gives a scalable non-clairvoyant algorithm for minimizing weighted flow time plus energy in the standard scheduling model.

Benjamin Moseley

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Configuration Balancing for Stochastic Requests

Minimizing Completion Times for Stochastic Jobs via Batched Free Times

On the Impossibility of Decomposing Binary Matroids

A Relational Gradient Descent Algorithm For Support Vector Machine Training

An Objective for Hierarchical Clustering in Euclidean Space and its Connection to Bisecting K-means

Approximate Aggregate Queries Under Additive Inequalities

Dynamic Weighted Fairness with Minimal Disruptions

Fair Hierarchical Clustering

Fast Noise Removal for $k$-Means Clustering

Functional Aggregate Queries with Additive Inequalities

Greed Works -- Online Algorithms For Unrelated Machine Stochastic Scheduling

Optimal Resource Allocation for Elastic and Inelastic Jobs

Scheduling for Weighted Flow and Completion Times in Reconfigurable Networks

Structural Iterative Rounding for Generalized $k$-Median Problems

Symmetric Linear Programming Formulations for Minimum Cut with Applications to TSP

Bargaining for Revenue Shares on Tree Trading Networks

The Complexity of Scheduling for p-norms of Flow and Stretch

Scalable K-Means++

Fast Clustering using MapReduce

Online Scheduling on Identical Machines using SRPT

Scheduling to Minimize Energy and Flow Time in Broadcast Scheduling