Researcher profile

Maria-Florina Balcan

Maria-Florina Balcan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2022arXiv

Improved Sample Complexity Bounds for Branch-and-Cut

Branch-and-cut is the most widely used algorithm for solving integer programs, employed by commercial solvers like CPLEX and Gurobi. Branch-and-cut has a wide variety of tunable parameters that have a huge impact on the size of the search tree that it builds, but are challenging to tune by hand. An increasingly popular approach is to use machine learning to tune these parameters: using a training set of integer programs from the application domain at hand, the goal is to find a configuration with strong predicted performance on future, unseen integer programs from the same domain. If the training set is too small, a configuration may have good performance over the training set but poor performance on future integer programs. In this paper, we prove sample complexity guarantees for this procedure, which bound how large the training set should be to ensure that for any configuration, its average performance over the training set is close to its expected future performance. Our guarantees apply to parameters that control the most important aspects of branch-and-cut: node selection, branching constraint selection, and cutting plane selection, and are sharper and more general than those found in prior research.

preprint2022arXiv

Meta-Learning Adversarial Bandits

We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure. As the first to target the adversarial setting, we design a unified meta-algorithm that yields setting-specific guarantees for two important cases: multi-armed bandits (MAB) and bandit linear optimization (BLO). For MAB, the meta-algorithm tunes the initialization, step-size, and entropy parameter of the Tsallis-entropy generalization of the well-known Exp3 method, with the task-averaged regret provably improving if the entropy of the distribution over estimated optima-in-hindsight is small. For BLO, we learn the initialization, step-size, and boundary-offset of online mirror descent (OMD) with self-concordant barrier regularizers, showing that task-averaged regret varies directly with a measure induced by these functions on the interior of the action space. Our adaptive guarantees rely on proving that unregularized follow-the-leader combined with multiplicative weights is enough to online learn a non-smooth and non-convex sequence of affine functions of Bregman divergences that upper-bound the regret of OMD.

preprint2022arXiv

Robustly-reliable learners under poisoning attacks

Data poisoning attacks, in which an adversary corrupts a training set with the goal of inducing specific desired mistakes, have raised substantial concern: even just the possibility of such an attack can make a user no longer trust the results of a learning system. In this work, we show how to achieve strong robustness guarantees in the face of such attacks across multiple axes. We provide robustly-reliable predictions, in which the predicted label is guaranteed to be correct so long as the adversary has not exceeded a given corruption budget, even in the presence of instance targeted attacks, where the adversary knows the test example in advance and aims to cause a specific failure on that example. Our guarantees are substantially stronger than those in prior approaches, which were only able to provide certificates that the prediction of the learning algorithm does not change, as opposed to certifying that the prediction is correct, as we are able to achieve in our work. Remarkably, we provide a complete characterization of learnability in this setting, in particular, nearly-tight matching upper and lower bounds on the region that can be certified, as well as efficient algorithms for computing this region given an ERM oracle. Moreover, for the case of linear separators over logconcave distributions, we provide efficient truly polynomial time algorithms (i.e., non-oracle algorithms) for such robustly-reliable predictions. We also extend these results to the active setting where the algorithm adaptively asks for labels of specific informative examples, and the difficulty is that the adversary might even be adaptive to this interaction, as well as to the agnostic learning setting where there is no perfect classifier even over the uncorrupted data.

preprint2022arXiv

Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts

The incorporation of cutting planes within the branch-and-bound algorithm, known as branch-and-cut, forms the backbone of modern integer programming solvers. These solvers are the foremost method for solving discrete optimization problems and thus have a vast array of applications in machine learning, operations research, and many other fields. Choosing cutting planes effectively is a major research topic in the theory and practice of integer programming. We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program. Our main application of this analysis is to derive sample complexity guarantees for using machine learning to determine which cutting planes to apply during branch-and-cut. These guarantees apply to infinite families of cutting planes, such as the family of Gomory mixed integer cuts, which are responsible for the main breakthrough speedups of integer programming solvers. We exploit geometric and combinatorial structure of branch-and-cut in our analysis, which provides a key missing piece for the recent generalization theory of branch-and-cut.

preprint2021arXiv

Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations. Our first method NoisyCounts naturally extends the single machine algorithm by using the Laplace mechanism. Our second method LocalRNM significantly reduces communication and added noise by performing local optimization at each data holder. We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings. These guarantees show that the error of the privately-learned decision tree quickly goes to zero provided that the dataset is sufficiently large. Our extensive experiments on real datasets illustrate the trade-offs of privacy, accuracy and generalization when learning private decision trees in the distributed setting.

preprint2020arXiv

Learning piecewise Lipschitz functions in changing environments

Optimization in the presence of sharp (non-Lipschitz), unpredictable (w.r.t. time and amount) changes is a challenging and largely unexplored problem of great significance. We consider the class of piecewise Lipschitz functions, which is the most general online setting considered in the literature for the problem, and arises naturally in various combinatorial algorithm selection problems where utility functions can have sharp discontinuities. The usual performance metric of $\mathit{static}$ regret minimizes the gap between the payoff accumulated and that of the best fixed point for the entire duration, and thus fails to capture changing environments. Shifting regret is a useful alternative, which allows for up to $s$ environment shifts. In this work we provide an $O(\sqrt{sdT\log T}+sT^{1-β})$ regret bound for $β$-dispersed functions, where $β$ roughly quantifies the rate at which discontinuities appear in the utility functions in expectation (typically $β\ge1/2$ in problems of practical interest). We also present a lower bound tight up to sub-logarithmic factors. We further obtain improved bounds when selecting from a small pool of experts. We empirically demonstrate a key application of our algorithms to online clustering problems on popular benchmarks.

preprint2012arXiv

Active Property Testing

One of the motivations for property testing of boolean functions is the idea that testing can serve as a preprocessing step before learning. However, in most machine learning applications, it is not possible to request for labels of fictitious examples constructed by the algorithm. Instead, the dominant query paradigm in applied machine learning, called active learning, is one where the algorithm may query for labels, but only on points in a given polynomial-sized (unlabeled) sample, drawn from some underlying distribution D. In this work, we bring this well-studied model in learning to the domain of testing. We show that for a number of important properties, testing can still yield substantial benefits in this setting. This includes testing unions of intervals, testing linear separators, and testing various assumptions used in semi-supervised learning. In addition to these specific results, we also develop a general notion of the testing dimension of a given property with respect to a given distribution. We show this dimension characterizes (up to constant factors) the intrinsic number of label requests needed to test that property. We develop such notions for both the active and passive testing models. We then use these dimensions to prove a number of lower bounds, including for linear separators and the class of dictator functions. Our results show that testing can be a powerful tool in realistic models for learning, and further that active testing exhibits an interesting and rich structure. Our work in addition brings together tools from a range of areas including U-statistics, noise-sensitivity, self-correction, and spectral analysis of random matrices, and develops new tools that may be of independent interest.

preprint2012arXiv

Distributed Learning, Communication Complexity and Privacy

We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/epsilon for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

preprint2012arXiv

Finding Endogenously Formed Communities

A central problem in e-commerce is determining overlapping communities among individuals or objects in the absence of external identification or tagging. We address this problem by introducing a framework that captures the notion of communities or clusters determined by the relative affinities among their members. To this end we define what we call an affinity system, which is a set of elements, each with a vector characterizing its preference for all other elements in the set. We define a natural notion of (potentially overlapping) communities in an affinity system, in which the members of a given community collectively prefer each other to anyone else outside the community. Thus these communities are endogenously formed in the affinity system and are "self-determined" or "self-certified" by its members. We provide a tight polynomial bound on the number of self-determined communities as a function of the robustness of the community. We present a polynomial-time algorithm for enumerating these communities. Moreover, we obtain a local algorithm with a strong stochastic performance guarantee that can find a community in time nearly linear in the of size the community. Social networks fit particularly naturally within the affinity system framework -- if we can appropriately extract the affinities from the relatively sparse yet rich information from social networks, our analysis then yields a set of efficient algorithms for enumerating self-determined communities in social networks. In the context of social networks we also connect our analysis with results about $(α,β)$-clusters introduced by Mishra, Schreiber, Stanton, and Tarjan \cite{msst}. In contrast with the polynomial bound we prove on the number of communities in the affinity system model, we show that there exists a family of networks with superpolynomial number of $(α,β)$-clusters.

preprint2012arXiv

Nash Equilibria in Perturbation Resilient Games

Motivated by the fact that in many game-theoretic settings, the game analyzed is only an approximation to the game being played, in this work we analyze equilibrium computation for the broad and natural class of bimatrix games that are stable to perturbations. We specifically focus on games with the property that small changes in the payoff matrices do not cause the Nash equilibria of the game to fluctuate wildly. For such games we show how one can compute approximate Nash equilibria more efficiently than the general result of Lipton et al. \cite{LMM03}, by an amount that depends on the degree of stability of the game and that reduces to their bound in the worst case. Furthermore, we show that for stable games the approximate equilibria found will be close in variation distance to true equilibria, and moreover this holds even if we are given as input only a perturbation of the actual underlying stable game. For uniformly-stable games, where the equilibria fluctuate at most quasi-linearly in the extent of the perturbation, we get a particularly dramatic improvement. Here, we achieve a fully quasi-polynomial-time approximation scheme: that is, we can find $1/\poly(n)$-approximate equilibria in quasi-polynomial time. This is in marked contrast to the general class of bimatrix games for which finding such approximate equilibria is PPAD-hard. In particular, under the (widely believed) assumption that PPAD is not contained in quasi-polynomial time, our results imply that such uniformly stable games are inherently easier for computation of approximate equilibria than general bimatrix games.

preprint2011arXiv

Efficient Clustering with Limited Distance Information

Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points. In our model we assume that we have access to one versus all queries that given a point s in S return the distances between s and all other points. We show that given a natural assumption about the structure of the instance, we can efficiently find an accurate clustering using only O(k) distance queries. Our algorithm uses an active selection strategy to choose a small set of points that we call landmarks, and considers only the distances between landmarks and other points to produce a clustering. We use our algorithm to cluster proteins by sequence similarity. This setting nicely fits our model because we can use a fast sequence database search program to query a sequence against an entire dataset. We conduct an empirical study that shows that even though we query a small fraction of the distances between the points, we produce clusterings that are close to a desired clustering given by manual classification.

preprint2011arXiv

Near Optimality in Covering and Packing Games by Exposing Global Information

Covering and packing problems can be modeled as games to encapsulate interesting social and engineering settings. These games have a high Price of Anarchy in their natural formulation. However, existing research applicable to specific instances of these games has only been able to prove fast convergence to arbitrary equilibria. This paper studies general classes of covering and packing games with learning dynamics models that incorporate a central authority who broadcasts weak, socially beneficial signals to agents that otherwise only use local information in their decision-making. Rather than illustrating convergence to an arbitrary equilibrium that may have very high social cost, we show that these systems quickly achieve near-optimal performance. In particular, we show that in the public service advertising model, reaching a small constant fraction of the agents is enough to bring the system to a state within a log n factor of optimal in a broad class of set cover and set packing games or a constant factor of optimal in the special cases of vertex cover and maximum independent set, circumventing social inefficiency of bad local equilibria that could arise without a central authority. We extend these results to the learn-then-decide model, in which agents use any of a broad class of learning algorithms to decide in a given round whether to behave according to locally optimal behavior or the behavior prescribed by the broadcast signal. The new techniques we use for analyzing these games could be of broader interest for analyzing more general classic optimization problems in a distributed fashion.

preprint2011arXiv

Robust Interactive Learning

In this paper we propose and study a generalization of the standard active-learning model where a more general type of query, class conditional query, is allowed. Such queries have been quite useful in applications, but have been lacking theoretical understanding. In this work, we characterize the power of such queries under two well-known noise models. We give nearly tight upper and lower bounds on the number of queries needed to learn both for the general agnostic setting and for the bounded noise model. We further show that our methods can be made adaptive to the (unknown) noise rate, with only negligible loss in query complexity.

preprint2010arXiv

Sequential item pricing for unlimited supply

We investigate the extent to which price updates can increase the revenue of a seller with little prior information on demand. We study prior-free revenue maximization for a seller with unlimited supply of n item types facing m myopic buyers present for k < log n days. For the static (k = 1) case, Balcan et al. [2] show that one random item price (the same on each item) yields revenue within a Θ(log m + log n) factor of optimum and this factor is tight. We define the hereditary maximizers property of buyer valuations (satisfied by any multi-unit or gross substitutes valuation) that is sufficient for a significant improvement of the approximation factor in the dynamic (k > 1) setting. Our main result is a non-increasing, randomized, schedule of k equal item prices with expected revenue within a O((log m + log n) / k) factor of optimum for private valuations with hereditary maximizers. This factor is almost tight: we show that any pricing scheme over k days has a revenue approximation factor of at least (log m + log n) / (3k). We obtain analogous matching lower and upper bounds of Θ((log n) / k) if all valuations have the same maximum. We expect our upper bound technique to be of broader interest; for example, it can significantly improve the result of Akhlaghpour et al. [1]. We also initiate the study of revenue maximization given allocative externalities (i.e. influences) between buyers with combinatorial valuations. We provide a rather general model of positive influence of others&#39; ownership of items on a buyer&#39;s valuation. For affine, submodular externalities and valuations with hereditary maximizers we present an influence-and-exploit (Hartline et al. [13]) marketing strategy based on our algorithm for private valuations. This strategy preserves our approximation factor, despite an affine increase (due to externalities) in the optimum revenue.