Source author record

Tung Mai

Tung Mai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Machine Learning Discrete Mathematics Data Structures and Algorithms Databases math.CO math.ST Methodology Statistics Theory

Catalog footprint

What is connected

12works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Robust Stable Matchings: Dealing with Changes in Preferences

We study stable matchings that are robust to preference changes in the two-sided stable matching setting of Gale and Shapley [GS62]. Given two instances $A$ and $B$ on the same set of agents, a matching is said to be robust if it is stable under both instances. This notion captures desirable robustness properties in matching markets where preferences may evolve, be misreported, or be subject to uncertainty. While the classical theory of stable matchings reveals rich lattice, algorithmic, and polyhedral structure for a single instance, it is unclear which of these properties persist when stability is required across multiple instances. Our work initiates a systematic study of the structural and computational behavior of robust stable matchings under increasingly general models of preference changes. We analyze robustness under a hierarchy of perturbation models: 1. a single upward shift in one agent's preference list, 2. an arbitrary permutation change by a single agent, and 3. arbitrary preference changes by multiple agents on both sides. For each regime, we characterize when: 1. the set of robust stable matchings forms a sublattice, 2. the lattice of robust stable matchings admits a succinct Birkhoff partial order enabling efficient enumeration, 3. worker-optimal and firm-optimal robust stable matchings can be computed efficiently, and 4. the robust stable matching polytope is integral (by studying its LP formulation). We provide explicit counterexamples demonstrating where these structural and geometric properties break down, and complement these results with XP-time algorithms running in $O(n^k)$ time, parameterized by $k$, the number of agents whose preferences change. Our results precisely delineate the boundary between tractable and intractable cases for robust stable matchings.

preprint2022arXiv

A Structural and Algorithmic Study of Stable Matching Lattices of "Nearby" Instances, with Applications

Recently MV18 identified and initiated work on the new problem of understanding structural relationships between the lattices of solutions of two "nearby" instances of stable matching. They also gave an application of their work to finding a robust stable matching. However, the types of changes they allowed in going from instance $A$ to $B$ were very restricted, namely any one agent executes an upward shift. In this paper, we allow any one agent to permute its preference list arbitrarily. Let $M_A$ and $M_B$ be the sets of stable matchings of the resulting pair of instances $A$ and $B$, and let $\mathcal{L}_A$ and $\mathcal{L}_B$ be the corresponding lattices of stable matchings. We prove that the matchings in $M_A \cap M_B$ form a sublattice of both $\mathcal{L}_A$ and $\mathcal{L}_B$ and those in $M_A \setminus M_B$ form a join semi-sublattice of $\mathcal{L}_A$. These properties enable us to obtain a polynomial time algorithm for not only finding a stable matching in $M_A \cap M_B$, but also for obtaining the partial order, as promised by Birkhoff's Representation Theorem, thereby enabling us to generate all matchings in this sublattice. Our algorithm also helps solve a version of the robust stable matching problem. We discuss another potential application, namely obtaining new insights into the incentive compatibility properties of the Gale-Shapley Deferred Acceptance Algorithm.

preprint2022arXiv

Electra: Conditional Generative Model based Predicate-Aware Query Approximation

The goal of Approximate Query Processing (AQP) is to provide very fast but "accurate enough" results for costly aggregate queries thereby improving user experience in interactive exploration of large datasets. Recently proposed Machine-Learning based AQP techniques can provide very low latency as query execution only involves model inference as compared to traditional query processing on database clusters. However, with increase in the number of filtering predicates(WHERE clauses), the approximation error significantly increases for these methods. Analysts often use queries with a large number of predicates for insights discovery. Thus, maintaining low approximation error is important to prevent analysts from drawing misleading conclusions. In this paper, we propose ELECTRA, a predicate-aware AQP system that can answer analytics-style queries with a large number of predicates with much smaller approximation errors. ELECTRA uses a conditional generative model that learns the conditional distribution of the data and at runtime generates a small (~1000 rows) but representative sample, on which the query is executed to compute the approximate result. Our evaluations with four different baselines on three real-world datasets show that ELECTRA provides lower AQP error for large number of predicates compared to baselines.

preprint2022arXiv

Online Balanced Experimental Design

e consider the experimental design problem in an online environment, an important practical task for reducing the variance of estimates in randomized experiments which allows for greater precision, and in turn, improved decision making. In this work, we present algorithms that build on recent advances in online discrepancy minimization which accommodate both arbitrary treatment probabilities and multiple treatments. The proposed algorithms are computational efficient, minimize covariate imbalance, and include randomization which enables robustness to misspecification. We provide worst case bounds on the expected mean squared error of the causal estimate and show that the proposed estimator is no worse than an implicit ridge regression, which are within a logarithmic factor of the best known results for offline experimental design. We conclude with a detailed simulation study showing favorable results relative to complete randomization as well as to offline methods for experimental design with time complexities exceeding our algorithm.

preprint2021arXiv

Asymptotics of Ridge Regression in Convolutional Models

Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This phenomenon is captured by the so called double descent curve, where the generalization error starts decreasing again after the interpolation threshold. A series of recent works tried to explain such phenomenon for simple models. In this work, we analyze the asymptotics of estimation error in ridge estimators for convolutional linear models. These convolutional inverse problems, also known as deconvolution, naturally arise in different fields such as seismology, imaging, and acoustics among others. Our results hold for a large class of input distributions that include i.i.d. features as a special case. We derive exact formulae for estimation error of ridge estimators that hold in a certain high-dimensional regime. We show the double descent phenomenon in our experiments for convolutional models and show that our theoretical results match the experiments.

preprint2021arXiv

Fundamental Tradeoffs in Distributionally Adversarial Training

Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff.

preprint2021arXiv

Machine Unlearning via Algorithmic Stability

We study the problem of machine unlearning and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are based on constructing a (maximal) coupling of Markov chains for the noisy SGD procedure. To understand the trade-offs between accuracy and unlearning efficiency, we give upper and lower bounds on excess empirical and populations risk of TV stable algorithms for convex risk minimization. Our techniques generalize to arbitrary non-convex functions, and our algorithms are differentially private as well.

preprint2021arXiv

Online Discrepancy Minimization via Persistent Self-Balancing Walks

We study the online discrepancy minimization problem for vectors in $\mathbb{R}^d$ in the oblivious setting where an adversary is allowed fix the vectors $x_1, x_2, \ldots, x_n$ in arbitrary order ahead of time. We give an algorithm that maintains $O(\sqrt{\log(nd/δ)})$ discrepancy with probability $1-δ$, matching the lower bound given in [Bansal et al. 2020] up to an $O(\sqrt{\log \log n})$ factor in the high-probability regime. We also provide results for the weighted and multi-color versions of the problem.

preprint2020arXiv

Stability-Preserving, Time-Efficient Mechanisms for School Choice in Two Rounds

We address the following dynamic version of the school choice question: a city, named City, admits students in two temporally-separated rounds, denoted $\mathcal{R}_1$ and $\mathcal{R}_2$. In round $\mathcal{R}_1$, the capacity of each school is fixed and mechanism $\mathcal{M}_1$ finds a student optimal stable matching. In round $\mathcal{R}_2$, certain parameters change, e.g., new students move into the City or the City is happy to allocate extra seats to specific schools. We study a number of Settings of this kind and give polynomial time algorithms for obtaining a stable matching for the new situations. It is well established that switching the school of a student midway, unsynchronized with her classmates, can cause traumatic effects. This fact guides us to two types of results, the first simply disallows any re-allocations in round $\mathcal{R}_2$, and the second asks for a stable matching that minimizes the number of re-allocations. For the latter, we prove that the stable matchings which minimize the number of re-allocations form a sublattice of the lattice of stable matchings. Observations about incentive compatibility are woven into these results. We also give a third type of results, namely proofs of NP-hardness for a mechanism for round $\mathcal{R}_2$ under certain settings.

preprint2016arXiv

A Performance-Based Scheme for Pricing Resources in the Cloud

With the rapid growth of the cloud computing marketplace, the issue of pricing resources in the cloud has been the subject of much study in recent years. In this paper, we identify and study a new issue: how to price resources in the cloud so that the customer's risk is minimized, while at the same time ensuring that the provider accrues his fair share. We do this by correlating the revenue stream of the customer to the prices charged by the provider. We show that our mechanism is incentive compatible in that it is in the best interest of the customer to provide his true revenue as a function of the resources rented. We next add another restriction to the price function, i.e., that it be linear. This removes the distortion that creeps in when the customer has to pay more money for less resources. Our algorithms for both the schemes mentioned above are efficient.

preprint2016arXiv

Convex Program Duality, Fisher Markets, and Nash Social Welfare

We study Fisher markets and the problem of maximizing the Nash social welfare (NSW), and show several closely related new results. In particular, we obtain: -- A new integer program for the NSW maximization problem whose fractional relaxation has a bounded integrality gap. In contrast, the natural integer program has an unbounded integrality gap. -- An improved, and tight, factor 2 analysis of the algorithm of [7]; in turn showing that the integrality gap of the above relaxation is at most 2. The approximation factor shown by [7] was $2e^{1/e} \approx 2.89$. -- A lower bound of $e^{1/e}\approx 1.44$ on the integrality gap of this relaxation. -- New convex programs for natural generalizations of linear Fisher markets and proofs that these markets admit rational equilibria. These results were obtained by establishing connections between previously known disparate results, and they help uncover their mathematical underpinnings. We show a formal connection between the convex programs of Eisenberg and Gale and that of Shmyrev, namely that their duals are equivalent up to a change of variables. Both programs capture equilibria of linear Fisher markets. By adding suitable constraints to Shmyrev's program, we obtain a convex program that captures equilibria of the spending-restricted market model defined by [7] in the context of the NSW maximization problem. Further, adding certain integral constraints to this program we get the integer program for the NSW mentioned above. The basic tool we use is convex programming duality. In the special case of convex programs with linear constraints (but convex objectives), we show a particularly simple way of obtaining dual programs, putting it almost at par with linear program duality. This simple way of finding duals has been used subsequently for many other applications.

preprint2016arXiv

New Convex Programs for Fisher's Market Model and its Generalizations

We present the following results pertaining to Fisher's market model: -We give two natural generalizations of Fisher's market model: In model M_1, sellers can declare an upper bound on the money they wish to earn (and take back their unsold good), and in model M_2, buyers can declare an upper bound on the amount to utility they wish to derive (and take back the unused part of their money). -We derive convex programs for the linear case of these two models by generalizing a convex program due to Shmyrev and the Eisenberg-Gale program, respectively. -We generalize the Arrow-Hurwicz theorem to the linear case of these two models, hence deriving alternate convex programs. -For the special class of convex programs having convex objective functions and linear constraints, we derive a simple set of rules for constructing the dual program (as simple as obtaining the dual of an LP). Using these rules we show a formal relationship between the two seemingly different convex programs for linear Fisher markets, due to Eisenberg-Gale and Shmyrev; the duals of these are the same, upto a change of variables.

Tung Mai

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Robust Stable Matchings: Dealing with Changes in Preferences

A Structural and Algorithmic Study of Stable Matching Lattices of "Nearby" Instances, with Applications

Electra: Conditional Generative Model based Predicate-Aware Query Approximation

Online Balanced Experimental Design

Asymptotics of Ridge Regression in Convolutional Models

Fundamental Tradeoffs in Distributionally Adversarial Training

Machine Unlearning via Algorithmic Stability

Online Discrepancy Minimization via Persistent Self-Balancing Walks

Stability-Preserving, Time-Efficient Mechanisms for School Choice in Two Rounds

A Performance-Based Scheme for Pricing Resources in the Cloud

Convex Program Duality, Fisher Markets, and Nash Social Welfare

New Convex Programs for Fisher's Market Model and its Generalizations