Researcher profile

MohammadTaghi Hajiaghayi

MohammadTaghi Hajiaghayi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Decision Tree Learning on Product Spaces

Decision tree learning has long been a central topic in theoretical computer science, driven by its practical importance. A fundamental and widely used method for decision tree construction is the top-down greedy heuristic, which recursively splits on the most influential variable. Despite its empirical success, theoretical analysis of this heuristic has been limited. A recent breakthrough by Blanc et al. (ITCS, 2020) provided the first rigorous theoretical guarantees for the greedy approach, but only under the uniform distribution. We extend this analysis to the more general and practically relevant setting of arbitrary product distributions. Our main result shows that for any function $f$ computable by an optimal decision tree of size $s$, maximum depth $D_{\text{opt}}$, and average depth $Δ_{\text{opt}}$, the greedy heuristic constructs an $ε$-approximating tree whose size grows at most with $\exp\bigl(Δ_{\text{opt}} D_{\text{opt}} \log(e/ε)\bigr)$. In the special case where the optimal tree is a full binary tree, this bound improves upon the bound of Blanc et al. and holds under a strictly broader class of distributions. Moreover, we present an algorithm based on the top-down greedy heuristic that is entirely parameter-free -- it requires no prior knowledge of the optimal tree's size or depth -- offering a practical advantage over Blanc et al.'s method.

preprint2026arXiv

Matroid Algorithms Under Size-Sensitive Independence Oracles

The standard oracle model for matroid algorithms assumes that each independence query can be answered in constant time, regardless of the size of the queried set. While this abstraction has underpinned much of the theoretical progress in matroid optimization, it masks the true computational effort required by these algorithms. In particular, for natural and widely studied classes such as graphic matroids, even a single independence query can require work linear in the size of the set, making the constant-time assumption implausible. We address this gap by introducing a size-sensitive cost model where the cost of a query $Q$ scales with $|Q|$. Nearly linear-time oracle implementations exist for broad families of matroids, and this refined abstraction therefore captures the true cost of query evaluation while allowing for a more faithful comparison between general matroids and their natural special cases. Within this framework we study three fundamental algorithmic tasks: finding a basis of a matroid, approximating its rank, and approximating its partition size. We establish tight results, proving nearly matching upper and lower bounds that show the optimal query cost is (up to logarithmic factors) quadratic in the size of the matroid. On the algorithmic side, our upper bounds are realized by explicit procedures that construct the desired solution. On the complexity side, our lower bounds are unconditional and already hold even for weaker distinguishing formulations of the problems. Finally, for matroids with maximum circuit size at most $c$, we show that the quadratic barrier can be broken, providing an algorithm that calculates the maximum-weight basis with expected query cost $\mathcal{O}(n^{2-1/c} \log n)$.

preprint2026arXiv

Networked Information Aggregation for Binary Classification

We study networked binary classification on a directed acyclic graph (DAG) where each agent observes only a subset of the feature columns of a shared dataset. Agents act sequentially along the DAG: each receives prediction columns from its parents (if any), augments its local features with these columns, fits a logistic predictor by minimizing binary cross-entropy (BCE), and forwards its prediction column to its outgoing neighbors. We ask whether this sequential distributed training procedure achieves information aggregation, meaning that some agent attains small excess loss compared to the best logistic predictor trained with access to all feature columns. This question was studied for linear regression under squared loss by Kearns, Roth, and Ryu (SODA 2026). Extending their guarantees to classification is nontrivial because their analysis relies on quadratic structure that does not directly transfer to BCE with a logistic link. We analyze the resulting sequential logit-passing protocol and prove: (i) an excess loss upper bound of $O(M/\sqrt{D})$ on depth-$D$ paths under the condition that every $M$ contiguous subsequence of $M$ agents collectively observe all features, and (ii) a close lower bound showing instances with excess loss of at least $Ω(k/D)$ where $k$ is the dimension of the feature space. Together, these results identify network depth as a fundamental bottleneck for information aggregation in networked logistic regression.

preprint2022arXiv

Adaptive Massively Parallel Algorithms for Cut Problems

We study the Weighted Min Cut problem in the Adaptive Massively Parallel Computation (AMPC) model. In 2019, Behnezhad et al. [3] introduced the AMPC model as an extension of the Massively Parallel Computation (MPC) model. In the past decade, research on highly scalable algorithms has had significant impact on many massive systems. The MPC model, introduced in 2010 by Karloff et al. [16], which is an abstraction of famous practical frameworks such as MapReduce, Hadoop, Flume, and Spark, has been at the forefront of this research. While great strides have been taken to create highly efficient MPC algorithms for a range of problems, recent progress has been limited by the 1-vs-2 Cycle Conjecture [20], which postulates that the simple problem of distinguishing between one and two cycles requires $Ω(\log n)$ MPC rounds. In the AMPC model, each machine has adaptive read access to a distributed hash table even when communication is restricted (i.e., in the middle of a round). While remaining practical [4], this gives algorithms the power to bypass limitations like the 1-vs-2 Cycle Conjecture. We give the first sublogarithmic AMPC algorithm, requiring $O(\log\log n)$ rounds, for $(2+ε)$-approximate weighted Min Cut. Our algorithm is inspired by the divide and conquer approach of Ghaffari and Nowicki [11], which solves the $(2+ε)$-approximate weighted Min Cut problem in $O(\log n\log\log n)$ rounds of MPC using the classic result of Karger and Stein [15]. Our work is fully-scalable in the sense that the local memory of each machine is $O(n^ε)$ for any constant $0 < ε< 1$. There are no $o(\log n)$-round MPC algorithms for Min Cut in this memory regime assuming the 1-vs-2 Cycle Conjecture holds. The exponential speedup in AMPC is the result of decoupling the different layers of the divide and conquer algorithm and solving all layers in $O(1)$ rounds.

preprint2022arXiv

Generalized Stochastic Matching

In this paper, we generalize the recently studied Stochastic Matching problem to more accurately model a significant medical process, kidney exchange, and several other applications. Up until now the Stochastic Matching problem that has been studied was as follows: given a graph G = (V, E), each edge is included in the realized sub-graph of G mutually independently with probability p_e, and the goal is to find a degree-bounded sub-graph Q of G that has an expected maximum matching that approximates the expected maximum matching of the realized sub-graph. This model does not account for possibilities of vertex dropouts, which can be found in several applications, e.g. in kidney exchange when donors or patients opt out of the exchange process as well as in online freelancing and online dating when online profiles are found to be faked. Thus, we will study a more generalized model of Stochastic Matching in which vertices and edges are both realized independently with some probabilities p_v, p_e, respectively, which more accurately fits important applications than the previously studied model. We will discuss the first algorithms and analysis for this generalization of the Stochastic Matching model and prove that they achieve good approximation ratios. In particular, we show that the approximation factor of a natural algorithm for this problem is at least $0.6568$ in unweighted graphs, and $1/2 + ε$ in weighted graphs for some constant $ε> 0$. We further improve our result for unweighted graphs to $2/3$ using edge degree constrained subgraphs (EDCS).

preprint2022arXiv

Improved Communication Complexity of Fault-Tolerant Consensus

Consensus is one of the most thoroughly studied problems in distributed computing, yet there are still complexity gaps that have not been bridged for decades. In particular, in the classical message-passing setting with processes&#39; crashes, since the seminal works of Bar-Joseph and Ben-Or [1998] \cite{Bar-JosephB98} and Aspnes and Waarts [1996, 1998] \cite{AspnesW-SICOMP-96,Aspnes-JACM-98} in the previous century, there is still a fundamental unresolved question about communication complexity of fast randomized Consensus against a (strong) adaptive adversary crashing processes arbitrarily online. The best known upper bound on the number of communication bits is $Θ(\frac{n^{3/2}}{\sqrt{\log{n}}})$ per process, while the best lower bound is $Ω(1)$. This is in contrast to randomized Consensus against a (weak) oblivious adversary, for which time-almost-optimal algorithms guarantee amortized $O(1)$ communication bits per process \cite{GK-SODA-10}. We design an algorithm against adaptive adversary that reduces the communication gap by nearly linear factor to $O(\sqrt{n}\cdot\text{polylog } n)$ bits per process, while keeping almost-optimal (up to factor $O(\log^3 n)$) time complexity $O(\sqrt{n}\cdot\log^{5/2} n)$. More surprisingly, we show this complexity indeed can be lowered further, but at the expense of increasing time complexity, i.e., there is a {\em trade-off} between communication complexity and time complexity. More specifically, our main Consensus algorithm allows to reduce communication complexity per process to any value from $\text{polylog } n$ to $O(\sqrt{n}\cdot\text{polylog } n)$, as long as Time $\times$ Communication $= O(n\cdot \text{polylog } n)$. Similarly, reducing time complexity requires more random bits per process, i.e., Time $\times$ Randomness $=O(n\cdot \text{polylog } n)$.

preprint2021arXiv

Improved Hierarchical Clustering on Massive Datasets with Broad Guarantees

Hierarchical clustering is a stronger extension of one of today&#39;s most influential unsupervised learning methods: clustering. The goal of this method is to create a hierarchy of clusters, thus constructing cluster evolutionary history and simultaneously finding clusterings at all resolutions. We propose four traits of interest for hierarchical clustering algorithms: (1) empirical performance, (2) theoretical guarantees, (3) cluster balance, and (4) scalability. While a number of algorithms are designed to achieve one to two of these traits at a time, there exist none that achieve all four. Inspired by Bateni et al.&#39;s scalable and empirically successful Affinity Clustering [NeurIPs 2017], we introduce Affinity Clustering&#39;s successor, Matching Affinity Clustering. Like its predecessor, Matching Affinity Clustering maintains strong empirical performance and uses Massively Parallel Communication as its distributed model. Designed to maintain provably balanced clusters, we show that our algorithm achieves good, constant factor approximations for Moseley and Wang&#39;s revenue and Cohen-Addad et al.&#39;s value. We show Affinity Clustering cannot approximate either function. Along the way, we also introduce an efficient $k$-sized maximum matching algorithm in the MPC model.

preprint2020arXiv

Almost Envy-freeness, Envy-rank, and Nash Social Welfare Matchings

Envy-free up to one good (EF1) and envy-free up to any good (EFX) are two well-known extensions of envy-freeness for the case of indivisible items. It is shown that EF1 can always be guaranteed for agents with subadditive valuations. In sharp contrast, it is unknown whether or not an EFX allocation always exists, even for four agents and additive valuations. In addition, the best approximation guarantee for EFX is $(ϕ-1) \simeq 0.61$ by Amanitidis et al.. In order to find a middle ground to bridge this gap, in this paper we suggest another fairness criterion, namely envy-freeness up to a random good or EFR, which is weaker than EFX, yet stronger than EF1. For this notion, we provide a polynomial-time $0.73$-approximation allocation algorithm. For our algorithm, we introduce Nash Social Welfare Matching which makes a connection between Nash Social Welfare and envy freeness. We believe Nash Social Welfare Matching will find its applications in future work.

preprint2020arXiv

Approximating LCS in Linear Time: Beating the $\sqrt{n}$ Barrier

Longest common subsequence (LCS) is one of the most fundamental problems in combinatorial optimization. Apart from theoretical importance, LCS has enormous applications in bioinformatics, revision control systems, and data comparison programs. Although a simple dynamic program computes LCS in quadratic time, it has been recently proven that the problem admits a conditional lower bound and may not be solved in truly subquadratic time. In addition to this, LCS is notoriously hard with respect to approximation algorithms. Apart from a trivial sampling technique that obtains a $n^{x}$ approximation solution in time $O(n^{2-2x})$ nothing else is known for LCS. This is in sharp contrast to its dual problem edit distance for which several linear time solutions are obtained in the past two decades.

preprint2020arXiv

Asymmetric Streaming Algorithms for Edit Distance and LCS

The edit distance (ED) and longest common subsequence (LCS) are two fundamental problems which quantify how similar two strings are to one another. In this paper, we consider these problems in the asymmetric streaming model introduced by Andoni et al. (FOCS&#39;10) and Saks and Seshadhri (SODA&#39;13). In this model we have random access to one string and streaming access the other string. Our main contribution is a constant factor approximation algorithm for ED with the memory of $\tilde O(n^δ)$ for any constant $δ> 0$. In addition to this, we present an upper bound of $\tilde O_ε(\sqrt{n})$ on the memory needed to approximate ED or LCS within a factor $1+ε$. All our algorithms are deterministic and run in a single pass. For approximating ED within a constant factor, we discover yet another application of triangle inequality, this time in the context of streaming algorithms. Triangle inequality has been previously used to obtain subquadratic time approximation algorithms for ED. Our technique is novel and elegantly utilizes triangle inequality to save memory at the expense of an exponential increase in the runtime.

preprint2020arXiv

Inverse Feature Learning: Feature learning based on Representation Learning of Error

This paper proposes inverse feature learning as a novel supervised feature learning technique that learns a set of high-level features for classification based on an error representation approach. The key contribution of this method is to learn the representation of error as high-level features, while current representation learning methods interpret error by loss functions which are obtained as a function of differences between the true labels and the predicted ones. One advantage of such learning method is that the learned features for each class are independent of learned features for other classes; therefore, this method can learn simultaneously meaning that it can learn new classes without retraining. Error representation learning can also help with generalization and reduce the chance of over-fitting by adding a set of impactful features to the original data set which capture the relationships between each instance and different classes through an error generation and analysis process. This method can be particularly effective in data sets, where the instances of each class have diverse feature representations or the ones with imbalanced classes. The experimental results show that the proposed method results in significantly better performance compared to the state-of-the-art classification techniques for several popular data sets. We hope this paper can open a new path to utilize the proposed perspective of error representation learning in different feature learning domains.

preprint2020arXiv

Stochastic Matching with Few Queries: $(1-\varepsilon)$ Approximation

Suppose that we are given an arbitrary graph $G=(V, E)$ and know that each edge in $E$ is going to be realized independently with some probability $p$. The goal in the stochastic matching problem is to pick a sparse subgraph $Q$ of $G$ such that the realized edges in $Q$, in expectation, include a matching that is approximately as large as the maximum matching among the realized edges of $G$. The maximum degree of $Q$ can depend on $p$, but not on the size of $G$. This problem has been subject to extensive studies over the years and the approximation factor has been improved from $0.5$ to $0.5001$ to $0.6568$ and eventually to $2/3$. In this work, we analyze a natural sampling-based algorithm and show that it can obtain all the way up to $(1-ε)$ approximation, for any constant $ε> 0$. A key and of possible independent interest component of our analysis is an algorithm that constructs a matching on a stochastic graph, which among some other important properties, guarantees that each vertex is matched independently from the vertices that are sufficiently far. This allows us to bypass a previously known barrier towards achieving $(1-ε)$ approximation based on existence of dense Ruzsa-Szemerédi graphs.

preprint2013arXiv

A Game-Theoretic Model Motivated by the DARPA Network Challenge

In this paper we propose a game-theoretic model to analyze events similar to the 2009 \emph{DARPA Network Challenge}, which was organized by the Defense Advanced Research Projects Agency (DARPA) for exploring the roles that the Internet and social networks play in incentivizing wide-area collaborations. The challenge was to form a group that would be the first to find the locations of ten moored weather balloons across the United States. We consider a model in which $N$ people (who can form groups) are located in some topology with a fixed coverage volume around each person&#39;s geographical location. We consider various topologies where the players can be located such as the Euclidean $d$-dimension space and the vertices of a graph. A balloon is placed in the space and a group wins if it is the first one to report the location of the balloon. A larger team has a higher probability of finding the balloon, but we assume that the prize money is divided equally among the team members. Hence there is a competing tension to keep teams as small as possible. \emph{Risk aversion} is the reluctance of a person to accept a bargain with an uncertain payoff rather than another bargain with a more certain, but possibly lower, expected payoff. In our model we consider the \emph{isoelastic} utility function derived from the Arrow-Pratt measure of relative risk aversion. The main aim is to analyze the structures of the groups in Nash equilibria for our model. For the $d$-dimensional Euclidean space ($d\geq 1$) and the class of bounded degree regular graphs we show that in any Nash Equilibrium the \emph{richest} group (having maximum expected utility per person) covers a constant fraction of the total volume.

preprint2012arXiv

LP Rounding for k-Centers with Non-uniform Hard Capacities

In this paper we consider a generalization of the classical k-center problem with capacities. Our goal is to select k centers in a graph, and assign each node to a nearby center, so that we respect the capacity constraints on centers. The objective is to minimize the maximum distance a node has to travel to get to its assigned center. This problem is NP-hard, even when centers have no capacity restrictions and optimal factor 2 approximation algorithms are known. With capacities, when all centers have identical capacities, a 6 approximation is known with no better lower bounds than for the infinite capacity version. While many generalizations and variations of this problem have been studied extensively, no progress was made on the capacitated version for a general capacity function. We develop the first constant factor approximation algorithm for this problem. Our algorithm uses an LP rounding approach to solve this problem, and works for the case of non-uniform hard capacities, when multiple copies of a node may not be chosen and can be extended to the case when there is a hard bound on the number of copies of a node that may be selected. In addition we establish a lower bound on the integrality gap of 7(5) for non-uniform (uniform) hard capacities. In addition we prove that if there is a (3-eps)-factor approximation for this problem then P=NP. Finally, for non-uniform soft capacities we present a much simpler 11-approximation algorithm, which we find as one more evidence that hard capacities are much harder to deal with.

preprint2011arXiv

Parameterized Complexity of Problems in Coalitional Resource Games

Coalition formation is a key topic in multi-agent systems. Coalitions enable agents to achieve goals that they may not have been able to achieve on their own. Previous work has shown problems in coalitional games to be computationally hard. Wooldridge and Dunne (Artificial Intelligence 2006) studied the classical computational complexity of several natural decision problems in Coalitional Resource Games (CRG) - games in which each agent is endowed with a set of resources and coalitions can bring about a set of goals if they are collectively endowed with the necessary amount of resources. The input of coalitional resource games bundles together several elements, e.g., the agent set Ag, the goal set G, the resource set R, etc. Shrot, Aumann and Kraus (AAMAS 2009) examine coalition formation problems in the CRG model using the theory of Parameterized Complexity. Their refined analysis shows that not all parts of input act equal - some instances of the problem are indeed tractable while others still remain intractable. We answer an important question left open by Shrot, Aumann and Kraus by showing that the SC Problem (checking whether a Coalition is Successful) is W[1]-hard when parameterized by the size of the coalition. Then via a single theme of reduction from SC, we are able to show that various problems related to resources, resource bounds and resource conflicts introduced by Wooldridge et al are 1. W[1]-hard or co-W[1]-hard when parameterized by the size of the coalition. 2. para-NP-hard or co-para-NP-hard when parameterized by |R|. 3. FPT when parameterized by either |G| or |Ag|+|R|.

preprint2010arXiv

Prize-collecting Network Design on Planar Graphs

In this paper, we reduce Prize-Collecting Steiner TSP (PCTSP), Prize-Collecting Stroll (PCS), Prize-Collecting Steiner Tree (PCST), Prize-Collecting Steiner Forest (PCSF) and more generally Submodular Prize-Collecting Steiner Forest (SPCSF) on planar graphs (and more generally bounded-genus graphs) to the same problems on graphs of bounded treewidth. More precisely, we show any $α$-approximation algorithm for these problems on graphs of bounded treewidth gives an $(α+ ε)$-approximation algorithm for these problems on planar graphs (and more generally bounded-genus graphs), for any constant $ε> 0$. Since PCS, PCTSP, and PCST can be solved exactly on graphs of bounded treewidth using dynamic programming, we obtain PTASs for these problems on planar graphs and bounded-genus graphs. In contrast, we show PCSF is APX-hard to approximate on series-parallel graphs, which are planar graphs of treewidth at most 2. This result is interesting on its own because it gives the first provable hardness separation between prize-collecting and non-prize-collecting (regular) versions of the problems: regular Steiner Forest is known to be polynomially solvable on series-parallel graphs and admits a PTAS on graphs of bounded treewidth. An analogous hardness result can be shown for Euclidian PCSF. This ends the common belief that prize-collecting variants should not add any new hardness to the problems.