Researcher profile

Anand Louis

Anand Louis contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

$λ_\infty$ & Maximum Variance Embedding: Measuring and Optimizing Connectivity of A Graph Metric

Bobkov, Houdré, and the last author [2000] introduced a Poincaré-type functional parameter, $λ_\infty$, of a graph and related it to connectivity of the graph via Cheeger-type inequalities. A work by the second author, Raghavendra, and Vempala [2013] related the complexity of $λ_\infty$ to the so-called small-set expansion (SSE) problem and further set forth the desiderata for NP-hardness of this optimization problem. We confirm the conjecture that computing $λ_\infty$ is NP-hard for weighted trees. Beyond measuring connectivity in many applications we want to optimize it. This, via convex duality, leads to a problem in machine learning known as the Maximum Variance Embedding (MVE). The output is a function from vertices to a low dim Euclidean space, subject to bounds on Euclidean distances between neighbors. The objective is to maximize output variance. Special cases of MVE into $n$ and $1$ dims lead to absolute algebraic connectivity [1990] and spread constant [1998], that measure connectivity of the graph and its Cartesian $n$-power, respectively. MVE has other applications in measuring diffusion speed and robustness of networks, clustering, and dimension reduction. We show that computing MVE in tree-width dims is NP-hard, while only one additional dim beyond width of a given tree-decomposition makes the problem in P. We show that MVE of a tree in 2 dims defines a non-convex yet benign optimization landscape, i.e., local=global optima. We further develop a linear time combinatorial algorithm for this case. Finally, we denote approximate Maximum Variance Embedding is tractable in significantly lower dims. For trees and general graphs, for which Maximum Variance Embedding cannot be solved in less than $2$ and $Ω(n)$ dims, we provide $1+\varepsilon$ approximation algorithms for embedding into $1$ and $O(\log n /\varepsilon^2)$ dims, respectively.

preprint2022arXiv

Approximating CSPs with Outliers

Constraint satisfaction problems (CSPs) are ubiquitous in theoretical computer science. We study the problem of StrongCSPs, i.e. instances where a large induced sub-instance has a satisfying assignment. More formally, given a CSP instance $Ψ(V, E, [k], \{Π_{ij}\}_{(i,j) \in E})$ consisting of a set of vertices $V$, a set of edges $E$, alphabet $[k]$, a constraint $Π_{ij} \subset [k] \times [k]$ for each $(i,j) \in E$, the goal of this problem is to compute the largest subset $S \subseteq V$ such that the instance induced on $S$ has an assignment that satisfies all the constraints. In this paper, we study approximation algorithms for Unique Games and related problems under the StrongCSP framework when the underlying constraint graph satisfies mild expansion properties. In particular, we show that given a Strong Unique Games instance whose optimal solution $S^*$ is supported on a regular low threshold rank graph, there exists an algorithm that runs in time exponential in the threshold rank, and recovers a large satisfiable sub-instance whose size is independent on the label set size and maximum degree of the graph. Our algorithm combines the techniques of Barak-Raghavendra-Steurer (FOCS'11), Guruswami-Sinop (FOCS'11) with several new ideas and runs in time exponential in the threshold rank of the optimal set. A key component of our algorithm is a new threshold rank based spectral decomposition, which is used to compute a "large" induced subgraph of "small" threshold rank; our techniques build on the work of Oveis Gharan and Rezaei (SODA'17) and could be of independent interest.

preprint2022arXiv

Exact recovery algorithm for Planted Bipartite Graph in Semi-random Graphs

The problem of finding the largest induced balanced bipartite subgraph in a given graph is NP-hard. This problem is closely related to the problem of finding the smallest Odd Cycle Transversal. In this work, we consider the following model of instances: starting with a set of vertices $V$, a set $S \subseteq V$ of $k$ vertices is chosen and an arbitrary $d$-regular bipartite graph is added on it; edges between pairs of vertices in $S \times (V \setminus S)$ and $(V \setminus S) \times (V \setminus S)$ are added with probability $p$. Since for $d=0$, the problem reduces to recovering a planted independent set, we don't expect efficient algorithms for $k=o(\sqrt{n})$. This problem is a generalization of the planted balanced biclique problem where the bipartite graph induced on $S$ is a complete bipartite graph; [Lev18] gave an algorithm for recovering $S$ in this problem when $k=Ω(\sqrt{n})$. Our main result is an efficient algorithm that recovers (w.h.p.) the planted bipartite graph when $k=Ω_p(\sqrt{n \log n})$ for a large range of parameters. Our results also hold for a natural semi-random model of instances, which involve the presence of a monotone adversary. Our proof shows that a natural SDP relaxation for the problem is integral by constructing an appropriate solution to it's dual formulation. Our main technical contribution is a new approach for constructing the dual solution where we calibrate the eigenvectors of the adjacency matrix to be the eigenvectors of the dual matrix. We believe that this approach may have applications to other recovery problems in semi-random models as well. When $k=Ω(\sqrt{n})$, we give an algorithm for recovering $S$ whose running time is exponential in the number of small eigenvalues in graph induced on $S$; this algorithm is based on subspace enumeration techniques due to the works of [KT07,ABS10,Kol11].

preprint2022arXiv

Socially Fair Center-based and Linear Subspace Clustering

Center-based clustering (e.g., $k$-means, $k$-medians) and clustering using linear subspaces are two most popular techniques to partition real-world data into smaller clusters. However, when the data consists of sensitive demographic groups, significantly different clustering cost per point for different sensitive groups can lead to fairness-related harms (e.g., different quality-of-service). The goal of socially fair clustering is to minimize the maximum cost of clustering per point over all groups. In this work, we propose a unified framework to solve socially fair center-based clustering and linear subspace clustering, and give practical, efficient approximation algorithms for these problems. We do extensive experiments to show that on multiple benchmark datasets our algorithms either closely match or outperform state-of-the-art baselines.

preprint2021arXiv

Group Fairness for Knapsack Problems

We study the knapsack problem with group fairness constraints. The input of the problem consists of a knapsack of bounded capacity and a set of items, each item belongs to a particular category and has and associated weight and value. The goal of this problem is to select a subset of items such that all categories are fairly represented, the total weight of the selected items does not exceed the capacity of the knapsack,and the total value is maximized. We study the fairness parameters such as the bounds on the total value of items from each category, the total weight of items from each category, and the total number of items from each category. We give approximation algorithms for these problems. These fairness notions could also be extended to the min-knapsack problem. The fair knapsack problems encompass various important problems, such as participatory budgeting, fair budget allocation, advertising.

preprint2021arXiv

On the Problem of Underranking in Group-Fair Ranking

Search and recommendation systems, such as search engines, recruiting tools, online marketplaces, news, and social media, output ranked lists of content, products, and sometimes, people. Credit ratings, standardized tests, risk assessments output only a score, but are also used implicitly for ranking. Bias in such ranking systems, especially among the top ranks, can worsen social and economic inequalities, polarize opinions, and reinforce stereotypes. On the other hand, a bias correction for minority groups can cause more harm if perceived as favoring group-fair outcomes over meritocracy. In this paper, we formulate the problem of underranking in group-fair rankings, which was not addressed in previous work. Most group-fair ranking algorithms post-process a given ranking and output a group-fair ranking. We define underranking based on how close the group-fair rank of each item is to its original rank, and prove a lower bound on the trade-off achievable for simultaneous underranking and group fairness in ranking. We give a fair ranking algorithm that takes any given ranking and outputs another ranking with simultaneous underranking and group fairness guarantees comparable to the lower bound we prove. Our algorithm works with group fairness constraints for any number of groups. Our experimental results confirm the theoretical trade-off between underranking and group fairness, and also show that our algorithm achieves the best of both when compared to the state-of-the-art baselines.

preprint2020arXiv

Approximation Algorithms and Hardness for Strong Unique Games

The UNIQUE GAMES problem is a central problem in algorithms and complexity theory. Given an instance of UNIQUE GAMES, the STRONG UNIQUE GAMES problem asks to find the largest subset of vertices, such that the UNIQUE GAMES instance induced on them is completely satisfiable. In this paper, we give new algorithmic and hardness results for STRONG UNIQUE GAMES. Given an instance with label set size $k$ where a set of $(1 - ε)$-fraction of the vertices induce an instance that is completely satisfiable, our first algorithm produces a set of $1 - \widetilde{O}({k^2}) ε\sqrt{\log n}$ fraction of the vertices such that the UNIQUE GAMES induced on them is completely satisfiable. In the same setting, our second algorithm produces a set of $1 - \widetilde{O}({k^2}) \sqrt{ε\log d}$ (here $d$ is the largest vertex degree of the graph) fraction of the vertices such that the UNIQUE GAMES induced on them is completely satisfiable. The technical core of our results is a new connection between STRONG UNIQUE GAMES and Small-Set-Vertex-Expansion in graphs. Complementing this, assuming the Unique Games Conjecture, we prove that it is NP-hard to compute a set of size larger than $1 - Ω( \sqrt{ε\log k \log d})$ for which all the constraints induced on this set are satisfied. Given an undirected graph $G(V,E)$ the ODD CYCLE TRANSVERSAL problem asks to delete the least fraction of vertices to make the induced graph on the remaining vertices bipartite. As a corollary to our main algorithmic results, we obtain an algorithm that outputs a set $S$ such the graph induced on $V \setminus S$ is bipartite, and $|S|/n \leq O(\sqrt{ε\log d})$ (here $d$ is the largest vertex degree and $ε$ is the optimal fraction of vertices that need to be deleted). Assuming the Unique Games Conjecture, we prove a matching (up to constant factors) hardness.

preprint2020arXiv

Robust Identifiability in Linear Structural Equation Models of Causal Inference

In this work, we consider the problem of robust parameter estimation from observational data in the context of linear structural equation models (LSEMs). LSEMs are a popular and well-studied class of models for inferring causality in the natural and social sciences. One of the main problems related to LSEMs is to recover the model parameters from the observational data. Under various conditions on LSEMs and the model parameters the prior work provides efficient algorithms to recover the parameters. However, these results are often about generic identifiability. In practice, generic identifiability is not sufficient and we need robust identifiability: small changes in the observational data should not affect the parameters by a huge amount. Robust identifiability has received far less attention and remains poorly understood. Sankararaman et al. (2019) recently provided a set of sufficient conditions on parameters under which robust identifiability is feasible. However, a limitation of their work is that their results only apply to a small sub-class of LSEMs, called ``bow-free paths.'' In this work, we significantly extend their work along multiple dimensions. First, for a large and well-studied class of LSEMs, namely ``bow free'' models, we provide a sufficient condition on model parameters under which robust identifiability holds, thereby removing the restriction of paths required by prior work. We then show that this sufficient condition holds with high probability which implies that for a large set of parameters robust identifiability holds and that for such parameters, existing algorithms already achieve robust identifiability. Finally, we validate our results on both simulated and real-world datasets.

preprint2020arXiv

Stability of Linear Structural Equation Models of Causal Inference

We consider the numerical stability of the parameter recovery problem in Linear Structural Equation Model ($\LSEM$) of causal inference. A long line of work starting from Wright (1920) has focused on understanding which sub-classes of $\LSEM$ allow for efficient parameter recovery. Despite decades of study, this question is not yet fully resolved. The goal of this paper is complementary to this line of work; we want to understand the stability of the recovery problem in the cases when efficient recovery is possible. Numerical stability of Pearl's notion of causality was first studied in Schulman and Srivastava (2016) using the concept of condition number where they provide ill-conditioned examples. In this work, we provide a condition number analysis for the $\LSEM$. First we prove that under a sufficient condition, for a certain sub-class of $\LSEM$ that are \emph{bow-free} (Brito and Pearl (2002)), the parameter recovery is stable. We further prove that \emph{randomly} chosen input parameters for this family satisfy the condition with a substantial probability. Hence for this family, on a large subset of parameter space, recovery is numerically stable. Next we construct an example of $\LSEM$ on four vertices with \emph{unbounded} condition number. We then corroborate our theoretical findings via simulations as well as real-world experiments for a sociology application. Finally, we provide a general heuristic for estimating the condition number of any $\LSEM$ instance.