Source author record

Goutham Rajendran

Goutham Rajendran appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity Machine Learning Computation and Language Data Structures and Algorithms Discrete Mathematics eess.AS math.CO math.PR math.ST Sound Statistics Theory

Catalog footprint

What is connected

4works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Concentration of polynomial random matrices via Efron-Stein inequalities

Analyzing concentration of large random matrices is a common task in a wide variety of fields. Given independent random variables, many tools are available to analyze random matrices whose entries are linear in the variables, e.g. the matrix-Bernstein inequality. However, in many applications, we need to analyze random matrices whose entries are polynomials in the variables. These arise naturally in the analysis of spectral algorithms, e.g., Hopkins et al. [STOC 2016], Moitra-Wein [STOC 2019]; and in lower bounds for semidefinite programs based on the Sum of Squares hierarchy, e.g. Barak et al. [FOCS 2016], Jones et al. [FOCS 2021]. In this work, we present a general framework to obtain such bounds, based on the matrix Efron-Stein inequalities developed by Paulin-Mackey-Tropp [Annals of Probability 2016]. The Efron-Stein inequality bounds the norm of a random matrix by the norm of another simpler (but still random) matrix, which we view as arising by "differentiating" the starting matrix. By recursively differentiating, our framework reduces the main task to analyzing far simpler matrices. For Rademacher variables, these simpler matrices are in fact deterministic and hence, analyzing them is far easier. For general non-Rademacher variables, the task reduces to scalar concentration, which is much easier. Moreover, in the setting of polynomial matrices, our results generalize the work of Paulin-Mackey-Tropp. Using our basic framework, we recover known bounds in the literature for simple "tensor networks" and "dense graph matrices". Using our general framework, we derive bounds for "sparse graph matrices", which were obtained only recently by Jones et al. [FOCS 2021] using a nontrivial application of the trace power method, and was a core component in their work. We expect our framework to be helpful for other applications involving concentration phenomena for nonlinear random matrices.

preprint2022arXiv

Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, which led to the thriving field of robust machine learning. We consider this important issue in the setting of automatic speech recognition. With the increasing popularity of pre-trained models, it's an important question to analyze and understand the robustness of such models to noise. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets. We use different kinds of noising mechanisms and measure the model performances as quantified by the inference time and the standard Word Error Rate metric. We also do an in-depth layer-wise analysis of the wav2vec2 model when injecting noise in between layers, enabling us to predict at a high level what each layer learns. Finally for this model, we visualize the propagation of errors across the layers and compare how it behaves on clean versus noisy data. Our experiments conform the predictions of Pasad et al. [2021] and also raise interesting directions for future work.

preprint2022arXiv

Combinatorial Optimization via the Sum of Squares Hierarchy

We study the Sum of Squares (SoS) Hierarchy with a view towards combinatorial optimization. We survey the use of the SoS hierarchy to obtain approximation algorithms on graphs using their spectral properties. We present a simplified proof of the result of Feige and Krauthgamer on the performance of the hierarchy for the Maximum Clique problem on random graphs. We also present a result of Guruswami and Sinop that shows how to obtain approximation algorithms for the Minimum Bisection problem on low threshold-rank graphs. We study inapproximability results for the SoS hierarchy for general constraint satisfaction problems and problems involving graph densities such as the Densest $k$-subgraph problem. We improve the existing inapproximability results for general constraint satisfaction problems in the case of large arity, using stronger probabilistic analyses of expansion of random instances. We examine connections between constraint satisfaction problems and density problems on graphs. Using them, we obtain new inapproximability results for the hierarchy for the Densest $k$-subhypergraph problem and the Minimum $p$-Union problem, which are proven via reductions. We also illustrate the relatively new idea of pseudocalibration to construct integrality gaps for the SoS hierarchy for Maximum Clique and Max $K$-CSP. The application to Max $K$-CSP that we present is known in the community but has not been presented before in the literature, to the best of our knowledge.

preprint2020arXiv

Sum-of-Squares Lower Bounds for Sherrington-Kirkpatrick via Planted Affine Planes

The Sum-of-Squares (SoS) hierarchy is a semi-definite programming meta-algorithm that captures state-of-the-art polynomial time guarantees for many optimization problems such as Max-$k$-CSPs and Tensor PCA. On the flip side, a SoS lower bound provides evidence of hardness, which is particularly relevant to average-case problems for which NP-hardness may not be available. In this paper, we consider the following average case problem, which we call the \emph{Planted Affine Planes} (PAP) problem: Given $m$ random vectors $d_1,\ldots,d_m$ in $\mathbb{R}^n$, can we prove that there is no vector $v \in \mathbb{R}^n$ such that for all $u \in [m]$, $\langle v, d_u\rangle^2 = 1$? In other words, can we prove that $m$ random vectors are not all contained in two parallel hyperplanes at equal distance from the origin? We prove that for $m \leq n^{3/2-ε}$, with high probability, degree-$n^{Ω(ε)}$ SoS fails to refute the existence of such a vector $v$. When the vectors $d_1,\ldots,d_m$ are chosen from the multivariate normal distribution, the PAP problem is equivalent to the problem of proving that a random $n$-dimensional subspace of $\mathbb{R}^m$ does not contain a boolean vector. As shown by Mohanty--Raghavendra--Xu [STOC 2020], a lower bound for this problem implies a lower bound for the problem of certifying energy upper bounds on the Sherrington-Kirkpatrick Hamiltonian, and so our lower bound implies a degree-$n^{Ω(ε)}$ SoS lower bound for the certification version of the Sherrington-Kirkpatrick problem.