Source author record

Stefan Tiegel

Stefan Tiegel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computational Complexity Data Structures and Algorithms math.PR math.ST Statistics Theory

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Rigorous Implications of the Low-Degree Heuristic

Over the past decade, the low-degree heuristic has been used to estimate the algorithmic thresholds for a wide range of average-case planted vs null distinguishing problems. Such results rely on the hypothesis that if the low-degree moments of the planted and null distributions are sufficiently close, then no efficient (noise-tolerant) algorithm can distinguish between them. This hypothesis is appealing due to the simplicity of calculating the low-degree likelihood ratio (LDLR) -- a quantity that measures the similarity between low-degree moments. However, despite sustained interest in the area, it remains unclear whether low-degree indistinguishability actually rules out any interesting class of algorithms. In this work, we initiate the study and develop technical tools for translating LDLR upper bounds to rigorous lower bounds against concrete algorithms. As a consequence, we prove: for any permutation-invariant distribution $\mathsf{P}$, 1. If $\mathsf{P}$ is over $\{0,1\}^n$ and is low-degree indistinguishable from $U = \mathrm{Unif}(\{0,1\}^n)$, then a noisy version of $\mathsf{P}$ is statistically indistinguishable from $U$. 2. If $\mathsf{P}$ is over $\mathbb{R}^n$ and is low-degree indistinguishable from the standard Gaussian ${N}(0, 1)^n$, then no statistic based on symmetric polynomials of degree at most $O(\log n/\log \log n)$ can distinguish between a noisy version of $\mathsf{P}$ from ${N}(0, 1)^n$. 3. If $\mathsf{P}$ is over $\mathbb{R}^{n\times n}$ and is low-degree indistinguishable from ${N}(0,1)^{n\times n}$, then no constant-sized subgraph statistic can distinguish between a noisy version of $\mathsf{P}$ and ${N}(0, 1)^{n\times n}$.

preprint2022arXiv

Fast algorithm for overcomplete order-3 tensor decomposition

We develop the first fast spectral algorithm to decompose a random third-order tensor over $\mathbb{R}^d$ of rank up to $O(d^{3/2}/\text{polylog}(d))$. Our algorithm only involves simple linear algebra operations and can recover all components in time $O(d^{6.05})$ under the current matrix multiplication time. Prior to this work, comparable guarantees could only be achieved via sum-of-squares [Ma, Shi, Steurer 2016]. In contrast, fast algorithms [Hopkins, Schramm, Shi, Steurer 2016] could only decompose tensors of rank at most $O(d^{4/3}/\text{polylog}(d))$. Our algorithmic result rests on two key ingredients. A clean lifting of the third-order tensor to a sixth-order tensor, which can be expressed in the language of tensor networks. A careful decomposition of the tensor network into a sequence of rectangular matrix multiplications, which allows us to have a fast implementation of the algorithm.

preprint2022arXiv

Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise

We give tight statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise. In particular, suppose that all labels are corrupted with probability at most $η$. We show that for arbitrary $η\in [0,1/2]$ every SQ algorithm achieving misclassification error better than $η$ requires queries of superpolynomial accuracy or at least a superpolynomial number of queries. Further, this continues to hold even if the information-theoretically optimal error $\mathrm{OPT}$ is as small as $\exp\left(-\log^c(d)\right)$, where $d$ is the dimension and $0 < c < 1$ is an arbitrary absolute constant, and an overwhelming fraction of examples are noiseless. Our lower bound matches known polynomial time algorithms, which are also implementable in the SQ framework. Previously, such lower bounds only ruled out algorithms achieving error $\mathrm{OPT} + ε$ or error better than $Ω(η)$ or, if $η$ is close to $1/2$, error $η- o_η(1)$, where the term $o_η(1)$ is constant in $d$ but going to 0 for $η$ approaching $1/2$. As a consequence, we also show that achieving misclassification error better than $1/2$ in the $(A,α)$-Tsybakov model is SQ-hard for $A$ constant and $α$ bounded away from 1.

preprint2021arXiv

SoS Degree Reduction with Applications to Clustering and Robust Moment Estimation

We develop a general framework to significantly reduce the degree of sum-of-squares proofs by introducing new variables. To illustrate the power of this framework, we use it to speed up previous algorithms based on sum-of-squares for two important estimation problems, clustering and robust moment estimation. The resulting algorithms offer the same statistical guarantees as the previous best algorithms but have significantly faster running times. Roughly speaking, given a sample of $n$ points in dimension $d$, our algorithms can exploit order-$\ell$ moments in time $d^{O(\ell)}\cdot n^{O(1)}$, whereas a naive implementation requires time $(d\cdot n)^{O(\ell)}$. Since for the aforementioned applications, the typical sample size is $d^{Θ(\ell)}$, our framework improves running times from $d^{O(\ell^2)}$ to $d^{O(\ell)}$.