Source author record

Matthew Eichhorn

Matthew Eichhorn appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.ST Methodology Statistics Theory

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Low-order outcomes and clustered designs: combining design and analysis for causal inference under network interference

Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and make the following threefold contributions. First, we present an estimator of the total treatment effect (or global average treatment effect) in low-order outcome models when the data are collected under general experimental designs, generalizing previous results for Bernoulli designs. We refer to this estimator as the pseudoinverse estimator and give bounds on its bias and variance in terms of properties of the experimental design. Second, we evaluate these bounds for the case of Bernoulli graph cluster randomized (GCR) designs. Its variance scales like the smaller of the variance obtained by the estimator derived under a low-order assumption, and the variance obtained from cluster randomization, showing that combining these variance reduction strategies is preferable to using either individually. When the order of the potential outcomes model is correctly specified, our estimator is always unbiased, and under a misspecified model, we upper bound the bias by the closeness of the ground truth model to a low-order model. Third, we give empirical evidence that our variance bounds can be used to select a good clustering that minimizes the worst-case variance under a cluster randomized design from a set of candidate clusterings. Across a range of graphs and clustering algorithms, our method consistently selects clusterings that perform well on a range of response models, suggesting the practical use of our bounds.

preprint2021arXiv

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps. However, choosing which of the myriad structured transformations to use (and its associated parameterization) is a laborious task that requires trading off speed, space, and accuracy. We consider a different approach: we introduce a family of matrices called kaleidoscope matrices (K-matrices) that provably capture any structured matrix with near-optimal space (parameter) and time (arithmetic operation) complexity. We empirically validate that K-matrices can be automatically learned within end-to-end pipelines to replace hand-crafted procedures, in order to improve model quality. For example, replacing channel shuffles in ShuffleNet improves classification accuracy on ImageNet by up to 5%. K-matrices can also simplify hand-engineered pipelines -- we replace filter bank feature computation in speech data preprocessing with a learnable kaleidoscope layer, resulting in only 0.4% loss in accuracy on the TIMIT speech recognition task. In addition, K-matrices can capture latent structure in models: for a challenging permuted image classification task, a K-matrix based representation of permutations is able to learn the right latent structure and improves accuracy of a downstream convolutional model by over 9%. We provide a practically efficient implementation of our approach, and use K-matrices in a Transformer network to attain 36% faster end-to-end inference speed on a language translation task.

preprint2020arXiv

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense matrix-vector multiplication, yet each has a specialized and highly efficient (subquadratic) algorithm. We ask to what extent hand-crafting these algorithms and implementations is necessary, what structural priors they encode, and how much knowledge is required to automatically learn a fast algorithm for a provided structured transform. Motivated by a characterization of fast matrix-vector multiplication as products of sparse matrices, we introduce a parameterization of divide-and-conquer methods that is capable of representing a large class of transforms. This generic formulation can automatically learn an efficient algorithm for many important transforms; for example, it recovers the $O(N \log N)$ Cooley-Tukey FFT algorithm to machine precision, for dimensions $N$ up to $1024$. Furthermore, our method can be incorporated as a lightweight replacement of generic matrices in machine learning pipelines to learn efficient and compressible transformations. On a standard task of compressing a single hidden-layer network, our method exceeds the classification accuracy of unconstrained matrices on CIFAR-10 by 3.9 points -- the first time a structured approach has done so -- with 4X faster inference speed and 40X fewer parameters.