Source author record

Paul Hand

Paul Hand appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Information Theory math.IT Computer Vision Machine Learning math.NA Artificial Intelligence eess.IV math.CO math.PR

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime

Overparameterization is known to permit strong generalization performance in neural networks. In this work, we provide an initial theoretical analysis of its effect on catastrophic forgetting in a continual learning setup. We show experimentally that in permuted MNIST image classification tasks, the generalization performance of multilayer perceptrons trained by vanilla stochastic gradient descent can be improved by overparameterization, and the extent of the performance increase achieved by overparameterization is comparable to that of state-of-the-art continual learning algorithms. We provide a theoretical explanation of this effect by studying a qualitatively similar two-task linear regression problem, where each task is related by a random orthogonal transformation. We show that when a model is trained on the two tasks in sequence without any additional regularization, the risk gain on the first task is small if the model is sufficiently overparameterized.

preprint2022arXiv

Regularized Training of Intermediate Layers for Generative Models for Inverse Problems

Generative Adversarial Networks (GANs) have been shown to be powerful and flexible priors when solving inverse problems. One challenge of using them is overcoming representation error, the fundamental limitation of the network in representing any particular signal. Recently, multiple proposed inversion algorithms reduce representation error by optimizing over intermediate layer representations. These methods are typically applied to generative models that were trained agnostic of the downstream inversion algorithm. In our work, we introduce a principle that if a generative model is intended for inversion using an algorithm based on optimization of intermediate layers, it should be trained in a way that regularizes those intermediate layers. We instantiate this principle for two notable recent inversion algorithms: Intermediate Layer Optimization and the Multi-Code GAN prior. For both of these inversion algorithms, we introduce a new regularized GAN training algorithm and demonstrate that the learned generative model results in lower reconstruction errors across a wide range of under sampling ratios when solving compressed sensing, inpainting, and super-resolution problems.

preprint2021arXiv

Generator Surgery for Compressed Sensing

Image recovery from compressive measurements requires a signal prior for the images being reconstructed. Recent work has explored the use of deep generative models with low latent dimension as signal priors for such problems. However, their recovery performance is limited by high representation error. We introduce a method for achieving low representation error using generators as signal priors. Using a pre-trained generator, we remove one or more initial blocks at test time and optimize over the new, higher-dimensional latent space to recover a target image. Experiments demonstrate significantly improved reconstruction quality for a variety of network architectures. This approach also works well for out-of-training-distribution images and is competitive with other state-of-the-art methods. Our experiments show that test-time architectural modifications can greatly improve the recovery quality of generator signal priors for compressed sensing.

preprint2020arXiv

Compressive Phase Retrieval: Optimal Sample Complexity with Deep Generative Priors

Advances in compressive sensing provided reconstruction algorithms of sparse signals from linear measurements with optimal sample complexity, but natural extensions of this methodology to nonlinear inverse problems have been met with potentially fundamental sample complexity bottlenecks. In particular, tractable algorithms for compressive phase retrieval with sparsity priors have not been able to achieve optimal sample complexity. This has created an open problem in compressive phase retrieval: under generic, phaseless linear measurements, are there tractable reconstruction algorithms that succeed with optimal sample complexity? Meanwhile, progress in machine learning has led to the development of new data-driven signal priors in the form of generative models, which can outperform sparsity priors with significantly fewer measurements. In this work, we resolve the open problem in compressive phase retrieval and demonstrate that generative priors can lead to a fundamental advance by permitting optimal sample complexity by a tractable algorithm in this challenging nonlinear inverse problem. We additionally provide empirics showing that exploiting generative priors in phase retrieval can significantly outperform sparsity priors. These results provide support for generative priors as a new paradigm for signal recovery in a variety of contexts, both empirically and theoretically. The strengths of this paradigm are that (1) generative priors can represent some classes of natural signals more concisely than sparsity priors, (2) generative priors allow for direct optimization over the natural signal manifold, which is intractable under sparsity priors, and (3) the resulting non-convex optimization problems with generative priors can admit benign optimization landscapes at optimal sample complexity, perhaps surprisingly, even in cases of nonlinear measurements.

preprint2020arXiv

Global Convergence of Sobolev Training for Overparameterized Neural Networks

Sobolev loss is used when training a network to approximate the values and derivatives of a target function at a prescribed set of input points. Recent works have demonstrated its successful applications in various tasks such as distillation or synthetic gradient prediction. In this work we prove that an overparameterized two-layer relu neural network trained on the Sobolev loss with gradient flow from random initialization can fit any given function values and any given directional derivatives, under a separation condition on the input data.

preprint2019arXiv

Bilinear Compressed Sensing under known Signs via Convex Programming

We consider the bilinear inverse problem of recovering two vectors, $\boldsymbol{x} \in\mathbb{R}^L$ and $\boldsymbol{w} \in\mathbb{R}^L$, from their entrywise product. We consider the case where $\boldsymbol{x}$ and $\boldsymbol{w}$ have known signs and are sparse with respect to known dictionaries of size $K$ and $N$, respectively. Here, $K$ and $N$ may be larger than, smaller than, or equal to $L$. We introduce $\ell_1$-BranchHull, which is a convex program posed in the natural parameter space and does not require an approximate solution or initialization in order to be stated or solved. Under the assumptions that $\boldsymbol{x}$ and $\boldsymbol{w}$ satisfy a comparable-effective-sparsity condition and are $S_1$- and $S_2$-sparse with respect to a random dictionary, we present a recovery guarantee in a noisy case. We show that $\ell_1$-BranchHull is robust to small dense noise with high probability if the number of measurements satisfy $L\geqΩ\left((S_1+S_2)\log^{2}(K+N)\right)$. Numerical experiments show that the scaling constant in the theorem is not too large. We also introduce variants of $\ell_1$-BranchHull for the purposes of tolerating noise and outliers, and for the purpose of recovering piecewise constant signals. We provide an ADMM implementation of these variants and show they can extract piecewise constant behavior from real images.

preprint2016arXiv

An Elementary Proof of Convex Phase Retrieval in the Natural Parameter Space via the Linear Program PhaseMax

The phase retrieval problem has garnered significant attention since the development of the PhaseLift algorithm, which is a convex program that operates in a lifted space of matrices. Because of the substantial computational cost due to lifting, many approaches to phase retrieval have been developed, including non-convex optimization algorithms which operate in the natural parameter space, such as Wirtinger Flow. Very recently, a convex formulation called PhaseMax has been discovered, and it has been proven to achieve phase retrieval via linear programming in the natural parameter space under optimal sample complexity. The current proofs of PhaseMax rely on statistical learning theory or geometric probability theory. Here, we present a short and elementary proof that PhaseMax exactly recovers real-valued vectors from random measurements under optimal sample complexity. Our proof only relies on standard probabilistic concentration and covering arguments, yielding a simpler and more direct proof than those that require statistical learning theory, geometric probability or the highly technical arguments for Wirtinger Flow-like approaches.

preprint2016arXiv

Corruption Robust Phase Retrieval via Linear Programming

We consider the problem of phase retrieval from corrupted magnitude observations. In particular we show that a fixed $x_0 \in \mathbb{R}^n$ can be recovered exactly from corrupted magnitude measurements $|\langle a_i, x_0 \rangle | + η_i, \quad i =1,2\ldots m$ with high probability for $m = O(n)$, where $a_i \in \mathbb{R}^n$ are i.i.d standard Gaussian and $η\in \mathbb{R}^m$ has fixed sparse support and is otherwise arbitrary, by using a version of the PhaseMax algorithm augmented with slack variables subject to a penalty. This linear programming formulation, which we call RobustPhaseMax, operates in the natural parameter space, and our proofs rely on a direct analysis of the optimality conditions using concentration inequalities.

preprint2016arXiv

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

We introduce a new method for location recovery from pair-wise directions that leverages an efficient convex program that comes with exact recovery guarantees, even in the presence of adversarial outliers. When pairwise directions represent scaled relative positions between pairs of views (estimated for instance with epipolar geometry) our method can be used for location recovery, that is the determination of relative pose up to a single unknown scale. For this task, our method yields performance comparable to the state-of-the-art with an order of magnitude speed-up. Our proposed numerical framework is flexible in that it accommodates other approaches to location recovery and can be used to speed up other methods. These properties are demonstrated by extensively testing against state-of-the-art methods for location recovery on 13 large, irregular collections of images of real scenes in addition to simulated data with ground truth.

preprint2015arXiv

Exact simultaneous recovery of locations and structure from known orientations and corrupted point correspondences

Let $t_1,\ldots,t_{n_l} \in \mathbb{R}^d$ and $p_1,\ldots,p_{n_s} \in \mathbb{R}^d$ and consider the bipartite location recovery problem: given a subset of pairwise direction observations $\{(t_i - p_j) / \|t_i - p_j\|_2\}_{i,j \in [n_l] \times [n_s]}$, where a constant fraction of these observations are arbitrarily corrupted, find $\{t_i\}_{i \in [n_ll]}$ and $\{p_j\}_{j \in [n_s]}$ up to a global translation and scale. We study the recently introduced ShapeFit algorithm as a method for solving this bipartite location recovery problem. In this case, ShapeFit consists of a simple convex program over $d(n_l + n_s)$ real variables. We prove that this program recovers a set of $n_l+n_s$ i.i.d. Gaussian locations exactly and with high probability if the observations are given by a bipartite Erdős-Rényi graph, $d$ is large enough, and provided that at most a constant fraction of observations involving any particular location are adversarially corrupted. This recovery theorem is based on a set of deterministic conditions that we prove are sufficient for exact recovery. Finally, we propose a modified pipeline for the Structure for Motion problem, based on this bipartite location recovery problem.

preprint2015arXiv

PhaseLift is robust to a constant fraction of arbitrary errors

Consider the task of recovering an unknown $n$-vector from phaseless linear measurements. This task is the phase retrieval problem. Through the technique of lifting, this nonconvex problem may be convexified into a semidefinite rank-one matrix recovery problem, known as PhaseLift. Under a linear number of exact Gaussian measurements, PhaseLift recovers the unknown vector exactly with high probability. Under noisy measurements, the solution to a variant of PhaseLift has error proportional to the $\ell_1$ norm of the noise. In the present paper, we study the robustness of this variant of PhaseLift to a case with noise and gross, arbitrary corruptions. We prove that PhaseLift can tolerate a small, fixed fraction of gross errors, even in the highly underdetermined regime where there are only $O(n)$ measurements. The lifted phase retrieval problem can be viewed as a rank-one robust Principal Component Analysis (PCA) problem under generic rank-one measurements. From this perspective, the proposed convex program is simpler that the semidefinite version of the sparse-plus-low-rank formulation standard in the robust PCA literature. Specifically, the rank penalization through a trace term is unnecessary, and the resulting optimization program has no parameters that need to be chosen. The present work also achieves the information theoretically optimal scaling of $O(n)$ measurements without the additional logarithmic factors that appear in existing general robust PCA results.

preprint2015arXiv

ShapeFit: Exact location recovery from corrupted pairwise directions

Let $t_1,\ldots,t_n \in \mathbb{R}^d$ and consider the location recovery problem: given a subset of pairwise direction observations $\{(t_i - t_j) / \|t_i - t_j\|_2\}_{i<j \in [n] \times [n]}$, where a constant fraction of these observations are arbitrarily corrupted, find $\{t_i\}_{i=1}^n$ up to a global translation and scale. We propose a novel algorithm for the location recovery problem, which consists of a simple convex program over $dn$ real variables. We prove that this program recovers a set of $n$ i.i.d. Gaussian locations exactly and with high probability if the observations are given by an \erdosrenyi graph, $d$ is large enough, and provided that at most a constant fraction of observations involving any particular location are adversarially corrupted. We also prove that the program exactly recovers Gaussian locations for $d=3$ if the fraction of corrupted observations at each location is, up to poly-logarithmic factors, at most a constant. Both of these recovery theorems are based on a set of deterministic conditions that we prove are sufficient for exact recovery.

preprint2014arXiv

Conditions for Existence of Dual Certificates in Rank-One Semidefinite Problems

Several signal recovery tasks can be relaxed into semidefinite programs with rank-one minimizers. A common technique for proving these programs succeed is to construct a dual certificate. Unfortunately, dual certificates may not exist under some formulations of semidefinite programs. In order to put problems into a form where dual certificate arguments are possible, it is important to develop conditions under which the certificates exist. In this paper, we provide an example where dual certificates do not exist. We then present a completeness condition under which they are guaranteed to exist. For programs that do not satisfy the completeness condition, we present a completion process which produces an equivalent program that does satisfy the condition. The important message of this paper is that dual certificates may not exist for semidefinite programs that involve orthogonal measurements with respect to positive-semidefinite matrices. Such measurements can interact with the positive-semidefinite constraint in a way that implies additional linear measurements. If these additional measurements are not included in the problem formulation, then dual certificates may fail to exist. As an illustration, we present a semidefinite relaxation for the task of finding the sparsest element in a subspace. One formulation of this program does not admit dual certificates. The completion process produces an equivalent formulation which does admit dual certificates.

preprint2014arXiv

Scaling Law for Recovering the Sparsest Element in a Subspace

We address the problem of recovering a sparse $n$-vector within a given subspace. This problem is a subtask of some approaches to dictionary learning and sparse principal component analysis. Hence, if we can prove scaling laws for recovery of sparse vectors, it will be easier to derive and prove recovery results in these applications. In this paper, we present a scaling law for recovering the sparse vector from a subspace that is spanned by the sparse vector and $k$ random vectors. We prove that the sparse vector will be the output to one of $n$ linear programs with high probability if its support size $s$ satisfies $s \lesssim n/\sqrt{k \log n}$. The scaling law still holds when the desired vector is approximately sparse. To get a single estimate for the sparse vector from the $n$ linear programs, we must select which output is the sparsest. This selection process can be based on any proxy for sparsity, and the specific proxy has the potential to improve or worsen the scaling law. If sparsity is interpreted in an $\ell_1/\ell_\infty$ sense, then the scaling law can not be better than $s \lesssim n/\sqrt{k}$. Computer simulations show that selecting the sparsest output in the $\ell_1/\ell_2$ or thresholded-$\ell_0$ senses can lead to a larger parameter range for successful recovery than that given by the $\ell_1/\ell_\infty$ sense.

preprint2013arXiv

Stable optimizationless recovery from phaseless linear measurements

We address the problem of recovering an n-vector from m linear measurements lacking sign or phase information. We show that lifting and semidefinite relaxation suffice by themselves for stable recovery in the setting of m = O(n log n) random sensing vectors, with high probability. The recovery method is optimizationless in the sense that trace minimization in the PhaseLift procedure is unnecessary. That is, PhaseLift reduces to a feasibility problem. The optimizationless perspective allows for a Douglas-Rachford numerical algorithm that is unavailable for PhaseLift. This method exhibits linear convergence with a favorable convergence rate and without any parameter tuning.

Paul Hand

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime

Regularized Training of Intermediate Layers for Generative Models for Inverse Problems

Generator Surgery for Compressed Sensing

Compressive Phase Retrieval: Optimal Sample Complexity with Deep Generative Priors

Global Convergence of Sobolev Training for Overparameterized Neural Networks

Bilinear Compressed Sensing under known Signs via Convex Programming

An Elementary Proof of Convex Phase Retrieval in the Natural Parameter Space via the Linear Program PhaseMax

Corruption Robust Phase Retrieval via Linear Programming

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

Exact simultaneous recovery of locations and structure from known orientations and corrupted point correspondences

PhaseLift is robust to a constant fraction of arbitrary errors

ShapeFit: Exact location recovery from corrupted pairwise directions

Conditions for Existence of Dual Certificates in Rank-One Semidefinite Problems

Scaling Law for Recovering the Sparsest Element in a Subspace

Stable optimizationless recovery from phaseless linear measurements