Source author record

Clarice Poon

Clarice Poon appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Information Theory math.IT math.OC Computer Vision Machine Learning

Catalog footprint

What is connected

13works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

On the global convergence of gradient descent for wide shallow models with bounded nonlinearities

A surprising phenomenon in the training of neural networks is the ability of gradient descent to find global minimizers of the training loss despite its non-convexity. Following earlier works, we investigate this behavior for wide shallow networks. Existing results essentially cover the case of ReLU activations and the case of sigmoid activations with scalar output weights. We study a large class of models that includes multi-head attention layers and two-layer sigmoid networks with vector output weights. Building upon [Chizat and Bach, 2018], we prove that all non-global minimizers of the training loss are unstable under gradient descent dynamics. Thus, when the initial distribution of the parameters has full support (which includes the popular Gaussian case), and in the many hidden neurons or attention heads limit, continuous-time gradient descent can only converge to global minimizers. Establishing the instability of non-global minimizers corresponds to the construction of an ``escaping active set'' -- we complete the proof of [Chizat and Bach, 2018] to construct this set for models with bounded nonlinearities and scalar output weights. We also extend this construction to new cases for models with vector output weights. Finally, we show the well-posedness and the stability with respect to discretization of the mean field training dynamic for sub-Gaussian initializations.

preprint2020arXiv

Geometry of First-Order Methods and Adaptive Acceleration

First-order operator splitting methods are ubiquitous among many fields through science and engineering, such as inverse problems, signal/image processing, statistics, data science and machine learning, to name a few. In this paper, we study a geometric property of first-order methods when applying to solve non-smooth optimization problems. With the tool of "partial smoothness", we design a framework to analyze the trajectory of the fixed-point sequence generated by first-order methods and show that locally, the fixed-point sequence settles onto a regular trajectory such as a straight line or a spiral. Based on this finding, we discuss the limitation of current widely used "inertial acceleration" technique, and propose a trajectory following adaptive acceleration algorithm. Global convergence is established for the proposed acceleration scheme based on the perturbation of fixed-point iteration. Locally, we first build connections between the acceleration scheme and the well-studied "vector extrapolation technique" in the field of numerical analysis, and then discuss local acceleration guarantees of the proposed acceleration scheme. Moreover, our result provides a geometric interpretation of these vector extrapolation techniques. Numerical experiments on various first-order methods are provided to demonstrate the advantage of the proposed adaptive acceleration scheme.

preprint2020arXiv

The geometry of off-the-grid compressed sensing

This paper presents a sharp geometric analysis of the recovery performance of sparse regularization. More specifically, we analyze the BLASSO method which estimates a sparse measure (sum of Dirac masses) from randomized sub-sampled measurements. This is a "continuous", often called off-the-grid, extension of the compressed sensing problem, where the $\ell^1$ norm is replaced by the total variation of measures. This extension is appealing from a numerical perspective because it avoids to discretize the the space by some grid. But more importantly, it makes explicit the geometry of the problem since the positions of the Diracs can now freely move over the parameter space. On a methodological level, our contribution is to propose the Fisher geodesic distance on this parameter space as the canonical metric to analyze super-resolution in a way which is invariant to reparameterization of this space. Switching to the Fisher metric allows us to take into account measurement operators which are not translation invariant, which is crucial for applications such as Laplace inversion in imaging, Gaussian mixtures estimation and training of multilayer perceptrons with one hidden layer. On a theoretical level, our main contribution shows that if the Fisher distance between spikes is larger than a Rayleigh separation constant, then the BLASSO recovers in a stable way a stream of Diracs, provided that the number of measurements is proportional (up to log factors) to the number of Diracs. We measure the stability using an optimal transport distance constructed on top of the Fisher geodesic distance. Our result is (up to log factor) sharp and does not require any randomness assumption on the amplitudes of the underlying measure. Our proof technique relies on an infinite-dimensional extension of the so-called "golfing scheme" which operates over the space of measures and is of general interest.

preprint2019arXiv

On instabilities of deep learning in image reconstruction - Does AI come at a cost?

Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper we demonstrate a crucial phenomenon: deep learning typically yields unstablemethods for image reconstruction. The instabilities usually occur in several forms: (1) tiny, almost undetectable perturbations, both in the image and sampling domain, may result in severe artefacts in the reconstruction, (2) a small structural change, for example a tumour, may not be captured in the reconstructed image and (3) (a counterintuitive type of instability) more samples may yield poorer performance. Our new stability test with algorithms and easy to use software detects the instability phenomena. The test is aimed at researchers to test their networks for instabilities and for government agencies, such as the Food and Drug Administration (FDA), to secure safe use of deep learning methods.

preprint2016arXiv

A practical guide to the recovery of wavelet coefficients from Fourier measurements

In a series of recent papers (Adcock, Hansen and Poon, 2013, Appl. Comput. Harm. Anal. 45(5):3132-3167), (Adcock, Gataric and Hansen, 2014, SIAM J. Imaging Sci. 7(3):1690-1723) and (Adcock, Hansen, Kutyniok and Ma, 2015, SIAM J. Math. Anal. 47(2):1196-1233), it was shown that one can optimally recover the wavelet coefficients of an unknown compactly supported function from pointwise evaluations of its Fourier transform via the method of generalized sampling. While these papers focused on the optimality of generalized sampling in terms of its stability and error bounds, the current paper explains how this optimal method can be implemented to yield a computationally efficient algorithm. In particular, we show that generalized sampling has a computational complexity of $\mathcal{O}(M(N)\log N)$ when recovering the first $N$ boundary-corrected wavelet coefficients of an unknown compactly supported function from $M(N)$ Fourier samples. Therefore, due to the linear correspondences between the number of samples $M$ and number of coefficients $N$ shown previously, generalized sampling offers a computationally optimal way of recovering wavelet coefficients from Fourier data.

preprint2016arXiv

Geometric properties of solutions to the total variation denoising problem

This article studies the denoising performance of total variation (TV) image regularization. More precisely, we study geometrical properties of the solution to the so-called Rudin-Osher-Fatemi total variation denoising method. The first contribution of this paper is a precise mathematical definition of the "extended support" (associated to the noise-free image) of TV denoising. It is intuitively the region which is unstable and will suffer from the staircasing effect. We highlight in several practical cases, such as the indicator of convex sets, that this region can be determined explicitly. Our second and main contribution is a proof that the TV denoising method indeed restores an image which is exactly constant outside a small tube surrounding the extended support. The radius of this tube shrinks toward zero as the noise level vanishes, and are able to determine, in some cases, an upper bound on the convergence rate. For indicators of so-called "calibrable" sets (such as disks or properly eroded squares), this extended support matches the edges, so that discontinuities produced by TV denoising cluster tightly around the edges. In contrast, for indicators of more general shapes or for complicated images, this extended support can be larger. Beside these main results, our paper also proves several intermediate results about fine properties of TV regularization, in particular for indicators of calibrable and convex sets, which are of independent interest.

preprint2016arXiv

On Cartesian line sampling with anisotropic total variation regularization

This paper considers the use of the anisotropic total variation seminorm to recover a two dimensional vector $x\in \mathbb{C}^{N\times N}$ from its partial Fourier coefficients, sampled along Cartesian lines. We prove that if $(x_{k,j} - x_{k-1,j})_{k,j}$ has at most $s_1$ nonzero coefficients in each column and $(x_{k,j} - x_{k,j-1})_{k,j}$ has at most $s_2$ nonzero coefficients in each row, then, up to multiplication by $\log$ factors, one can exactly recover $x$ by sampling along $s_1$ horizontal lines of its Fourier coefficients and along $s_2$ vertical lines of its Fourier coefficients. Finally, unlike standard compressed sensing estimates, the $\log$ factors involved are dependent on the separation distance between the nonzero entries in each row/column of the gradient of $x$ and not on $N^2$, the ambient dimension of $x$.

preprint2015arXiv

On the role of total variation in compressed sensing

This paper considers the problem of recovering a one or two dimensional discrete signal which is approximately sparse in its discrete gradient from an incomplete subset of its discrete Fourier coefficients which have been corrupted with noise. We prove that in order to obtain a reconstruction which is robust to noise and stable to inexact gradient sparsity of order $s$ with high probability, it suffices to draw $\mathcal{O}(s \log N)$ of the available Fourier coefficients uniformly at random. However, we also show that if one draws $\mathcal{O}(s \log N)$ samples in accordance to a particular distribution which concentrates on the low Fourier frequencies, then the stability bounds which can be guaranteed are optimal up to $\log$ factors. Finally, we prove that in the one dimensional case where the underlying signal is gradient sparse and its sparsity pattern satisfies a minimum separation condition, then to guarantee exact recovery with high probability, for some $M<N$, it suffices to draw $\mathcal{O}(s\log M\log s)$ samples uniformly at random from the Fourier coefficients whose frequencies are no greater than $M$.

preprint2015arXiv

Structure dependent sampling in compressed sensing: theoretical guarantees for tight frames

Many of the applications of compressed sensing have been based on variable density sampling, where certain sections of the sampling coefficients are sampled more densely. Furthermore, it has been observed that these sampling schemes are dependent not only on sparsity but also on the sparsity structure of the underlying signal. This paper extends the result of (Adcock, Hansen, Poon and Roman, arXiv:1302.0561, 2013) to the case where the sparsifying system forms a tight frame. By dividing the sampling coefficients into levels, our main result will describe how the amount of subsampling in each level is determined by the local coherences between the sampling and sparsifying operators and the localized level sparsities -- the sparsity in each level under the sparsifying operator.

preprint2014arXiv

Breaking the coherence barrier: A new theory for compressed sensing

This paper provides an extension of compressed sensing which bridges a substantial gap between existing theory and its current use in real-world applications. It introduces a mathematical framework that generalizes the three standard pillars of compressed sensing - namely, sparsity, incoherence and uniform random subsampling - to three new concepts: asymptotic sparsity, asymptotic incoherence and multilevel random sampling. The new theorems show that compressed sensing is also possible, and reveals several advantages, under these substantially relaxed conditions. The importance of this is threefold. First, inverse problems to which compressed sensing is currently applied are typically coherent. The new theory provides the first comprehensive mathematical explanation for a range of empirical usages of compressed sensing in real-world applications, such as medical imaging, microscopy, spectroscopy and others. Second, in showing that compressed sensing does not require incoherence, but instead that asymptotic incoherence is sufficient, the new theory offers markedly greater flexibility in the design of sensing mechanisms. Third, by using asymptotic incoherence and multi-level sampling to exploit not just sparsity, but also structure, i.e. asymptotic sparsity, the new theory shows that substantially improved reconstructions can be obtained from fewer measurements.

preprint2014arXiv

On the role of total variation in compressed sensing - structure dependence

This paper considers the use of total variation regularization in the recovery of approximately gradient sparse signals from their noisy discrete Fourier samples in the context of compressed sensing. It has been observed over the last decade that a reconstruction which is robust to noise and stable to inexact sparsity can be achieved when we observe a highly incomplete subset of the Fourier samples for which the samples have been drawn in a random manner. Furthermore, in order to minimize the cardinality of the set of Fourier samples, the sampling set needs to be drawn in a non-uniform manner and the use of randomness is far more complex than the notion of uniform random sampling often considered in the theoretical results of compressed sensing. The purpose of this paper is to derive recovery guarantees in the case where the sampling set is drawn in a non-uniform random manner. We will show how the sampling set is dependent on the sparsity structure of the underlying signal.

preprint2013arXiv

Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem

Generalized sampling is a recently developed linear framework for sampling and reconstruction in separable Hilbert spaces. It allows one to recover any element in any finite-dimensional subspace given finitely many of its samples with respect to an arbitrary frame. Unlike more common approaches for this problem, such as the consistent reconstruction technique of Eldar et al, it leads to completely stable numerical methods possessing both guaranteed stability and accuracy. The purpose of this paper is twofold. First, we give a complete and formal analysis of generalized sampling, the main result of which being the derivation of new, sharp bounds for the accuracy and stability of this approach. Such bounds improve those given previously, and result in a necessary and sufficient condition, the stable sampling rate, which guarantees a priori a good reconstruction. Second, we address the topic of optimality. Under some assumptions, we show that generalized sampling is an optimal, stable reconstruction. Correspondingly, whenever these assumptions hold, the stable sampling rate is a universal quantity. In the final part of the paper we illustrate our results by applying generalized sampling to the so-called uniform resampling problem.

preprint2013arXiv

On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate

In this paper we study the problem of computing wavelet coefficients of compactly supported functions from their Fourier samples. For this, we use the recently introduced framework of generalized sampling. Our first result demonstrates that using generalized sampling one obtains a stable and accurate reconstruction, provided the number of Fourier samples grows linearly in the number of wavelet coefficients recovered. For the class of Daubechies wavelets we derive the exact constant of proportionality. Our second result concerns the optimality of generalized sampling for this problem. Under some mild assumptions we show that generalized sampling cannot be outperformed in terms of approximation quality by more than a constant factor. Moreover, for the class of so-called perfect methods, any attempt to lower the sampling ratio below a certain critical threshold necessarily results in exponential ill-conditioning. Thus generalized sampling provides a nearly-optimal solution to this problem.

Clarice Poon

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

On the global convergence of gradient descent for wide shallow models with bounded nonlinearities

Geometry of First-Order Methods and Adaptive Acceleration

The geometry of off-the-grid compressed sensing

On instabilities of deep learning in image reconstruction - Does AI come at a cost?

A practical guide to the recovery of wavelet coefficients from Fourier measurements

Geometric properties of solutions to the total variation denoising problem

On Cartesian line sampling with anisotropic total variation regularization

On the role of total variation in compressed sensing

Structure dependent sampling in compressed sensing: theoretical guarantees for tight frames

Breaking the coherence barrier: A new theory for compressed sensing

On the role of total variation in compressed sensing - structure dependence

Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem

On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate