Source author record

Ernesto De Vito

Ernesto De Vito appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.FA Machine Learning math.GR math.RT math.SP

Catalog footprint

What is connected

14works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Hyperparameter Tuning for Large Scale Kernel Ridge Regression

Kernel methods provide a principled approach to nonparametric learning. While their basic implementations scale poorly to large problems, recent advances showed that approximate solvers can efficiently handle massive datasets. A shortcoming of these solutions is that hyperparameter tuning is not taken care of, and left for the user to perform. Hyperparameters are crucial in practice and the lack of automated tuning greatly hinders efficiency and usability. In this paper, we work to fill in this gap focusing on kernel ridge regression based on the Nyström approximation. After reviewing and contrasting a number of hyperparameter tuning strategies, we propose a complexity regularization criterion based on a data dependent penalty, and discuss its efficient optimization. Then, we proceed to a careful and extensive empirical evaluation highlighting strengths and weaknesses of the different tuning strategies. Our analysis shows the benefit of the proposed approach, that we hence incorporate in a library for large scale kernel methods to derive adaptively tuned solutions.

preprint2022arXiv

Mean Nyström Embeddings for Adaptive Compressive Learning

Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an inverse problem using an adapted parametric model. Previous works in this context have focused on sketches obtained by averaging random features, that while universal can be poorly adapted to the problem at hand. In this paper, we propose and study the idea of performing sketching based on data-dependent Nyström approximation. From a theoretical perspective we prove that the excess risk can be controlled under a geometric assumption relating the parametric model used to learn from the sketch and the covariance operator associated to the task at hand. Empirically, we show for k-means clustering and Gaussian modeling that for a fixed sketch size, Nyström sketches indeed outperform those built with random features.

preprint2022arXiv

Multiclass learning with margin: exponential rates with no bias-variance trade-off

We study the behavior of error bounds for multiclass classification under suitable margin conditions. For a wide variety of methods we prove that the classification error under a hard-margin condition decreases exponentially fast without any bias-variance trade-off. Different convergence rates can be obtained in correspondence of different margin assumptions. With a self-contained and instructive analysis we are able to generalize known results from the binary to the multiclass setting.

preprint2021arXiv

Construction and Monte Carlo estimation of wavelet frames generated by a reproducing kernel

We introduce a construction of multiscale tight frames on general domains. The frame elements are obtained by spectral filtering of the integral operator associated with a reproducing kernel. Our construction extends classical wavelets as well as generalized wavelets on both continuous and discrete non-Euclidean structures such as Riemannian manifolds and weighted graphs. Moreover, it allows to study the relation between continuous and discrete frames in a random sampling regime, where discrete frames can be seen as Monte Carlo estimates of the continuous ones. Pairing spectral regularization with learning theory, we show that a sample frame tends to its population counterpart, and derive explicit finite-sample rates on spaces of Sobolev and Besov regularity. Our results prove the stability of frames constructed on empirical data, in the sense that all stochastic discretizations have the same underlying limit regardless of the set of initial training samples.

preprint2020arXiv

Radon Transform: Dual Pairs and Irreducible Representations

We illustrate the general point of view developed in [SIAM J. Math. Anal., 51(6), 4356-4381] that can be described as a variation of Helgason's theory of dual $G$-homogeneous pairs $(X,Ξ)$ and which allows us to prove intertwining properties and inversion formulae of many existing Radon transforms. Here we analyze in detail one of the important aspects in the theory of dual pairs, namely the injectivity of the map label-to-manifold $ξ\to\hatξ$ and we prove that it is a necessary condition for the irreducibility of the quasi-regular representation of $G$ on $L^2(Ξ)$. We further explain how the theory in [SIAM J. Math. Anal., 51(6), 4356-4381] applies to the classical Radon and X-ray transforms in $\mathbb R^3$.

preprint2019arXiv

Monte Carlo wavelets: a randomized approach to frame discretization

In this paper we propose and study a family of continuous wavelets on general domains, and a corresponding stochastic discretization that we call Monte Carlo wavelets. First, using tools from the theory of reproducing kernel Hilbert spaces and associated integral operators, we define a family of continuous wavelets by spectral calculus. Then, we propose a stochastic discretization based on Monte Carlo estimates of integral operators. Using concentration of measure results, we establish the convergence of such a discretization and derive convergence rates under natural regularity assumptions.

preprint2015arXiv

Different faces of the shearlet group

Recently, shearlet groups have received much attention in connection with shearlet transforms applied for orientation sensitive image analysis and restoration. The square integrable representations of the shearlet groups provide not only the basis for the shearlet transforms but also for a very natural definition of scales of smoothness spaces, called shearlet coorbit spaces. The aim of this paper is twofold: first we discover isomorphisms between shearlet groups and other well-known groups, namely extended Heisenberg groups and subgroups of the symplectic group. Interestingly, the connected shearlet group with positive dilations has an isomorphic copy in the symplectic group, while this is not true for the full shearlet group with all nonzero dilations. Indeed we prove the general result that there exist, up to adjoint action of the symplectic group, only one embedding of the extended Heisenberg algebra into the Lie algebra of the symplectic group. Having understood the various group isomorphisms it is natural to ask for the relations between coorbit spaces of isomorphic groups with equivalent representations. These connections are examined in the second part of the paper. We describe how isomorphic groups with equivalent representations lead to isomorphic coorbit spaces. In particular we apply this result to square integrable representations of the connected shearlet groups and metaplectic representations of subgroups of the symplectic group. This implies the definition of metaplectic coorbit spaces. Besides the usual full and connected shearlet groups we also deal with Toeplitz shearlet groups.

preprint2015arXiv

Reproducing subgroups of $Sp(2,\mathbb{R})$. Part I: algebraic classification

We classify the connected Lie subgroups of the symplectic group $Sp(2,\mathbb{R})$ whose elements are matrices in block lower triangular form. The classification is up to conjugation within $Sp(2,\mathbb{R})$. Their study is motivated by the need of a unified approach to continuous 2D signal analyses, as those provided by wavelets and shearlets.

preprint2014arXiv

Coorbit spaces with voice in a Fréchet space

We set up a new general coorbit space theory for reproducing representations of a locally compact second countable group $G$ that are not necessarily irreducible nor integrable. Our basic assumption is that the kernel associated with the voice transform belongs to a Fréchet space $\mathcal T$ of functions on $G$, which generalizes the classical choice $\mathcal T=L_w^1(G)$. Our basic example is $ \mathcal T=\bigcap_{p\in(1,+\infty)} L^p(G)$, or a weighted versions of it. By means of this choice it is possible to treat, for instance, Paley-Wiener spaces and coorbit spaces related to Shannon wavelets and Schrödingerlets.

preprint2014arXiv

Geometric classification of semidirect products in the maximal parabolic subgroup of $\operatorname{Sp}(2,\mathbb{R})$

We classify up to conjugation by $\operatorname{GL}(2,\mathbb{R})$ (more precisely, block diagonal symplectic matrices) all the semidirect products inside the maximal parabolic of $\operatorname{Sp}(2,\mathbb{R})$ by means of an essentially geometric argument. This classification has already been established without geometry, under a stricter notion of equivalence, namely conjugation by arbitrary symplectic matrices. The present approach might be useful in higher dimensions and provides some insight.

preprint2014arXiv

Learning Sets with Separating Kernels

We consider the problem of learning a set from random samples. We show how relevant geometric and topological properties of a set can be studied analytically using concepts from the theory of reproducing kernel Hilbert spaces. A new kind of reproducing kernel, that we call separating kernel, plays a crucial role in our study and is analyzed in detail. We prove a new analytic characterization of the support of a distribution, that naturally leads to a family of provably consistent regularized learning algorithms and we discuss the stability of these methods with respect to random sampling. Numerical experiments show that the approach is competitive, and often better, than other state of the art techniques.

preprint2012arXiv

Reproducing subgroups of Sp(2,R). Part II: admissible vectors

In part I we introduced the class ${\mathcal E}_2$ of Lie subgroups of $Sp(2,\R)$ and obtained a classification up to conjugation (Theorem 1.1). Here, we determine for which of these groups the restriction of the metaplectic representation gives rise to a reproducing formula. In all the positive cases we characterize the admissible vectors with a generalized Calderón equation. They include products of 1D-wavelets, directional wavelets, shearlets, and many new examples.

preprint2011arXiv

A mock metaplectic representation

We obtain necessary and sufficient conditions for the admissible vectors of a new unitary non irreducible representation $U$. The group $G$ is an arbitrary semidirect product whose normal factor $A$ is abelian and whose homogeneous factor $H$ is a locally compact second countable group acting on a Riemannian manifold $M$. The key ingredient in the construction of $U$ is a $C^1$ intertwining map between the actions of $H$ on the dual group $\hat A$ and on $M$. The representation $U$ generalizes the restriction of the metaplectic representation to triangular subgroups of $Sp(d,\R)$, whence the name "mock metaplectic". For simplicity, we content ourselves with the case where $A=\R^n$ and $M=\R^d$. The main technical point is the decomposition of $U$ as direct integral of its irreducible components. This theory is motivated by some recent developments in signal analysis, notably shearlets. Many related examples are discussed.

preprint2011arXiv

An extension of Mercer theorem to vector-valued measurable kernels

We extend the classical Mercer theorem to reproducing kernel Hilbert spaces whose elements are functions from a measurable space $X$into $\mathbb C^n$. Given a finite measure $μ$ on $X$, we represent the reproducing kernel $K$ as convergent series in terms of the eigenfunctions of a suitable compact operator depending on $K$ and $μ$. Our result holds under the mild assumption that $K$ is measurable and the associated Hilbert space is separable. Furthermore, we show that $X$ has a natural second countable topology with respect to which the eigenfunctions are continuous and the series representing $K$ uniformly converges to $K$ on any compact subsets of $X\times X$, provided that the support of $μ$ is $X$.

Ernesto De Vito

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Efficient Hyperparameter Tuning for Large Scale Kernel Ridge Regression

Mean Nyström Embeddings for Adaptive Compressive Learning

Multiclass learning with margin: exponential rates with no bias-variance trade-off

Construction and Monte Carlo estimation of wavelet frames generated by a reproducing kernel

Radon Transform: Dual Pairs and Irreducible Representations

Monte Carlo wavelets: a randomized approach to frame discretization

Different faces of the shearlet group

Reproducing subgroups of $Sp(2,\mathbb{R})$. Part I: algebraic classification

Coorbit spaces with voice in a Fréchet space

Geometric classification of semidirect products in the maximal parabolic subgroup of $\operatorname{Sp}(2,\mathbb{R})$

Learning Sets with Separating Kernels

Reproducing subgroups of Sp(2,R). Part II: admissible vectors

A mock metaplectic representation

An extension of Mercer theorem to vector-valued measurable kernels