Source author record

Thomas Oberlin

Thomas Oberlin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Sound eess.AS eess.IV Computer Vision Machine Learning astro-ph.IM cond-mat.mtrl-sci eess.SP math.NA Methodology Numerical Analysis

Catalog footprint

What is connected

10works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Audio inpainting, i.e., the task of restoring missing or occluded audio signal samples, usually relies on sparse representations or autoregressive modeling. In this paper, we propose to structure the spectrogram with nonnegative matrix factorization (NMF) in a probabilistic framework. First, we treat the missing samples as latent variables, and derive two expectation-maximization algorithms for estimating the parameters of the model, depending on whether we formulate the problem in the time- or time-frequency domain. Then, we treat the missing samples as parameters, and we address this novel problem by deriving an alternating minimization scheme. We assess the potential of these algorithms for the task of restoring short- to middle-length gaps in music signals. Experiments reveal great convergence properties of the proposed methods, as well as competitive performance when compared to state-of-the-art audio inpainting techniques.

preprint2022arXiv

Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

This paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given setting. In this paper we propose a novel strategy to automatically learn the optimal metric for PR. We unfold a recently introduced ADMM algorithm into a neural network, and we emphasize that the information about the loss used to formulate the PR problem is conveyed by the proximity operator involved in the ADMM updates. Therefore, we replace this proximity operator with trainable activation functions: learning these in a supervised setting is then equivalent to learning an optimal metric for PR. Experiments conducted with speech signals show that our approach outperforms the baseline ADMM, using a light and interpretable neural architecture.

preprint2021arXiv

Phase recovery with Bregman divergences for audio source separation

Time-frequency audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a phase recovery algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has shown good performance in several recent works. This algorithm minimizes a quadratic reconstruction error between magnitude spectrograms. However, this loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. In this paper, we propose to reformulate phase recovery in audio source separation as a minimization problem involving Bregman divergences. To optimize the resulting objective, we derive a projected gradient descent algorithm. Experiments conducted on a speech enhancement task show that this approach outperforms MISI for several alternative losses, which highlights their relevance for audio source separation applications.

preprint2021arXiv

Phase retrieval with Bregman divergences and application to audio signal recovery

Phase retrieval (PR) aims to recover a signal from the magnitudes of a set of inner products. This problem arises in many audio signal processing applications which operate on a short-time Fourier transform magnitude or power spectrogram, and discard the phase information. Recovering the missing phase from the resulting modified spectrogram is indeed necessary in order to synthesize time-domain signals. PR is commonly addressed by considering a minimization problem involving a quadratic loss function. In this paper, we adopt a different standpoint. Indeed, the quadratic loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. Therefore, we formulate PR as a new minimization problem involving Bregman divergences. Since these divergences are not symmetric with respect to their two input arguments in general, they lead to two different formulations of the problem. To optimize the resulting objective, we derive two algorithms based on accelerated gradient descent and alternating direction method of multipliers. Experiments conducted on audio signal recovery from spectrograms that are either exact or estimated from noisy observations highlight the potential of our proposed methods for audio restoration. In particular, leveraging some of these Bregman divergences induce better performance than the quadratic loss when performing PR from spectrograms under very noisy conditions.

preprint2021arXiv

Regularization via deep generative models: an analysis point of view

This paper proposes a new way of regularizing an inverse problem in imaging (e.g., deblurring or inpainting) by means of a deep generative neural network. Compared to end-to-end models, such approaches seem particularly interesting since the same network can be used for many different problems and experimental conditions, as soon as the generative model is suited to the data. Previous works proposed to use a synthesis framework, where the estimation is performed on the latent vector, the solution being obtained afterwards via the decoder. Instead, we propose an analysis formulation where we directly optimize the image itself and penalize the latent vector. We illustrate the interest of such a formulation by running experiments of inpainting, deblurring and super-resolution. In many cases our technique achieves a clear improvement of the performance and seems to be more robust, in particular with respect to initialization.

preprint2020arXiv

Fast reconstruction of atomic-scale STEM-EELS images from sparse sampling

This paper discusses the reconstruction of partially sampled spectrum-images to accelerate the acquisition in scanning transmission electron microscopy (STEM). The problem of image reconstruction has been widely considered in the literature for many imaging modalities, but only a few attempts handled 3D data such as spectral images acquired by STEM electron energy loss spectroscopy (EELS). Besides, among the methods proposed in the microscopy literature, some are fast but inaccurate while others provide accurate reconstruction but at the price of a high computation burden. Thus none of the proposed reconstruction methods fulfills our expectations in terms of accuracy and computation complexity. In this paper, we propose a fast and accurate reconstruction method suited for atomic-scale EELS. This method is compared to popular solutions such as beta process factor analysis (BPFA) which is used for the first time on STEM-EELS images. Experiments based on real as synthetic data will be conducted.

preprint2020arXiv

Ordinal Non-negative Matrix Factorization for Recommendation

We introduce a new non-negative matrix factorization (NMF) method for ordinal data, called OrdNMF. Ordinal data are categorical data which exhibit a natural ordering between the categories. In particular, they can be found in recommender systems, either with explicit data (such as ratings) or implicit data (such as quantized play counts). OrdNMF is a probabilistic latent factor model that generalizes Bernoulli-Poisson factorization (BePoF) and Poisson factorization (PF) applied to binarized data. Contrary to these methods, OrdNMF circumvents binarization and can exploit a more informative representation of the data. We design an efficient variational algorithm based on a suitable model augmentation and related to variational PF. In particular, our algorithm preserves the scalability of PF and can be applied to huge sparse datasets. We report recommendation experiments on explicit and implicit datasets, and show that OrdNMF outperforms BePoF and PF applied to binarized data.

preprint2020arXiv

Simulated JWST datasets for multispectral and hyperspectral image fusion

This paper aims at providing a comprehensive framework to generate an astrophysical scene and to simulate realistic hyperspectral and multispectral data acquired by two JWST instruments, namely NIRCam Imager and NIRSpec IFU. We want to show that this simulation framework can be resorted to assess the benefits of fusing these images to recover an image of high spatial and spectral resolutions. To do so, we create a synthetic scene associated with a canonical infrared source, the Orion Bar. This scene combines pre-existing modelled spectra provided by the JWST Early Release Science Program 1288 and real high resolution spatial maps from the Hubble space and ALMA telescopes. We develop forward models including corresponding noises for the two JWST instruments based on their technical designs and physical features. JWST observations are then simulated by applying the forward models to the aforementioned synthetic scene. We test a dedicated fusion algorithm we developed on these simulated observations. We show the fusion process reconstructs the high spatio-spectral resolution scene with a good accuracy on most areas, and we identify some limitations of the method to be tackled in future works. The synthetic scene and observations presented in the paper are made publicly available and can be used for instance to evaluate instrument models (aboard the JWST or on the ground), pipelines, or more sophisticated algorithms dedicated to JWST data analysis. Besides, fusion methods such as the one presented in this paper are shown to be promising tools to fully exploit the unprecedented capabilities of the JWST.

preprint2015arXiv

Bayesian Structured Sparsity Priors for EEG Source Localization Technical Report

This report introduces a new hierarchical Bayesian model for the EEG source localization problem. This model promotes structured sparsity to search for focal brain activity. This sparsity is obtained via a multivariate Bernoulli Laplacian prior assigned to the brain activity approximating an $\ell_{20}$ pseudo norm regularization in a Bayesian framework. A partially collapsed Gibbs sampler is used to draw samples asymptotically distributed according to the posterior associated with the proposed Bayesian model. The generated samples are used to estimate the brain activity and the model hyperparameters jointly in an unsupervised framework. Two different kinds of Metropolis-Hastings moves are introduced to accelerate the convergence of the Gibbs sampler. The first move is based on multiple dipole shifts within each MCMC chain whereas the second one exploits proposals associated with different MCMC chains. We use both synthetic and real data to compare the performance of the proposed method with the weighted $\ell_{21}$ mixed norm regularization and a method based on a multiple sparse prior, showing that our algorithm presents advantages in several scenarios.

preprint2012arXiv

The Monogenic Synchrosqueezed Wavelet Transform: A tool for the Decomposition/Demodulation of AM-FM images

The synchrosqueezing method aims at decomposing 1D functions as superpositions of a small number of "Intrinsic Modes", supposed to be well separated both in time and frequency. Based on the unidimensional wavelet transform and its reconstruction properties, the synchrosqueezing transform provides a powerful representation of multicomponent signals in the time-frequency plane, together with a reconstruction of each mode. In this paper, a bidimensional version of the synchrosqueezing transform is defined, by considering a well-adapted extension of the concept of analytic signal to images: the monogenic signal. The natural bidimensional counterpart of the notion of Intrinsic Mode is then the concept of "Intrinsic Monogenic Mode" that we define. Thereafter, we investigate the properties of its associated Monogenic Wavelet Decomposition. This leads to a natural bivariate extension of the Synchrosqueezed Wavelet Transform, for decomposing and processing multicomponent images. Numerical tests validate the effectiveness of the method for different examples.

Thomas Oberlin

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Phase recovery with Bregman divergences for audio source separation

Phase retrieval with Bregman divergences and application to audio signal recovery

Regularization via deep generative models: an analysis point of view

Fast reconstruction of atomic-scale STEM-EELS images from sparse sampling

Ordinal Non-negative Matrix Factorization for Recommendation

Simulated JWST datasets for multispectral and hyperspectral image fusion

Bayesian Structured Sparsity Priors for EEG Source Localization Technical Report

The Monogenic Synchrosqueezed Wavelet Transform: A tool for the Decomposition/Demodulation of AM-FM images