Source author record

Mads Græsbøll Christensen

Mads Græsbøll Christensen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Sound eess.SP Cryptography and Security

Catalog footprint

What is connected

6works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Bayesian Permutation training deep representation learning method for speech enhancement with variational autoencoder

Recently, variational autoencoder (VAE), a deep representation learning (DRL) model, has been used to perform speech enhancement (SE). However, to the best of our knowledge, current VAE-based SE methods only apply VAE to the model speech signal, while noise is modeled using the traditional non-negative matrix factorization (NMF) model. One of the most important reasons for using NMF is that these VAE-based methods cannot disentangle the speech and noise latent variables from the observed signal. Based on Bayesian theory, this paper derives a novel variational lower bound for VAE, which ensures that VAE can be trained in supervision, and can disentangle speech and noise latent variables from the observed signal. This means that the proposed method can apply the VAE to model both speech and noise signals, which is totally different from the previous VAE-based SE works. More specifically, the proposed DRL method can learn to impose speech and noise signal priors to different sets of latent variables for SE. The experimental results show that the proposed method can not only disentangle speech and noise latent variables from the observed signal but also obtain a higher scale-invariant signal-to-distortion ratio and speech quality score than the similar deep neural network-based (DNN) SE method.

preprint2022arXiv

A deep representation learning speech enhancement method using $β$-VAE

In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training speech enhancement (SE) method (PVAE) which indicated that the SE performance of the traditional deep neural network-based (DNN) method could be improved by deep representation learning (DRL). Based on our previous work, we in this paper propose to use $β$-VAE to further improve PVAE's ability of representation learning. More specifically, our $β$-VAE can improve PVAE's capacity of disentangling different latent variables from the observed signal without the trade-off problem between disentanglement and signal reconstruction. This trade-off problem widely exists in previous $β$-VAE algorithms. Unlike the previous $β$-VAE algorithms, the proposed $β$-VAE strategy can also be used to optimize the DNN's structure. This means that the proposed method can not only improve PVAE's SE performance but also reduce the number of PVAE training parameters. The experimental results show that the proposed method can acquire better speech and noise latent representation than PVAE. Meanwhile, it also obtains a higher scale-invariant signal-to-distortion ratio, speech quality, and speech intelligibility.

preprint2022arXiv

Space Alternating Variational Estimation Based Sparse Bayesian Learning for Complex-value Sparse Signal Recovery Using Adaptive Laplace Priors

Due to its self-regularizing nature and its ability to quantify uncertainty, the Bayesian approach has achieved excellent recovery performance across a wide range of sparse signal recovery applications. However, most existing methods are based on the real-value signal model, with the complex-value signal model rarely considered. Motivated by the adaptive least absolute shrinkage and selection operator (LASSO) and the sparse Bayesian learning (SBL) framework, a hierarchical model with adaptive Laplace priors is proposed in this paper for recovery of complex sparse signals. Moreover, the space alternating approach is integrated into the algorithm to reduce the computational complexity of the proposed method. In experiments, the proposed algorithm is studied for complex Gaussian random dictionaries and different types of complex signals. These experiments show that the proposed algorithm offers better recovery performance for different types of complex signals than state-of-the-art methods.

preprint2020arXiv

A Speech Enhancement Algorithm based on Non-negative Hidden Markov Model and Kullback-Leibler Divergence

In this paper, we propose a novel supervised single-channel speech enhancement method combing the the Kullback-Leibler divergence-based non-negative matrix factorization (NMF) and hidden Markov model (NMF-HMM). With the application of HMM, the temporal dynamics information of speech signals can be taken into account. In the training stage, the sum of Poisson, leading to the KL divergence measure, is used as the observation model for each state of HMM. This ensures that a computationally efficient multiplicative update can be used for the parameter update of the proposed model. In the online enhancement stage, we propose a novel minimum mean-square error (MMSE) estimator for the proposed NMF-HMM. This estimator can be implemented using parallel computing, saving the time complexity. The performance of the proposed algorithm is verified by objective measures. The experimental results show that the proposed strategy achieves better speech enhancement performance than state-of-the-art speech enhancement methods. More specifically, compared with the traditional NMF-based speech enhancement methods, our proposed algorithm achieves a 5\% improvement for short-time objective intelligibility (STOI) and 0.18 improvement for perceptual evaluation of speech quality (PESQ).

preprint2020arXiv

Privacy-Preserving Distributed Processing: Metrics, Bounds, and Algorithms

Privacy-preserving distributed processing has recently attracted considerable attention. It aims to design solutions for conducting signal processing tasks over networks in a decentralized fashion without violating privacy. Many algorithms can be adopted to solve this problem such as differential privacy, secure multiparty computation, and the recently proposed distributed optimization based subspace perturbation. However, how these algorithms relate to each other is not fully explored yet. In this paper, we therefore first propose information-theoretic metrics based on mutual information. Using the proposed metrics, we are able to compare and relate a number of existing well-known algorithms. We then derive a lower bound on individual privacy that gives insights on the nature of the problem. To validate the above claims, we investigate a concrete example and compare a number of state-of-the-art approaches in terms of different aspects such as output utility, individual privacy and algorithm robustness against the number of corrupted parties, using not only theoretical analysis but also numerical validation. Finally, we discuss and provide principles for designing appropriate algorithms for different applications.

preprint2020arXiv

Signal-Adaptive and Perceptually Optimized Sound Zones with Variable Span Trade-Off Filters

Creating sound zones has been an active research field since the idea was first proposed. So far, most sound zone control methods rely on either an optimization of physical metrics such as acoustic contrast and signal distortion or a mode decomposition of the desired sound field. By using these types of methods, approximately 15 dB of acoustic contrast between the reproduced sound field in the target zone and its leakage to other zone(s) has been reported in practical set-ups, but this is typically not high enough to satisfy the people inside the zones. In this paper, we propose a sound zone control method shaping the leakage errors so that they are as inaudible as possible for a given acoustic contrast. The shaping of the leakage errors is performed by taking the time-varying input signal characteristics and the human auditory system into account when the loudspeaker control filters are calculated. We show how this shaping can be performed using variable span trade-off filters, and we show theoretically how these filters can be used for trading signal distortion in the target zone for acoustic contrast. The proposed method is evaluated based on physical metrics such as acoustic contrast and perceptual metrics such as STOI. The computational complexity and processing time of the proposed method for different system set-ups are also investigated. Lastly, the results of a MUSHRA listening test are reported. The test results show that the proposed method provides more than 20% perceptual improvement compared to existing sound zone control methods.

Mads Græsbøll Christensen

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

A Bayesian Permutation training deep representation learning method for speech enhancement with variational autoencoder

A deep representation learning speech enhancement method using $β$-VAE

Space Alternating Variational Estimation Based Sparse Bayesian Learning for Complex-value Sparse Signal Recovery Using Adaptive Laplace Priors

A Speech Enhancement Algorithm based on Non-negative Hidden Markov Model and Kullback-Leibler Divergence

Privacy-Preserving Distributed Processing: Metrics, Bounds, and Algorithms

Signal-Adaptive and Perceptually Optimized Sound Zones with Variable Span Trade-Off Filters