Source author record

Changchun Bao

Changchun Bao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Sound

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Neural Vocoder Based Packet Loss Concealment Algorithm

The packet loss problem seriously affects the quality of service in Voice over IP (VoIP) sceneries. In this paper, we investigated online receiver-based packet loss concealment which is much more portable and applicable. For ensuring the speech naturalness, rather than directly processing time-domain waveforms or separately reconstructing amplitudes and phases in frequency domain, a flow-based neural vocoder is adopted to generate the substitution waveform of lost packet from Mel-spectrogram which is generated from history contents by a well-designed neural predictor. Furthermore, a waveform similarity-based smoothing post-process is created to mitigate the discontinuity of speech and avoid the artifacts. The experimental results show the outstanding performance of the proposed method.

preprint2022arXiv

An Effective Dereverberation Algorithm by Fusing MVDR and MCLP

In the scenario with reverberation, the experience of human-machine interaction will become worse. In order to solve this problem, many methods for the dereverberation have emerged. At present, how to update the parameters of the Kalman filter in the existing dereverberation methods based on multichannel linear prediction (MCLP) is a challenging task, especially, accurate power spectral density (PSD) estimation of target speech. In this paper, minimum variance distortionless response (MVDR) beamformer and MCLP are effectively fused in the dereverberation, where the PSD of target speech used for Kalman filter is modified in the MCLP. In order to construct a MVDR beamformer, the PSD of late reverberation and the PSD of the noise are estimated simultaneously by the blocking-based PSD estimator. Thus, the PSD of target speech used for Kalman filter can be obtained by subtracting the PSD of late reverberation and the PSD of the noise from the PSD of observation signal. Compared to the reference methods, the proposed method shows an outstanding performance.

preprint2022arXiv

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent neural network (RNN) to advanced transformer, have been designed sophistically to improve separation performance. However, the state-of-the-art models usually suffer from several flaws related to the computation, such as large model size, huge memory consumption and computational complexity. To find the balance between the performance and computational efficiency and to further explore the modeling ability of traditional network structure, we combine RNN and a newly proposed variant of convolutional network to cope with speech separation problem. By embedding two RNNs into basic block of this variant with the help of dual-path strategy, the proposed network can effectively learn the local information and global dependency. Besides, a four-staged structure enables the separation procedure to be performed gradually at finer and finer scales as the feature dimension increases. The experimental results on various datasets have proven the effectiveness of the proposed method and shown that a trade-off between the separation performance and computational efficiency is well achieved.

preprint2022arXiv

Multi-source wideband doa estimation method by frequency focusing and error weighting

In this paper, a new multi-source wideband direction of arrival (MSW-DOA) estimation method is proposed for the signal with non-uniform distribution using the sub-array of uniform linear array. Different from conventional methods, based on the free far-field model, the proposed method mainly makes two contributions. One is that the sub-array decomposition is adopted to improve the accuracy of MSW-DOA estimation by minimizing the weighted error, and the other one is that the frequency focusing procedure is optimized according to the presence probability of sound sources for reducing the influence of the sub-bands with low signal to noise ratio (SNR). Simulation results show that the proposed method can effectively improve the performance of wideband DOA estimation in the case of multiple sound sources.

Changchun Bao

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

A Neural Vocoder Based Packet Loss Concealment Algorithm

An Effective Dereverberation Algorithm by Fusing MVDR and MCLP

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Multi-source wideband doa estimation method by frequency focusing and error weighting