Researcher profile

Jiri Malek

Jiri Malek contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
6works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification

This manuscript proposes a novel robust procedure for the extraction of a speaker of interest (SOI) from a mixture of audio sources. The estimation of the SOI is performed via independent vector extraction (IVE). Since the blind IVE cannot distinguish the target source by itself, it is guided towards the SOI via frame-wise speaker identification based on deep learning. Still, an incorrect speaker can be extracted due to guidance failings, especially when processing challenging data. To identify such cases, we propose a criterion for non-intrusively assessing the estimated speaker. It utilizes the same model as the speaker identification, so no additional training is required. When incorrect extraction is detected, we propose a ``deflation'' step in which the incorrect source is subtracted from the mixture and, subsequently, another attempt to extract the SOI is performed. The process is repeated until successful extraction is achieved. The proposed procedure is experimentally tested on artificial and real-world datasets containing challenging phenomena: source movements, reverberation, transient noise, or microphone failures. The method is compared with state-of-the-art blind algorithms as well as with current fully supervised deep learning-based methods.

preprint2019arXiv

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

This work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when very short utterances are processed, e.g., in voice assistant scenarios. We consider several variants of a system that performs beamforming supported by DNN-based voice activity detection (VAD) followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently in order to make the method applicable in highly dynamic environments. Owing to the short length of the processed block, the statistics required by the beamformer are estimated less precisely. The influence of this inaccuracy is studied and compared to the processing regime when recordings are treated as one block (batch processing). The experimental evaluation of the proposed method is performed on large datasets of CHiME-4 and on another dataset featuring moving target speaker. The experiments are evaluated in terms of objective and perceptual criteria (such as signal-to-interference ratio (SIR) or perceptual evaluation of speech quality (PESQ), respectively). Moreover, word error rate (WER) achieved by a baseline automatic speech recognition system is evaluated, for which the enhancement method serves as a front-end solution. The results indicate that the proposed method is robust with respect to short length of the processed block. Significant improvements in terms of the criteria and WER are observed even for the block length of 250 ms.

preprint2015arXiv

Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function

Relative impulse responses between microphones are usually long and dense due to the reverberant acoustic environment. Estimating them from short and noisy recordings poses a long-standing challenge of audio signal processing. In this paper we apply a novel strategy based on ideas of Compressed Sensing. Relative transfer function (RTF) corresponding to the relative impulse response can often be estimated accurately from noisy data but only for certain frequencies. This means that often only an incomplete measurement of the RTF is available. A complete RTF estimate can be obtained through finding its sparsest representation in the time-domain: that is, through computing the sparsest among the corresponding relative impulse responses. Based on this approach, we propose to estimate the RTF from noisy data in three steps. First, the RTF is estimated using any conventional method such as the non-stationarity-based estimator by Gannot et al. or through Blind Source Separation. Second, frequencies are determined for which the RTF estimate appears to be accurate. Third, the RTF is reconstructed through solving a weighted $\ell_1$ convex program, which we propose to solve via a computationally efficient variant of the SpaRSA (Sparse Reconstruction by Separable Approximation) algorithm. An extensive experimental study with real-world recordings has been conducted. It has been shown that the proposed method is capable of improving many conventional estimators used as the first step in most situations.

preprint2012arXiv

Determining the Short-Range Spin Correlations in Cuprate Chain Materials with Resonant Inelastic X-ray Scattering

We report a high-resolution resonant inelastic soft x-ray scattering study of the quantum magnetic spin-chain materials Li2CuO2 and CuGeO3. By tuning the incoming photon energy to the oxygen K-edge, a strong excitation around 3.5 eV energy loss is clearly resolved for both materials. Comparing the experimental data to many-body calculations, we identify this excitation as a Zhang-Rice singlet exciton on neighboring CuO4-plaquettes. We demonstrate that the strong temperature dependence of the inelastic scattering related to this high-energy exciton enables to probe short-range spin correlations on the 1 meV scale with outstanding sensitivity.

preprint2012arXiv

The dual nature of As-vacancies in LaFeAsO-derived superconductors: magnetic moment formation while preserving superconductivity

As-vacancies (V_As) in La-1111-systems, which are nominally non-magnetic defects, are shown to create in their vicinity by symmetry ferromagnetically oriented local magnetic moments due to the strong, covalent bonds with neighboring Fe atoms that they break. From microscopic theory in terms of an appropriately modified Anderson-Wolff model, we find that the moment formation results in a substantially enhanced paramagnetic susceptibility in both the normal and superconducting (SC) state. Despite the V_As act as magnetic scatterers, they do not deteriorate SC properties which can even be improved by V_As by suppressing a competing or coexisting commensurate spin density wave or its remnant fluctuations. Due to the induced local magnetic moments an s_++-scenario in related systems is unlikely.

preprint2012arXiv

The strength of frustration and quantum fluctuations in LiVCuO4

For the 1D-frustrated ferromagnetic J_1-J_2 model with interchain coupling added, we analyze the dynamical and static structure factor S(k,omega), the pitch angle phi of the magnetic structure, the magnetization curve of edge-shared chain cuprates, and focus on LiCuVO4 for which neither a perturbed spinon nor a spin wave approach can be applied. phi is found to be most sensitive to the interplay of frustration and quantum fluctuations. For LiVCuO4 the obtained exchange parameters J are in accord with the results for a realistic 5-band extended Hubbard model and LSDA + U predictions yielding alpha=J_2/|J_1| about 0.75 in contrast to 5.5 > alpha > 1.42 suggested in the literature. The alpha-regime of the empirical phi-values in NaCu2O2 and linarite are considered, too.