Researcher profile

Jing Shi

Jing Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2022arXiv

Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem

Deep learning based models have significantly improved the performance of speech separation with input mixtures like the cocktail party. Prominent methods (e.g., frequency-domain and time-domain speech separation) usually build regression models to predict the ground-truth speech from the mixture, using the masking-based design and the signal-level loss criterion (e.g., MSE or SI-SNR). This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem, with great flexibility and strong potential. Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols, and convert the paradigm of the speech separation/enhancement related tasks from regression to classification. By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized. Evaluation results based on the WSJ0-2mix and VCTK-noisy corpora in various settings show that our proposed method can steadily synthesize the separated speech with high speech quality and without any interference, which is difficult to avoid in regression-based methods. In addition, with negligible loss of listening quality, the speaker conversion of enhanced/separated speech could be easily realized through our method.

preprint2021arXiv

Role of Magnon-Magnon Scattering in Magnon Polaron Spin Seebeck Effect

The spin Seebeck effect (SSE) signal of magnon polarons in bulk-Y3Fe5O12 (YIG)/Pt heterostructures is found to drastically change as a function of temperature. It appears as a dip in the total SSE signal at low temperatures, but as the temperature increases, the dip gradually decreases before turning to a peak. We attribute the observed dip-to-peak transition to the rapid rise of the four-magnon scattering rate. Our analysis provides important insights into the microscopic origin of the hybridized excitations and the overall temperature dependence of the SSE anomalies.

preprint2021arXiv

Training Noisy Single-Channel Speech Separation With Noisy Oracle Sources: A Large Gap and A Small Step

As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise synthetically added to clean speech, preventing the use of in-domain data for a noisy-condition task, or training using mixtures of noisy speech, requiring the network to additionally separate the noise. We demonstrate the relative inseparability of noise and that this noisy speech paradigm leads to significant degradation of system performance. We also propose an SI-SDR-inspired training objective that tries to exploit the inseparability of noise to implicitly partition the signal and discount noise separation errors, enabling the training of better separation systems with noisy oracle sources.

preprint2021arXiv

Wide Field Imaging of van der Waals Ferromagnet Fe3GeTe2 by Spin Defects in Hexagonal Boron Nitride

Emergent color centers with accessible spins hosted by van der Waals materials have attracted substantial interest in recent years due to their significant potential for implementing transformative quantum sensing technologies. Hexagonal boron nitride (hBN) is naturally relevant in this context due to its remarkable ease of integration into devices consisting of low-dimensional materials. Taking advantage of boron vacancy spin defects in hBN, we report nanoscale quantum imaging of low-dimensional ferromagnetism sustained in Fe3GeTe2/hBN van der Waals heterostructures. Exploiting spin relaxometry methods, we have further observed spatially varying magnetic fluctuations in the exfoliated Fe3GeTe2 flake, whose magnitude reaches a peak value around the Curie temperature. Our results demonstrate the capability of spin defects in hBN of investigating local magnetic properties of layered materials in an accessible and precise way, which can be extended readily to a broad range of miniaturized van der Waals heterostructure systems.

preprint2020arXiv

Current-induced CrI3 surface spin-flop transition probed by proximity magnetoresistance in Pt

By exploiting proximity coupling, we probe the spin state of the surface layers of CrI3, a van der Waals magnetic semiconductor, by measuring the induced magnetoresistance (MR) of Pt in Pt/CrI3 nano-devices. We fabricate the devices with clean and stable interfaces by placing freshly exfoliated CrI3 flake atop pre-patterned thin Pt strip and encapsulating the Pt/CrI3 heterostructure with hexagonal boron nitride (hBN) in a protected environment. In devices consisting of a wide range of CrI3 thicknesses (30 to 150 nm), we observe that an abrupt upward jump in Pt MR emerge at a 2 T magnetic field applied perpendicularly to the layers when the current density exceeds 2.5x10^10 A/m2, followed by a gradual decrease over a range of 5 T. These distinct MR features suggest a spin-flop transition which reveals strong antiferromagnetic interlayer coupling in the surface layers of CrI3. We study the current dependence by holding the Pt/CrI3 sample at approximately the same temperature to exclude the joule heating effect, and find that the MR jump increases with the current density, indicating a spin current origin. This spin current effect provides a new route to control spin configurations in insulating antiferromagnets, which is potentially useful for spintronic applications.

preprint2020arXiv

Neural Speaker Diarization with Speaker-Wise Chain Rule

Speaker diarization is an essential step for processing multi-speaker audio. Although an end-to-end neural diarization (EEND) method achieved state-of-the-art performance, it is limited to a fixed number of speakers. In this paper, we solve this fixed number of speaker issue by a novel speaker-wise conditional inference method based on the probabilistic chain rule. In the proposed method, each speaker's speech activity is regarded as a single random variable, and is estimated sequentially conditioned on previously estimated other speakers' speech activities. Similar to other sequence-to-sequence models, the proposed method produces a variable number of speakers with a stop sequence condition. We evaluated the proposed method on multi-speaker audio recordings of a variable number of speakers. Experimental results show that the proposed method can correctly produce diarization results with a variable number of speakers and outperforms the state-of-the-art end-to-end speaker diarization methods in terms of diarization error rate.

preprint2020arXiv

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

Neural sequence-to-sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on one-to-many sequence transduction problems, such as extracting multiple sequential sources from a mixture sequence. We extend the standard sequence-to-sequence model to a conditional multi-sequence model, which explicitly models the relevance between multiple output sequences with the probabilistic chain rule. Based on this extension, our model can conditionally infer output sequences one-by-one by making use of both input and previously-estimated contextual output sequences. This model additionally has a simple and efficient stop criterion for the end of the transduction, making it able to infer the variable number of output sequences. We take speech data as a primary test field to evaluate our methods since the observed speech data is often composed of multiple sources due to the nature of the superposition principle of sound waves. Experiments on several different tasks including speech separation and multi-speaker speech recognition show that our conditional multi-sequence models lead to consistent improvements over the conventional non-conditional models.

preprint2020arXiv

Speaker-Conditional Chain Model for Speech Separation and Extraction

Speech separation has been extensively explored to tackle the cocktail party problem. However, these studies are still far from having enough generalization capabilities for real scenarios. In this work, we raise a common strategy named Speaker-Conditional Chain Model to process complex speech recordings. In the proposed method, our model first infers the identities of variable numbers of speakers from the observation based on a sequence-to-sequence model. Then, it takes the information from the inferred speakers as conditions to extract their speech sources. With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings. The experiments from standard fully-overlapped speech separation benchmarks show comparable results with prior studies, while our proposed model gets better adaptability for multi-round long recordings.

preprint2020arXiv

Spin Seebeck Effect near the Antiferromagnetic Spin-Flop Transition

We develop a low-temperature, long-wavelength theory for the interfacial spin Seebeck effect (SSE) in easy-axis antiferromagnets. The field-induced spin-flop (SF) transition of Néel order is associated with a qualitative change in SSE behavior: Below SF, there are two spin carriers with opposite magnetic moments, with the carriers polarized along the field forming a majority magnon band. Above SF, the low-energy, ferromagnetic-like mode has magnetic moment opposite the field. This results in a sign change of the SSE across SF, which agrees with recent measurements on Cr$_2$O$_3$/Pt and Cr$_2$O$_3$/Ta devices [Li $\textit{et al.,}$ $\textit{Nature}$ $\textbf{578,}$ 70 (2020)]. In our theory, SSE is due to a Néel spin current below SF and a magnetic spin current above SF. Using the ratio of the associated Néel to magnetic spin-mixing conductances as a single constant fitting parameter, we reproduce the field dependence of the experimental data and partially the temperature dependence of the relative SSE jump across SF.

preprint2020arXiv

Unveiling valley lifetimes of free charge carriers in monolayer WSe$_2$

We report on nanosecond long, gate-dependent valley lifetimes of free charge carriers in monolayer WSe$_2$, unambiguously identified by the combination of time-resolved Kerr rotation and electrical transport measurements. While the valley polarization increases when tuning the Fermi level into the conduction or valence band, there is a strong decrease of the respective valley lifetime consistent with both electron-phonon and spin-orbit scattering. The longest lifetimes are seen for spin-polarized bound excitons in the band gap region. We explain our findings via two distinct, Fermi level-dependent scattering channels of optically excited, valley polarized bright trions either via dark or bound states. By electrostatic gating we demonstrate that the transition metal dichalcogenide WSe$_2$ can be tuned to be either an ideal host for long-lived localized spin states or allow for nanosecond valley lifetimes of free charge carriers (> 10 ns).

preprint2020arXiv

Valley lifetimes of conduction band electrons in monolayer WSe$_2$

One of the main tasks in the investigation of 2-dimensional transition metal dichalcogenides is the determination of valley lifetimes. In this work, we combine time-resolved Kerr rotation with electrical transport measurements to explore the gate-dependent valley lifetimes of free conduction band electrons of monolayer WSe$_2$. When tuning the Fermi energy into the conduction band we observe a strong decrease of the respective valley lifetimes which is consistent with both spin-orbit and electron-phonon scattering. We explain the formation of a valley polarization by the scattering of optically excited valley polarized bright trions into dark states by intervalley scattering. Furthermore, we show that the conventional time-resolved Kerr rotation measurement scheme has to be modified to account for photo-induced gate screening effects. Disregarding this adaptation can lead to erroneous conclusions drawn from gate-dependent optical measurements and can completely mask the true gate-dependent valley dynamics.

preprint2019arXiv

Fe$_{5-x}$Ge$_2$Te$_2$ -- A new exfoliable itinerant ferromagnet with high Curie temperature and large perpendicular magnetic anisotropy

Layered van der Waals (vdW) crystals with intrinsic magnetic properties such as high Curie temperature (TC) and large perpendicular magnetic anisotropy (PMA) are key to the development and application of spintronic devices. The ferromagnetic vdW metal Fe3-xGeTe2 (FGT) has gained prominence recently due to its high TC (220 K) and strong PMA. Here, we introduce a new metallic vdW ferromagnets, Fe5-xGe2Te2 or FG2T, which was successfully synthesized and fully characterized. FG2T is a metal that orders ferromagnetically with a very sharp transition at 250 K (bulk and single crystal thin flakes) and shows large PMA, as found by both experimental and computational studies. This work enables novel heterostructure devices with near room temperature capabilities by using FG2T as spin injector.