Source author record

Yutian Wang

Yutian Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci eess.AS Sound Artificial Intelligence Computer Vision math.OC Multimedia physics.optics

Catalog footprint

What is connected

16works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Atomic-scale Deformation Process of Glasses Unveiled by Stress-induced Structural Anisotropy

Experimentally resolving atomic-scale structural changes of a deformed glass remains challenging owing to the disordered nature of glass structure. Here, we show that the structural anisotropy emerges as a general hallmark for different types of glasses (metallic glasses, oxide glass, amorphous selenium, and polymer glass) after thermo-mechanical deformation, and it is highly correlates with local nonaffine atomic displacements detected by the high-energy X-ray diffraction technique. By analyzing the anisotropic pair density function, we unveil the atomic-level mechanism responsible for the plastic flow, which notably differs between metallic glasses and covalent glasses. The structural rearrangements in metallic glasses are mediated through cutting and formation of atomic bonds, which occurs in some localized inelastic regions embedded in elastic matrix, whereas that of covalent glasses is mediated through the rotation of atomic bonds or chains without bond length change, which occurs in a less localized manner.

preprint2022arXiv

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

The decoupling-style concept begins to ignite in the speech enhancement area, which decouples the original complex spectrum estimation task into multiple easier sub-tasks i.e., magnitude-only recovery and the residual complex spectrum estimation)}, resulting in better performance and easier interpretability. In this paper, we propose a dual-branch federative magnitude and phase estimation framework, dubbed DBT-Net, for monaural speech enhancement, aiming at recovering the coarse- and fine-grained regions of the overall spectrum in parallel. From the complementary perspective, the magnitude estimation branch is designed to filter out dominant noise components in the magnitude domain, while the complex spectrum purification branch is elaborately designed to inpaint the missing spectral details and implicitly estimate the phase information in the complex-valued spectral domain. To facilitate the information flow between each branch, interaction modules are introduced to leverage features learned from one branch, so as to suppress the undesired parts and recover the missing components of the other branch. Instead of adopting the conventional RNNs and temporal convolutional networks for sequence modeling, we employ a novel attention-in-attention transformer-based network within each branch for better feature learning. More specially, it is composed of several adaptive spectro-temporal attention transformer-based modules and an adaptive hierarchical attention module, aiming to capture long-term time-frequency dependencies and further aggregate intermediate hierarchical contextual information. Comprehensive evaluations on the WSJ0-SI84 + DNS-Challenge and VoiceBank + DEMAND dataset demonstrate that the proposed approach consistently outperforms previous advanced systems and yields state-of-the-art performance in terms of speech quality and intelligibility.

preprint2022arXiv

Deep BSDE-ML Learning and Its Application to Model-Free Optimal Control

A modified Deep BSDE (backward differential equation) learning method with measurability loss, called Deep BSDE-ML method, is introduced in this paper to solve a kind of linear decoupled forward-backward stochastic differential equations (FBSDEs), which is encountered in the policy evaluation of learning the optimal feedback policies of a class of stochastic control problems. The measurability loss is characterized via the measurability of BSDE's state at the forward initial time, which differs from that related to terminal state of the known Deep BSDE method. Though the minima of the two loss functions are shown to be equal, this measurability loss is proved to be equal to the expected mean squared error between the true diffusion term of BSDE and its approximation. This crucial observation extends the application of the Deep BSDE method -- approximating the gradients of the solution of a partial differential equation (PDE) instead of the solution itself. Simultaneously, a learning-based framework is introduced to search an optimal feedback control of a deterministic nonlinear system. Specifically, by introducing Gaussian exploration noise, we are aiming to learn a robust optimal controller under this stochastic case. This reformulation sacrifices the optimality to some extent, but as suggested in reinforcement learning (RL) exploration noise is essential to enable the model-free learning.

preprint2022arXiv

Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement

Curriculum learning begins to thrive in the speech enhancement area, which decouples the original spectrum estimation task into multiple easier sub-tasks to achieve better performance. Motivated by that, we propose a dual-branch attention-in-attention transformer dubbed DB-AIAT to handle both coarse- and fine-grained regions of the spectrum in parallel. From a complementary perspective, a magnitude masking branch is proposed to coarsely estimate the overall magnitude spectrum, and simultaneously a complex refining branch is elaborately designed to compensate for the missing spectral details and implicitly derive phase information. Within each branch, we propose a novel attention-in-attention transformer-based module to replace the conventional RNNs and temporal convolutional networks for temporal sequence modeling. Specifically, the proposed attention-in-attention transformer consists of adaptive temporal-frequency attention transformer blocks and an adaptive hierarchical attention module, aiming to capture long-term temporal-frequency dependencies and further aggregate global hierarchical contextual information. Experimental results on Voice Bank + DEMAND demonstrate that DB-AIAT yields state-of-the-art performance (e.g., 3.31 PESQ, 95.6% STOI and 10.79dB SSNR) over previous advanced systems with a relatively small model size (2.81M).

preprint2022arXiv

Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement

For the lack of adequate paired noisy-clean speech corpus in many real scenarios, non-parallel training is a promising task for DNN-based speech enhancement methods. However, because of the severe mismatch between input and target speeches, many previous studies only focus on the magnitude spectrum estimation and remain the phase unaltered, resulting in the degraded speech quality under low signal-to-noise ratio conditions. To tackle this problem, we decouple the difficult target w.r.t. original spectrum optimization into spectral magnitude and phase, and a novel Cycle-in-Cycle generative adversarial network (dubbed CinCGAN) is proposed to jointly estimate the spectral magnitude and phase information stage by stage under unpaired data. In the first stage, we pretrain a magnitude CycleGAN to coarsely estimate the spectral magnitude of clean speech. In the second stage, we incorporate the pretrained CycleGAN with a complex-valued CycleGAN as a cycle-in-cycle structure to simultaneously recover phase information and refine the overall spectrum. Experimental results demonstrate that the proposed approach significantly outperforms previous baselines under non-parallel training. The evaluation on training the models with standard paired data also shows that CinCGAN achieves remarkable performance especially in reducing background noise and speech distortion.

preprint2022arXiv

Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement

Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

preprint2022arXiv

Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis

In this paper, we propose a novel prosody disentangle method for prosodic Text-to-Speech (TTS) model, which introduces the vector quantization (VQ) method to the auxiliary prosody encoder to obtain the decomposed prosody representations in an unsupervised manner. Rely on its advantages, the speaking styles, such as pitch, speaking velocity, local pitch variance, etc., are decomposed automatically into the latent quantize vectors. We also investigate the internal mechanism of VQ disentangle process by means of a latent variables counter and find that higher value dimensions usually represent prosody information. Experiments show that our model can control the speaking styles of synthesis results by directly manipulating the latent variables. The objective and subjective evaluations illustrated that our model outperforms the popular models.

preprint2020arXiv

Soliton Distillation in Fiber Lasers

Pure solitons are for the first time distilled from the resonant continuous wave (CW) background in a fiber laser by utilizing nonlinear Fourier transform (NFT). It is identified that the soliton and the resonant CW background have different eigenvalue distributions in the nonlinear frequency domain. Similar to water distillation, we propose the approach of soliton distillation, by making NFT on a steady pulse generated from a fiber laser, then filtering out the eigenvalues of the resonant CW background in the nonlinear frequency domain, and finally recovering the soliton by inverse NFT (INFT). Simulation results verify that the soliton can be distinguished from the resonant CW background in the nonlinear frequency domain and pure solitons can be obtained by INFT.

preprint2016arXiv

Precise tuning of the Curie temperature of (Ga,Mn)As-based magnetic semiconductors by hole compensation: Support for valence-band ferromagnetism

For the prototype diluted ferromagnetic semiconductor (Ga,Mn)As, there is a fundamental concern about the electronic states near the Fermi level, i.e., whether the Fermi level resides in a well-separated impurity band derived from Mn doping (impurity-band model) or in the valence band that is already merged with the Mn-derived impurity band (valence-band model). We investigate this question by carefully shifting the Fermi level by means of carrier compensation. We use helium-ion implantation, a standard industry technology, to precisely compensate the hole doping of GaAs-based diluted ferromagnetic semiconductors while keeping the Mn concentration constant. We monitor the change of Curie temperature ($T_C$) and conductivity. For a broad range of samples including (Ga,Mn)As and (Ga,Mn)(As,P) with various Mn and P concentrations, we observe a smooth decrease of $T_C$ with carrier compensation over a wide temperature range while the conduction is changed from metallic to insulating. The existence of $T_C$ below 10\,K is also confirmed in heavily compensated samples. Our experimental results are naturally explained within the valence-band picture.

preprint2015arXiv

Defect-induced magnetism in graphite through neutron irradiation

We have investigated the variation in the magnetization of highly ordered pyrolytic graphite (HOPG) after neutron irradiation, which introduces defects in the bulk sample and consequently gives rise to a large magnetic signal. We observe strong paramagnetism in HOPG, increasing with the neutron fluence. We correlate the induced paramagnetism with structural defects by comparison with density-functional theory calculations. In addition to the in-plane vacancies, the trans-planar defects also contribute to the magnetization. The lack of any magnetic order between the local moments is possibly due to the absence of hydrogen/nitrogen chemisorption, or the magnetic order cannot be established at all in the bulk form.

preprint2015arXiv

Defect-induced magnetism in SiC: Interplay between ferromagnetism and paramagnetism

Defect-induced ferromagnetism has triggered a lot of investigations and controversies. The major issue is that the induced ferromagnetic signal is so weak that it can sufficiently be accounted for by trace contamination. To resolve this issue, we studied the variation of the magnetic properties of SiC after neutron irradiation with fluence covering four orders of magnitude. A large paramagnetic component has been induced and scales up with defect concentration, which can be well accounted for by uncoupled divacancies. However, the ferromagnetic contribution is still weak and only appears in the low fluence range of neutrons or after annealing treatments. First-principles calculations hint towards a mutually exclusive role of the concentration of defects: Defects favor spin polarization at the expense of magnetic interaction. Combining both experimental and first-principles calculation results, the defect-induced ferromagnetism can be understood as a local effect which cannot be scaled up with the volume. Therefore, our investigation answers the long-standing question why the defect-induced ferromagnetic signal is weak.

preprint2014arXiv

Disentangling defect-induced ferromagnetism in SiC

We present a detailed investigation of the magnetic properties in SiC single crystals bombarded with neon ions. Through careful measuring of the magnetization of virgin and irradiated SiC, we decompose the magnetization of SiC into paramagnetic, superparamagnetic, and ferromagnetic contributions. The ferromagnetic contribution persists well above room temperature and exhibits a pronounced magnetic anisotropy. We qualitatively explain the magnetic properties as a result of the intrinsic clustering tendency of defects.

preprint2014arXiv

Structural and magnetic properties of irradiated SiC

We present a comprehensive structural characterization of ferromagnetic SiC single crystals induced by Ne ion irradiation. The ferromagnetism has been confirmed by electron spin resonance and possible transition metal impurities can be excluded to be the origin of the observed ferromagnetism. Using X-ray diffraction and Rutherford backscattering/channeling spectroscopy, we estimate the damage to the crystallinity of SiC which mutually influences the ferromagnetism in SiC.

preprint2012arXiv

A United Image Force for Deformable Models and Direct Transforming Geometric Active Contorus to Snakes by Level Sets

A uniform distribution of the image force field around the object fasts the convergence speed of the segmentation process. However, to achieve this aim, it causes the force constructed from the heat diffusion model unable to indicate the object boundaries accurately. The image force based on electrostatic field model can perform an exact shape recovery. First, this study introduces a fusion scheme of these two image forces, which is capable of extracting the object boundary with high precision and fast speed. Until now, there is no satisfied analysis about the relationship between Snakes and Geometric Active Contours (GAC). The second contribution of this study addresses that the GAC model can be deduced directly from Snakes model. It proves that each term in GAC and Snakes is correspondent and has similar function. However, the two models are expressed using different mathematics. Further, since losing the ability of rotating the contour, adoption of level sets can limits the usage of GAC in some circumstances.

preprint2012arXiv

Ferromagnetic InMnAs on InAs Prepared by Ion Implantation and Pulsed Laser Annealing

Ferromagnetic InMnAs has been prepared by Mn ion implantation and pulsed laser annealing. The InMnAs layer reveals a saturated magnetization of 2.6 mu_B/Mn at 5 K and a perpendicular magnetic anisotropy. The Curie temperature is determined to be 46 K, which is higher than those in previous reports with similar Mn concentrations. Ferromagnetism is further evidenced by the large magnetic circular dichroism.

preprint2012arXiv

Magnetic Mn5Ge3 nanocrystals embedded in crystalline Ge: a magnet/semiconductor hybrid synthesized by ion implantation

The integration of ferromagnetic Mn5Ge3 with the Ge matrix is promising for spin injection in a silicon-compatible geometry. In this paper, we report the preparation of magnetic Mn5Ge3 nanocrystals embedded inside the Ge matrix by Mn ions implantation at elevated temperature. By X-ray diffraction and transmission electron microscopy, we observe crystalline Mn5Ge3 with variable size depending on the Mn ion fluence. The electronic structure of Mn in Mn5Ge3 nanocrystals is 3d6 configuration, the same as in bulk Mn5Ge3. A large positive magnetoresistance has been observed at low temperatures. It can be explained by the conductivity inhomogeneity in the magnetic/semiconductor hybrid system.

Yutian Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Atomic-scale Deformation Process of Glasses Unveiled by Stress-induced Structural Anisotropy

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

Deep BSDE-ML Learning and Its Application to Model-Free Optimal Control

Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement

Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement

Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement

Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis

Soliton Distillation in Fiber Lasers

Precise tuning of the Curie temperature of (Ga,Mn)As-based magnetic semiconductors by hole compensation: Support for valence-band ferromagnetism

Defect-induced magnetism in graphite through neutron irradiation

Defect-induced magnetism in SiC: Interplay between ferromagnetism and paramagnetism

Disentangling defect-induced ferromagnetism in SiC

Structural and magnetic properties of irradiated SiC

A United Image Force for Deformable Models and Direct Transforming Geometric Active Contorus to Snakes by Level Sets

Ferromagnetic InMnAs on InAs Prepared by Ion Implantation and Pulsed Laser Annealing

Magnetic Mn5Ge3 nanocrystals embedded in crystalline Ge: a magnet/semiconductor hybrid synthesized by ion implantation