Source author record

Yi Luo

Yi Luo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

47works

26topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis

Autonomous data analysis agents are increasingly expected to conduct exploratory analysis over underexplored data environments. This burden is especially salient in complex financial analytics, where relevant evidence is rarely pre-specified. However, existing benchmarks typically evaluate such agents in prior-guided settings, providing selected data sources, explicit data schemas, or cleaned data, thereby understating the exploratory burden. We introduce DataClawBench, a benchmark for exploratory real-world financial data analysis under limited prior guidance. DataClawBench contains approximately 2.06 million real-world records across enterprise, industry, and policy domains, with native data noise preserved. It further includes 492 cross-domain tasks derived from think-tank consulting scenarios, each annotated with intermediate milestones that diagnose exploration and reasoning failures beyond outcome accuracy. A systematic evaluation of eight advanced LLMs under the OpenClaw agent reveals that exploratory data analysis breaks agent reliability: more exploration does not reliably translate into task-relevant progress or correct final answers.

preprint2022arXiv

A Time-domain Real-valued Generalized Wiener Filter for Multi-channel Neural Separation Systems

Frequency-domain beamformers have been successful in a wide range of multi-channel neural separation systems in the past years. However, the operations in conventional frequency-domain beamformers are typically independently-defined and complex-valued, which result in two drawbacks: the former does not fully utilize the advantage of end-to-end optimization, and the latter may introduce numerical instability during the training phase. Motivated by the recent success in end-to-end neural separation systems, in this paper we propose time-domain real-valued generalized Wiener filter (TD-GWF), a linear filter defined on a 2-D learnable real-valued signal transform. TD-GWF splits the transformed representation into groups and performs an minimum mean-square error (MMSE) estimation on all available channels on each of the groups. We show how TD-GWF can be connected to conventional filter-and-sum beamformers when certain signal transform and the number of groups are specified. Moreover, given the recent success in the sequential neural beamforming frameworks, we show how TD-GWF can be applied in such frameworks to perform iterative beamforming and separation to obtain an overall performance gain. Comprehensive experiment results show that TD-GWF performs consistently better than conventional frequency-domain beamformers in the sequential neural beamforming pipeline with various neural network architectures, microphone array scenarios, and task configurations.

preprint2022arXiv

An Information-theoretical Secured Byzantine-fault Tolerance Consensus in Quantum Key Distribution Network

Quantum key distribution (QKD) networks is expected to provide information-theoretical secured (ITS) communication over long distances. QKD networks based trusted relay architecture are now the most widely used scheme in practice. However, it is an unrealistic assumption that all relays are fully trustable in complex networks. In the past, only a few studies have theoretically analyzed the case of passive eavesdropping attack by dishonest relays and corresponding defense method. However, we have found that active attacks by dishonest relays can be more threatening. With the consideration of passive and active attacks, we treat dishonest relays as Byzantine nodes and analyzes the upper limit of Byzantine nodes that the QKD network can accommodate. In this paper, we propose an ITS Byzantine-fault tolerance (BFT) QKD network scheme to achieve end-to-end key distribution based on point-to-point QKD links. To ensure consistency and provide BFT ability in the QKD network, we design an ITSBFT-consensus protocol for this network scheme. To ensure the information-theoretic security of consensus, we design a temporary signature scheme based on point-to-point QKD link keys. To prevent Byzantine nodes from disrupting the execution process of key distribution, we design an end-to-end key distribution scheme combined with consensus. We theoretically analyze proposed ITSBFT-QKD network scheme from four aspects: QKD key distribution security, temporary signature security, consensus security, and leader election fairness. The simulation result proved the feasibility and demonstrate the performance.

preprint2022arXiv

Analysis of Diffractive Neural Networks for Seeing Through Random Diffusers

Imaging through diffusive media is a challenging problem, where the existing solutions heavily rely on digital computers to reconstruct distorted images. We provide a detailed analysis of a computer-free, all-optical imaging method for seeing through random, unknown phase diffusers using diffractive neural networks, covering different deep learning-based training strategies. By analyzing various diffractive networks designed to image through random diffusers with different correlation lengths, a trade-off between the image reconstruction fidelity and distortion reduction capability of the diffractive network was observed. During its training, random diffusers with a range of correlation lengths were used to improve the diffractive network's generalization performance. Increasing the number of random diffusers used in each epoch reduced the overfitting of the diffractive network's imaging performance to known diffusers. We also demonstrated that the use of additional diffractive layers improved the generalization capability to see through new, random diffusers. Finally, we introduced deliberate misalignments in training to 'vaccinate' the network against random layer-to-layer shifts that might arise due to the imperfect assembly of the diffractive networks. These analyses provide a comprehensive guide in designing diffractive networks to see through random diffusers, which might profoundly impact many fields, such as biomedical imaging, atmospheric physics, and autonomous driving.

preprint2022arXiv

FRA-RIR: Fast Random Approximation of the Image-source Method

The training of modern speech processing systems often requires a large amount of simulated room impulse response (RIR) data in order to allow the systems to generalize well in real-world, reverberant environments. However, simulating realistic RIR data typically requires accurate physical modeling, and the acceleration of such simulation process typically requires certain computational platforms such as a graphics processing unit (GPU). In this paper, we propose FRA-RIR, a fast random approximation method of the widely-used image-source method (ISM), to efficiently generate realistic RIR data without specific computational devices. FRA-RIR replaces the physical simulation in the standard ISM by a series of random approximations, which significantly speeds up the simulation process and enables its application in on-the-fly data generation pipelines. Experiments show that FRA-RIR can not only be significantly faster than other existing ISM-based RIR simulation tools on standard computational platforms, but also improves the performance of speech denoising systems evaluated on real-world RIR when trained with simulated RIR. A Python implementation of FRA-RIR is available online\footnote{\url{https://github.com/yluo42/FRA-RIR}}.

preprint2022arXiv

Improving Choral Music Separation through Expressive Synthesized Data from Sampled Instruments

Choral music separation refers to the task of extracting tracks of voice parts (e.g., soprano, alto, tenor, and bass) from mixed audio. The lack of datasets has impeded research on this topic as previous work has only been able to train and evaluate models on a few minutes of choral music data due to copyright issues and dataset collection difficulties. In this paper, we investigate the use of synthesized training data for the source separation task on real choral music. We make three contributions: first, we provide an automated pipeline for synthesizing choral music data from sampled instrument plugins within controllable options for instrument expressiveness. This produces an 8.2-hour-long choral music dataset from the JSB Chorales Dataset and one can easily synthesize additional data. Second, we conduct an experiment to evaluate multiple separation models on available choral music separation datasets from previous work. To the best of our knowledge, this is the first experiment to comprehensively evaluate choral music separation. Third, experiments demonstrate that the synthesized choral data is of sufficient quality to improve the model's performance on real choral music datasets. This provides additional experimental statistics and data support for the choral music separation study.

preprint2022arXiv

Massively Parallel Universal Linear Transformations using a Wavelength-Multiplexed Diffractive Optical Network

We report deep learning-based design of a massively parallel broadband diffractive neural network for all-optically performing a large group of arbitrarily-selected, complex-valued linear transformations between an input and output field-of-view, each with N_i and N_o pixels, respectively. This broadband diffractive processor is composed of N_w wavelength channels, each of which is uniquely assigned to a distinct target transformation. A large set of arbitrarily-selected linear transformations can be individually performed through the same diffractive network at different illumination wavelengths, either simultaneously or sequentially (wavelength scanning). We demonstrate that such a broadband diffractive network, regardless of its material dispersion, can successfully approximate N_w unique complex-valued linear transforms with a negligible error when the number of diffractive neurons (N) in its design matches or exceeds 2 x N_w x N_i x N_o. We further report that the spectral multiplexing capability (N_w) can be increased by increasing N; our numerical analyses confirm these conclusions for N_w > 180, which can be further increased to e.g., ~2000 depending on the upper bound of the approximation error. Massively parallel, wavelength-multiplexed diffractive networks will be useful for designing high-throughput intelligent machine vision systems and hyperspectral processors that can perform statistical inference and analyze objects/scenes with unique spectral properties.

preprint2022arXiv

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

The key towards learning informative node representations in graphs lies in how to gain contextual information from the neighbourhood. In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing. Following InfoNCE, our framework is optimized via a surrogate contrastive loss, where the positive selection underpins the quality and efficiency of representation learning. To this end, we propose a topology-aware positive sampling strategy, which samples positives from the neighbourhood by considering the structural dependencies between nodes and thus enables positive selection upfront. In the extreme case when only one positive is sampled, we fully avoid expensive neighbourhood aggregation. Our methods achieve promising performance on various node classification datasets. It is also worth mentioning by applying our loss function to MLP based node encoders, our methods can be orders of faster than existing solutions. Our codes and supplementary materials are available at https://github.com/dongwei156/n2n.

preprint2022arXiv

On the Use of Deep Mask Estimation Module for Neural Source Separation Systems

Most of the recent neural source separation systems rely on a masking-based pipeline where a set of multiplicative masks are estimated from and applied to a signal representation of the input mixture. The estimation of such masks, in almost all network architectures, is done by a single layer followed by an optional nonlinear activation function. However, recent literatures have investigated the use of a deep mask estimation module and observed performance improvement compared to a shallow mask estimation module. In this paper, we analyze the role of such deeper mask estimation module by connecting it to a recently proposed unsupervised source separation method, and empirically show that the deep mask estimation module is an efficient approximation of the so-called overseparation-grouping paradigm with the conventional shallow mask estimation layers.

preprint2022arXiv

To image, or not to image: Class-specific diffractive cameras with all-optical erasure of undesired objects

Privacy protection is a growing concern in the digital era, with machine vision techniques widely used throughout public and private settings. Existing methods address this growing problem by, e.g., encrypting camera images or obscuring/blurring the imaged information through digital algorithms. Here, we demonstrate a camera design that performs class-specific imaging of target objects with instantaneous all-optical erasure of other classes of objects. This diffractive camera consists of transmissive surfaces structured using deep learning to perform selective imaging of target classes of objects positioned at its input field-of-view. After their fabrication, the thin diffractive layers collectively perform optical mode filtering to accurately form images of the objects that belong to a target data class or group of classes, while instantaneously erasing objects of the other data classes at the output field-of-view. Using the same framework, we also demonstrate the design of class-specific permutation cameras, where the objects of a target data class are pixel-wise permuted for all-optical class-specific encryption, while the other objects are irreversibly erased from the output image. The success of class-specific diffractive cameras was experimentally demonstrated using terahertz (THz) waves and 3D-printed diffractive layers that selectively imaged only one class of the MNIST handwritten digit dataset, all-optically erasing the other handwritten digits. This diffractive camera design can be scaled to different parts of the electromagnetic spectrum, including, e.g., the visible and infrared wavelengths, to provide transformative opportunities for privacy-preserving digital cameras and task-specific data-efficient imaging.

preprint2021arXiv

Atomic-Scale Probing of Heterointerface Phonon Bridges in Nitride Semiconductor

Interface phonon modes that are generated by several atomic layers at the heterointerface play a major role in the interface thermal conductance for nanoscale high-power devices such as nitride-based high-electron-mobility transistors and light emitting diodes. Here we measure the local phonon spectra across AlN/Si and AlN/Al interfaces using atomically resolved vibrational electron energy-loss spectroscopy in a scanning transmission electron microscope. At the AlN/Si interface, we observe various localized phonon modes, of which the extended and interfacial modes act as bridges to connect the bulk AlN modes and bulk Si modes, and are expected to boost the inelastic phonon transport thus substantially contribute to interface thermal conductance. In comparison, no such phonon bridge is observed at the AlN/Al interface, for which partially extended modes dominate the interface thermal conductivity. This work provides valuable insights into understanding the interfacial thermal transport in nitride semiconductors and useful guidance for thermal management via interface engineering.

preprint2021arXiv

Cascadable all-optical NAND gates using diffractive networks

Owing to its potential advantages such as scalability, low latency and power efficiency, optical computing has seen rapid advances over the last decades. A core unit of a potential all-optical processor would be the NAND gate, which can be cascaded to perform an arbitrary logical operation. Here, we present the design and analysis of cascadable all-optical NAND gates using diffractive neural networks. We encoded the logical values at the input and output planes of a diffractive NAND gate using the relative optical power of two spatially-separated apertures. Based on this architecture, we numerically optimized the design of a diffractive neural network composed of 4 passive layers to all-optically perform NAND operation using the diffraction of light, and cascaded these diffractive NAND gates to perform complex logical functions by successively feeding the output of one diffractive NAND gate into another. We demonstrated the cascadability of our diffractive NAND gates by using identical diffractive designs to all-optically perform AND and OR operations, as well as a half-adder. Cascadable all-optical NAND gates composed of spatially-engineered passive diffractive layers can serve as a core component of various optical computing platforms.

preprint2021arXiv

Characterization of exhaled e-cigarette aerosols in a vape shop using a field-portable holographic on-chip microscope

The past decade marked a drastic increase in the usage of electronic cigarettes (e-cigs). The adverse health impact of secondhand exposure due to exhaled e-cig particles has raised significant concerns, demanding further research on the characteristics of these particles. In this work, we report direct volatility measurements on exhaled e-cig aerosols using a field-portable device (termed c-Air) enabled by deep learning and lens-free holographic microscopy; for this analysis, we performed a series of field experiments in a vape shop where customers used/vaped their e-cig products. During four days of experiments, we periodically sampled the indoor air with intervals of ~15 minutes and collected the exhaled particles with c-Air. Time-lapse inline holograms of the collected particles were recorded by c-Air and reconstructed using a convolutional neural network yielding phase-recovered microscopic images of the particles. Volumetric decay of individual particles due to evaporation was used as an indicator of the volatility of each aerosol. Volatility dynamics quantified through c-Air experiments showed that indoor vaping increased the volatility of particles as well as the percentage of volatile and semi-volatile particles in air. The reported methodology and findings can guide further studies on volatility characterization of e-cig emission and regulations on indoor vaping.

preprint2021arXiv

Computational Imaging Without a Computer: Seeing Through Random Diffusers at the Speed of Light

Imaging through diffusers presents a challenging problem with various digital image reconstruction solutions demonstrated to date using computers. We present a computer-free, all-optical image reconstruction method to see through random diffusers at the speed of light. Using deep learning, a set of diffractive surfaces are designed/trained to all-optically reconstruct images of objects that are covered by random phase diffusers. We experimentally demonstrated this concept using coherent THz illumination and all-optically reconstructed objects distorted by unknown, random diffusers, never used during training. Unlike digital methods, all-optical diffractive reconstructions do not require power except for the illumination light. This diffractive solution to see through diffusers can be extended to other wavelengths, and might fuel various applications in biomedical imaging, astronomy, atmospheric sciences, oceanography, security, robotics, among others.

preprint2021arXiv

Dual-Path Modeling for Long Recording Speech Separation in Meetings

The continuous speech separation (CSS) is a task to separate the speech sources from a long, partially overlapped recording, which involves a varying number of speakers. A straightforward extension of conventional utterance-level speech separation to the CSS task is to segment the long recording with a size-fixed window and process each window separately. Though effective, this extension fails to model the long dependency in speech and thus leads to sub-optimum performance. The recent proposed dual-path modeling could be a remedy to this problem, thanks to its capability in jointly modeling the cross-window dependency and the local-window processing. In this work, we further extend the dual-path modeling framework for CSS task. A transformer-based dual-path system is proposed, which integrates transform layers for global modeling. The proposed models are applied to LibriCSS, a real recorded multi-talk dataset, and consistent WER reduction can be observed in the ASR evaluation for separated speech. Also, a dual-path transformer equipped with convolutional layers is proposed. It significantly reduces the computation amount by 30% with better WER evaluation. Furthermore, the online processing dual-path models are investigated, which shows 10% relative WER reduction compared to the baseline.

preprint2021arXiv

High-current CNT films grown directly on commercially available 2.5D substrates for low-voltage field-emission electron sources

Carbon nanotube (CNT) based electronic devices are promising for beyond-silicon solid-state electronics and vacuum micro-nano-electronics. Despite rapid progress in CNT field-effect transistor related solid-state electronics, the development of CNT-based vacuum nanoelectronic devices is substantially blocked by the longstanding challenges in demanding high-current field-emission (FE) electron sources at low operating voltage. In addition to CNTs' properties, FE characteristics are also affected by substrate morphology and interface state. This work demonstrates high-current FE characteristics at relatively low operating voltage by using CNT films grown directly on commercially available 2.5D substrates with matched feature size and improved interface contact. Simulation results indicate that the commercially available 2.5D substrate including nickel foam (NiF) and carbon cloth (CC) with appropriate feature size would dramatically help to enhance emission current at a relatively lower voltage. Modified fabrication process results in improved contact between CNTs and the underlying 2.5D substrates. Twenty times higher emission current density with halved lower turn-on electric field achieved by CNTs grown directly on randomly picked NiF shows the potential of 2.5D substrate with good contact in improving FE characteristics. Finally, a high emission current (6 mA) with approximately 75 percent decrease in turn-on electric field was realized by matching the feature size of 2.5D substrate with that of CNTs, bringing us significantly closer to reliable high-current and low-voltage FE electron sources for practical applications.

preprint2021arXiv

Low half-wave-voltage, ultra-high bandwidth thin-film LiNbO3 modulator based on hybrid waveguide and periodic capacitively loaded electrodes

A novel thin-film LiNbO3 (TFLN) electro-optic modulator is proposed and demonstrated. LiNbO3-silica hybrid waveguide is adopted to maintain low optical loss for an electrode spacing as narrow as 3 μm, resulting in a record low half-wave-voltage length product of only 1.7 V*cm. Capacitively loaded traveling-wave electrodes (CL-TWEs) are employed to reduce the microwave loss, while quartz substrate is used in place of silicon substrate to achieve velocity matching. The fabricated TFLN modulator with a 5-mm-long modulation region exhibits a half-wave-voltage of 3.4 V and merely 1.3 dB roll-off in electro-optic response up to 67 GHz, and a 3-dB modulation bandwidth over 110 GHz is predicted.

preprint2021arXiv

Observation of optical gyromagnetic properties in a magneto-plasmonic metamaterial

Metamaterials with artificial optical properties have attracted significant research interest. In particular, artificial magnetic resonances in non-unity permeability tensor at optical frequencies in metamaterials have been reported. However, only non-unity diagonal elements of the permeability tensor have been demonstrated to date. A gyromagnetic permeability tensor with non-zero off-diagonal elements has not been observed at the optical frequencies. Here we report the observation of gyromagnetic properties in the near-infrared wavelength range in a magneto-plasmonic metamaterial. The non-zero off-diagonal permeability tensor element causes the transverse magneto-optical Kerr effect (TMOKE) under s-polarized incidence that otherwise vanishes if the permeability tensor is not gyromagnetic. By retrieving the permeability tensor elements from reflection, transmission, and TMOKE spectra, we show that the effective off-diagonal permeability tensor elements reach the 10-3 level at the resonance wavelength (~900 nm) of the split-ring resonators that is at least two orders of magnitude higher than that of magneto-optical materials at the same wavelength. The artificial gyromagnetic permeability is attributed to the change in the local electric field direction modulated by the split-ring resonators. Our study demonstrates the possibility of engineering the permeability and permittivity tensors in metamaterials at arbitrary frequencies, thereby promising a variety of applications of next-generation nonreciprocal photonic devices, magneto-plasmonic sensors, and active metamaterials.

preprint2021arXiv

Ultrafast Parallel LiDAR with Time-encoding and Spectral Scanning: Breaking the Time-of-flight Limit

Light detection and ranging (LiDAR) has been widely used in autonomous driving and large-scale manufacturing. Although state-of-the-art scanning LiDAR can perform long-range three-dimensional imaging, the frame rate is limited by both round-trip delay and the beam steering speed, hindering the development of high-speed autonomous vehicles. For hundred-meter level ranging applications, a several-time speedup is highly desirable. Here, we uniquely combine fiber-based encoders with wavelength-division multiplexing devices to implement all-optical time-encoding on the illumination light. Using this method, parallel detection and fast inertia-free spectral scanning can be achieved simultaneously with single-pixel detection. As a result, the frame rate of a scanning LiDAR can be multiplied with scalability. We demonstrate a 4.4-fold speedup for a maximum 75-m detection range, compared with a time-of-flight-limited laser ranging system. This approach has the potential to improve the velocity of LiDAR-based autonomous vehicles to the regime of hundred kilometers per hour and open up a new paradigm for ultrafast-frame-rate LiDAR imaging.

preprint2020arXiv

An End-to-end Architecture of Online Multi-channel Speech Separation

Multi-speaker speech recognition has been one of the keychallenges in conversation transcription as it breaks the singleactive speaker assumption employed by most state-of-the-artspeech recognition systems. Speech separation is consideredas a remedy to this problem. Previously, we introduced a sys-tem, calledunmixing,fixed-beamformerandextraction(UFE),that was shown to be effective in addressing the speech over-lap problem in conversation transcription. With UFE, an inputmixed signal is processed by fixed beamformers, followed by aneural network post filtering. Although promising results wereobtained, the system contains multiple individually developedmodules, leading potentially sub-optimum performance. In thiswork, we introduce an end-to-end modeling version of UFE. Toenable gradient propagation all the way, an attentional selectionmodule is proposed, where an attentional weight is learnt foreach beamformer and spatial feature sampled over space. Ex-perimental results show that the proposed system achieves com-parable performance in an offline evaluation with the originalseparate processing-based pipeline, while producing remark-able improvements in an online evaluation.

preprint2020arXiv

Continuous speech separation: dataset and analysis

This paper describes a dataset and protocols for evaluating continuous speech separation algorithms. Most prior studies on speech separation use pre-segmented signals of artificially mixed speech utterances which are mostly \emph{fully} overlapped, and the algorithms are evaluated based on signal-to-distortion ratio or similar performance metrics. However, in natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components. In addition, the signal-based metrics have very weak correlations with automatic speech recognition (ASR) accuracy. We think that not only does this make it hard to assess the practical relevance of the tested algorithms, it also hinders researchers from developing systems that can be readily applied to real scenarios. In this paper, we define continuous speech separation (CSS) as a task of generating a set of non-overlapped speech signals from a \textit{continuous} audio stream that contains multiple utterances that are \emph{partially} overlapped by a varying degree. A new real recorded dataset, called LibriCSS, is derived from LibriSpeech by concatenating the corpus utterances to simulate a conversation and capturing the audio replays with far-field microphones. A Kaldi-based ASR evaluation protocol is also established by using a well-trained multi-conditional acoustic model. By using this dataset, several aspects of a recently proposed speaker-independent CSS algorithm are investigated. The dataset and evaluation scripts are available to facilitate the research in this direction.

preprint2020arXiv

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input sequences consisting of a huge number of time steps, which introduces challenges for modeling extremely long sequences. Conventional recurrent neural networks (RNNs) are not effective for modeling such long sequences due to optimization difficulties, while one-dimensional convolutional neural networks (1-D CNNs) cannot perform utterance-level sequence modeling when its receptive field is smaller than the sequence length. In this paper, we propose dual-path recurrent neural network (DPRNN), a simple yet effective method for organizing RNN layers in a deep structure to model extremely long sequences. DPRNN splits the long sequential input into smaller chunks and applies intra- and inter-chunk operations iteratively, where the input length can be made proportional to the square root of the original sequence length in each operation. Experiments show that by replacing 1-D CNN with DPRNN and apply sample-level modeling in the time-domain audio separation network (TasNet), a new state-of-the-art performance on WSJ0-2mix is achieved with a 20 times smaller model than the previous best system.

preprint2020arXiv

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

An important problem in ad-hoc microphone speech separation is how to guarantee the robustness of a system with respect to the locations and numbers of microphones. The former requires the system to be invariant to different indexing of the microphones with the same locations, while the latter requires the system to be able to process inputs with varying dimensions. Conventional optimization-based beamforming techniques satisfy these requirements by definition, while for deep learning-based end-to-end systems those constraints are not fully addressed. In this paper, we propose transform-average-concatenate (TAC), a simple design paradigm for channel permutation and number invariant multi-channel speech separation. Based on the filter-and-sum network (FaSNet), a recently proposed end-to-end time-domain beamforming system, we show how TAC significantly improves the separation performance across various numbers of microphones in noisy reverberant separation tasks with ad-hoc arrays. Moreover, we show that TAC also significantly improves the separation performance with fixed geometry array configuration, further proving the effectiveness of the proposed paradigm in the general problem of multi-microphone speech separation.

preprint2020arXiv

Low energy magnons in the chiral ferrimagnet $\text{Cu}_2\text{OSeO}_3$: a coarse-grained approach

We report a comprehensive neutron scattering study of low energy magnetic excitations in the breathing pyrochlore helimagnetic $\text{Cu}_2\text{OSeO}_3$. Fully documenting the four lowest energy magnetic modes that leave the ferrimagnetic configuration of the "strong tetrahedra" intact ($|\hbarω|<13$ meV), we find gapless quadratic dispersion at the $Γ$ point for energies above 0.2 meV, two doublets separated by 1.6(2) meV at the $R$ point, and a bounded continuum at the $X$ point. Our constrained rigid spin cluster model relates these features to Dzyaloshinskii-Moriya (DM) interactions and the incommensurate helical ground state. Combining conventional spin wave theory with a spin cluster form-factor accurately reproduces the measured equal time structure factor through multiple Brillouin zones. An effective spin Hamiltonian describing the complex anisotropic inter-cluster interactions is obtained.

preprint2020arXiv

Real-time binaural speech separation with preserved spatial cues

Deep learning speech separation algorithms have achieved great success in improving the quality and intelligibility of separated speech from mixed audio. Most previous methods focused on generating a single-channel output for each of the target speakers, hence discarding the spatial cues needed for the localization of sound sources in space. However, preserving the spatial information is important in many applications that aim to accurately render the acoustic scene such as in hearing aids and augmented reality (AR). Here, we propose a speech separation algorithm that preserves the interaural cues of separated sound sources and can be implemented with low latency and high fidelity, therefore enabling a real-time modification of the acoustic scene. Based on the time-domain audio separation network (TasNet), a single-channel time-domain speech separation system that can be implemented in real-time, we propose a multi-input-multi-output (MIMO) end-to-end extension of TasNet that takes binaural mixed audio as input and simultaneously separates target speakers in both channels. Experimental results show that the proposed end-to-end MIMO system is able to significantly improve the separation performance and keep the perceived location of the modified sources intact in various acoustic scenes.

preprint2020arXiv

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Many recent source separation systems are designed to separate a fixed number of sources out of a mixture. In the cases where the source activation patterns are unknown, such systems have to either adjust the number of outputs or to identify invalid outputs from the valid ones. Iterative separation methods have gain much attention in the community as they can flexibly decide the number of outputs, however (1) they typically rely on long-term information to determine the stopping time for the iterations, which makes them hard to operate in a causal setting; (2) they lack a "fault tolerance" mechanism when the estimated number of sources is different from the actual number. In this paper, we propose a simple training method, the auxiliary autoencoding permutation invariant training (A2PIT), to alleviate the two issues. A2PIT assumes a fixed number of outputs and uses auxiliary autoencoding loss to force the invalid outputs to be the copies of the input mixture, and detects invalid outputs in a fully unsupervised way during inference phase. Experiment results show that A2PIT is able to improve the separation performance across various numbers of speakers and effectively detect the number of speakers in a mixture.

preprint2020arXiv

Terahertz Pulse Shaping Using Diffractive Surfaces

Recent advances in deep learning have been providing non-intuitive solutions to various inverse problems in optics. At the intersection of machine learning and optics, diffractive networks merge wave-optics with deep learning to design task-specific elements to all-optically perform various tasks such as object classification and machine vision. Here, we present a diffractive network, which is used to shape an arbitrary broadband pulse into a desired optical waveform, forming a compact pulse engineering system. We experimentally demonstrate the synthesis of square pulses with different temporal-widths by manufacturing passive diffractive layers that collectively control both the spectral amplitude and the phase of an input terahertz pulse. Our results constitute the first demonstration of direct pulse shaping in terahertz spectrum, where a complex-valued spectral modulation function directly acts on terahertz frequencies. Furthermore, a Lego-like physical transfer learning approach is presented to illustrate pulse-width tunability by replacing part of an existing network with newly trained diffractive layers, demonstrating its modularity. This learning-based diffractive pulse engineering framework can find broad applications in e.g., communications, ultra-fast imaging and spectroscopy.

preprint2019arXiv

Fragmentation and isomerization of polycyclic aromatic hydrocarbons in the interstellar medium: coronene as a case study

Aims. Due to the limitations of current computational technology, the fragmentation and isomerization products of vibrationally-excited polycyclic aromatic hydrocarbon (PAH) molecules and their derivatives are poorly studied. In this work, we investigate the intermediate products of PAHs and their derivatives as well as the gas-phase reactions relevant to the interstellar medium, with coronene as a case study. Methods. Based on the semi-empirical method of PM3 as implemented in the CP2K program, molecular dynamics simulations are performed to model the major processes (e.g., vibrations, fragmentations, and isomerizations) of coronene and its derivatives (e.g., methylated coronene, hydrogenated coronene, dehydrogenated coronene, nitrogen-substituted coronene, and oxygen-substituted coronene) at temperatures of 3000 K and 4000 K. Results. We find that the anharmonic effects are crucial for the simulation of vibrational excitation. For the molecules studied here, H2, CO, HCN, and CH2 are the major fragments. Following the dissociation of these small units, most of the molecules could maintain their ring structures, but a few molecules would break completely into carbon chains. The transformation from hexagon to pentagon or heptagon may occur and the heteroatomic substitutions (e.g., N- or O-substitutions) facilitate the transformation.

preprint2019arXiv

Visually Constructing the Chemical Structure of a Single Molecule by Scanning Raman Picoscopy

The strong spatial confinement of a nanocavity plasmonic field has made it possible to visualize the inner structure of a single molecule and even to distinguish its vibrational modes in real space. With such ever-improved spatial resolution, it is anticipated that full vibrational imaging of a molecule could be achieved to reveal molecular structural details. Here we demonstrate full Raman images of individual vibrational modes on the Ångström level for a single Mg-porphine molecule, revealing distinct characteristics of each vibrational mode in real space. Furthermore, by exploiting the underlying interference effect and Raman fingerprint database, we propose a new methodology for structural determination, coined as scanning Raman picoscopy, to show how such ultrahigh-resolution spectromicroscopic vibrational images can be used to visually assemble the chemical structure of a single molecule through a simple Lego-like building process.

preprint2016arXiv

"WM"-Shaped Growth of GaN on Patterned Sapphire Substrates

In metal organic vapor phase epitaxy of GaN, the growth mode is sensitive to reactor temperature. In this study, V-pit-shaped GaN has been grown on normal c-plane cone-patterned sapphire substrate by decreasing the growth temperature of high-temperature-GaN to around 950 oC, which leads to the 3-dimensional growth of GaN. The so-called "WM" well describes the shape that the bottom of GaN V-pit is just right over the top of sapphire cone, and the regular arrangement of V-pits follows the patterns of sapphire substrate strictly. Two types of semipolar facets (1101) and (1122) expose on sidewalls of V-pits. Furthermore, by raising the growth temperature to 1000 oC, the growth mode of GaN can be transferred to 2-demonsional growth. Accordingly, the size of V-pits becomes smaller and the area of c-plane GaN becomes larger, while the total thickness of GaN keeps almost unchanged during this process. As long as the 2-demonsional growth lasts, the V-pits will disappear and only flat c-plane GaN remains. This means the area ratio of c-plane and semipolar plane GaN can be controlled by the duration time of 2-demonsional growth.

preprint2016arXiv

A PMT-like high gain avalanche photodiode based on GaN/AlN periodical stacked structure

Avalanche photodiode (APD) has been intensively investigated as a promising candidate to replace photomultiplier tubes (PMT) for weak light detection. However, in conventional APDs, a large portion of carrier energy drawn from the electric field is thermalized, and the multiplication efficiencies of electron and hole are low and close. In order to achieve high gain, the device should work under breakdown bias, where carrier multiplication proceeds bi-directionally to form a positive feedback multiplication circle. However, breakdown is hard to control, in practice, APDs should work under Geiger mode as a compromise between sustainable detection and high gain. The complexity of system seriously restricts the application. Here, we demonstrate an avalanche photodiode holding high gain without breakdown, which means no quenching circuit is needed for sustainable detection. The device is based on a GaN/AlN periodically-stacked-structure (PSS), wherein electron holds much higher efficiency than hole to draw energy from the electric field, and avalanche happens uni-directionally with high efficiency. and a recorded high gain (10^4) tested under constant bias is obtained in a prototype device, wherein the stable gain can be determined by the periodicity of the GaN/AlN PSS. This work not only brings a new light into avalanche multiplication mechanism, but also paves a technological path with high commercial value to realize highly sensitive avalanche devices working under constant bias like PMT.

preprint2016arXiv

Broadband frequency comb generation in aluminum nitride-on-sapphire microresonators

Development of chip-scale optical frequency comb with the coverage from ultra-violet (UV) to mid-infrared (MIR) wavelength is of great significance. To expand the comb spectrum into the challenging UV region, a material platform with high UV transparency is crucial. In this paper, crystalline aluminum nitride (AlN)-onsapphire film is demonstrated for efficient Kerr frequency comb generation. Near-infrared (NIR) comb with nearly octave-spanning coverage and low parametric threshold is achieved in continuous-wave pumped high-quality-factor AlN microring resonators. The competition between stimulated Raman scattering (SRS) and hyperparametric oscillation is investigated, along with broadband comb generation via Raman-assisted four-wave mixing (FWM). Thanks to its wide bandgap, excellent crystalline quality as well as intrinsic quadratic and cubic susceptibilities, AlN-on-sapphire platform should be appealing for integrated nonlinear optics from MIR to UV region.

preprint2016arXiv

Continuous-wave Raman Lasing in Aluminum Nitride Microresonators

We report the first investigation on continuous-wave Raman lasing in high-quality-factor aluminum nitride (AlN) microring resonators. Although wurtzite AlN is known to exhibit six Raman-active phonons, single-mode Raman lasing with low threshold and high slope efficiency is demonstrated. Selective excitation of A$_1^\mathrm{TO}$ and E$_2^\mathrm{high}$ phonons with Raman shifts of $\sim$612 and 660 cm$^{-1}$ is observed by adjusting the polarization of the pump light. A theoretical analysis of Raman scattering efficiency within ${c}$-plane (0001) of AlN is carried out to help account for the observed lasing behavior. Bidirectional lasing is experimentally confirmed as a result of symmetric Raman gain in micro-scale waveguides. Furthermore, second-order Raman lasing with unparalleled output power of $\sim$11.3 mW is obtained, which offers the capability to yield higher order Raman lasers for mid-infrared applications.

preprint2016arXiv

InGaN/GaN Multi-Quantum-Well and Light-Emitting Diode Based on V-pit-Shaped GaN Grown on Patterned Sapphire Substrate

V-pit-defects in GaN-based light-emitting diodes induced by dislocations are considered beneficial to electroluminescence because they relax the strain in InGaN quantum wells and also enhance the hole lateral injection through sidewall of V-pits. In this paper, regularly arranged V-pits are formed on c-plane GaN grown by metal organic vapor phase epitaxy on conventional c-plane cone-patterned sapphire substrates. The size of V-pits and area of flat GaN can be adjusted by changing growth temperature. Five pairs of InGaN/GaN multi-quantumwell and also a light-emitting diode structure are grown on this V-pit-shaped GaN. Two peaks around 410 nm and 450 nm appearing in both photoluminescence and cathodeluminescence spectra are from the semipolar InGaN/GaN multi-quantum-well on sidewalls of V-pits and cplane InGaN/GaN multi-quantum-well, respectively. In addition, dense bright spots can be observed on the surface of light-emitting diode when it works under small injection current, which are believed owing to the enhanced hole injection around V-pits.

preprint2016arXiv

Understanding different efficiency droop behaviors in InGaN-based near-UV, blue and green light-emitting diodes through differential carrier lifetime measurements

Efficiency droop effect under high injection in GaN-based light emitting diodes (LEDs) strongly depends on wavelength, which is still not well understood. In this paper, through differential carrier lifetime measurements on commercialized near-UV, blue, and green LEDs, their different efficiency droop behaviors are attributed to different carrier lifetimes, which are prolonged as wavelength increases. This relationship between carrier lifetime and indium composition of InGaN quantum well is believed owing to the polarization-induced quantum confinement Stark effect. Long carrier lifetime not only increases the probability of carrier leakage, but also results in high carrier concentration in quantum well. In other words, under the same current density, the carrier concentration in active region in near-UV LED is the lowest while that in green one is the highest. If considering the efficiency droop depending on carrier concentration, the behaviors of LEDs with different wavelengths do not show any abnormality. The reason why the efficiency droop becomes more serious under lower temperature can be also explained by this model as well. Based on this result, the possible solutions to conquer efficiency droop are discussed. It seems that decreasing the carrier lifetime is a fundamental approach to solve the problem.

preprint2015arXiv

Coherent Resonances Observed in the Dissociative Electron Attachments to Carbon Monoxide

Succeeding our previous finding about coherent interference of the resonant states of CO^- formed by the low-energy electron attachment [Phys. Rev. A 88, 012708 (2013)], here we provide more evidences of the coherent interference, in particular, we find the state configuration change in the interference with the increase of electron attachment energy by measuring the completely backward distributions of the O^- fragment ion of the temporary CO^- in an energy range 11.3-12.6 eV. Therefore, different pure states, namely, coherent resonances, can be formed when the close-lying resonant states are coherently superposed by a broad-band electron pulse.

preprint2015arXiv

Raman Images of a Single Molecule in a Highly Confined Plasmonic Field

Under the local plasmonic excitation, the Raman images of a single molecule can now reach sub-nanometer resolution. We report here a theoretical description of the interaction between a molecule and a highly confined plasmonic field. It is shown that when the spatial distribution of the plasmonic field is comparable with the size of the molecule, the optical transition matrix of the molecule becomes to be dependent on the position and the spatial distribution of the plasmonic field, resulting in spatially resolved Raman image of a molecule. It is found that the resonant Raman image reflects the electronic transition density of the molecule. In combination with the first principles calculations, the simulated Raman image of a porphyrin derivative adsorbed on the silver surface nicely reproduces its experimental counterpart. The present theory provides the basic framework for describing linear and nonlinear responses of molecules under the highly confined plasmonic field.

preprint2015arXiv

Significant contributions of Albrecht's $A$ term to non-resonant Raman scattering processes

The Raman intensity can be well described by the famous Albrecht equation that consists of A and B terms. It is well known that the contribution from Albrecht's A term can be neglected without loss of accuracy for far off-resonant Raman scattering processes. However, as demonstrated in this study, we have found that this widely accepted long-standing assumption fails drastically for totally symmetric vibration modes of molecules in general off-resonant Raman scattering. Perturbed first principles calculations for water molecule show that strong constructive interference between the A and B terms occurs for the Raman intensity of the symmetric O-H stretching mode, which can account for about 40% of the total intensity. Meanwhile, a minor destructive interference is found for the angle bending mode. The state to state mapping between the Albrecht's theory and the perturbation theory allows us to verify the accuracy of the widely employed perturbation method for the dynamic/resonant Raman intensities. The model calculations rationalized from water molecule with the bending mode show that the perturbation method is a good approximation only when the absolute energy difference between the first excited state and the incident light is more than five times of the vibrational energy in ground state.

preprint2014arXiv

Degrees-of-Freedom Regions for $K$-User MISO Time-Correlated Broadcast Channel

In this paper, we study the achievable degrees-of-freedom (DoF) regions of the $K$-user multiple-input-single-output (MISO) time correlated broadcast channel (BC). The time correlation induces knowledge of the current channel state information at transmitter (CSIT) with an estimation error $P^{-α}$, where $P$ is the signal-to-noise ratio (SNR). We consider the following two scenarios: $(i)$ $K$-user with $K$-antenna base station (BS) and $(ii)$ $3$-user with $2$-antenna BS. In case of symmetric DoF tuples, where all the users obtain the same DoF, we derive the total DoF equal to $\frac{K(1-α)}{1+\frac{1}{2}+\cdots+\frac{1}{K}}+Kα$ for the first scenario and $\frac{3+α}{2}$ for the second one. In particular, we provide the achievability schemes for these two DoF tuples. Nevertheless, we also consider the asymmetric case where one of the users is guaranteed {\it one} DoF, and provide the achievability scheme. Notably, the consistency of the proposed DoF regions with an already published outer bound , as well as with the Maddah-Ali-Tse (MAT), which assumes only perfect delayed CSIT, and the ZF beamforming schemes (perfect current CSIT) consents to the optimality of the proposed achievability schemes.

preprint2013arXiv

From microscopic theory to macroscopic theory: a systematic study on static modeling for liquid crystals

In this paper, we propose a systematic way of liquid crystal modeling to build connection between microscopic theory and macroscopic theory. A new Q-tensor theory based on Onsager's molecular theory which leads to liquid crystals with certain shape has been proposed. Making uniaxial assumption, we can recover the Oseen-Frank theory from the derived $Q$-tensor theory, and the Oseen-Frank model coefficients can be examined. In addition, the smectic-A phase can also be characterized by the derived macroscopic model.

preprint2011arXiv

CO2 dissociation activated through electron attachment on reduced rutile TiO2(110)-1x1 surface

Converting CO$_2$ to useful compounds through the solar photocatalytic reduction has been one of the most promising strategies for artificial carbon recycling. The highly relevant photocatalytic substrate for CO$_2$ conversion has been the popular TiO$_2$ surfaces. However, the lack of accurate fundamental parameters that determine the CO$_2$ reduction on TiO$_2$ has limited our ability to control these complicated photocatalysis processes. We have systematically studied the reduction of CO2 at specific sites of the rutile TiO$_2$(110)-1x1 surface using scanning tunneling microscopy at 80 K. The dissociation of CO2 molecules is found to be activated by one electron attachment process and its energy threshold, corresponding to the CO$_2^{\dot-}$/CO$_2$ redox potential, is unambiguously determined to be 2.3 eV higher than the onset of the TiO$_2$ conduction band. The dissociation rate as a function of electron injection energy is also provided. Such information can be used as practical guidelines for the design of effective catalysts for CO$_2$ photoreduction.

preprint2011arXiv

Evidence of Photocatalytic Dissociation of Water on TiO2 with Atomic Resolution

Photocatalytic water splitting reaction on TiO2 surface is one of the fundamental issues that bears significant implication in hydrogen energy technology and has been extensively studied. However, the existence of the very first reaction step, the direct photo-dissociation of water, has been disregarded. Here, we provide unambiguously experimental evidence to demonstrate that adsorbed water molecules on reduced rutile TiO2(110)-1\times1 surface can be dissociated under UV irradiation using low temperature scanning tunneling microscopy. It is identified that a water molecule at fivefold coordinated Ti (Ti5c) site can be photocatalytically dissociated, resulting in a hydroxyl at Ti5c and another hydroxyl at bridge oxygen row. Our findings reveal a missing link in the photocatalytic water splitting reaction chain, which greatly contribute to the detailed understanding of underlying mechanism.

preprint2011arXiv

Laser-launched evanescent surface plasmon polariton field utilized as a direct coherent pumping source to generate emitted nonlinear four-wave mixing radiation

We develop a concept of surface plasmon polaritons (SPPs) based four-wave mixing (4WM), in which a laser-launched evanescent SPP field is utilized as a coherent pumping source to involve directly in a nonlinear 4WM process at the dielectric/metal interface. Conversion efficiency of the resulting 4WM radiation is expected to be dramatically increased due to the local-field enhancement effect. Feasibility of implementing this concept at the air/gold film and graphene flake/gold film interfaces is further examined by numerical simulations. The concept shows intriguing promise for applications in newly emerging nanophotonics, optoelectronics, and active plasmonics.

preprint2011arXiv

Nonlinear electron scattering activated by surface plasmon excitation of Ag nanostructures

The discovery of many fascinating new phenomena associated with the surface plasmon polariton (SPP) has triggered the rapid development of nanophotonics and nanoelectronics. We report here the experimental observation of a fundamentally new physical process, nonlinear electron scattering, stimulated by the SPP excitation of Ag nanostructures on graphite surface in scanning probe electron energy loss spectroscopy. The observed intensity of SPP energy loss peak normalized to the elastic scattering intensity shows clearly a quadratic dependence on the external electric field strength generated by the tip-sample bias. The strong coherent nature of the SPP has made the observation possible and a two-step scattering process is proposed to explain this novel nonlinear effect. Our findings shed new light on the nature of SPP and pave the way to new spectroscopic applications.

preprint2010arXiv

A density matrix approach for the electroluminescence of molecules in a scanning tunneling microscope

The electroluminescence of molecules confined inside a nanocavity in the scanning tunneling microscopy possesses many intriguing but unexplained features. We present here a general theoretical approach based on the density matrix formalism to describe the electroluminescence from molecules near a metal surface induced by both electron tunneling and local surface plasmon excitations simultaneously. It reveals the underlying physical mechanism for the external bias dependent electroluminescence. The important role played by the local surface plasmon on the electroluminescence is highlighted. Calculations for porphyrin derivatives have reproduced corresponding experimental spectra and nicely explained the observed unusual large variation of emission spectral profiles. This general theoretical approach can find many applications in the design of molecular electronic and photonic devices.

preprint2010arXiv

Rotation and dissociation dynamics of a single O2 molecule on the Pt(111) surface determined from a first principles study

The STM induced rotation and dissociation dynamics of a single oxygen molecule on the Pt(111) surface have been finally determined by first principles calculations together with a newly developed statistical model for inelastic electron tunneling. Several long-standing puzzles associated with these dynamic processes in this classic system have been fully resolved. It is found that the unexpected low energy barrier of the O2 rotation is originated from an ingenious pathway, while the prior occupation of the metastable hcp-hollow site after the O2 dissociation can be attributed to a dynamic process of surface accommodation. The experimentally observed non-integer power-law dependence of the rotation rate as a function of the current can be perfectly explained by taking into account the randomness of multi-electron inelastic tunneling processes.

preprint2005arXiv

First-Principles Simulations of Inelastic Electron Tunneling Spectroscopyof Molecular Junctions

A generalized Green's function theory is developed to simulate the inelastic electron tunneling spectroscopy (IETS) of molecular junctions. It has been applied to a realistic molecular junction with an octanedithiolate embedded between two gold contacts in combination with the hybrid density functional theory calculations. The calculated spectra are in excellent agreement with recent experimental results. Strong temperature dependence of the experimental IETS spectra is also reproduced. It is shown that the IETS is extremely sensitive to the intra-molecular conformation and to the molecule-metal contact geometry.

Yi Luo

What is connected

Connect this record

See the researcher in context

Building this map preview

47 published item(s)

DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis

A Time-domain Real-valued Generalized Wiener Filter for Multi-channel Neural Separation Systems

An Information-theoretical Secured Byzantine-fault Tolerance Consensus in Quantum Key Distribution Network

Analysis of Diffractive Neural Networks for Seeing Through Random Diffusers

FRA-RIR: Fast Random Approximation of the Image-source Method

Improving Choral Music Separation through Expressive Synthesized Data from Sampled Instruments

Massively Parallel Universal Linear Transformations using a Wavelength-Multiplexed Diffractive Optical Network

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

On the Use of Deep Mask Estimation Module for Neural Source Separation Systems

To image, or not to image: Class-specific diffractive cameras with all-optical erasure of undesired objects

Atomic-Scale Probing of Heterointerface Phonon Bridges in Nitride Semiconductor

Cascadable all-optical NAND gates using diffractive networks

Characterization of exhaled e-cigarette aerosols in a vape shop using a field-portable holographic on-chip microscope

Computational Imaging Without a Computer: Seeing Through Random Diffusers at the Speed of Light

Dual-Path Modeling for Long Recording Speech Separation in Meetings

High-current CNT films grown directly on commercially available 2.5D substrates for low-voltage field-emission electron sources

Low half-wave-voltage, ultra-high bandwidth thin-film LiNbO3 modulator based on hybrid waveguide and periodic capacitively loaded electrodes

Observation of optical gyromagnetic properties in a magneto-plasmonic metamaterial

Ultrafast Parallel LiDAR with Time-encoding and Spectral Scanning: Breaking the Time-of-flight Limit

An End-to-end Architecture of Online Multi-channel Speech Separation

Continuous speech separation: dataset and analysis

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

Low energy magnons in the chiral ferrimagnet $\text{Cu}_2\text{OSeO}_3$: a coarse-grained approach

Real-time binaural speech separation with preserved spatial cues

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Terahertz Pulse Shaping Using Diffractive Surfaces

Fragmentation and isomerization of polycyclic aromatic hydrocarbons in the interstellar medium: coronene as a case study

Visually Constructing the Chemical Structure of a Single Molecule by Scanning Raman Picoscopy

"WM"-Shaped Growth of GaN on Patterned Sapphire Substrates

A PMT-like high gain avalanche photodiode based on GaN/AlN periodical stacked structure

Broadband frequency comb generation in aluminum nitride-on-sapphire microresonators

Continuous-wave Raman Lasing in Aluminum Nitride Microresonators

InGaN/GaN Multi-Quantum-Well and Light-Emitting Diode Based on V-pit-Shaped GaN Grown on Patterned Sapphire Substrate

Understanding different efficiency droop behaviors in InGaN-based near-UV, blue and green light-emitting diodes through differential carrier lifetime measurements

Coherent Resonances Observed in the Dissociative Electron Attachments to Carbon Monoxide

Raman Images of a Single Molecule in a Highly Confined Plasmonic Field

Significant contributions of Albrecht's $A$ term to non-resonant Raman scattering processes

Degrees-of-Freedom Regions for $K$-User MISO Time-Correlated Broadcast Channel

From microscopic theory to macroscopic theory: a systematic study on static modeling for liquid crystals

CO2 dissociation activated through electron attachment on reduced rutile TiO2(110)-1x1 surface

Evidence of Photocatalytic Dissociation of Water on TiO2 with Atomic Resolution

Laser-launched evanescent surface plasmon polariton field utilized as a direct coherent pumping source to generate emitted nonlinear four-wave mixing radiation

Nonlinear electron scattering activated by surface plasmon excitation of Ag nanostructures

A density matrix approach for the electroluminescence of molecules in a scanning tunneling microscope

Rotation and dissociation dynamics of a single O2 molecule on the Pt(111) surface determined from a first principles study

First-Principles Simulations of Inelastic Electron Tunneling Spectroscopyof Molecular Junctions