Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2022arXiv

A Fast Attention Network for Joint Intent Detection and Slot Filling on Edge Devices

Intent detection and slot filling are two main tasks in natural language understanding and play an essential role in task-oriented dialogue systems. The joint learning of both tasks can improve inference accuracy and is popular in recent works. However, most joint models ignore the inference latency and cannot meet the need to deploy dialogue systems at the edge. In this paper, we propose a Fast Attention Network (FAN) for joint intent detection and slot filling tasks, guaranteeing both accuracy and latency. Specifically, we introduce a clean and parameter-refined attention module to enhance the information exchange between intent and slot, improving semantic accuracy by more than 2%. FAN can be implemented on different encoders and delivers more accurate models at every speed level. Our experiments on the Jetson Nano platform show that FAN inferences fifteen utterances per second with a small accuracy drop, showing its effectiveness and efficiency on edge devices.

preprint2022arXiv

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation. However, all the above tasks are in the direction of speech understanding, but for the inverse direction, speech synthesis, the potential of representation learning is yet to be realized, due to the challenging nature of generating high-quality speech. To address this problem, we propose our framework, Alignment-Aware Acoustic-Text Pretraining (A$^3$T), which reconstructs masked acoustic signals with text input and acoustic-text alignment during training. In this way, the pretrained model can generate high quality reconstructed spectrogram, which can be applied to the speech editing and unseen speaker TTS directly. Experiments show A$^3$T outperforms SOTA models on speech editing, and improves multi-speaker speech synthesis without the external speaker verification model.

preprint2022arXiv

Data-Driven Adaptive Simultaneous Machine Translation

In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency. However, wait-k suffers from two major limitations: (a) it is a fixed policy that can not adaptively adjust latency given context, and (b) its training is much slower than full-sentence translation. To alleviate these issues, we propose a novel and efficient training scheme for adaptive SimulMT by augmenting the training corpus with adaptive prefix-to-prefix pairs, while the training complexity remains the same as that of training full-sentence translation models. Experiments on two language pairs show that our method outperforms all strong baselines in terms of translation quality and latency.

preprint2022arXiv

Experimental demonstration of genuine tripartite nonlocality under strict locality conditions

Nonlocality captures one of the counterintuitive features of nature that defies classical intuition. Recent investigations reveal that our physical world's nonlocality is at least tripartite; i.e., genuinely tripartite nonlocal correlations in nature cannot be reproduced by any causal theory involving bipartite nonclassical resources and unlimited shared randomness. Here, by allowing the fair sampling assumption and postselection, we experimentally demonstrate such genuine tripartite nonlocality in a network under strict locality constraints that are ensured by spacelike separating all relevant events and employing fast quantum random number generators and high-speed polarization measurements. In particular, for a photonic quantum triangular network we observe a locality-loophole-free violation of the Bell-type inequality by 7.57 standard deviations for a postselected tripartite Greenberger-Horne-Zeilinger state of fidelity $(93.13 \pm 0.24)\%$, which convincingly disproves the possibility of simulating genuine tripartite nonlocality by bipartite nonlocal resources with globally shared randomness.

preprint2022arXiv

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddleSpeech is an open-source all-in-one speech toolkit. It aims at facilitating the development and research of speech processing technologies by providing an easy-to-use command-line interface and a simple code structure. This paper describes the design philosophy and core architecture of PaddleSpeech to support several essential speech-to-text and text-to-speech tasks. PaddleSpeech achieves competitive or state-of-the-art performance on various speech datasets and implements the most popular methods. It also provides recipes and pretrained models to quickly reproduce the experimental results in this paper. PaddleSpeech is publicly avaiable at https://github.com/PaddlePaddle/PaddleSpeech.

preprint2022arXiv

Prediction and Control of Focal Seizure Spread: Random Walk with Restart on Heterogeneous Brain Networks

Whole-brain models offer a promising method of predicting seizure spread, which is critical for successful surgery treatment of focal epilepsy. Existing methods are largely based on structural connectome, which ignores the effects of heterogeneity in regional excitability of brains. In this study, we used a whole-brain model to show that heterogeneity in nodal excitability had a significant impact on seizure propagation in the networks, and compromised the prediction accuracy with structural connections. We then addressed this problem with an algorithm based on random walk with restart on graphs. We demonstrated that by establishing a relationship between the restarting probability and the excitability for each node, this algorithm could significantly improve the seizure spread prediction accuracy in heterogeneous networks, and was more robust against the extent of heterogeneity. We also strategized surgical seizure control as a process to identify and remove the key nodes (connections) responsible for the early spread of seizures from the focal region. Compared to strategies based on structural connections, virtual surgery with a strategy based on mRWER generated outcomes with a high success rate while maintaining low damage to the brain by removing fewer anatomical connections. These findings may have potential applications in developing personalized surgery strategies for epilepsy.

preprint2021arXiv

MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation

End-to-end Speech-to-text Translation (E2E-ST), which directly translates source language speech to target language text, is widely useful in practice, but traditional cascaded approaches (ASR+MT) often suffer from error propagation in the pipeline. On the other hand, existing end-to-end solutions heavily depend on the source language transcriptions for pre-training or multi-task training with Automatic Speech Recognition (ASR). We instead propose a simple technique to learn a robust speech encoder in a self-supervised fashion only on the speech side, which can utilize speech data without transcription. This technique termed Masked Acoustic Modeling (MAM), not only provides an alternative solution to improving E2E-ST, but also can perform pre-training on any acoustic signals (including non-speech ones) without annotation. We conduct our experiments over 8 different translation directions. In the setting without using any transcriptions, our technique achieves an average improvement of +1.1 BLEU, and +2.3 BLEU with MAM pre-training. Pre-training of MAM with arbitrary acoustic signals also has an average improvement with +1.6 BLEU for those languages. Compared with ASR multi-task learning solution, which replies on transcription during training, our pre-trained MAM model, which does not use transcription, achieves similar accuracy.

preprint2021arXiv

Stable Online Computation Offloading via Lyapunov-guided Deep Reinforcement Learning

In this paper, we consider a multi-user mobile-edge computing (MEC) network with time-varying wireless channels and stochastic user task data arrivals in sequential time frames. In particular, we aim to design an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. The online algorithm is practical in the sense that the decisions for each time frame are made without the assumption of knowing future channel conditions and data arrivals. We formulate the problem as a multi-stage stochastic mixed integer non-linear programming (MINLP) problem that jointly determines the binary offloading (each user computes the task either locally or at the edge server) and system resource allocation decisions in sequential time frames. To address the coupling in the decisions of different time frames, we propose a novel framework, named LyDROO, that combines the advantages of Lyapunov optimization and deep reinforcement learning (DRL). Specifically, LyDROO first applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems of much smaller size. Then, it integrates model-based optimization and model-free DRL to solve the per-frame MINLP problems with very low computational complexity. Simulation results show that the proposed LyDROO achieves optimal computation performance while satisfying all the long-term constraints. Besides, it induces very low execution latency that is particularly suitable for real-time implementation in fast fading environments.

preprint2021arXiv

Visualizing Deep Learning-based Radio Modulation Classifier

Deep learning has recently been successfully applied in automatic modulation classification by extracting and classifying radio features in an end-to-end way. However, deep learning-based radio modulation classifiers are lack of interpretability, and there is little explanation or visibility into what kinds of radio features are extracted and chosen for classification. In this paper, we visualize different deep learning-based radio modulation classifiers by introducing a class activation vector. Specifically, both convolutional neural networks (CNN) based classifier and long short-term memory (LSTM) based classifier are separately studied, and their extracted radio features are visualized. Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points. In particular, for the LSTM-based classifier, its obtained radio features are similar to the knowledge of human experts. Our numerical results indicate the radio features extracted by deep learning-based classifiers greatly depend on the contents carried by radio signals, and a short radio sample may lead to misclassification.

preprint2020arXiv

Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks

Wireless powered mobile-edge computing (MEC) has recently emerged as a promising paradigm to enhance the data processing capability of low-power networks, such as wireless sensor networks and internet of things (IoT). In this paper, we consider a wireless powered MEC network that adopts a binary offloading policy, so that each computation task of wireless devices (WDs) is either executed locally or fully offloaded to an MEC server. Our goal is to acquire an online algorithm that optimally adapts task offloading decisions and wireless resource allocations to the time-varying wireless channel conditions. This requires quickly solving hard combinatorial optimization problems within the channel coherence time, which is hardly achievable with conventional numerical optimization methods. To tackle this problem, we propose a Deep Reinforcement learning-based Online Offloading (DROO) framework that implements a deep neural network as a scalable solution that learns the binary offloading decisions from the experience. It eliminates the need of solving combinatorial optimization problems, and thus greatly reduces the computational complexity especially in large-size networks. To further reduce the complexity, we propose an adaptive procedure that automatically adjusts the parameters of the DROO algorithm on the fly. Numerical results show that the proposed algorithm can achieve near-optimal performance while significantly decreasing the computation time by more than an order of magnitude compared with existing optimization methods. For example, the CPU execution latency of DROO is less than $0.1$ second in a $30$-user network, making real-time and optimal offloading truly viable even in a fast fading environment.

preprint2020arXiv

Joint Optimization of Service Caching Placement and Computation Offloading in Mobile Edge Computing Systems

In mobile edge computing (MEC) systems, edge service caching refers to pre-storing the necessary programs for executing computation tasks at MEC servers. At resource-constrained edge servers, service caching placement is in general a complicated problem that highly correlates to the offloading decisions of computation tasks. In this paper, we consider a single edge server that assists a mobile user (MU) in executing a sequence of computation tasks. In particular, the MU can run its customized programs at the edge server, while the server can selectively cache the previously generated programs for future service reuse. To minimize the computation delay and energy consumption of the MU, we formulate a mixed integer non-linear programming (MINLP) that jointly optimizes the service caching placement, computation offloading, and system resource allocation. We first derive the closed-form expressions of the optimal resource allocation, and subsequently transform the MINLP into an equivalent pure 0-1 integer linear programming (ILP). To further reduce the complexity in solving the ILP, we exploit the underlying structures in optimal solutions, and devise a reduced-complexity alternating minimization technique to update the caching placement and offloading decision alternately. Simulations show that the proposed techniques achieve substantial resource savings compared to other representative benchmark methods.

preprint2020arXiv

NEJM-enzh: A Parallel Corpus for English-Chinese Translation in the Biomedical Domain

Machine translation requires large amounts of parallel text. While such datasets are abundant in domains such as newswire, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge a parallel corpus in the biomedical domain does not exist for this language pair. In this study, we develop an effective pipeline to acquire and process an English-Chinese parallel corpus, consisting of about 100,000 sentence pairs and 3,000,000 tokens on each side, from the New England Journal of Medicine (NEJM). We show that training on out-of-domain data and fine-tuning with as few as 4,000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en$\to$zh (zh$\to$en) directions. Translation quality continues to improve at a slower pace on larger in-domain datasets, with an increase of 33.0 (24.3) BLEU for en$\to$zh (zh$\to$en) directions on the full dataset.

preprint2020arXiv

Opportunistic Decoding with Timely Correction for Simultaneous Translation

Simultaneous translation has many important application scenarios and attracts much attention from both academia and industry recently. Most existing frameworks, however, have difficulties in balancing between the translation quality and latency, i.e., the decoding policy is usually either too aggressive or too conservative. We propose an opportunistic decoding technique with timely correction ability, which always (over-)generates a certain mount of extra words at each step to keep the audience on track with the latest information. At the same time, it also corrects, in a timely fashion, the mistakes in the former overgenerated words when observing more source context to ensure high translation quality. Experiments show our technique achieves substantial reduction in latency and up to +3.1 increase in BLEU, with revision rate under 8% in Chinese-to-English and English-to-Chinese translation.

preprint2020arXiv

Optimal approximations of available states and a triple uncertainty relation

We investigate the optimal convex approximation of the quantum state with respect to a set of available states. By isometric transformation, we have presented the general mathematical model and its solutions together with a triple uncertainty equality relation. Meanwhile, we show a concise inequality criterion for decomposing qubit mixed states. The new results include previous ones as special cases. Our model and method may be applied to solve similar problems in high-dimensional and multipartite scenarios

preprint2020arXiv

Quantum signatures of transitions from stable fixed points to limit cycles in optomechanical systems

Optomechanical systems, due to its inherent nonlinear optomechanical coupling, owns rich nonlinear dynamics of different types of motion. The interesting question is that whether there exist some common quantum features to infer the nonlinear dynamical transitions from one type to another. In this paper, we have studied the quantum signatures of transitions from stable fixed points to limit cycles in an optomechanical phonon laser system. Our calculations show that the entanglement of stable fixed points in the long run does not change with time, however, it will oscillate periodically with time at the mechanical vibration frequency for the limit cycles. Most strikingly, the entanglement quite close to the boundary line keeps as a constant, and it is very robust to the thermal phonon noise, as strong indications of this particular classical transitions.

preprint2020arXiv

Simultaneous Translation Policies: From Fixed to Adaptive

Adaptive policies are better than fixed policies for simultaneous translation, since they can flexibly balance the tradeoff between translation quality and latency based on the current context information. But previous methods on obtaining adaptive policies either rely on complicated training process, or underperform simple fixed policies. We design an algorithm to achieve adaptive policies via a simple heuristic composition of a set of fixed policies. Experiments on Chinese -> English and German -> English show that our adaptive policies can outperform fixed ones by up to 4 BLEU points for the same latency, and more surprisingly, it even surpasses the BLEU score of full-sentence translation in the greedy mode (and very close to beam mode), but with much lower latency.

preprint2020arXiv

ThreshKnot: Thresholded ProbKnot for Improved RNA Secondary Structure Prediction

RNA structure prediction is a challenging problem, especially with pseudoknots. Recently, there has been a shift from the classical minimum free energy-based methods (MFE) to partition function-based ones that assemble structures using base-pairing probabilities. Two examples of the latter group are the popular maximum expected accuracy (MEA) method and the ProbKnot method. ProbKnot is a fast heuristic that pairs nucleotides that are reciprocally most probable pairing partners, and unlike MEA, can also predict structures with pseudoknots. However, ProbKnot's full potential has been largely overlooked. In particular, when introduced, it did not have an MEA-like hyperparameter that can balance between positive predictive value (PPV) and sensitivity. We show that a simple thresholded version of ProbKnot, which we call ThreshKnot, leads to more accurate overall predictions by filtering out unlikely pairs whose probabilities fall under a given threshold. We also show that on three widely-used folding engines (RNAstructure, Vienna RNAfold, and CONTRAfold), ThreshKnot always outperforms the much more involved MEA algorithm in (1) its higher structure prediction accuracy, (2) its capability to predict pseudoknots, and (3) its faster runtime and easier implementation. This suggests that ThreshKnot should replace MEA as the default partition function-based structure prediction algorithm. ThreshKnot is already available in the widely used RNAstructure software package version 6.2 (released November 27, 2019): https://rna.urmc.rochester.edu/RNAstructure.html

preprint2019arXiv

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search

Motivation: Predicting the secondary structure of an RNA sequence is useful in many applications. Existing algorithms (based on dynamic programming) suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Results: We present a novel alternative $O(n^3)$-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in $O(n)$ time and $O(n)$ space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5'-to-3') direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models. Availability: Our source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100,000nt).

preprint2018arXiv

Transport signatures of relativistic quantum scars in a graphene cavity

We study a relativistic quantum cavity system realized by etching out from a graphene sheet by quantum transport measurements and theoretical calculations. The conductance of the graphene cavity has been measured as a function of the back gate voltage (or the Fermi energy) and the magnetic field applied perpendicular to the graphene sheet, and characteristic conductance contour patterns are observed in the measurements. In particular, two types of high conductance contour lines, i.e., straight and parabolic-like high conductance contour lines, are found in the measurements. The theoretical calculations are performed within the framework of tight-binding approach and Green's function formalism. Similar characteristic high conductance contour features as in the experiments are found in the calculations. The wave functions calculated at points selected along a straight conductance contour line are found to be dominated by a chain of scars of high probability distributions arranged as a necklace following the shape of cavity and the current density distributions calculated at these point are dominated by an overall vortex in the cavity. These characteristics are found to be insensitive to increasing magnetic field. However, the wave function probability distributions and the current density distributions calculated at points selected along a parabolic-like contour line show a clear dependence on increasing magnetic field, and the current density distributions at these points are characterized by the complex formation of several localized vortices in the cavity. Our work brings a new insight into quantum chaos in relativistic particle systems and would greatly stimulate experimental and theoretical efforts towards this still emerging field.

preprint2013arXiv

Controlling collective dynamics in complex, minority-game resource-allocation systems

Resource allocation takes place in various kinds of real-world complex systems, such as the traffic systems, social services institutions or organizations, or even the ecosystems. The fundamental principle underlying complex resource-allocation dynamics is Boolean interactions associated with minority games, as resources are generally limited and agents tend to choose the least used resource based on available information. A common but harmful dynamical behavior in resource-allocation systems is herding, where there are time intervals during which a large majority of the agents compete for a few resources, leaving many other resources unused. Ac- companying the herd behavior is thus strong fluctuations with time in the number of resources being used. In this paper, we articulate and establish that an intuitive control strategy, namely pinning control, is effective at harnessing the herding dynamics. In particular, by fixing the choices of resources for a few agents while leaving majority of the agents free, herding can be eliminated completely. Our investigation is systematic in that we consider random and targeted pinning and a variety of network topologies, and we carry out a comprehensive analysis in the framework of mean-field theory to understand the working of control. The basic philosophy is then that, when a few agents waive their freedom to choose resources by receiving sufficient incentives, majority of the agents benefit in that they will make fair, efficient, and effective use of the available resources. Our work represents a basic and general framework to address the fundamental issue of fluctuations in complex dynamical systems with significant applications to social, economical and political systems.

preprint2013arXiv

Hidden symmetry and Collective behavior

We study the relationship between the partially synchronous state and the coupling structure in general dynamical systems. Our results show that, on the contrary to the widely accepted concept, topological symmetry in a coupling structure is the sufficient condition but not the necessary condition. Furthermore, we find the necessary and sufficient condition for the existence of the partial synchronization and develop a method to obtain all of the existing partially synchronous solutions for all nonspecific dynamics from a very large number of possible candidates.

preprint2012arXiv

Emergence of grouping in multi-resource minority game dynamics

TheMinority Game (MG) has become a paradigm to probe complex social and economical phenomena where adaptive agents compete for a limited resource, and it finds applications in statistical and nonlinear physics as well. In the traditional MG model, agents are assumed to have access to global information about the past history of the underlying system, and they react by choosing one of the two available options associated with a single resource. Complex systems arising in a modern society, however, can possess many resources so that the number of available strategies/resources can be multiple. We propose a class of models to investigate MG dynamics with multiple strategies. In particular, in such a system, at any time an agent can either choose a minority strategy (say with probability p) based on available local information or simply choose a strategy randomly (with probability 1 - p). The parameter p thus defines the minority-preference probability, which is key to the dynamics of the underlying system. A striking finding is the emergence of strategy-grouping states where a particular number of agents choose a particular subset of strategies. We develop an analytic theory based on the mean-field framework to understand the "bifurcation" to the grouping states and their evolution. The grouping phenomenon has also been revealed in a real-world example of the subsystem of 27 stocks in the Shanghai Stock Market's Steel Plate. Our work demonstrates that complex systems following the MG rules can spontaneously self-organize themselves into certain divided states, and our model represents a basic mathematical framework to address this kind of phenomena in social, economical, and even political systems.