Source author record

Che-Wei Huang

Che-Wei Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Computation and Language Machine Learning Sound cond-mat.mes-hall eess.SP quant-ph

Catalog footprint

What is connected

4works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Incremental user embedding modeling for personalized text classification

Individual user profiles and interaction histories play a significant role in providing customized experiences in real-world applications such as chatbots, social media, retail, and education. Adaptive user representation learning by utilizing user personalized information has become increasingly challenging due to ever-growing history data. In this work, we propose an incremental user embedding modeling approach, in which embeddings of user's recent interaction histories are dynamically integrated into the accumulated history vectors via a transformer encoder. This modeling paradigm allows us to create generalized user representations in a consecutive manner and also alleviate the challenges of data management. We demonstrate the effectiveness of this approach by applying it to a personalized multi-class classification task based on the Reddit dataset, and achieve 9% and 30% relative improvement on prediction accuracy over a baseline system for two experiment settings through appropriate comment history encoding and task modeling.

preprint2020arXiv

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

In this work, we propose a novel and efficient minimum word error rate (MWER) training method for RNN-Transducer (RNN-T). Unlike previous work on this topic, which performs on-the-fly limited-size beam-search decoding and generates alignment scores for expected edit-distance computation, in our proposed method, we re-calculate and sum scores of all the possible alignments for each hypothesis in N-best lists. The hypothesis probability scores and back-propagated gradients are calculated efficiently using the forward-backward algorithm. Moreover, the proposed method allows us to decouple the decoding and training processes, and thus we can perform offline parallel-decoding and MWER training for each subset iteratively. Experimental results show that this proposed semi-on-the-fly method can speed up the on-the-fly method by 6 times and result in a similar WER improvement (3.6%) over a baseline RNN-T model. The proposed MWER training can also effectively reduce high-deletion errors (9.2% WER-reduction) introduced by RNN-T models when EOS is added for endpointer. Further improvement can be achieved if we use a proposed RNN-T rescoring method to re-rank hypotheses and use external RNN-LM to perform additional rescoring. The best system achieves a 5% relative improvement on an English test-set of real far-field recordings and a 11.6% WER reduction on music-domain utterances.

preprint2020arXiv

Streaming ResLSTM with Causal Mean Aggregation for Device-Directed Utterance Detection

In this paper, we propose a streaming model to distinguish voice queries intended for a smart-home device from background speech. The proposed model consists of multiple CNN layers with residual connections, followed by a stacked LSTM architecture. The streaming capability is achieved by using unidirectional LSTM layers and a causal mean aggregation layer to form the final utterance-level prediction up to the current frame. In order to avoid redundant computation during online streaming inference, we use a caching mechanism for every convolution operation. Experimental results on a device-directed vs. non device-directed task show that the proposed model yields an equal error rate reduction of 41% compared to our previous best model on this task. Furthermore, we show that the proposed model is able to accurately predict earlier in time compared to the attention-based models.

preprint2012arXiv

A quantum effect in the classical limit: nonequilibrium tunneling in the Duffing oscillator

For suitable parameters, the classical Duffing oscillator has a known bistability in its stationary states, with low- and high-amplitude branches. As expected from the analogy with a particle in a double-well potential, transitions between these states become possible either at finite temperature, or in the quantum regime due to tunneling. In this analogy, besides local stability, one can also discuss global stability by comparing the two potential minima. For the Duffing oscillator, the stationary states emerge dynamically so that a priori, a potential-minimum criterion for them does not exist. However, global stability is still relevant, and definable as the state containing the majority population for long times, low temperature, and close to the classical limit. Further, the crossover point is the parameter value at which global stability abruptly changes from one state to the other. For the double-well model, the crossover point is defined by potential-minimum degeneracy. Given that this analogy is so effective in other respects, it is thus striking that for the Duffing oscillator, the crossover point turns out to be non-unique. Rather, none of the three aforementioned limits commute with each other, and the limiting behaviour depends on the order in which they are taken. More generally, as both $\hbar\To0$ and $T\To0$, the ratio $\hbarω_0/k_\mathrm{B}T$ continues to be a key parameter and can have any nonnegative value. This points to an apparent conceptual difference between equilibrium and nonequilibrium tunneling. We present numerical evidence by studying the pertinent quantum master equation in the photon-number basis. Independent verification and some further understanding are obtained using a semi-analytical approach in the coherent-state representation.