Source author record

Xiaodong Zeng

Xiaodong Zeng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence eess.AS physics.optics quant-ph Sound

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

Traditional speech systems typically rely on separate, task-specific models for text-to-speech (TTS), automatic speech recognition (ASR), and voice conversion (VC), resulting in fragmented pipelines that limit scalability, efficiency, and cross-task generalization. In this paper, we present General-Purpose Audio (GPA), a unified audio foundation model that integrates multiple core speech tasks within a single large language model (LLM) architecture. GPA operates on a shared discrete audio token space and supports instruction-driven task induction, enabling a single autoregressive model to flexibly perform TTS, ASR, and VC without architectural modifications. This unified design combines a fully autoregressive formulation over discrete speech tokens, joint multi-task training across speech domains, and a scalable inference pipeline that achieves high concurrency and throughput. The resulting model family supports efficient multi-scale deployment, including a lightweight 0.3B-parameter variant optimized for edge and resource-constrained environments. Together, these design choices demonstrate that a unified autoregressive architecture can achieve competitive performance across diverse speech tasks while remaining viable for low-latency, practical deployment.

preprint2022arXiv

A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

Although well-established in general reinforcement learning (RL), value-based methods are rarely explored in constrained RL (CRL) for their incapability of finding policies that can randomize among multiple actions. To apply value-based methods to CRL, a recent groundbreaking line of game-theoretic approaches uses the mixed policy that randomizes among a set of carefully generated policies to converge to the desired constraint-satisfying policy. However, these approaches require storing a large set of policies, which is not policy efficient, and may incur prohibitive memory costs in constrained deep RL. To address this problem, we propose an alternative approach. Our approach first reformulates the CRL to an equivalent distance optimization problem. With a specially designed linear optimization oracle, we derive a meta-algorithm that solves it using any off-the-shelf RL algorithm and any conditional gradient (CG) type algorithm as subroutines. We then propose a new variant of the CG-type algorithm, which generalizes the minimum norm point (MNP) method. The proposed method matches the convergence rate of the existing game-theoretic approaches and achieves the worst-case optimal policy efficiency. The experiments on a navigation task show that our method reduces the memory costs by an order of magnitude, and meanwhile achieves better performance, demonstrating both its effectiveness and efficiency.

preprint2021arXiv

Adversarial Learning for Incentive Optimization in Mobile Payment Marketing

Many payment platforms hold large-scale marketing campaigns, which allocate incentives to encourage users to pay through their applications. To maximize the return on investment, incentive allocations are commonly solved in a two-stage procedure. After training a response estimation model to estimate the users' mobile payment probabilities (MPP), a linear programming process is applied to obtain the optimal incentive allocation. However, the large amount of biased data in the training set, generated by the previous biased allocation policy, causes a biased estimation. This bias deteriorates the performance of the response model and misleads the linear programming process, dramatically degrading the performance of the resulting allocation policy. To overcome this obstacle, we propose a bias correction adversarial network. Our method leverages the small set of unbiased data obtained under a full-randomized allocation policy to train an unbiased model and then uses it to reduce the bias with adversarial learning. Offline and online experimental results demonstrate that our method outperforms state-of-the-art approaches and significantly improves the performance of the resulting allocation policy in a real-world marketing campaign.

preprint2021arXiv

Plasmonic band and defect mode of one dimensional graphene lattice

Photonic crystals based on graphene plasmons (GPs) are highly tunable and can accurately control photonic transmission at the nanoscales. In this work, the transfer matrix method (TMM) is introduced to study graphene plasmonic crystal (GPC) with periodic surface conductivity in the case of normal incidence. The introduction of TMM after considering the abnormal phase scattering of the abrupt interface gives an idea to accurately manipulate plasmonic crystal structures, and can reduce the calculation workload to a certain extent. The effectiveness of the proposed method is verified with the plane wave expansion method in our model. Furthermore, we study the defect mode and the plasmonic Tamm state in GPC by the transfer matrix method.

preprint2015arXiv

Single Photon Transport through an Atomic Chain Coupled to a One-dimensional Nanophotonic Waveguide

We study the dynamics of a single photon pulse travels through a linear atomic chain coupled to a one-dimensional (1D) single mode photonic waveguide. We derive a time-dependent dynamical theory for this collective many-body system which allows us to study the real time evolution of the photon transport and the atomic excitations. Our analytical result is consistent with previous numerical calculations when there is only one atom. For an atomic chain, the collective interaction between the atoms mediated by the waveguide mode can significantly change the dynamics of the system. The reflectivity of a photon can be tuned by changing the ratio of coupling strength and the photon linewidth or by changing the number of atoms in the chain. The reflectivity of a single photon pulse with finite bandwidth can even approach $100\%$. The spectrum of the reflected and transmitted photon can also be significantly different from the single atom case. Many interesting physical phenomena can occur in this system such as the photonic bandgap effects, quantum entanglement generation, Fano-like interference, and superradiant effects. For engineering, this system may serve as a single photon frequency filter, single photon modulation and may find important applications in quantum information.