Researcher profile

Yuhang Jia

Yuhang Jia contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

CosyEdit: Unlocking End-to-End Speech Editing Capability from Zero-Shot Text-to-Speech Models

Automatic speech editing aims to modify spoken content based on textual instructions, yet traditional cascade systems suffer from complex preprocessing pipelines and a reliance on explicit external temporal alignment. Addressing these limitations, we propose CosyEdit, an end-to-end speech editing model adapted from CosyVoice through task-specific fine-tuning and an optimized inference procedure, which internalizes speech-text alignment while ensuring high consistency between the speech before and after editing. By fine-tuning on only 250 hours of supervised data from our curated GigaEdit dataset, our 400M-parameter model achieves reliable speech editing performance. Experiments on the RealEdit benchmark indicate that CosyEdit not only outperforms several billion-parameter language model baselines but also matches the performance of state-of-the-art cascade approaches. These results demonstrate that, with task-specific fine-tuning and inference optimization, robust and efficient speech editing capabilities can be unlocked from a zero-shot TTS model, yielding a novel and cost-effective end-to-end solution for high-quality speech editing.

preprint2022arXiv

Low-complexity Robust Optimization for an IRS-assisted Multi-Cell Network

The impacts of channel estimation errors, inter-cell interference, phase adjustment cost, and computation cost on an intelligent reflecting surface (IRS)-assisted system are severe in practice but have been ignored for simplicity in most existing works. In this paper, we investigate a multi-antenna base station (BS) serving a single-antenna user with the help of a multi-element IRS in the presence of channel estimation errors and inter-cell interference. We consider imperfect channel state information (CSI) at the BS, i.e., imperfect CSIT, and focus on the robust optimization of the BS's instantaneous CSI-adaptive beamforming and the IRS's quasi-static phase shifts. First, we formulate the robust optimization of the BS's instantaneous channel state information (CSI)-adaptive beamforming and IRS's quasi-static phase shifts for the ergodic rate maximization as a very challenging two-timescale stochastic non-convex problem. Then, we obtain a closed-form beamformer for any given phase shifts and a more tractable single-timescale stochastic non-convex problem only for phase shifts. Next, we propose a low-complexity stochastic algorithm to obtain quasi-static phase shifts which correspond to a KKT point of the single-timescale stochastic problem. It is worth noting that the proposed method offers a closed-form robust instantaneous CSI-adaptive beamforming design that can promptly adapt to rapid CSI changes over slots and a robust quasi-static phase shift design of low computation and phase adjustment costs in the presence of channel estimation errors and inter-cell interference. Finally, numerical results demonstrate the notable gains of the proposed robust joint design over existing ones and reveal the practical values of the proposed solutions.

preprint2021arXiv

Device Activity Detection for Massive Grant-Free Access Under Frequency-Selective Rayleigh Fading

Device activity detection and channel estimation for massive grant-free access under frequency-selective fading have unfortunately been an outstanding problem. This paper aims to address the challenge. Specifically, we present an orthogonal frequency division multiplexing (OFDM)-based massive grant-free access scheme for a wideband system with one M-antenna base station (BS), N single-antenna Internet of Things (IoT) devices, and P channel taps. We obtain two different but equivalent models for the received pilot signals under frequency-selective Rayleigh fading. Based on each model, we formulate device activity detection as a non-convex maximum likelihood estimation (MLE) problem and propose an iterative algorithm to obtain a stationary point using optimal techniques. The two proposed MLE-based methods have the identical computational complexity order O(NPL^2), irrespective of M, and degrade to the existing MLE-based device activity detection method when P=1. Conventional channel estimation methods can be readily applied for channel estimation of detected active devices under frequency-selective Rayleigh fading, based on one of the derived models for the received pilot signals. Numerical results show that the two proposed methods have different preferable system parameters and complement each other to offer promising device activity detection design for grant-free massive access under frequency-selective Rayleigh fading.

preprint2021arXiv

Statistical Device Activity Detection for OFDM-based Massive Grant-Free Access

Existing works on grant-free access, proposed to support massive machine-type communication (mMTC) for the Internet of things (IoT), mainly concentrate on narrow band systems under flat fading. However, little is known about massive grant-free access for wideband systems under frequency-selective fading. This paper investigates massive grant-free access in a wideband system under frequency-selective fading. First, we present an orthogonal frequency division multiplexing (OFDM)-based massive grant-free access scheme. Then, we propose two different but equivalent models for the received pilot signal, which are essential for designing various device activity detection and channel estimation methods for OFDM-based massive grant-free access. One directly models the received signal for actual devices, whereas the other can be interpreted as a signal model for virtual devices. Next, we investigate statistical device activity detection under frequency-selective Rayleigh fading based on the two signal models. We first model device activities as unknown deterministic quantities and propose three maximum likelihood (ML) estimation-based device activity detection methods with different detection accuracies and computation times. We also model device activities as random variables with a known joint distribution and propose three maximum a posterior probability (MAP) estimation-based device activity methods, which further enhance the accuracies of the corresponding ML estimation-based methods. Optimization techniques and matrix analysis are applied in designing and analyzing these methods. Finally, numerical results show that the proposed statistical device activity detection methods outperform existing state-of-the-art device activity detection methods under frequency-selective Rayleigh fading.