Source author record

Yuhang Jia

Yuhang Jia appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT eess.AS eess.SP Sound

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CosyEdit: Unlocking End-to-End Speech Editing Capability from Zero-Shot Text-to-Speech Models

Automatic speech editing aims to modify spoken content based on textual instructions, yet traditional cascade systems suffer from complex preprocessing pipelines and a reliance on explicit external temporal alignment. Addressing these limitations, we propose CosyEdit, an end-to-end speech editing model adapted from CosyVoice through task-specific fine-tuning and an optimized inference procedure, which internalizes speech-text alignment while ensuring high consistency between the speech before and after editing. By fine-tuning on only 250 hours of supervised data from our curated GigaEdit dataset, our 400M-parameter model achieves reliable speech editing performance. Experiments on the RealEdit benchmark indicate that CosyEdit not only outperforms several billion-parameter language model baselines but also matches the performance of state-of-the-art cascade approaches. These results demonstrate that, with task-specific fine-tuning and inference optimization, robust and efficient speech editing capabilities can be unlocked from a zero-shot TTS model, yielding a novel and cost-effective end-to-end solution for high-quality speech editing.

preprint2022arXiv

Low-complexity Robust Optimization for an IRS-assisted Multi-Cell Network

The impacts of channel estimation errors, inter-cell interference, phase adjustment cost, and computation cost on an intelligent reflecting surface (IRS)-assisted system are severe in practice but have been ignored for simplicity in most existing works. In this paper, we investigate a multi-antenna base station (BS) serving a single-antenna user with the help of a multi-element IRS in the presence of channel estimation errors and inter-cell interference. We consider imperfect channel state information (CSI) at the BS, i.e., imperfect CSIT, and focus on the robust optimization of the BS's instantaneous CSI-adaptive beamforming and the IRS's quasi-static phase shifts. First, we formulate the robust optimization of the BS's instantaneous channel state information (CSI)-adaptive beamforming and IRS's quasi-static phase shifts for the ergodic rate maximization as a very challenging two-timescale stochastic non-convex problem. Then, we obtain a closed-form beamformer for any given phase shifts and a more tractable single-timescale stochastic non-convex problem only for phase shifts. Next, we propose a low-complexity stochastic algorithm to obtain quasi-static phase shifts which correspond to a KKT point of the single-timescale stochastic problem. It is worth noting that the proposed method offers a closed-form robust instantaneous CSI-adaptive beamforming design that can promptly adapt to rapid CSI changes over slots and a robust quasi-static phase shift design of low computation and phase adjustment costs in the presence of channel estimation errors and inter-cell interference. Finally, numerical results demonstrate the notable gains of the proposed robust joint design over existing ones and reveal the practical values of the proposed solutions.

preprint2021arXiv

Device Activity Detection for Massive Grant-Free Access Under Frequency-Selective Rayleigh Fading

Device activity detection and channel estimation for massive grant-free access under frequency-selective fading have unfortunately been an outstanding problem. This paper aims to address the challenge. Specifically, we present an orthogonal frequency division multiplexing (OFDM)-based massive grant-free access scheme for a wideband system with one M-antenna base station (BS), N single-antenna Internet of Things (IoT) devices, and P channel taps. We obtain two different but equivalent models for the received pilot signals under frequency-selective Rayleigh fading. Based on each model, we formulate device activity detection as a non-convex maximum likelihood estimation (MLE) problem and propose an iterative algorithm to obtain a stationary point using optimal techniques. The two proposed MLE-based methods have the identical computational complexity order O(NPL^2), irrespective of M, and degrade to the existing MLE-based device activity detection method when P=1. Conventional channel estimation methods can be readily applied for channel estimation of detected active devices under frequency-selective Rayleigh fading, based on one of the derived models for the received pilot signals. Numerical results show that the two proposed methods have different preferable system parameters and complement each other to offer promising device activity detection design for grant-free massive access under frequency-selective Rayleigh fading.

preprint2021arXiv

Statistical Device Activity Detection for OFDM-based Massive Grant-Free Access

Existing works on grant-free access, proposed to support massive machine-type communication (mMTC) for the Internet of things (IoT), mainly concentrate on narrow band systems under flat fading. However, little is known about massive grant-free access for wideband systems under frequency-selective fading. This paper investigates massive grant-free access in a wideband system under frequency-selective fading. First, we present an orthogonal frequency division multiplexing (OFDM)-based massive grant-free access scheme. Then, we propose two different but equivalent models for the received pilot signal, which are essential for designing various device activity detection and channel estimation methods for OFDM-based massive grant-free access. One directly models the received signal for actual devices, whereas the other can be interpreted as a signal model for virtual devices. Next, we investigate statistical device activity detection under frequency-selective Rayleigh fading based on the two signal models. We first model device activities as unknown deterministic quantities and propose three maximum likelihood (ML) estimation-based device activity detection methods with different detection accuracies and computation times. We also model device activities as random variables with a known joint distribution and propose three maximum a posterior probability (MAP) estimation-based device activity methods, which further enhance the accuracies of the corresponding ML estimation-based methods. Optimization techniques and matrix analysis are applied in designing and analyzing these methods. Finally, numerical results show that the proposed statistical device activity detection methods outperform existing state-of-the-art device activity detection methods under frequency-selective Rayleigh fading.

Yuhang Jia

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

CosyEdit: Unlocking End-to-End Speech Editing Capability from Zero-Shot Text-to-Speech Models

Low-complexity Robust Optimization for an IRS-assisted Multi-Cell Network

Device Activity Detection for Massive Grant-Free Access Under Frequency-Selective Rayleigh Fading

Statistical Device Activity Detection for OFDM-based Massive Grant-Free Access