Source author record

Chao Xu

Chao Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.str-el eess.SP Robotics

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Delay-Doppler-Domain Channel Estimation and Reduced-Complexity Detection of Faster-than-Nyquist Signaling Aided OTFS

We conceive a novel channel estimation and data detection scheme for OTFS-modulated faster-than-Nyquist (FTN) transmission over doubly selective fading channels, aiming for enhancing the spectral efficiency and Doppler resilience. The delay-Doppler (DD) domain's input-output relationship of OTFS-FTN signaling is derived by employing a root-raised cosine (RRC) shaping filter. More specifically, we design our DD-domain channel estimator for FTN-based pilot transmission, where the pilot symbol interval is lower than that defined by the classic Nyquist criterion. Moreover, we propose a reduced-complexity linear minimum mean square error equalizer, supporting noise whitening, where the FTN-induced inter-symbol interference (ISI) matrix is approximated by a sparse one. Our performance results demonstrate that the proposed OTFS-FTN scheme is capable of enhancing the achievable information rate, while attaining a comparable BER performance to both that of its Nyquist-based OTFS counterpart and to other FTN transmission schemes, which employ the same RRC shaping filter.

preprint2026arXiv

Electric field effects in one-dimensional spin-1/2 $K_1J_1Γ_1Γ_1^\prime K_2J_2$ model with ferromagnetic Kitaev coupling

We perform a systematic study on the effects of electric fields in the Luttinger liquid phase of the one-dimensional spin-$1/2$ $K_1J_1Γ_1Γ_1^\prime K_2J_2$ model in the region of ferromagnetic nearest-neighboring Kitaev coupling. We find that while electric fields along $(1,1,1)$-direction maintain the Luttinger liquid behavior, fields along other directions drive the system to a dimerized state. An estimation is made on how effective a $(1,1,1)$-field is for tuning the Luttinger parameter in real materials. Our work is useful for understanding the effects of electric fields in one-dimensional generalized Kitaev spin models, and provides a starting point for exploring the electric-field-related physics in two dimensions based on a quasi-one-dimensional approach.

preprint2026arXiv

NEWTON: Agentic Planning for Physically Grounded Video Generation

Video generation models produce visually compelling results but systematically violate physical commonsense -- on VideoPhy-2, the best model achieves only 32.6% joint accuracy. We identify a specification bottleneck: text prompts are lossy compression of the physical world, omitting the parameters that fully determine dynamics, and no amount of model scaling can recover what was never specified. From this diagnosis we derive three properties that physics conditioning must satisfy -- sufficiency, dynamism, and verifiability -- and show that no existing approach satisfies all three. We present NEWTON, in which video generation is demoted from the system output to one action inside an agent's toolbox: a learned planner orchestrates physics-aware tools (keyframe generation, scientific computation, prompt refinement) to construct rich conditioning, and a verifier closes the loop for iterative re-planning. The planner is the sole trainable component, optimized on-policy via Flow-GRPO inside the live multi-turn loop. On VideoPhy-2, NEWTON improves joint accuracy from 21.4% to 29.7% on LTX-Video and from 30.7% to 37.4% on Veo-3.1, without modifying either generator. Our project page: https://Newton026.github.io/newton

preprint2026arXiv

U-REPA: Aligning Diffusion U-Nets to ViTs

Representation Alignment (REPA) that aligns Diffusion Transformer (DiT) hidden-states with ViT visual encoders has proven highly effective in DiT training, demonstrating superior convergence properties, but it has not been validated on the canonical diffusion U-Net architecture that shows faster convergence compared to DiTs. However, adapting REPA to U-Net architectures presents unique challenges: (1) different block functionalities necessitate revised alignment strategies; (2) spatial-dimension inconsistencies emerge from U-Net's spatial downsampling operations; (3) space gaps between U-Net and ViT hinder the effectiveness of tokenwise alignment. To encounter these challenges, we propose \textbf{U-REPA}, a representation alignment paradigm that bridges U-Net hidden states and ViT features as follows: Firstly, we propose via observation that due to skip connection, the middle stage of U-Net is the best alignment option. Secondly, we propose upsampling of U-Net features after passing them through MLPs. Thirdly, we observe difficulty when performing tokenwise similarity alignment, and further introduces a manifold loss that regularizes the relative similarity between samples. Experiments indicate that the resulting U-REPA could achieve excellent generation quality and greatly accelerates the convergence speed. With CFG guidance interval, U-REPA could reach $FID<1.5$ in 200 epochs or 1M iterations on ImageNet 256 $\times$ 256, and needs only half the total epochs to perform better than REPA under sd-vae-ft-ema. Codes: https://github.com/YuchuanTian/U-REPA

preprint2025arXiv

CREPES-X: Hierarchical Bearing-Distance-Inertial Direct Cooperative Relative Pose Estimation System

Relative localization is critical for cooperation in autonomous multi-robot systems. Existing approaches either rely on shared environmental features or inertial assumptions or suffer from non-line-of-sight degradation and outliers in complex environments. Robust and efficient fusion of inter-robot measurements such as bearings, distances, and inertials for tens of robots remains challenging. We present CREPES-X (Cooperative RElative Pose Estimation System with multiple eXtended features), a hierarchical relative localization framework that enhances speed, accuracy, and robustness under challenging conditions, without requiring any global information. CREPES-X starts with a compact hardware design: InfraRed (IR) LEDs, an IR camera, an ultra-wideband module, and an IMU housed in a cube no larger than 6cm on each side. Then CREPES-X implements a two-stage hierarchical estimator to meet different requirements, considering speed, accuracy, and robustness. First, we propose a single-frame relative estimator that provides instant relative poses for multi-robot setups through a closed-form solution and robust bearing outlier rejection. Then a multi-frame relative estimator is designed to offer accurate and robust relative states by exploring IMU pre-integration via robocentric relative kinematics with loosely- and tightly-coupled optimization. Extensive simulations and real-world experiments validate the effectiveness of CREPES-X, showing robustness to up to 90% bearing outliers, proving resilience in challenging conditions, and achieving RMSE of 0.073m and 1.817° in real-world datasets.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint