Source author record

Yue Wang

Yue Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision astro-ph.HE Robotics quant-ph Artificial Intelligence astro-ph.IM

Catalog footprint

What is connected

12works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Brightest GRB flare observed in GRB 221009A: bridge the last gap between flare and prompt emission in GRB

Flares are usually observed during the afterglow phase of Gamma-Ray Bursts (GRBs) in soft X-ray, optical and radio bands, but rarely in gamma-ray band. Despite the extraordinary brightness, GECAM-C has accurately measured both the bright prompt emission and flare emission of GRB 221009A without instrumental effects, offering a good opportunity to study the relation between them. In this work, we present a comprehensive analysis of flare emission of GRB 221009A, which is composed of a series of flares. Among them, we identify an exceptionally bright flare with a record-breaking isotropic energy $E_{\rm iso} = 1.82 \times 10^{53}$ erg of GRB flares. It exhibits the highest peak energy ever detected in GRB flares, $E_{\rm peak} \sim 300$ keV, making it a genuine gamma-ray flare. It also shows rapid rise and decay timescales, significantly shorter than those of typical X-ray flares observed in soft X-ray or optical band, but comparable to those observed in prompt emissions. Despite these exceptional properties, the flare shares several common properties with typical GRB flares. We note that this is the first observation of a GRB flare in the keV-MeV band with sufficiently high temporal resolution and high statistics, which bridges the last gap between prompt emission and flare.

preprint2026arXiv

Efficient Preparation of Quantum States via Randomized Truncation

While the preparation of a general quantum state is challenging, realistic problem instances, such as those encountered in quantum chemistry and quantum machine learning-typically exhibit hierarchical amplitude structures, consisting of a small number of large components alongside a vast number of small but non-negligible ones. Standard approaches deterministically truncate the small amplitude would incur an approximation error that scales linearly with the discarded amplitude mass, enforcing a rigid trade-off between precision and circuit depth. Here, we circumvent the challenge by introducing a randomized state-preparation protocol with probabilistic amplification of small amplitudes using ensembles of low-complexity circuits. Analytically, we prove that this approach significantly reduces the number of encoded amplitudes, halving the requirement for exponentially decaying states and offering asymptotically larger gains for heavy-tailed power-law decays. Numerical simulations on LiH molecular wavefunctions and deep-learning-derived states demonstrate reductions of up to 99 percent in CNOT and T-gate counts compared with deterministic methods. These results establish a resource-efficient paradigm for initializing complex states, relaxing gate-synthesis precision requirements for both near-term and fault-tolerant hardware, and improving the end-to-end feasibility of quantum computing.

preprint2026arXiv

Fiducial Exoskeletons: Image-Centric Robot State Estimation

We introduce Fiducial Exoskeletons, an image-based reformulation of 3D robot state estimation that replaces cumbersome procedures and motor-centric pipelines with single-image inference. Traditional approaches - especially robot-camera extrinsic estimation - often rely on high-precision actuators and require time-consuming routines such as hand-eye calibration. In contrast, modern learning-based robot control is increasingly trained and deployed from RGB observations on lower-cost hardware. Our key insight is twofold. First, we cast robot state estimation as 6D pose estimation of each link from a single RGB image: the robot-camera base transform is obtained directly as the estimated base-link pose, and the joint state is recovered via a lightweight global optimization that enforces kinematic consistency with the observed link poses (optionally warm-started with encoder readings). Second, we make per-link 6D pose estimation robust and simple - even without learning - by introducing the fiducial exoskeleton: a lightweight 3D-printed mount with a fiducial marker on each link and known marker-link geometry. This design yields robust camera-robot extrinsics, per-link SE(3) poses, and joint-angle state from a single image, enabling robust state estimation even on unplugged robots. Demonstrated on a low-cost robot arm, fiducial exoskeletons substantially simplify setup while improving calibration, state accuracy, and downstream 3D control performance. We release code and printable hardware designs to enable further algorithm-hardware co-design.

preprint2026arXiv

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbitrary output resolutions and hinder the geometric detail recovery. This paper introduces InfiniDepth, which represents depth as neural implicit fields. Through a simple yet effective local implicit decoder, we can query depth at continuous 2D coordinates, enabling arbitrary-resolution and fine-grained depth estimation. To better assess our method's capabilities, we curate a high-quality 4K synthetic benchmark from five different games, spanning diverse scenes with rich geometric and appearance details. Extensive experiments demonstrate that InfiniDepth achieves state-of-the-art performance on both synthetic and real-world benchmarks across relative and metric depth estimation tasks, particularly excelling in fine-detail regions. It also benefits the task of novel view synthesis under large viewpoint shifts, producing high-quality results with fewer holes and artifacts.

preprint2026arXiv

Inverse Rendering for High-Genus 3D Surface Meshes from Multi-view Images with Persistent Homology Priors

Reconstructing 3D objects from images is inherently an ill-posed problem due to ambiguities in geometry, appearance, and topology. This paper introduces collaborative inverse rendering with persistent homology priors, a novel strategy that leverages topological constraints to resolve these ambiguities. By incorporating priors that capture critical features such as tunnel loops and handle loops, our approach directly addresses the difficulty of reconstructing high-genus surfaces. The collaboration between photometric consistency from multi-view images and homology-based guidance enables recovery of complex high-genus geometry while circumventing catastrophic failures such as collapsing tunnels or losing high-genus structure. Instead of neural networks, our method relies on gradient-based optimization within a mesh-based inverse rendering framework to highlight the role of topological priors. Experimental results show that incorporating persistent homology priors leads to lower Chamfer Distance (CD) and higher Volume IoU compared to state-of-the-art mesh-based methods, demonstrating improved geometric accuracy and robustness against topological failure.

preprint2026arXiv

Learning Domain Agnostic Latent Embeddings of 3D Faces for Zero-shot Animal Expression Transfer

We present a zero-shot framework for transferring human facial expressions to 3D animal face meshes. Our method combines intrinsic geometric descriptors (HKS/WKS) with a mesh-agnostic latent embedding that disentangles facial identity and expression. The ID latent space captures species-independent facial structure, while the expression latent space encodes deformation patterns that generalize across humans and animals. Trained only with human expression pairs, the model learns the embeddings, decoupling, and recoupling of cross-identity expressions, enabling expression transfer without requiring animal expression data. To enforce geometric consistency, we employ Jacobian loss together with vertex-position and Laplacian losses. Experiments show that our approach achieves plausible cross-species expression transfer, effectively narrowing the geometric gap between human and animal facial shapes.

preprint2026arXiv

OmniSelect: Dynamic Modality-Aware Token Compression for Efficient Omni-modal Large Language Models

Omnimodal large language models (OmniLLMs) have recently gained increasing attention for unified audio-video understanding. However, processing long multimodal token sequences introduces substantial computational overhead, making efficient token compression crucial. Existing methods typically rely on fixed, modality-specific guidance, which fails to account for the varying importance of modalities across different queries. To address this limitation, we propose $\textbf{OmniSelect}$, a training-free, modality-adaptive token pruning framework that dynamically selects appropriate compression strategies for multimodal inputs. Specifically, we leverage a lightweight AudioCLIP model to estimate cross-modal relevance and categorize each input into three pruning regimes: Audio-Centric, Video-Centric, and Uniform pruning. Based on these relevance scores, OmniSelect further performs fine-grained token pruning within each temporal group, adaptively allocating pruning ratios to preserve informative tokens across modalities. By explicitly modeling modality preference and enabling dynamic strategy selection, OmniSelect effectively avoids the pitfalls of one-size-fits-all compression. Extensive experiments demonstrate that our method achieves efficient multimodal token reduction while maintaining strong performance, without requiring any additional training.

preprint2026arXiv

On the Ultra-Long Gamma-Ray Transient GRB 250702B/EP250702

GRB 250702B/EP250702a is an interesting long-duration gamma-ray transient whose nature is in debate. To obtain a full picture in gamma-ray band, we implement a comprehensive targeted search of burst emission in a wide window of 30 days jointly with Insight-HXMT, GECAM and Fermi/GBM data within the ETJASMIN framework. In gamma-ray band, we find there is a 50-second precursor about 25 hours before the 4-hour main burst, which generally consists of 4 emission episodes. Remarkably, we find that the soft X-ray emission (after the main burst) decays as a power-law with start time aligning with the last episode of main emission and index of -5/3 perfectly consistent with the canonical prediction of fallback accretion. We conclude that the properties of precursor, main burst and the following soft X-ray emission strongly support the atypical collapsar Ultra-Long Gamma-Ray Burst (ULGRB) scenario rather than the Tidal Disruption Event (TDE), and all these gamma-ray and soft X-ray emission probably originate from relativistic jet whose luminosity is dominated by the fallback accretion rate during the death collapse of a supergiant star.

preprint2026arXiv

OpenNavMap: Structure-Free Topometric Mapping via Large-Scale Collaborative Localization

Scalable and maintainable map representations are fundamental to enabling large-scale visual navigation and facilitating the deployment of robots in real-world environments. While collaborative localization across multi-session mapping enhances efficiency, traditional structure-based methods struggle with high maintenance costs and fail in feature-less environments or under significant viewpoint changes typical of crowd-sourced data. To address this, we propose OPENNAVMAP, a lightweight, structure-free topometric system leveraging 3D geometric foundation models for on-demand reconstruction. Our method unifies dynamic programming-based sequence matching, geometric verification, and confidence-calibrated optimization to robust, coarse-to-fine submap alignment without requiring pre-built 3D models. Evaluations on the Map-Free benchmark demonstrate superior accuracy over structure-from-motion and regression baselines, achieving an average translation error of 0.62m. Furthermore, the system maintains global consistency across 15km of multi-session data with an absolute trajectory error below 3m for map merging. Finally, we validate practical utility through 12 successful autonomous image-goal navigation tasks on simulated and physical robots. Code and datasets will be publicly available in https://rpl-cs-ucl.github.io/OpenNavMap_page.

preprint2026arXiv

Randomization Accelerates Series-Truncated Quantum Algorithms

Quantum algorithms typically demand prohibitively complicated circuits to solve practical problems. Previous studies have shown that classical randomness can accelerate some specific quantum algorithms. In this work, we introduce the Randomized Truncated Series (RTS) which extends this acceleration to all quantum algorithms that rely on truncated series approximations. RTS offers two key advantages: it quadratically suppresses truncation errors and allows for continuous adjustment of the effective truncation order. By leveraging random mixing between two quantum circuits, RTS ensures that their probabilistic combination accurately realizes the desired algorithm, while significantly reducing the average circuit size. We demonstrate the versatility of RTS through concrete applications. Our results shed light on the path toward practical quantum advantage.

preprint2025arXiv

CREPES-X: Hierarchical Bearing-Distance-Inertial Direct Cooperative Relative Pose Estimation System

Relative localization is critical for cooperation in autonomous multi-robot systems. Existing approaches either rely on shared environmental features or inertial assumptions or suffer from non-line-of-sight degradation and outliers in complex environments. Robust and efficient fusion of inter-robot measurements such as bearings, distances, and inertials for tens of robots remains challenging. We present CREPES-X (Cooperative RElative Pose Estimation System with multiple eXtended features), a hierarchical relative localization framework that enhances speed, accuracy, and robustness under challenging conditions, without requiring any global information. CREPES-X starts with a compact hardware design: InfraRed (IR) LEDs, an IR camera, an ultra-wideband module, and an IMU housed in a cube no larger than 6cm on each side. Then CREPES-X implements a two-stage hierarchical estimator to meet different requirements, considering speed, accuracy, and robustness. First, we propose a single-frame relative estimator that provides instant relative poses for multi-robot setups through a closed-form solution and robust bearing outlier rejection. Then a multi-frame relative estimator is designed to offer accurate and robust relative states by exploring IMU pre-integration via robocentric relative kinematics with loosely- and tightly-coupled optimization. Extensive simulations and real-world experiments validate the effectiveness of CREPES-X, showing robustness to up to 90% bearing outliers, proving resilience in challenging conditions, and achieving RMSE of 0.073m and 1.817° in real-world datasets.

preprint2025arXiv

LUNCH: A Lightweight Unified Deep-Learning Framework for General Transients Classification in High-Energy Time-Domain Astronomy

The increasing data volume of high-energy space monitors necessitates real-time, automated transient classification for multi-messenger follow-up. Conventional methods rely on empirical features like hardness ratios and reliable localization, which are not always precisely available during early detection. We developed the Lightweight Unified Neural Classifier for High-energy Transients (LUNCH) - an end-to-end deep-learning framework that performs general transient classification directly from raw multi-band light curves, eliminating the need for background subtraction or source localization. Its dual-scale architecture fuses long- and short-scale temporal evolution adaptively. Evaluated on 15 years of Fermi/GBM triggers, the optimal model achieves 97.23% accuracy when trained on complete energy spectra. A lightweight version using only three broad energy bands retains 95.07% accuracy, demonstrating that coarse spectral information fused with temporal context enables robust discrimination. The system significantly outperforms the GBM in-flight classifier on three months of independent test data. Feature visualization reveals well-separated class clusters, confirming physical interpretability. LUNCH combines high accuracy, low computational cost, and instrument-agnostic inputs, offering a practical solution for real-time in-flight processing that enables timely triggers for immediate multi-wavelength and multi-messenger follow-up observations in future time-domain missions.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Computer Vision astro-ph.HE Robotics quant-ph Artificial Intelligence astro-ph.IM

Source provenance