Researcher profile

Jiahui Wang

Jiahui Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs

Reinforcement learning (RL) is increasingly used to improve the reasoning, coding, and tool-use capabilities of large language models, but agentic RL remains prohibitively expensive. Scaling RL to agentic LLMs requires supporting complex workloads, including multi-policy collaborative training, while efficiently using elastic, heterogeneous, and cross-region compute resources. Existing LLM RL systems support some of these capabilities, but each new extension often requires dedicated system engineering. This burden arises from trainer-centered control architectures and the lack of principled abstractions for RL system components. To address these limitations, we propose AstraFlow, a dataflow-oriented RL system that replaces conventional trainer-centered control with principled component abstractions. In AstraFlow, rollout services, dataflow management, and training are decoupled into autonomous components, enabling the system to natively support complex multi-policy agentic RL workloads and efficiently exploit diverse compute resources. We evaluate AstraFlow across math, code, search, and AgentBench workloads, showing that the same system supports multi-policy training, elastic scaling, heterogeneous cross-region execution, and composable data algorithms without system-level code changes. In multi-policy collaborative training, AstraFlow achieves comparable or better accuracy than existing RL systems while speeding up training time by 2.7x.

preprint2026arXiv

Converting qubit relaxation into erasures with a single fluxonium

Qubits that experience predominantly erasure errors offer distinct advantages for fault-tolerant operation. Indeed, dual-rail encoded erasure qubits in superconducting cavities and transmons have demonstrated high-fidelity operations by converting physical-qubit relaxation into logical-qubit erasures, but this comes at the cost of increased hardware overhead and circuit complexity. Here, we address these limitations by realizing erasure conversion in a single fluxonium operated at zero flux, where the logical state is encoded in its 0-2 subspace. A single, carefully engineered resonator provides both mid-circuit erasure detection and end-of-line (EOL) logical measurement. Post-selection on non-erasure outcomes results in more than four-fold increase of the logical lifetime, from $193~μ$s to $869~μ$s. Finally, we characterize measurement-induced logical dephasing as a function of measurement power and frequency, and infer that each erasure check contributes a negligible error of $7.2\times 10^{-5}$. These results establish integer-fluxonium as a promising, resource-efficient platform for erasure-based error mitigation, without requiring additional hardware.

preprint2026arXiv

Global Parametric Gates for Multi-qubit Entanglement

We propose and experimentally demonstrate a global parametric gate that generates multi-qubit entangled states in a single step. By applying a parametric drive to a common qubit at precise detunings relative to computational qubits, we directly produce two-, three-, and four-qubit entanglement with state fidelities of 99.4\%\pm0.2\%, 93.4\%\pm0.3\%, and 91.4\%\pm0.3\%, respectively. This scheme enables efficient, reconfigurable control using only microwave drives and is compatible with fixed-frequency qubits. Error analyses indicate that infidelity stems primarily from decoherence and coherent control errors, with negligible contributions from static ZZ coupling and flux noise. Furthermore, simulations with state-of-the-art parameters predict this global gate can generate high-fidelity (99.70\%) entanglement in systems of up to six qubits.

preprint2026arXiv

Illusion-Aware Visual Preprocessing and Anti-Illusion Prompting for Classic Illusion Understanding in Vision-Language Models

Vision-Language Models (VLMs) exhibit systematic bias toward visual illusions, recalling memorized facts rather than perceiving actual visual differences. This paper presents a training-free framework for the 5th DataCV Challenge Task 1 at CVPR 2026, addressing this perception-versus-memory conflict through three complementary strategies:(1) illusion-aware image preprocessing that weakens illusion-inducing context via type-specific transformations (edge extraction, color isolation, morphological processing, and reference-line overlay), (2) anti-illusion prompt engineering guiding VLMs toward qualitative visual comparison, and (3) multi-vote ensemble that further improves robustness. Our method achieves 90.48% accuracy on the official 630-image test set using Claude (claude-opus-4-6) with 5-vote majority ensemble, and 98.41% on a human-verified subset. The approach requires no finetuning, relying solely on visual manipulation and prompt design. Our solution secured 2nd place in the challenge, only 0.47% behind the 1st-place solution. Code is available at https://github.com/jasminezz/sf-illusion-aware-vlm.git.

preprint2026arXiv

PaMoSplat: Part-Aware Motion-Guided Gaussian Splatting for Dynamic Scene Reconstruction

Dynamic scene reconstruction represents a fundamental yet demanding challenge in computer vision and robotics. While recent progress in 3DGS-based methods has advanced dynamic scene modeling, obtaining high-fidelity rendering and accurate tracking in scenarios with substantial, intricate motions remains significantly challenging. To address these challenges, we propose PaMoSplat, a novel dynamic Gaussian splatting framework incorporating part awareness and motion priors. Our approach is grounded in two key observations: 1) Parts serve as primitives for scene deformation, and 2) Motion cues from optical flow can effectively guide part motion. Specifically, PaMoSplat initializes by lifting multi-view segmentation masks into 3D space via graph clustering, establishing coherent Gaussian parts. For subsequent timestamps, we leverage a differential evolutionary algorithm to estimate the rigid motion of these parts using multi-view optical flow cues, providing a robust warm-start for further optimization. Additionally, PaMoSplat introduces an adaptive iteration count mechanism, internal learnable rigidity, and flow-supervised rendering loss to accelerate and optimize the training process. Comprehensive evaluations across diverse scenes, including real-world environments, demonstrate that PaMoSplat delivers superior rendering quality, improved tracking precision, and faster convergence compared to existing methods. Furthermore, it enables multiple part-level downstream applications, such as 4D scene editing.

preprint2026arXiv

Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models

Diffusion Language Models (DLMs) have recently demonstrated remarkable capabilities in natural language processing tasks. However, the potential of Retrieval-Augmented Generation (RAG), which shows great successes for enhancing large language models (LLMs), has not been well explored, due to the fundamental difference between LLM and DLM decoding. To fill this critical gap, we systematically test the performance of DLMs within the RAG framework. Our findings reveal that DLMs coupled with RAG show promising potentials with stronger dependency on contextual information, but suffer from limited generation precision. We identify a key underlying issue: Response Semantic Drift (RSD), where the generated answer progressively deviates from the query's original semantics, leading to low precision content. We trace this problem to the denoising strategies in DLMs, which fail to maintain semantic alignment with the query throughout the iterative denoising process. To address this, we propose Semantic-Preserving REtrieval-Augmented Diffusion (SPREAD), a novel framework that introduces a query-relevance-guided denoising strategy. By actively guiding the denoising trajectory, SPREAD ensures the generation remains anchored to the query's semantics and effectively suppresses drift. Experimental results demonstrate that SPREAD significantly enhances the precision and effectively mitigates RSD of generated answers within the RAG framework.

preprint2022arXiv

A flying Schrödinger cat in multipartite entangled states

Schrödinger's cat originates from the famous thought experiment querying the counterintuitive quantum superposition of macroscopic objects. As a natural extension, several "cats" (quasi-classical objects) can be prepared into coherent quantum superposition states, which is known as multipartite cat states demonstrating quantum entanglement among macroscopically distinct objects. Here we present a highly scalable approach to deterministically create flying multipartite Schrödinger cat states, by reflecting coherent state photons from a microwave cavity containing a superconducting qubit. We perform full quantum state tomography on the cat states with up to four photonic modes and confirm the existence of quantum entanglement among them. We also witness the hybrid entanglement between discrete-variable states (the qubit) and continuous-variable states (the flying multipartite cat) through a joint quantum state tomography. Our work demonstrates an important experimental control method in the microwave region and provides an enabling step for implementing a series of quantum metrology and quantum information processing protocols based on cat states.

preprint2022arXiv

CAM/CAD Point Cloud Part Segmentation via Few-Shot Learning

3D part segmentation is an essential step in advanced CAM/CAD workflow. Precise 3D segmentation contributes to lower defective rate of work-pieces produced by the manufacturing equipment (such as computer controlled CNCs), thereby improving work efficiency and attaining the attendant economic benefits. A large class of existing works on 3D model segmentation are mostly based on fully-supervised learning, which trains the AI models with large, annotated datasets. However, the disadvantage is that the resulting models from the fully-supervised learning methodology are highly reliant on the completeness of the available dataset, and its generalization ability is relatively poor to new unknown segmentation types (i.e. further additional novel classes). In this work, we propose and develop a noteworthy few-shot learning-based approach for effective part segmentation in CAM/CAD; and this is designed to significantly enhance its generalization ability and flexibly adapt to new segmentation tasks by using only relatively rather few samples. As a result, it not only reduces the requirements for the usually unattainable and exhaustive completeness of supervision datasets, but also improves the flexibility for real-world applications. As further improvement and innovation, we additionally adopt the transform net and the center loss block in the network. These characteristics serve to improve the comprehension for 3D features of the various possible instances of the whole work-piece and ensure the close distribution of the same class in feature space.

preprint2022arXiv

Cross-Enhancement Transformer for Action Segmentation

Temporal convolutions have been the paradigm of choice in action segmentation, which enhances long-term receptive fields by increasing convolution layers. However, high layers cause the loss of local information necessary for frame recognition. To solve the above problem, a novel encoder-decoder structure is proposed in this paper, called Cross-Enhancement Transformer. Our approach can be effective learning of temporal structure representation with interactive self-attention mechanism. Concatenated each layer convolutional feature maps in encoder with a set of features in decoder produced via self-attention. Therefore, local and global information are used in a series of frame actions simultaneously. In addition, a new loss function is proposed to enhance the training process that penalizes over-segmentation errors. Experiments show that our framework performs state-of-the-art on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities and the Breakfast dataset.

preprint2022arXiv

Experimental preparation of generalized cat states for itinerant microwave photons

Generalized cat states represent arbitrary superpositions of coherent states, which are of great importance in various quantum information processing protocols. Here we demonstrate a versatile approach to creating generalized itinerant cat states in the microwave domain, by reflecting coherent state photons from a microwave cavity containing a superconducting qubit. We show that, with a coherent control of the qubit state, a full control over the coherent state superposition can be realized. The prepared cat states are verified through quantum state tomography of the qubit state dependent reflection photon field. We further quantify quantum coherence in the prepared cat states based on the resource theory, revealing a good experimental control on the coherent state superpositions. The photon number statistic and the squeezing properties are also analyzed. Remarkably, fourth-order squeezing is observed in the experimental states. Those results open up new possibilities of applying generalized cat states for the purpose of quantum information processing.

preprint2022arXiv

Masked Self-Supervision for Remaining Useful Lifetime Prediction in Machine Tools

Prediction of Remaining Useful Lifetime(RUL) in the modern manufacturing and automation workplace for machines and tools is essential in Industry 4.0. This is clearly evident as continuous tool wear, or worse, sudden machine breakdown will lead to various manufacturing failures which would clearly cause economic loss. With the availability of deep learning approaches, the great potential and prospect of utilizing these for RUL prediction have resulted in several models which are designed driven by operation data of manufacturing machines. Current efforts in these which are based on fully-supervised models heavily rely on the data labeled with their RULs. However, the required RUL prediction data (i.e. the annotated and labeled data from faulty and/or degraded machines) can only be obtained after the machine breakdown occurs. The scarcity of broken machines in the modern manufacturing and automation workplace in real-world situations increases the difficulty of getting sufficient annotated and labeled data. In contrast, the data from healthy machines is much easier to be collected. Noting this challenge and the potential for improved effectiveness and applicability, we thus propose (and also fully develop) a method based on the idea of masked autoencoders which will utilize unlabeled data to do self-supervision. In thus the work here, a noteworthy masked self-supervised learning approach is developed and utilized. This is designed to seek to build a deep learning model for RUL prediction by utilizing unlabeled data. The experiments to verify the effectiveness of this development are implemented on the C-MAPSS datasets (which are collected from the data from the NASA turbofan engine). The results rather clearly show that our development and approach here perform better, in both accuracy and effectiveness, for RUL prediction when compared with approaches utilizing a fully-supervised model.

preprint2022arXiv

Massively parallel pixel-by-pixel nanophotonic optimization using a Green's function formalism

We introduce an efficient parallelization scheme to implement pixel-by-pixel nanophotonic optimization using a Green's function based formalism. The crucial insight in our proposal is the reframing of the optimization algorithm as a large-scale data processing pipeline, which allows for the efficient distribution of computational tasks across thousands of workers. We demonstrate the utility of our implementation by exercising it to optimize a high numerical aperture focusing metalens at problem sizes that would otherwise be far out of reach for the Green's function based method. Finally, we highlight the connection to powerful ideas from reinforcement learning as a natural corollary of reinterpreting the nanophotonic inverse design problem as a graph traversal enabled by the pixel-by-pixel optimization paradigm.

preprint2021arXiv

Mirror symmetric on-chip frequency circulation of light

Integrated circulators and isolators are important for developing on-chip optical technologies, such as laser cavities, communication systems, and quantum information processors. These devices appear to inherently require mirror symmetry breaking to separate backwards from forwards propagation, so existing implementations rely upon magnetic materials, or interactions driven by propagating waves. In contrast to previous work, we demonstrate a mirror symmetric nonreciprocal device. Our device comprises three coupled photonic resonators implemented in thin-film lithium niobate. Applying radio frequency modulation, we drive conversion between the frequency eigenmodes of this system. We measure nearly 40 dB of isolation for approximately 75 mW of RF power near 1550 nm. We simultaneously generate nonreciprocal conversion between all of the eigenmodes in order to demonstrate circulation. Mirror symmetric circulation significantly simplifies the fabrication and operation of nonreciprocal integrated devices. Finally, we consider applications of such on-chip isolators and circulators, such as full-duplex isolation within a single waveguide.

preprint2019arXiv

Chaos Phase Induced Mass-producible Monolayer Two-dimensional Material

Crystal phase is well studied and presents a periodical atom arrangement in three dimensions lattice, but the "amorphous phase" is poorly understood. Here, by starting from cage-like bicyclocalix[2]arene[2]triazines building block, a brand-new 2D MOF is constructed with extremely weak interlaminar interaction existing between two adjacent 2D-crystal layer. Inter-layer slip happens under external disturbance and leads to the loss of periodicity at one dimension in the crystal lattice, resulting in an interim phase between the crystal and amorphous phase - the chaos phase, non-periodical in microscopic scale but orderly in mesoscopic scale. This chaos phase 2D MOF is a disordered self-assembly of black-phosphorus like 3D-layer, which has excellent mechanical-strength and a thickness of 1.15 nm. The bulky 2D-MOF material is readily to be exfoliated into monolayer nanosheets in gram-scale with unprecedented evenness and homogeneity, as well as previously unattained lateral size (>10 um), which present the first mass-producible monolayer 2D material and can form wafer-scale film on substrate.