Researcher profile

Yidong Huang

Yidong Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation

Generating realistic human motion is a central yet unsolved challenge in video generation. While reinforcement learning (RL)-based post-training has driven recent gains in general video quality, extending it to human motion remains bottlenecked by a reward signal that cannot reliably score motion realism. Existing video rewards primarily rely on 2D perceptual signals, without explicitly modeling the 3D body state, contact, and dynamics underlying articulated human motion, and often assign high scores to videos with floating bodies or physically implausible movements. To address this, we propose PhyMotion, a structured, fine-grained motion reward that grounds recovered 3D human trajectories in a physics simulator and evaluates motion quality along multiple dimensions of physical feasibility. Concretely, we recover SMPL body meshes from generated videos, retarget them onto a humanoid in the MuJoCo physics simulator, and evaluate the resulting motion along three axes: kinematic plausibility, contact and balance consistency, and dynamic feasibility. Each component provides a continuous and interpretable signal tied to a specific aspect of motion quality, allowing the reward to capture which aspects of motion are physically correct or violated. Experiments show that PhyMotion achieves stronger correlation with human judgments than existing reward formulations. These gains carry over to RL-based post-training, where optimizing PhyMotion leads to larger and more consistent improvements than optimizing existing rewards, improving motion realism across both autoregressive and bidirectional video generators under both automatic metrics and blind human evaluation (+68 Elo gain). Ablations show that the three axes provide complementary supervision signals, while the reward preserves overall video generation quality with only modest training overhead.

preprint2026arXiv

Quantum Interaction Between Free Electrons and Light Involving First-order and Second-order Process

Photon-induced Near-field Electron Microscopy (PINEM) effect has revealed the quantum interaction between free electrons and optical near filed, which demonstrated plenty of novel phenomena of manipulating free electron wave packet and detecting/shaping quantum photonic states. However, free electrons generally only absorb/emit one photon at a time, while the physical mechanism and phenomena of free electron-two-photon interaction have not been studied yet. Moreover, the relationship between PINEM and Kapitza-Dirac (KD) effect and nonlinear Compton scattering is still unclear. Here we develop the full quantum theory of electron-photon interaction considering the two-photon process. It is revealed that the emission/absorption of two photons by electrons can be greatly enhanced by manipulating the electric field component of optical near field, and the quantum interference between single-photon and two-photon processes can occur in some circumstances, which affects the photon number state, electron energy states and electron-photon entanglement. Meanwhile, it is found that the KD effect (elastic electron-photon scattering) and nonlinear Compton scattering (inelastic electron-photon scattering) are also a kind of two-photon process and the distribution of electrons can be deduced analytically based on the full quantum theory. Our work uncovers the possible abundant phenomena when free electron interacting with two photons, paves the way for more in-depth studies of nonlinear processes in electron-photon quantum interactions in the future.

preprint2024arXiv

SUANPAN: Scalable Photonic Linear Vector Machine

Photonic linear operation is a promising approach to handle the extensive vector multiplications in artificial intelligence techniques due to the natural bosonic parallelism and high-speed information transmission of photonics. Although it is believed that maximizing the interaction of the light beams is necessary to fully utilize the parallelism and tremendous efforts have been made in past decades, the achieved dimensionality of vector-matrix multiplication is very limited due to the difficulty of scaling up a tightly interconnected or highly coupled optical system. Additionally, there is still a lack of a universal photonic computing architecture that can be readily merged with existing computing system to meet the computing power demand of AI techniques. Here, we propose a programmable and reconfigurable photonic linear vector machine to perform only the inner product of two vectors, formed by a series of independent basic computing units, while each unit is just one pair of light-emitter and photodetector. Since there is no interaction among light beams inside, extreme scalability could be achieved by simply duplicating the independent basic computing unit while there is no requirement of large-scale analog-to-digital converter and digital-to-analog converter arrays. Our architecture is inspired by the traditional Chinese Suanpan or abacus and thus is denoted as photonic SUANPAN. As a proof of principle, SUANPAN architecture is implemented with an 8*8 vertical cavity surface emission laser array and an 8*8 MoTe2 two-dimensional material photodetector array. We believe that our proposed photonic SUANPAN is capable of serving as a fundamental linear vector machine that can be readily merged with existing electronic digital computing system and is potential to enhance the computing power for future various AI applications.

preprint2023arXiv

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

Spectral imaging extends the concept of traditional color cameras to capture images across multiple spectral channels and has broad application prospects. Conventional spectral cameras based on scanning methods suffer from low acquisition speed and large volume. On-chip computational spectral imaging based on metasurface filters provides a promising scheme for portable applications, but endures long computation time for point-by-point iterative spectral reconstruction and mosaic effect in the reconstructed spectral images. In this study, we demonstrated on-chip rapid spectral imaging eliminating the mosaic effect in the spectral image by deep-learning-based spectral data cube reconstruction. We experimentally achieved four orders of magnitude speed improvement than iterative spectral reconstruction and high fidelity of spectral reconstruction over 99% for a standard color board. In particular, we demonstrated video-rate spectral imaging for moving objects and outdoor driving scenes with good performance for recognizing metamerism, where the concolorous sky and white cars can be distinguished via their spectra, showing great potential for autonomous driving and other practical applications in the field of intelligent perception.

preprint2022arXiv

A photon counting reconstructive spectrometer combining metasurfaces and superconducting nanowire single-photon detectors

Faint light spectroscopy has many important applications such as fluorescence spectroscopy, lidar and astronomical observations. However, long measurement time limit its application on real-time measurement. In this work, a photon counting reconstructive spectrometer combining metasurfaces and superconducting nanowire single photon detectors (SNSPDs) was proposed. A prototype device was fabricated on a silicon on isolator (SOI) substrate, and its performance was characterized. Experiment results show that this device support spectral reconstruction of mono-color lights with a resolution of 2 nm in the wavelength region of 1500 nm ~ 1600 nm. The detection efficiency of this device is 1.4% ~ 3.2% in this wavelength region. The measurement time required by this photon counting reconstructive spectrometer was also investigated experimentally, showing its potential to be applied in the scenarios requiring real-time measurement.

preprint2022arXiv

Discovering Intrinsic Reward with Contrastive Random Walk

The aim of this paper is to demonstrate the efficacy of using Contrastive Random Walk as a curiosity method to achieve faster convergence to the optimal policy.Contrastive Random Walk defines the transition matrix of a random walk with the help of neural networks. It learns a meaningful state representation with a closed loop. The loss of Contrastive Random Walk serves as an intrinsic reward and is added to the environment reward. Our method works well in non-tabular sparse reward scenarios, in the sense that our method receives the highest reward within the same iterations compared to other methods. Meanwhile, Contrastive Random Walk is more robust. The performance doesn't change much with different random initialization of environments. We also find that adaptive restart and appropriate temperature are crucial to the performance of Contrastive Random Walk.

preprint2022arXiv

Programmable Unitary Operations for Orbital Angular Momentum Encoded States

We have proposed and demonstrated a scalable and efficient scheme for programmable unitary operations in orbital angular momentum (OAM) domain. Based on matrix decomposition into diagonal and Fourier factors, arbitrary matrix operators can be implemented only by diagonal matrices alternately acting on orbital angular momentum domain and azimuthal angle domain, which are linked by Fourier transform. With numerical simulations, unitary matrices with dimensionality of 3*3 are designed and discussed for OAM domain. Meanwhile, the parallelism of our proposed scheme is also presented with two 3*3 matrices. Furthermore, as an alternative to verify our proposal, proof of principle experiments have been performed on path domain with the same matrix decomposition method, in which an average fidelity of 0.97 is evaluated through 80 experimental results with dimensionality of 3*3.

preprint2020arXiv

An entanglement-based quantum network based on symmetric dispersive optics quantum key distribution

Quantum key distribution (QKD) is a crucial technology for information security in the future. Developing simple and efficient ways to establish QKD among multiple users are important to extend the applications of QKD in communication networks. Herein, we proposed a scheme of symmetric dispersive optics QKD (DO-QKD) and demonstrated an entanglement-based quantum network based on it. In the experiment, a broadband entanglement photon pair source was shared by end users via wavelength and space division multiplexing. The wide spectrum of generated entangled photon pairs was divided into 16 combinations of frequency-conjugate channels. Photon pairs in each channel combination supported a fully-connected subnet with 8 users by a passive beam splitter. Eventually, it showed that an entanglement-based QKD network over 100 users could be supported by one entangled photon pair source in this architecture. It has great potential on applications of local quantum networks with large user number.

preprint2020arXiv

Programmable coherent linear quantum operations with high-dimensional optical spatial modes

A simple and flexible scheme for high-dimensional linear quantum operations on optical transverse spatial modes is demonstrated. The quantum Fourier transformation (QFT) and quantum state tomography (QST) via symmetric informationally complete positive operator-valued measures (SIC POVMs) are implemented with dimensionality of 15. The matrix fidelity of QFT is 0.85, while the statistical fidelity of SIC POVMs and fidelity of QST are ~0.97 and up to 0.853, respectively. We believe that our device has the potential for further exploration of high-dimensional spatial entanglement provided by spontaneous parametric down conversion in nonlinear crystals.

preprint2018arXiv

Universal linear optical operations on discrete phase-coherent spatial modes

Linear optical operations are fundamental and significant for both quantum mechanics and classical technologies. We demonstrate a non-cascaded approach to perform arbitrary unitary and non-unitary linear operations for N-dimensional phase-coherent spatial modes with meticulously designed phase gratings. As implemented on spatial light modulators (SLMs), the unitary transformation matrix has been realized with dimensionalities ranging from 7 to 24 and the corresponding fidelities are from 95.1% to 82.1%. For the non-unitary operators, a matrix is presented for the tomography of a 4-level quantum system with a fidelity of 94.9%. Thus, the linear operator has been successfully implemented with much higher dimensionality than that in previous reports. It should be mentioned that our method is not limited to SLMs and can be easily applied on other devices. Thus we believe that our proposal provides another option to perform linear operation with a simple, fixed, error-tolerant and scalable scheme.

preprint2017arXiv

Identifying the tilt angle and correcting the orbital angular momentum spectrum dispersion of misaligned light beam

The axis tilt of light beam in optical system would introduce the dispersion of orbital angular momentum (OAM) spectrum. To deal with it, a two-step method is proposed and demonstrated. First, the tilt angle of optical axis is identified with a deduced relation between the tilt angle and the variation of OAM topological charges with different reference axes, which is obtained with the help of a charge coupled device (CCD) camera. In our experiments, the precision of measured tilt angle is about 10-4rad with OAM orders of -3~3. Using the measured angle value, the additional phase delay due to axis tilt can be calculated so that the dispersion of OAM spectrum can be corrected with a simple formula while the optical axis is not aligned. The experimental results indicate that the original OAM spectrum has been successfully extracted for not only the pure OAM state but also the superposed OAM states.