Researcher profile

Huan Yu

Huan Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring

Online advertising governance faces significant challenges due to the non-stationary nature of regulatory policies, where emerging mandates (e.g., restrictions on education or aesthetic anxiety) create severe label inconsistencies and reasoning ambiguities in historical datasets. In this paper, we propose ARGUS, a policy-adaptive governance system that enables evolving reinforcement through multi-agent adversarial umpiring. ARGUS addresses the sparsity of new policy data by employing a three-stage framework: (1) Policy Seeding for initial perception; (2) Adversarial Label Rectification, which utilizes a ``Prosecutor-Defender-Umpire'' architecture to resolve conflicts between stale labels and new mandates; and (3) Latent Knowledge Discovery, which employs a tripartite dialectical discussion to unearth sophisticated, ``gray-area'' violations. By leveraging RAG-enhanced policy knowledge and Chain-of-Thought synthesis as dynamic rewards for reinforcement learning, ARGUS synchronizes its reasoning pathways with evolving regulations. Extensive experiments on both industrial and public datasets demonstrate that ARGUS significantly outperforms traditional fine-tuning baselines, achieving superior policy-adaptive learning with minimal gold data.

preprint2022arXiv

Anisotropic Hardy-Sobolev inequality in mixed Lorentz spaces with applications to the axisymmetric Navier-Stokes equations

In this paper, we establish several new anisotropic Hardy-Sobolev inequalities in mixed Lebesgue spaces and mixed Lorentz spaces, which covers many known corresponding results. As an application, this type of inequalities allows us to generalize some regularity criteria of the 3D axisymmetric Navier-Stokes equations.

preprint2022arXiv

Energy and helicity conservation for the generalized quasi-geostrophic equation

In this paper, we consider the 2-D generalized surface quasi-geostrophic equation with the velocity $v$ determined by $v=\mathcal{R}^{\perp}Λ^{γ-1}θ$. It is shown that the $L^p$ type energy norm of weak solutions is conserved provided $θ\in L^{p+1}(0,T; {B}^{\fracγ{3}}_{p+1, c(\mathbb{N})})$ for $0<γ<\frac32$ or $θ\in L^{p+1}(0,T; {B}^α_{p+1,\infty})~\text{for any}~γ-1<α<1 \text{ with} ~\frac{3}{2}\leq γ<2$. Moreover, we also prove that the helicity of weak solutions satisfying $\nablaθ\in L^{3}(0,T;\dot{B}_{3,c(\mathbb{N})}^{\fracγ{3}})$ for $0<γ<\frac32$ or $\nablaθ\in L^{3}(0,T; \dot{B}^α_{3,\infty})~\text{for any}~γ-1<α<1 \text{ with} ~\frac{3}{2}\leq γ<2$ is invariant. Therefore, the accurate relationships between the critical regularity for the energy (helicity) conservation of the weak solutions and the regularity of velocity in 2-D generalized quasi-geostrophic equation are presented.

preprint2022arXiv

Kinetic Simulation on Electron, Proton and Helium Acceleration in a Nonrelativistic Quasiparallel shock

In addition to electrons and protons, nonrelativistic quasiparallel shocks are expected to possess the ability to accelerate heavy ions. The shocks in supernova remnants are generally supposed to be accelerators of the Galactic cosmic rays, which consist of many species of particles. We investigate diffusive shock acceleration (DSA) of electrons, protons and helium ions in a nonrelativistic quasiparallel shock through 1D particle-in-cell (PIC) simulation with a helium-to-proton number density ratio of $0.1$, which is relevant for the Galactic cosmic rays. The simulation indicates that waves can be excited by the flow of the energetic protons and helium ions upstream of the nonrelativistic quasiparallel shock with a sonic Mach number of 14 and an alfvén Mach number of 19.5 in the shock rest frame, and the charged particles are scattered by the self-generated waves and accelerated gradually. Moreover, the spectra of the charged particles downstream of the shock are thermal plus a nonthermal tail, and the acceleration is efficient with about $7\%$ and $5.4\%$ of the bulk kinetic energy transferred into the nonthermal protons and helium ions in the near downstream region at the end of the simulation, respectively.

preprint2021arXiv

GM-Livox: An Integrated Framework for Large-Scale Map Construction with Multiple Non-repetitive Scanning LiDARs

With the ability of providing direct and accurate enough range measurements, light detection and ranging (LiDAR) is playing an essential role in localization and detection for autonomous vehicles. Since single LiDAR suffers from hardware failure and performance degradation intermittently, we present a multi-LiDAR integration scheme in this article. Our framework tightly couples multiple non-repetitive scanning LiDARs with inertial, encoder, and global navigation satellite system (GNSS) into pose estimation and simultaneous global map generation. Primarily, we formulate a precise synchronization strategy to integrate isolated sensors, and the extracted feature points from separate LiDARs are merged into a single sweep. The fused scans are introduced to compute the scan-matching correspondences, which can be further refined by additional real-time kinematic (RTK) measurements. Based thereupon, we construct a factor graph along with the inertial preintegration result, estimated ground constraints, and RTK data. For the purpose of maintaining a restricted number of poses for estimation, we deploy a keyframe based sliding-window optimization strategy in our system. The real-time performance is guaranteed with multi-threaded computation, and extensive experiments are conducted in challenging scenarios. Experimental results show that the utilization of multiple LiDARs boosts the system performance in both robustness and accuracy.

preprint2021arXiv

Investigating the energy distribution of the high-energy particles in the Crab nebula

The Crab nebula is a prominent pulsar wind nebula (PWN) detected in multiband observations ranging from radio to very high-energy (VHE) $γ$-rays. Recently, $γ$-rays with energies above $1 \mathrm{PeV}$ had been detected by the Large High Altitude Air Shower Observatory (LHAASO), and the energy of the most energetic particles in the nebula can be constrained. In this paper, we investigate the broadest spectral energy distribution of the Crab nebula and the energy distribution of the electrons emitting the multiwavelength nonthermal emission based on a one-zone time-dependent model. The nebula is powered by the pulsar, and high-energy electrons/positrons with a broken power-law spectrum are continually injected in the nebula as the pulsar spins down. Multiwavelength nonthermal emission is generated by the leptons through synchrotron radiation and inverse Compton scattering. Using appropriate parameters, the detected fluxes for the nebula can be well reproduced, especially for the $γ$-rays from $10^2\,\mathrm{MeV}$ to $1\,\mathrm{PeV}$. The results show that the detected $γ$-rays can be produced by the leptons via the inverse Compton scattering, and the lower limit of the Lorentz factor of the most energetic leptons is $\sim 8.5\times10^{9}$. It can be concluded that there are electrons/positrons with energies higher than $4.3$\,PeV in the Crab nebula.

preprint2021arXiv

Reinforcement Learning versus PDE Backstepping and PI Control for Congested Freeway Traffic

We develop reinforcement learning (RL) boundary controllers to mitigate stop-and-go traffic congestion on a freeway segment. The traffic dynamics of the freeway segment are governed by a macroscopic Aw-Rascle-Zhang (ARZ) model, consisting of $2\times 2$ quasi-linear partial differential equations (PDEs) for traffic density and velocity. Boundary stabilization of the linearized ARZ PDE model has been solved by PDE backstepping, guaranteeing spatial $L^2$ norm regulation of the traffic state to uniform density and velocity and ensuring that traffic oscillations are suppressed. Collocated Proportional (P) and Proportional-Integral (PI) controllers also provide stability guarantees under certain restricted conditions, and are always applicable as model-free control options through gain tuning by trail and error, or by model-free optimization. Although these approaches are mathematically elegant, the stabilization result only holds locally and is usually affected by the change of model parameters. Therefore, we reformulate the PDE boundary control problem as a RL problem that pursues stabilization without knowing the system dynamics, simply by observing the state values. The proximal policy optimization, a neural network-based policy gradient algorithm, is employed to obtain RL controllers by interacting with a numerical simulator of the ARZ PDE. Being stabilization-inspired, the RL state-feedback boundary controllers are compared and evaluated against the rigorously stabilizing controllers in two cases: (i) in a system with perfect knowledge of the traffic flow dynamics, and then (ii) in one with only partial knowledge. We obtain RL controllers that nearly recover the performance of the backstepping, P, and PI controllers with perfect knowledge and outperform them in some cases with partial knowledge.

preprint2020arXiv

An Extension of Calder$\acute{\rm O}$n-Zygmund type singular integral

In this paper, we consider a kind of singular integral which can be viewed as an extension of the classical Calder$\acute{\rm o}$n-Zygmund type singular integral. We establish an estimate of the singular integral in the $L^q$ space for $1<q<\infty$. In particular, the Calder$\acute{\rm o}$n-Zygmund estimate can be recovered from our obtained estimate. The proof of our main result is via the so called &#34;geometric approach&#34;, which was applied in \cite{CP} on the $L^q$ estimate of the elliptic equations and in \cite{LW,Wang} on a new proof of the the Calder$\acute{\rm o}$n-Zygmund estimate.

preprint2020arXiv

Modular Transfer Learning with Transition Mismatch Compensation for Excessive Disturbance Rejection

Underwater robots in shallow waters usually suffer from strong wave forces, which may frequently exceed robot&#39;s control constraints. Learning-based controllers are suitable for disturbance rejection control, but the excessive disturbances heavily affect the state transition in Markov Decision Process (MDP) or Partially Observable Markov Decision Process (POMDP). Also, pure learning procedures on targeted system may encounter damaging exploratory actions or unpredictable system variations, and training exclusively on a prior model usually cannot address model mismatch from the targeted system. In this paper, we propose a transfer learning framework that adapts a control policy for excessive disturbance rejection of an underwater robot under dynamics model mismatch. A modular network of learning policies is applied, composed of a Generalized Control Policy (GCP) and an Online Disturbance Identification Model (ODI). GCP is first trained over a wide array of disturbance waveforms. ODI then learns to use past states and actions of the system to predict the disturbance waveforms which are provided as input to GCP (along with the system state). A transfer reinforcement learning algorithm using Transition Mismatch Compensation (TMC) is developed based on the modular architecture, that learns an additional compensatory policy through minimizing mismatch of transitions predicted by the two dynamics models of the source and target tasks. We demonstrated on a pose regulation task in simulation that TMC is able to successfully reject the disturbances and stabilize the robot under an empirical model of the robot system, meanwhile improve sample efficiency.

preprint2020arXiv

Suppression of Oscillations in Two-Class Traffic by Full-State Feedback

This paper develops a full-state feedback controller that damps out oscillations in traffic density and traffic velocity whose dynamical behavior is governed by the linearized two-class Aw-Rascle (AR) model. Thereby, the traffic is considered to be in the congested regime and subdivided in two classes whereas each class represents vehicles with the same size and driver&#39;s behavior. The macroscopic second-order two-class AR model consists of four first order hyperbolic partial differential equations (PDEs) and introduces a concept of area occupancy to depict the mixed density of two-class vehicles in the traffic. Moreover, the linearized model equations show heterodirectional behavior with both positive and negative characteristic speeds in the congested regime. The control objective is to achieve convergence to a constant equilibrium in finite time. The control input is realized by ramp metering acting at the outlet of the considered track section. The backstepping method is employed to design full-state feedback for the $4\times 4$ hyperbolic PDEs. The performance of the full-state feedback controller is verified by simulation.

preprint2019arXiv

Early acceleration of electrons and protons at the nonrelativistic quasiparallel shocks with different obliquity angles

The early acceleration of protons and electrons in the nonrelativistic collisionless shocks with three obliquities are investigated through 1D particle-in-cell simulations. In the simulations, the charged particles possessing a velocity of $0.2\, c$ flow towards a reflecting boundary, and the shocks with a sonic Mach number of $13.4$ and a Alfven Mach number of $16.5$ in the downstream shock frame are generated. In these quasi-parallel shocks with the obliquity angles $θ= 15^\circ$, $30^\circ$, and $45^\circ$, some of the protons and the electrons can be injected into the acceleration processes, and their downstream spectra in the momentum space show a power law tail at a time of $1.89\times10^5 ω_{\rm pe}^{-1}$, where $ω_{\rm pe}$ is the electron plasma frequency. Moreover, the charged particles reflected at the shock excite magnetic waves upstream of the shock. The shock drift acceleration is more prominent with a larger obliquity angle for the shocks, but the accelerated particles diffuse parallel to the shock propagation direction more easily to participate in the diffusive shock acceleration. At the time still in the early acceleration stage, more energetic protons and electrons appear in the downstream of the shock for $θ= 15^\circ$ compared with the other two obliquities; moreover, in the upstream region, the spectrum of the accelerated electrons is the hardest for $θ_{\rm nB} = 45^\circ$ among the three obliquities, whereas the proton spectra for $θ_{\rm nB} = 15^\circ$ and $45^\circ$ are similar as a result of the competition of the effectiveness of the shock drift acceleration and the diffusive shock acceleration.

preprint2019arXiv

Numerically investigating the morphology of the supernova remnant SN 1006 in the ambient medium with a density discontinuity

Multiband observations on the type Ia supernova remnant SN 1006 indicate peculiar properties in its morphologies of emission in the radio, optical and X-ray bands. In the hard X-rays, the remnant is bilateral with two opposite bright limbs with prominent protrusions. Moreover, a filament has been detected at the radio, optical and soft X-ray wavelengthes. The reason for these peculiar features in the morphologies of the remnant is investigated using 3D HD simulations. With the assumption that the supernova ejecta is evolved in the ambient medium with a density discontinuity, the radius of the remnant&#39;s boundary is smaller in the tenuous medium, and the shell consists of two hemispheres with different radiuses. Along particular line of sights, protrusions appear on the periphery of the remnants since the emission from the edge of the hemisphere with a larger radius is located outside that from the shell of the small hemisphere. Furthermore, the northwest filament of SN 1006 arises as a result of the intersection of the line of sight and the shocked material near the edges of the two hemispheres. It can be concluded that the features that the protrusions on the northeast and southwest limbs and the northwest filament in the morphologies of SN 1006 can be reproduced as the remnants interacting with the medium with a density discontinuity.