Researcher profile

Miao Wang

Miao Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Reward-Decomposed Reinforcement Learning for Immersive Video Role-Playing

Text-based role-playing models can imitate character styles, yet they often fail to reflect a scene's atmosphere and evolving tension, both essential for immersive applications such as Virtual Reality (VR) games and interactive narratives. We study video-grounded role-playing dialogue and introduce EBM-RL (Eye-Brain-Mouth Reinforcement Learning), a decoupled GRPO-based framework that explicitly separates observation ([perception]), reasoning ([think]), and utterance ([answer]). This structure promotes human-like sensory grounding by compelling the model to first attend to visual cues, then form internal interpretations, and finally generate context-appropriate dialogue. EBM-RL integrates four complementary rewards: (i) CLIP-based scene-text alignment to improve ambiance and emotion; (ii) a Perceptual-Cognitive reward that encourages [perception] and [think] processes that increase the likelihood of the reference response; (iii) answer accuracy to ensure faithfulness; and (iv) a dense format reward to enforce the desired structured output. Extensive experiments demonstrate that EBM-RL substantially outperforms text-only role-playing baselines and larger-scale vision-language models on our immersive role-playing benchmark, delivering simultaneous gains in visual-atmosphere consistency and character authenticity. Beyond the role-playing domain, EBM-RL also exhibits strong zero-shot generalization: without any additional fine-tuning, it consistently improves performance on out-of-domain VideoQA benchmarks. We additionally release an open-source dataset for video-grounded role-playing dialogue.

preprint2026arXiv

SC-MAS: Constructing Cost-Efficient Multi-Agent Systems with Edge-Level Heterogeneous Collaboration

Large Language Model (LLM)-based Multi-Agent Systems (MAS) enhance complex problem solving through multi-agent collaboration, but often incur substantially higher costs than single-agent systems. Recent MAS routing methods aim to balance performance and overhead by dynamically selecting agent roles and language models. However, these approaches typically rely on a homogeneous collaboration mode, where all agents follow the same interaction pattern, limiting collaboration flexibility across different roles. Motivated by Social Capital Theory, which emphasizes that different roles benefit from distinct forms of collaboration, we propose SC-MAS, a framework for constructing heterogeneous and cost-efficient multi-agent systems. SC-MAS models MAS as directed graphs, where edges explicitly represent pairwise collaboration strategies, allowing different agent pairs to interact through tailored communication patterns. Given an input query, a unified controller progressively constructs an executable MAS by selecting task-relevant agent roles, assigning edge-level collaboration strategies, and allocating appropriate LLM backbones to individual agents. Experiments on multiple benchmarks demonstrate the effectiveness of SC-MAS. In particular, SC-MAS improves accuracy by 3.35% on MMLU while reducing inference cost by 15.38%, and achieves a 3.53% accuracy gain with a 12.13% cost reduction on MBPP. These results validate the feasibility of SC-MAS and highlight the effectiveness of heterogeneous collaboration in multi-agent systems.

preprint2022arXiv

C3-STISR: Scene Text Image Super-resolution with Triple Clues

Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. Most recent approaches use the recognizer's feedback as clues to guide super-resolution. However, directly using recognition clue has two problems: 1) Compatibility. It is in the form of probability distribution, has an obvious modal gap with STISR - a pixel-level task; 2) Inaccuracy. it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback, visual and linguistical information as clues to guide super-resolution. Here, visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate a comprehensive and unified guidance for super-resolution. Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in fidelity and recognition performance. Code is available in https://github.com/zhaominyiz/C3-STISR.

preprint2022arXiv

Measurement of infrared magic wavelength for an all-optical trapping of $^{40}$Ca$^{+}$ ion clock

For the first time, we experimentally determine the infrared magic wavelength for the $^{40}$Ca$^{+}$ $4s\, ^{2}\!S_{1/2} \rightarrow 3d\,^{2}\!D_{5/2}$ electric quadrupole transition by observation of the light shift canceling in $^{40}$Ca$^{+}$ optical clock. A "magic" magnetic field direction is chosen to make the magic wavelength insensitive to both the linear polarization purity and the polarization direction of the laser. The determined magic wavelength for this transition is 1056.37(9)~nm, which is not only in good agreement with theoretical predictions but also more precise by a factor of about 300. Using this measured magic wavelength we also derive the differential static polarizability to be $-44.32(32)$~a.u., which will be an important input for the evaluation of the blackbody radiation shift at room temperatures. Our work paves a way for all-optical-trapping of $^{40}$Ca$^{+}$ optical clock.

preprint2020arXiv

An ACE/CRIS-observation-based Galactic Cosmic Rays heavy nuclei spectra model II

An observation-based Galactic Cosmic Ray (GCR) spectral model for heavy nuclei is developed. Zhao and Qin (J. Geophys. Res. Space Phys.118, 1837(2013)) proposed an empirical elemental GCR spectra model for nuclear charge 5-28 over the energy range from 30 to 500 MeV/nuc, which is proved to be successful in predicting yearly averaged GCR heavy nuclei spectra.Based on the latest highly statistically precise measurements from ACE/CRIS,a further elemental GCR model with monthly averaged spectra is presented. The model can reproduce the past and predict the futureGCR intensity monthly by correlating model parameters with thecontinuous sunspot number (SSN) record. The effects of solar activity on GCR modulation are considered separately for odd and even solar cycles. Compared with other comprehensive GCR models, our modeling results are satisfyingly consistent with the GCR spectral measurements from ACE/SIS and IMP-8, and have comparable prediction accuracy as the Badhwar & O'Neill 2014 model.A detailed error analysis is also provided.Finally, the GCR carbon and iron nuclei fluxes for the subsequent two solar cycles (SC 25 and 26) are predicted and they show a potential trend in reduced flux amplitude, which is suspected to be relevant to possible weak solar cycles.

preprint2020arXiv

Energy and Information Management of Electric Vehicular Network: A Survey

The connected vehicle paradigm empowers vehicles with the capability to communicate with neighboring vehicles and infrastructure, shifting the role of vehicles from a transportation tool to an intelligent service platform. Meanwhile, the transportation electrification pushes forward the electric vehicle (EV) commercialization to reduce the greenhouse gas emission by petroleum combustion. The unstoppable trends of connected vehicle and EVs transform the traditional vehicular system to an electric vehicular network (EVN), a clean, mobile, and safe system. However, due to the mobility and heterogeneity of the EVN, improper management of the network could result in charging overload and data congestion. Thus, energy and information management of the EVN should be carefully studied. In this paper, we provide a comprehensive survey on the deployment and management of EVN considering all three aspects of energy flow, data communication, and computation. We first introduce the management framework of EVN. Then, research works on the EV aggregator (AG) deployment are reviewed to provide energy and information infrastructure for the EVN. Based on the deployed AGs, we present the research work review on EV scheduling that includes both charging and vehicle-to-grid (V2G) scheduling. Moreover, related works on information communication and computing are surveyed under each scenario. Finally, we discuss open research issues in the EVN.

preprint2020arXiv

SREC: Proactive Self-Remedy of Energy-Constrained UAV-Based Networks via Deep Reinforcement Learning

Energy-aware control for multiple unmanned aerial vehicles (UAVs) is one of the major research interests in UAV based networking. Yet few existing works have focused on how the network should react around the timing when the UAV lineup is changed. In this work, we study proactive self-remedy of energy-constrained UAV networks when one or more UAVs are short of energy and about to quit for charging. We target at an energy-aware optimal UAV control policy which proactively relocates the UAVs when any UAV is about to quit the network, rather than passively dispatches the remaining UAVs after the quit. Specifically, a deep reinforcement learning (DRL)-based self remedy approach, named SREC-DRL, is proposed to maximize the accumulated user satisfaction scores for a certain period within which at least one UAV will quit the network. To handle the continuous state and action space in the problem, the state-of-the-art algorithm of the actor-critic DRL, i.e., deep deterministic policy gradient (DDPG), is applied with better convergence stability. Numerical results demonstrate that compared with the passive reaction method, the proposed SREC-DRL approach shows a $12.12\%$ gain in accumulative user satisfaction score during the remedy period.

preprint2019arXiv

Perturbed Field Ionization for Improved State Selectivity

Selective field ionization is used to determine the state or distribution of states to which a Rydberg atom is excited. By evolving a small perturbation to the ramped electric field using a genetic algorithm, the shape of the time-resolved ionization signal can be controlled. This allows for separation of signals from pairs of states that would be indistinguishable with unperturbed selective field ionization. Measurements and calculations are presented that demonstrate this technique and shed light on how the perturbation directs the pathway of the electron to ionization. Pseudocode for the genetic algorithm is provided. Using the improved resolution afforded by this technique, quantitative measurements of the $36p_{3/2}+36p_{3/2}\rightarrow 36s_{1/2}+37s_{1/2}$ dipole-dipole interaction are made.