Researcher profile

Rui Mao

Rui Mao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

AudioFace: Language-Assisted Speech-Driven Facial Animation with Multimodal Language Models

Speech-driven facial animation requires accurate correspondence between acoustic signals and facial motion, especially for articulation-related mouth movements. However, directly mapping speech audio to facial coefficients often overlooks the linguistic and phonetic structure underlying speech production. In this paper, we propose AudioFace, a language-assisted framework for speech-driven blendshape generation that treats mouth-related facial coefficient prediction as a structured generation problem guided by linguistic and articulatory information. Instead of relying solely on acoustic features, our method leverages the prior knowledge of multimodal large language models and introduces transcript- and phoneme-level cues to bridge speech signals with interpretable facial actions. Extensive experiments show that AudioFace achieves superior performance across multiple evaluation metrics, validating the effectiveness of language-assisted and multimodal-prior-guided speech-driven facial animation.

preprint2026arXiv

LLMdoctor: Token-Level Flow-Guided Preference Optimization for Efficient Test-Time Alignment of Large Language Models

Aligning Large Language Models (LLMs) with human preferences is critical, yet traditional fine-tuning methods are computationally expensive and inflexible. While test-time alignment offers a promising alternative, existing approaches often rely on distorted trajectory-level signals or inefficient sampling, fundamentally capping performance and failing to preserve the generative diversity of the base model. This paper introduces LLMdoctor, a novel framework for efficient test-time alignment that operates via a patient-doctor paradigm. It integrates token-level reward acquisition with token-level flow-guided preference optimization (TFPO) to steer a large, frozen patient LLM with a smaller, specialized doctor model. Unlike conventional methods that rely on trajectory-level rewards, LLMdoctor first extracts fine-grained, token-level preference signals from the patient model's behavioral variations. These signals then guide the training of the doctor model via TFPO, which establishes flow consistency across all subtrajectories, enabling precise token-by-token alignment while inherently preserving generation diversity. Extensive experiments demonstrate that LLMdoctor significantly outperforms existing test-time alignment methods and even surpasses the performance of full fine-tuning approaches like DPO.

preprint2026arXiv

MAXS: Meta-Adaptive Exploration with LLM Agents

Large Language Model (LLM) Agents exhibit inherent reasoning abilities through the collaboration of multiple tools. However, during agent inference, existing methods often suffer from (i) locally myopic generation, due to the absence of lookahead, and (ii) trajectory instability, where minor early errors can escalate into divergent reasoning paths. These issues make it difficult to balance global effectiveness and computational efficiency. To address these two issues, we propose meta-adaptive exploration with LLM agents https://github.com/exoskeletonzj/MAXS, a meta-adaptive reasoning framework based on LLM Agents that flexibly integrates tool execution and reasoning planning. MAXS employs a lookahead strategy to extend reasoning paths a few steps ahead, estimating the advantage value of tool usage, and combines step consistency variance and inter-step trend slopes to jointly select stable, consistent, and high-value reasoning steps. Additionally, we introduce a trajectory convergence mechanism that controls computational cost by halting further rollouts once path consistency is achieved, enabling a balance between resource efficiency and global effectiveness in multi-tool reasoning. We conduct extensive empirical studies across three base models (MiMo-VL-7B, Qwen2.5-VL-7B, Qwen2.5-VL-32B) and five datasets, demonstrating that MAXS consistently outperforms existing methods in both performance and inference efficiency. Further analysis confirms the effectiveness of our lookahead strategy and tool usage.

preprint2025arXiv

QAOA-MaxCut has barren plateaus for almost all graphs

The QAOA has been the subject of intense study over recent years, yet the corresponding Dynamical Lie Algebra (DLA)--a key indicator of the expressivity and trainability of VQAs--remains poorly understood beyond highly symmetric instances. An exponentially scaling DLA dimension is associated with the presence of so-called barren plateaus (BP) in the optimization landscape, which renders training intractable. In this work, we investigate the DLA of QAOA applied to the canonical MaxCut, for both weighted and unweighted graphs. For weighted graphs, we show that when the weights are drawn from a continuous distribution, the DLA dimension grows as $Θ(4^n)$ almost surely for all connected graphs except paths and cycles. In the more common unweighted setting, we show that asymptotically all but an exponentially vanishing fraction of graphs have $Θ(4^n)$ large DLA dimension. The entire simple Lie algebra decomposition of the corresponding DLAs is also identified, from which we prove that the variance of the loss function is $O(1/2^n)$, implying that QAOA on these weighted and unweighted graphs all suffers from BP. Moreover, we give explicit constructions for families of graphs whose DLAs have exponential dimension, including cases whose MaxCut is in $\mathsf P$. Our proof of the unweighted case is based on a number of splitting lemmas and DLA-freeness conditions that allow one to convert prohibitively complicated Lie algebraic problems into amenable graph theoretic problems. These form the basis for a new algorithm that computes such DLAs orders of magnitude faster than previous methods, reducing runtimes from days to seconds on standard hardware. We apply this algorithm to MQLib, a classical MaxCut benchmark suite covering over 3,500 instances with up to 53,130 vertices, and find that, ignoring edge weights, at least 75% of the instances possess a DLA of dimension at least $2^{128}$.

preprint2020arXiv

Quantum coherence and spin nematic to nematic quantum phase transitions in biquadratic spin-1 and -2 XY chains with rhombic single-ion anisotropy

We investigate quantum phase transitions and quantum coherence in infinite biquadratic spin-1 and -2 XY chains with rhombic single-ion anisotropy. All considered coherence measures such as the $l_1$ norm of coherence, the relative entropy of coherence, and the quantum Jensen-Shannon divergence, and the quantum mutual information show consistently that singular behaviors occur for the spin-1 system, which enables to identity quantum phase transitions. For the spin-2 system, the relative entropy of coherence and the quantum mutual information properly detect no singular behavior in the whole system parameter range, while the $l_1$ norm of coherence and the quantum Jensen-Shannon divergence show a conflicting singular behavior of their first-order derivatives. Examining local magnetic moments and spin quadrupole moments lead to the explicit identification of novel orderings of spin quadrupole moments with zero magnetic moments in the whole parameter space. We find the three uniaxial spin nematic quadrupole phases for the spin-1 system and the two biaxial spin nematic phases for the spin-2 system. For the spin-2 system, the two orthogonal biaxial spin nematic states are connected adiabatically without an explicit phase transition, which can be called quantum crossover. The quantum crossover region is estimated by using the quantum fidelity. Whereas for the spin-1 system, the two discontinuous quantum phase transitions occur between three distinct uniaxial spin nematic phases. We discuss the quantum coherence measures and the quantum mutual information in connection with the quantum phase transitions including the quantum crossover.