Source author record

Shuai Shao

Shuai Shao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Computer Vision Social and Information Networks Computational Complexity Information Theory math.IT physics.data-an Artificial Intelligence cond-mat.stat-mech eess.AS eess.SP Machine Learning quant-ph Sound

Catalog footprint

What is connected

15works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Extreme Value Policy Optimization for Safe Reinforcement Learning

Ensuring safety is a critical challenge in applying Reinforcement Learning (RL) to real-world scenarios. Constrained Reinforcement Learning (CRL) addresses this by maximizing returns under predefined constraints, typically formulated as the expected cumulative cost. However, expectation-based constraints overlook rare but high-impact extreme value events in the tail distribution, such as black swan incidents, which can lead to severe constraint violations. To address this issue, we propose the Extreme Value policy Optimization (EVO) algorithm, leveraging Extreme Value Theory (EVT) to model and exploit extreme reward and cost samples, reducing constraint violations. EVO introduces an extreme quantile optimization objective to explicitly capture extreme samples in the cost tail distribution. Additionally, we propose an extreme prioritization mechanism during replay, amplifying the learning signal from rare but high-impact extreme samples. Theoretically, we establish upper bounds on expected constraint violations during policy updates, guaranteeing strict constraint satisfaction at a zero-violation quantile level. Further, we demonstrate that EVO achieves a lower probability of constraint violations than expectation-based methods and exhibits lower variance than quantile regression methods. Extensive experiments show that EVO significantly reduces constraint violations during training while maintaining competitive policy performance compared to baselines.

preprint2026arXiv

MMSkills: Towards Multimodal Skills for General Visual Agents

Reusable skills have become a core substrate for improving agent capabilities, yet most existing skill packages encode reusable behavior primarily as textual prompts, executable code, or learned routines. For visual agents, however, procedural knowledge is inherently multimodal: reuse depends not only on what operation to perform, but also on recognizing the relevant state, interpreting visual evidence of progress or failure, and deciding what to do next. We formalize this requirement as multimodal procedural knowledge and address three practical challenges: (I) what a multimodal skill package should contain; (II) where such packages can be derived from public interaction experience; and (III) how agents can consult multimodal evidence at inference time without excessive image context or over-anchoring to reference screenshots. We introduce MMSkills, a framework for representing, generating, and using reusable multimodal procedures for runtime visual decision making. Each MMSkill is a compact, state-conditioned package that couples a textual procedure with runtime state cards and multi-view keyframes. To construct these packages, we develop an agentic trajectory-to-skill Generator that transforms public non-evaluation trajectories into reusable multimodal skills through workflow grouping, procedure induction, visual grounding, and meta-skill-guided auditing. To use them, we introduce a branch-loaded multimodal skill agent: selected state cards and keyframes are inspected in a temporary branch, aligned with the live environment, and distilled into structured guidance for the main agent. Experiments across GUI and game-based visual-agent benchmarks show that MMSkills consistently improve both frontier and smaller multimodal agents, suggesting that external multimodal procedural knowledge complements model-internal priors.

preprint2026arXiv

Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character Generation

Generating high-quality 3D characters from single images remains a significant challenge in digital content creation, particularly due to complex body poses and self-occlusion. In this paper, we present RCM (Rotate your Character Model), an advanced image-to-video diffusion framework tailored for high-quality novel view synthesis (NVS) and 3D character generation. Compared to existing diffusion-based approaches, RCM offers several key advantages: (1) transferring characters with any complex poses into a canonical pose, enabling consistent novel view synthesis across the entire viewing orbit, (2) high-resolution orbital video generation at 1024x1024 resolution, (3) controllable observation positions given different initial camera poses, and (4) multi-view conditioning supporting up to 4 input images, accommodating diverse user scenarios. Extensive experiments demonstrate that RCM outperforms state-of-the-art methods in both novel view synthesis and 3D generation quality.

preprint2026arXiv

TAGRPO: Boosting GRPO on Image-to-Video Generation with Direct Trajectory Alignment

Recent studies have demonstrated the efficacy of integrating Group Relative Policy Optimization (GRPO) into flow matching models, particularly for text-to-image and text-to-video generation. However, we find that directly applying these techniques to image-to-video (I2V) models often fails to yield consistent reward improvements. To address this limitation, we present TAGRPO, a robust post-training framework for I2V models inspired by contrastive learning. Our approach is grounded in the observation that rollout videos generated from identical initial noise provide superior guidance for optimization. Leveraging this insight, we propose a novel GRPO loss applied to intermediate latents, encouraging direct alignment with high-reward trajectories while maximizing distance from low-reward counterparts. Furthermore, we introduce a memory bank for rollout videos to enhance diversity and reduce computational overhead. Despite its simplicity, TAGRPO achieves significant improvements over DanceGRPO in I2V generation.

preprint2022arXiv

A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy

Acoustic Echo Cancellation (AEC) is essential for accurate recognition of queries spoken to a smart speaker that is playing out audio. Previous work has shown that a neural AEC model operating on log-mel spectral features (denoted "logmel" hereafter) can greatly improve Automatic Speech Recognition (ASR) accuracy when optimized with an auxiliary loss utilizing a pre-trained ASR model encoder. In this paper, we develop a conformer-based waveform-domain neural AEC model inspired by the "TasNet" architecture. The model is trained by jointly optimizing Negative Scale-Invariant SNR (SISNR) and ASR losses on a large speech dataset. On a realistic rerecorded test set, we find that cascading a linear adaptive AEC and a waveform-domain neural AEC is very effective, giving 56-59% word error rate (WER) reduction over the linear AEC alone. On this test set, the 1.6M parameter waveform-domain neural AEC also improves over a larger 6.5M parameter logmel-domain neural AEC model by 20-29% in easy to moderate conditions. By operating on smaller frames, the waveform neural model is able to perform better at smaller sizes and is better suited for applications where memory is limited.

preprint2022arXiv

Birds of A Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation

Unsupervised domain adaptation (UDA) aims to enhance the generalization capability of a certain model from a source domain to a target domain. Present UDA models focus on alleviating the domain shift by minimizing the feature discrepancy between the source domain and the target domain but usually ignore the class confusion problem. In this work, we propose an Inter-class Separation and Intra-class Aggregation (ISIA) mechanism. It encourages the cross-domain representative consistency between the same categories and differentiation among diverse categories. In this way, the features belonging to the same categories are aligned together and the confusable categories are separated. By measuring the align complexity of each category, we design an Adaptive-weighted Instance Matching (AIM) strategy to further optimize the instance-level adaptation. Based on our proposed methods, we also raise a hierarchical unsupervised domain adaptation framework for cross-domain semantic segmentation task. Through performing the image-level, feature-level, category-level and instance-level alignment, our method achieves a stronger generalization performance of the model from the source domain to the target domain. In two typical cross-domain semantic segmentation tasks, i.e., GTA5 to Cityscapes and SYNTHIA to Cityscapes, our method achieves the state-of-the-art segmentation accuracy. We also build two cross-domain semantic segmentation datasets based on the publicly available data, i.e., remote sensing building segmentation and road segmentation, for domain adaptive segmentation.

preprint2020arXiv

A Dichotomy for Real Boolean Holant Problems

We prove a complexity dichotomy for Holant problems on the boolean domain with arbitrary sets of real-valued constraint functions. These constraint functions need not be symmetric nor do we assume any auxiliary functions as in previous results. It is proved that for every set $\mathcal{F}$ of real-valued constraint functions, Holant$(\mathcal{F})$ is either P-time computable or #P-hard. The classification has an explicit criterion. This is the culmination of much research on this problem, and it uses previous results and techniques from many researchers. Some particularly intriguing concrete functions $f_6$, $f_8$ and their associated families with extraordinary closure properties related to Bell states in quantum information theory play an important role in this proof.

preprint2020arXiv

From Holant to Quantum Entanglement and Back

Holant problems are intimately connected with quantum theory as tensor networks. We first use techniques from Holant theory to derive new and improved results for quantum entanglement theory. We discover two particular entangled states $|{Ψ_6}\rangle$ of 6 qubits and $|{Ψ_8}\rangle$ of 8 qubits respectively, that have extraordinary and unique closure properties in terms of the Bell property. Then we use entanglement properties of constraint functions to derive a new complexity dichotomy for all real-valued Holant problems containing an odd-arity signature. The signatures need not be symmetric, and no auxiliary signatures are assumed.

preprint2016arXiv

Localized attack on clustering networks

Clustering network is one of which complex network attracting plenty of scholars to discuss and study the structures and cascading process. We primarily analyzed the effect of clustering coefficient to other various of the single clustering network under localized attack. These network models including double clustering network and star-like NON with clustering and random regular (RR) NON of ER networks with clustering are made up of at least two networks among which exist interdependent relation among whose degree of dependence is measured by coupling strength. We show both analytically and numerically, how the coupling strength and clustering coefficient effect the percolation threshold, size of giant component, critical coupling point where the behavior of phase transition changes from second order to first order with the increase of coupling strength between the networks. Last, we study the two types of clustering network: one type is same with double clustering network in which each subnetwork satisfies identical degree distribution and the other is that their subnetwork satisfies different degree distribution. The former type is treated both analytically and numerically while the latter is treated only numerically. In each section, we compared two results obtained from localized attack and random attack according to Shao et al:[22].

preprint2016arXiv

On the Dual of the Coulter-Matthews Bent Functions

For any bent function, it is very interesting to determine its dual function because the dual function is also bent in certain cases. For $k$ odd and $\gcd(n, k)=1$, it is known that the Coulter-Matthews bent function $f(x)=Tr(ax^{\frac{3^k+1}{2}})$ is weakly regular bent over $\mathbb{F}_{3^n}$, where $a\in\mathbb{F}_{3^n}^{*}$, and $Tr(\cdot):\mathbb{F}_{3^n}\rightarrow\mathbb{F}_3$ is the trace function. In this paper, we investigate the dual function of $f(x)$, and dig out an universal formula. In particular, for two cases, we determine the formula explicitly: for the case of $n=3t+1$ and $k=2t+1$ with $t\geq 2$, the dual function is given by $$Tr\left(-\frac{x^{3^{2t+1}+3^{t+1}+2}}{a^{3^{2t+1}+3^{t+1}+1}}-\frac{x^{3^{2t}+1}}{a^{-3^{2t}+3^{t}+1}}+\frac{x^{2}}{a^{-3^{2t+1}+3^{t+1}+1}}\right);$$ and for the case of $n=3t+2$ and $k=2t+1$ with $t\geq 2$, the dual function is given by $$Tr\left(-\frac{x^{3^{2t+2}+1}}{a^{3^{2t+2}-3^{t+1}+3}}-\frac{x^{2\cdot3^{2t+1}+3^{t+1}+1}}{a^{3^{2t+2}+3^{t+1}+1}}+\frac{x^2}{a^{-3^{2t+2}+3^{t+1}+3}}\right).$$ As a byproduct, we find two new classes of ternary bent functions with only three terms. Moreover, we also prove that in certain cases $f(x)$ is regular bent.

preprint2015arXiv

The influence of the broadness of the degree distribution on network's robustness: comparing localized attack and random attack

The stability of networks is greatly influenced by their degree distributions and in particular by their broadness. Networks with broader degree distributions are usually more robust to random failures but less robust to localized attacks. To better understand the effect of the broadness of the degree distribution we study here two models where the broadness is controlled and compare their robustness against localized attacks (LA) and random attacks (RA). We study analytically and by numerical simulations the cases where the degrees in the networks follow a Bi-Poisson distribution $P(k)=αe^{-λ_1}\frac{λ_1^k}{k!}+(1-α) e^{-λ_2}\frac{λ_2^k}{k!},α\in[0,1]$, and a Gaussian distribution $P(k)=A \cdot exp{(-\frac{(k-μ)^2}{2σ^2})}$ with a normalization constant $A$ where $k\geq 0$. In the Bi-Poisson distribution the broadness is controlled by the values of $α$, $λ_1$ and $λ_2$, while in the Gaussian distribution it is controlled by the standard deviation, $σ$. We find that only for $α=0$ or $α=1$, namely degrees obeying a pure Poisson distribution, LA and RA are the same but for all other cases networks are more vulnerable under LA compared to RA. For Gaussian distribution, with an average degree $μ$ fixed, we find that when $σ^2$ is smaller than $μ$ the network is more vulnerable against random attack. However, when $σ^2$ is larger than $μ$ the network becomes more vulnerable against localized attack. Similar qualitative results are also shown for interdependent networks.

preprint2014arXiv

Percolation of localized attack on complex networks

The robustness of complex networks against node failure and malicious attack has been of interest for decades, while most of the research has focused on random attack or hub-targeted attack. In many real-world scenarios, however, attacks are neither random nor hub-targeted, but localized, where a group of neighboring nodes in a network are attacked and fail. In this paper we develop a percolation framework to analytically and numerically study the robustness of complex networks against such localized attack. In particular, we investigate this robustness in Erdős-Rényi networks, random-regular networks, and scale-free networks. Our results provide insight into how to better protect networks, enhance cybersecurity, and facilitate the design of more robust infrastructures.

preprint2013arXiv

Robustness of partially interdependent network formed of clustered networks

Clustering, or transitivity has been observed in real networks and its effects on their structure and function has been discussed extensively. The focus of these studies has been on clustering of single networks while the effect of clustering on the robustness of coupled networks received very little attention. Only the case of a pair of fully coupled networks with clustering has been studied recently. Here we generalize the study of clustering of a fully coupled pair of networks to the study of partially interdependent network of networks with clustering within the network components. We show both analytically and numerically, how clustering within the networks, affects the percolation properties of interdependent networks, including percolation threshold, size of giant component and critical coupling point where first order phase transition changes to second order phase transition as the coupling between the networks reduces. We study two types of clustering: one type proposed by Newman where the average degree is kept constant while changing the clustering and the other proposed by Hackett $et$ $al.$ where the degree distribution is kept constant. The first type of clustering is treated both analytically and numerically while the second one is treated only numerically.

preprint2013arXiv

The Proof of Lin's Conjecture via the Decimation-Hadamard Transform

In 1998, Lin presented a conjecture on a class of ternary sequences with ideal 2-level autocorrelation in his Ph.D thesis. Those sequences have a very simple structure, i.e., their trace representation has two trace monomial terms. In this paper, we present a proof for the conjecture. The mathematical tools employed are the second-order multiplexing decimation-Hadamard transform, Stickelberger's theorem, the Teichmüller character, and combinatorial techniques for enumerating the Hamming weights of ternary numbers. As a by-product, we also prove that the Lin conjectured ternary sequences are Hadamard equivalent to ternary $m$-sequences.

preprint2012arXiv

The robustness of interdependent clustered networks

It was recently found that cascading failures can cause the abrupt breakdown of a system of interdependent networks. Using the percolation method developed for single clustered networks by Newman [Phys. Rev. Lett. {\bf 103}, 058701 (2009)], we develop an analytical method for studying how clustering within the networks of a system of interdependent networks affects the system's robustness. We find that clustering significantly increases the vulnerability of the system, which is represented by the increased value of the percolation threshold $p_c$ in interdependent networks.

Shuai Shao

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Extreme Value Policy Optimization for Safe Reinforcement Learning

MMSkills: Towards Multimodal Skills for General Visual Agents

Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character Generation

TAGRPO: Boosting GRPO on Image-to-Video Generation with Direct Trajectory Alignment

A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy

Birds of A Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation

A Dichotomy for Real Boolean Holant Problems

From Holant to Quantum Entanglement and Back

Localized attack on clustering networks

On the Dual of the Coulter-Matthews Bent Functions

The influence of the broadness of the degree distribution on network's robustness: comparing localized attack and random attack

Percolation of localized attack on complex networks

Robustness of partially interdependent network formed of clustered networks

The Proof of Lin's Conjecture via the Decimation-Hadamard Transform

The robustness of interdependent clustered networks