Researcher profile

Hang Wang

Hang Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents

Vision-language model (VLM) agents increasingly rely on memory-augmented reinforcement learning to reuse experience across long-horizon tasks, yet most existing frameworks store memory as text and depend on proprietary teacher models to summarize or refine it. This design is poorly matched to spatial decision making: geometric priors are compressed into lossy language, and sparse interaction is often supervised through delayed textual feedback rather than dense visually grounded signals. We argue that reusable experience for VLM agents should remain visually grounded. Based on this insight, we propose \textbf{AtlasVA}, a teacher-free visual skill memory framework that organizes memory into three complementary layers: spatial heatmaps, visual exemplars, and symbolic text skills. AtlasVA further evolves danger and affinity atlases directly from trajectory statistics and lightweight grid heuristics, and reuses these self-evolving atlases as potential-based shaping rewards for reinforcement learning. This unifies perception, memory, and optimization without external LLM supervision. Experiments on \textsc{Sokoban}, \textsc{FrozenLake}, 3D embodied navigation, and 3D robotic manipulation benchmarks show that AtlasVA consistently outperforms text-centric memory baselines and competitive VLM agents, with especially strong gains on spatially intensive tasks. Homepage: https://wangpan-ustc.github.io/AtlasvaWeb

preprint2026arXiv

CMTA: Leveraging Cross-Modal Temporal Artifacts for Generalizable AI-Generated Video Detection

The proliferation of advanced AI video synthesis techniques poses an unprecedented challenge to digital video authenticity. Existing AI-generated video (AIGV) detection methods primarily focus on uni-modal or spatiotemporal artifacts, but they overlook the rich cues within the visual-textual cross-modal space, especially the temporal stability of semantic alignment. In this work, we identify a distinctive fingerprint in AIGVs, termed cross-modal temporal artifact (CMTA). Unlike real videos that exhibit natural temporal fluctuations in cross-modal alignment due to semantic variations, AIGVs display unnaturally stable semantic trajectories governed by given input prompts. To bridge this gap, we propose the CMTA framework, a cross-modal detection approach that captures these unique temporal artifacts through joint cross-modal embedding and multi-grained temporal modeling. Specifically, CMTA leverages BLIP to generate frame-level image captions and utilizes CLIP to extract corresponding visual-textual representations. A coarse-grained temporal modeling branch is then designed to characterize temporal fluctuations in cross-modal alignment with a GRU. In parallel, a fine-grained branch is constructed to capture intricate inter-frame variations from integrated visual-textual features with a Transformer encoder. Extensive experiments on 40 subsets across four large-scale datasets, including GenVideo, EvalCrafter, VideoPhy, and VidProM, validate that our approach sets a new state-of-the-art while exhibiting superior cross-generator generalization. Code and models of CMTA will be released at https://github.com/hwang-cs-ime/CMTA

preprint2022arXiv

A Review on Serious Games in E-learning

E-learning is a widely used learning method, but with the development of society, traditional E-learning method has exposed some shortcomings, such as the boring way of teaching, so that it is difficult to increase the enthusiasm of students and raise their attention in class. The application of serious games in E-learning can make up for these shortcomings and effectively improve the quality of teaching. When applying serious games to E-learning, there are two main considerations: educational goals and game design. A successful serious game should organically combine the two aspects and balance the educational and entertaining nature of serious games. This paper mainly discusses the role of serious games in E-learning, various elements of game design, the classification of the educational goals of serious games and the relationship between educational goals and game design. In addition, we try to classify serious games and match educational goals with game types to provide guidance and assistance in the design of serious games. This paper also summarizes some shortcomings that serious games may have in the application of E-learning.

preprint2022arXiv

Anomaly Detection of Adversarial Examples using Class-conditional Generative Adversarial Networks

Deep Neural Networks (DNNs) have been shown vulnerable to Test-Time Evasion attacks (TTEs, or adversarial examples), which, by making small changes to the input, alter the DNN's decision. We propose an unsupervised attack detector on DNN classifiers based on class-conditional Generative Adversarial Networks (GANs). We model the distribution of clean data conditioned on the predicted class label by an Auxiliary Classifier GAN (AC-GAN). Given a test sample and its predicted class, three detection statistics are calculated based on the AC-GAN Generator and Discriminator. Experiments on image classification datasets under various TTE attacks show that our method outperforms previous detection methods. We also investigate the effectiveness of anomaly detection using different DNN layers (input features or internal-layer features) and demonstrate, as one might expect, that anomalies are harder to detect using features closer to the DNN's output layer.

preprint2022arXiv

Graphical Modeling for Multi-Source Domain Adaptation

Multi-Source Domain Adaptation (MSDA) focuses on transferring the knowledge from multiple source domains to the target domain, which is a more practical and challenging problem compared to the conventional single-source domain adaptation. In this problem, it is essential to model multiple source domains and target domain jointly, and an effective domain combination scheme is also highly required. The graphical structure among different domains is useful to tackle these challenges, in which the interdependency among various instances/categories can be effectively modeled. In this work, we propose two types of graphical models, i.e. Conditional Random Field for MSDA (CRF-MSDA) and Markov Random Field for MSDA (MRF-MSDA), for cross-domain joint modeling and learnable domain combination. In a nutshell, given an observation set composed of a query sample and the semantic prototypes (i.e. representative category embeddings) on various domains, the CRF-MSDA model seeks to learn the joint distribution of labels conditioned on the observations. We attain this goal by constructing a relational graph over all observations and conducting local message passing on it. By comparison, MRF-MSDA aims to model the joint distribution of observations over different Markov networks via an energy-based formulation, and it can naturally perform label prediction by summing the joint likelihoods over several specific networks. Compared to the CRF-MSDA counterpart, the MRF-MSDA model is more expressive and possesses lower computational cost. We evaluate these two models on four standard benchmark data sets of MSDA with distinct domain shift and data complexity, and both models achieve superior performance over existing methods on all benchmarks. In addition, the analytical studies illustrate the effect of different model components and provide insights about how the cross-domain joint modeling performs.

preprint2022arXiv

K-homology and K-theory of pure Braid groups

We produce an explicit description of the K-theory and K-homology of the pure braid group on $n$ strands. We describe the Baum--Connes correspondence between the generators of the left- and right-hand sides for $n=4$. Using functoriality of the assembly map and direct computations, we recover Oyono-Oyono's result on the Baum--Connes conjecture for pure braid groups. We also discuss the case of the full braid group $B_3$.

preprint2022arXiv

Topological K-theory for discrete groups and Index theory

We give a complete solution, for discrete countable groups, to the problem of defining and computing a geometric pairing between the left hand side of the Baum-Connes assembly map, given in terms of geometric cycles associated to proper actions on manifolds, and cyclic periodic cohomology of the group algebra. Indeed, for any such group $Γ$ (without any further assumptions on it) we construct an explicit morphism from the Left-Hand side of the Baum-Connes assembly map to the periodic cyclic homology of the group algebra. This morphism, called here the Chern-Baum-Connes assembly map, allows to give a proper and explicit formulation for a Chern-Connes pairing with the periodic cyclic cohomology of the group algebra. Several theorems are needed to formulate the Chern-Baum-Connes assembly map. In particular we establish a delocalised Riemann-Roch theorem, the wrong way functoriality for periodic delocalised cohomology for $Γ$-proper actions, the construction of a Chern morphism between the Left-Hand side of Baum-Connes and a delocalised cohomology group associated to $Γ$ which is an isomorphism once tensoring with $\mathbb{C}$, and the construction of an explicit cohomological assembly map between the delocalised cohomology group associated to $Γ$ and the homology group $H_*(Γ,FΓ)$. We then give an index theoretical formula for the above mentioned pairing (for any $Γ$) in terms of pairings of invariant forms, associated to geometric cycles and given in terms of delocalized Chern and Todd classes, and currents naturally associated to group cocycles using Burghelea's computation. As part of our results we prove that left-Hand side group used in this paper is isomorphic to the usual analytic model for the left-hand side of the assembly map.

preprint2020arXiv

An equivariant Atiyah-Patodi-Singer index theorem for proper actions II: the $K$-theoretic index

Consider a proper, isometric action by a unimodular locally compact group $G$ on a Riemannian manifold $M$ with boundary, such that $M/G$ is compact. Then an equivariant Dirac-type operator $D$ on $M$ under a suitable boundary condition has an equivariant index $\operatorname{index}_G(D)$ in the $K$-theory of the reduced group $C^*$-algebra $C^*_rG$ of $G$. This is a common generalisation of the Baum-Connes analytic assembly map and the (equivariant) Atiyah-Patodi-Singer index. In part I of this series, a numerical index $\operatorname{index}_g(D)$ was defined for an element $g \in G$, in terms of a parametrix of $D$ and a trace associated to $g$. An Atiyah-Patodi-Singer type index formula was obtained for this index. In this paper, we show that, under certain conditions, $τ_g(\operatorname{index}_G(D)) = \operatorname{index}_g(D)$, for a trace $τ_g$ defined by the orbital integral over the conjugacy class of $g$. This implies that the index theorem from part I yields information about the $K$-theoretic index $\operatorname{index}_G(D)$. It also shows that $\operatorname{index}_g(D)$ is a homotopy-invariant quantity.

preprint2020arXiv

An equivariant orbifold index for proper actions

For a proper, cocompact action by a locally compact group of the form $H \times G$, with $H$ compact, we define an $H \times G$-equivariant index of $H$-transversally elliptic operators, which takes values in $KK_*(C^*H, C^*G)$. This simultaneously generalises the Baum--Connes analytic assembly map, Atiyah's index of transversally elliptic operators, and Kawasaki's orbifold index. This index also generalises the assembly map to elliptic operators on orbifolds. In the special case where the manifold in question is a real semisimple Lie group, $G$ is a cocompact lattice and $H$ is a maximal compact subgroup, we realise the Dirac induction map from the Connes--Kasparov conjecture as a Kasparov product and obtain an index theorem for Spin-Dirac operators on compact locally symmetric spaces.

preprint2020arXiv

Cross-domain Detection via Graph-induced Prototype Alignment

Applying the knowledge of an object detector trained on a specific domain directly onto a new domain is risky, as the gap between two domains can severely degrade model's performance. Furthermore, since different instances commonly embody distinct modal information in object detection scenario, the feature alignment of source and target domain is hard to be realized. To mitigate these problems, we propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment via elaborate prototype representations. In the nutshell, more precise instance-level features are obtained through graph-based information propagation among region proposals, and, on such basis, the prototype representation of each class is derived for category-level domain alignment. In addition, in order to alleviate the negative effect of class-imbalance on domain adaptation, we design a Class-reweighted Contrastive Loss to harmonize the adaptation training process. Combining with Faster R-CNN, the proposed framework conducts feature alignment in a two-stage manner. Comprehensive results on various cross-domain detection tasks demonstrate that our approach outperforms existing methods with a remarkable margin. Our code is available at https://github.com/ChrisAllenMing/GPA-detection.

preprint2020arXiv

Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation

Transferring knowledges learned from multiple source domains to target domain is a more practical and challenging task than conventional single-source domain adaptation. Furthermore, the increase of modalities brings more difficulty in aligning feature distributions among multiple domains. To mitigate these problems, we propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework via exploring interactions among domains. In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations. On such basis, a graph model is learned to predict query samples under the guidance of correlated prototypes. In addition, we design a Relation Alignment Loss (RAL) to facilitate the consistency of categories' relational interdependency and the compactness of features, which boosts features' intra-class invariance and inter-class separability. Comprehensive results on public benchmark datasets demonstrate that our approach outperforms existing methods with a remarkable margin. Our code is available at \url{https://github.com/ChrisAllenMing/LtC-MSDA}

preprint2018arXiv

Positive Scalar Curvature and Poincare Duality for Proper Actions

For G an almost-connected Lie group, we study G-equivariant index theory for proper co-compact actions with various applications, including obstructions to and existence of G-invariant Riemannian metrics of positive scalar curvature. We prove a rigidity result for almost-complex manifolds, generalising Hattori's results, and an analogue of Petrie's conjecture. When G is an almost-connected Lie group or a discrete group, we establish Poincare duality between G-equivariant K-homology and K-theory, observing that Poincare duality does not necessarily hold for general G.