Researcher profile

Nenggan Zheng

Nenggan Zheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

PIDNet: Progressive Implicit Decouple Network for Multimodal Action Quality Assessment

Action quality assessment (AQA) aims to automatically quantify the execution quality of human actions in videos and is valuable for applications such as competitive sports judging. In multimodal AQA, quality evidence from different modalities is heterogeneous, and quality cues evolve progressively over time. Existing methods often rely on coarse fusion or unified temporal modeling, which may blur modality-specific cues, preserve cross-modal redundancy, and weaken stage-specific quality evidence. To address these issues, we propose a progressive implicit decoupling and fusion network (PIDNet) that progressively integrates modality-specific information, cross-modal complementary cues, and global quality semantics for accurate assessment. Specifically, we design an iMambaWave module that maps RGB, optical flow, and audio features into a shared latent space and disentangles them with a Bi-Mamba branch and a wavelet-transform branch to capture long-range temporal dependencies and local perturbation details, respectively. A gated aggregation mechanism adaptively fuses temporal and frequency-domain information. We further build a three-stage progressive fusion network using Group3M blocks, where modality complementary attention retrieves cross-modal evidence while suppressing redundancy, and multi-scale convolutions enrich feature representations. Experiments on the Rhythmic Gymnastics and Fis-V datasets show that PIDNet achieves highly competitive score correlation with favorable error control compared with existing unimodal and multimodal methods. Ablation studies verify the effectiveness of each component. Moreover, iMambaWave consistently improves visual representation and temporal modeling across multiple backbones, showing good generalization and plug-and-play capability.

preprint2022arXiv

SIGMA: A Structural Inconsistency Reducing Graph Matching Algorithm

Graph matching finds the correspondence of nodes across two correlated graphs and lies at the core of many applications. When graph side information is not available, the node correspondence is estimated on the sole basis of network topologies. In this paper, we propose a novel criterion to measure the graph matching accuracy, structural inconsistency (SI), which is defined based on the network topological structure. Specifically, SI incorporates the heat diffusion wavelet to accommodate the multi-hop structure of the graphs. Based on SI, we propose a Structural Inconsistency reducing Graph Matching Algorithm (SIGMA), which improves the alignment scores of node pairs that have low SI values in each iteration. Under suitable assumptions, SIGMA can reduce SI values of true counterparts. Furthermore, we demonstrate that SIGMA can be derived by using a mirror descent method to solve the Gromov-Wasserstein distance with a novel K-hop-structure-based matching costs. Extensive experiments show that our method outperforms state-of-the-art methods.