Researcher profile

Zhicheng Wang

Zhicheng Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Beyond Point-wise Neural Collapse: A Topology-Aware Hierarchical Classifier for Class-Incremental Learning

The Nearest Class Mean (NCM) classifier is widely favored in Class-Incremental Learning (CIL) for its superior resistance to catastrophic forgetting compared to Fully Connected layers. While Neural Collapse (NC) theory supports NCM's optimality by assuming features collapse into single points, non-linear feature drift and insufficient training in CIL often prevent this ideal state. Consequently, classes manifest as complex manifolds rather than collapsed points, rendering the single-point NCM suboptimal. To address this, we propose Hierarchical-Cluster SOINN (HC-SOINN), a novel classifier that captures the topological structure of these manifolds via a ``local-to-global'' representation. Furthermore, we introduce Structure-Topology Alignment via Residuals (STAR) method, which employs a fine-grained pointwise trajectory tracking mechanism to actively deform the learned topology, allowing it to adapt precisely to complex non-linear feature drift. Theoretical analysis and Procrustes distance experiments validate our framework's resilience to manifold deformations. We integrated HC-SOINN into seven state-of-the-art methods by replacing their original classifiers, achieving consistent improvements that highlight the effectiveness and robustness of our approach. Code is available at https://github.com/yhyet/HC_SOINN.

preprint2026arXiv

The Cartesian Shortcut: Re-evaluate Vision Reasoning in Polar Coordinate Space

As current Multimodal Large Language Models rapidly saturate canonical visual reasoning benchmarks, a key question emerges: do these strong scores genuinely reflect robust visual understanding? We identify a pervasive vulnerability, the \textbf{Cartesian Shortcut}: visual reasoning benchmarks prevalently build on orthogonal grid-based layouts that can be readily discretized into explicit textual coordinates. Models systematically exploit this property, heavily leveraging text-based deductive reasoning to assist visual problem-solving. To systematically dismantle this shortcut, we introduce \textbf{Polaris-Bench}, which re-formulates 53 visual reasoning tasks in Polar coordinate space with paired Cartesian counterparts as reference, while preserving consistent logical constraints and task semantics -- thus fundamentally breaking the orthogonal prior that models exploit. Comprehensive evaluation across $14$ state-of-the-art MLLMs reveals that frontier models achieving $70$--$83\%$ on Cartesian layouts collapse to $31$--$39\%$ on Polar equivalents, with degradation persisting even under complete logical equivalence. Moreover, reasoning gains observed on Cartesian layouts are severely diminished on Polar equivalents. These findings expose a critical deficiency in current MLLMs: the lack of topology-invariant visual reasoning.

preprint2024arXiv

FuRPE: Learning Full-body Reconstruction from Part Experts

In the field of full-body reconstruction, the scarcity of annotated data often impedes the efficacy of prevailing methods. To address this issue, we introduce FuRPE, a novel framework that employs part-experts and an ingenious pseudo ground-truth selection scheme to derive high-quality pseudo labels. These labels, central to our approach, equip our network with the capability to efficiently learn from the available data. Integral to FuRPE is a unique exponential moving average training strategy and expert-derived feature distillation strategy. These novel elements of FuRPE not only serve to further refine the model but also to reduce potential biases that may arise from inaccuracies in pseudo labels, thereby optimizing the network's training process and enhancing the robustness of the model. We apply FuRPE to train both two-stage and fully convolutional single-stage full-body reconstruction networks. Our exhaustive experiments on numerous benchmark datasets illustrate a substantial performance boost over existing methods, underscoring FuRPE's potential to reshape the state-of-the-art in full-body reconstruction.

preprint2022arXiv

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image

Recently, RGBD-based category-level 6D object pose estimation has achieved promising improvement in performance, however, the requirement of depth information prohibits broader applications. In order to relieve this problem, this paper proposes a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation. We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation. Two novel modules named Normalized Global Position Hints (NGPH) and Shape-aware Decoupled Depth Reconstruction (SDDR) module are introduced to learn high fidelity object-level depth and delicate shape representations. At last, the 6D object pose is solved by aligning the predicted canonical representation with the back-projected object-level depth. Extensive experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model, though simple, achieves state-of-the-art performance.

preprint2022arXiv

SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation

The 2D heatmap-based approaches have dominated Human Pose Estimation (HPE) for years due to high performance. However, the long-standing quantization error problem in the 2D heatmap-based methods leads to several well-known drawbacks: 1) The performance for the low-resolution inputs is limited; 2) To improve the feature map resolution for higher localization precision, multiple costly upsampling layers are required; 3) Extra post-processing is adopted to reduce the quantization error. To address these issues, we aim to explore a brand new scheme, called \textit{SimCC}, which reformulates HPE as two classification tasks for horizontal and vertical coordinates. The proposed SimCC uniformly divides each pixel into several bins, thus achieving \emph{sub-pixel} localization precision and low quantization error. Benefiting from that, SimCC can omit additional refinement post-processing and exclude upsampling layers under certain settings, resulting in a more simple and effective pipeline for HPE. Extensive experiments conducted over COCO, CrowdPose, and MPII datasets show that SimCC outperforms heatmap-based counterparts, especially in low-resolution settings by a large margin.

preprint2021arXiv

EDNet: Efficient Disparity Estimation with Cost Volume Combination and Attention-based Spatial Residual

Existing state-of-the-art disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression, which is inefficient due to the high memory consumption and slow inference speed. In this paper, we propose a network named EDNet for efficient disparity estimation. Firstly, we construct a combined volume which incorporates contextual information from the squeezed concatenation volume and feature similarity measurement from the correlation volume. The combined volume can be next aggregated by 2D convolutions which are faster and require less memory than 3D convolutions. Secondly, we propose an attention-based spatial residual module to generate attention-aware residual features. The attention mechanism is applied to provide intuitive spatial evidence about inaccurate regions with the help of error maps at multiple scales and thus improve the residual learning efficiency. Extensive experiments on the Scene Flow and KITTI datasets show that EDNet outperforms the previous 3D CNN based works and achieves state-of-the-art performance with significantly faster speed and less memory consumption.

preprint2021arXiv

Physics-informed neural networks with hard constraints for inverse design

Inverse design arises in a variety of areas in engineering such as acoustic, mechanics, thermal/electronic transport, electromagnetism, and optics. Topology optimization is a major form of inverse design, where we optimize a designed geometry to achieve targeted properties and the geometry is parameterized by a density function. This optimization is challenging, because it has a very high dimensionality and is usually constrained by partial differential equations (PDEs) and additional inequalities. Here, we propose a new deep learning method -- physics-informed neural networks with hard constraints (hPINNs) -- for solving topology optimization. hPINN leverages the recent development of PINNs for solving PDEs, and thus does not rely on any numerical PDE solver. However, all the constraints in PINNs are soft constraints, and hence we impose hard constraints by using the penalty method and the augmented Lagrangian method. We demonstrate the effectiveness of hPINN for a holography problem in optics and a fluid problem of Stokes flow. We achieve the same objective as conventional PDE-constrained optimization methods based on adjoint methods and numerical PDE solvers, but find that the design obtained from hPINN is often simpler and smoother for problems whose solution is not unique. Moreover, the implementation of inverse design with hPINN can be easier than that of conventional methods.

preprint2021arXiv

Using Low-rank Representation of Abundance Maps and Nonnegative Tensor Factorization for Hyperspectral Nonlinear Unmixing

Tensor-based methods have been widely studied to attack inverse problems in hyperspectral imaging since a hyperspectral image (HSI) cube can be naturally represented as a third-order tensor, which can perfectly retain the spatial information in the image. In this article, we extend the linear tensor method to the nonlinear tensor method and propose a nonlinear low-rank tensor unmixing algorithm to solve the generalized bilinear model (GBM). Specifically, the linear and nonlinear parts of the GBM can both be expressed as tensors. Furthermore, the low-rank structures of abundance maps and nonlinear interaction abundance maps are exploited by minimizing their nuclear norm, thus taking full advantage of the high spatial correlation in HSIs. Synthetic and real-data experiments show that the low rank of abundance maps and nonlinear interaction abundance maps exploited in our method can improve the performance of the nonlinear unmixing. A MATLAB demo of this work will be available at https://github.com/LinaZhuang for the sake of reproducibility.

preprint2020arXiv

A Light-Weight Object Detection Framework with FPA Module for Optical Remote Sensing Imagery

With the development of remote sensing technology, the acquisition of remote sensing images is easier and easier, which provides sufficient data resources for the task of detecting remote sensing objects. However, how to detect objects quickly and accurately from many complex optical remote sensing images is a challenging hot issue. In this paper, we propose an efficient anchor free object detector, CenterFPANet. To pursue speed, we use a lightweight backbone and introduce the asymmetric revolution block. To improve the accuracy, we designed the FPA module, which links the feature maps of different levels, and introduces the attention mechanism to dynamically adjust the weights of each level of feature maps, which solves the problem of detection difficulty caused by large size range of remote sensing objects. This strategy can improve the accuracy of remote sensing image object detection without reducing the detection speed. On the DOTA dataset, CenterFPANet mAP is 64.00%, and FPS is 22.2, which is close to the accuracy of the anchor-based methods currently used and much faster than them. Compared with Faster RCNN, mAP is 6.76% lower but 60.87% faster. All in all, CenterFPANet achieves a balance between speed and accuracy in large-scale optical remote sensing object detection.

preprint2020arXiv

Active- and transfer-learning applied to microscale-macroscale coupling to simulate viscoelastic flows

Active- and transfer-learning are applied to polymer flows for the multiscale discovery of effective constitutive approximations required in viscoelastic flow simulation. The result is macroscopic rheology directly connected to a microstructural model. Micro and macroscale simulations are adaptively coupled by means of Gaussian process regression to run the expensive microscale computations only as necessary. This active-learning guided multiscale method can automatically detect the inaccuracy of the learned constitutive closure and initiate simulations at new sampling points informed by proper acquisition functions, leading to an autonomic microscale-macroscale coupled system. Also, we develop a new dissipative particle dynamics model with the range of interaction cutoff between particles allowed to vary with the local strain-rate invariant, which is able to capture both the shear-thinning viscosity and the normal stress difference functions consistent with rheological experiments for aqueous polyacrylamide solutions. Our numerical experiments demonstrate the effectiveness of using active- and transfer-learning schemes to on-the-fly couple a spectral element solver and a mesoscopic particle-based simulator, and verify that the microscale-macroscale coupled model with effective constitutive closure learned from microscopic dynamics can outperform empirical constitutive models compared to experimental observations. The effective closure learned in a channel simulation is then transferred directly to the flow past a circular cylinder, where the results show that only two additional microscopic simulations are required to achieve a satisfactory constitutive model to once again close the continuum equations. This new paradigm of active- and transfer-learning for multiscale modeling is readily applicable to other microscale-macroscale coupled simulations of complex fluids and other materials.

preprint2020arXiv

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

Occluded person re-identification (ReID) aims to match occluded person images to holistic ones across dis-joint cameras. In this paper, we propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. At first, we use a CNN backbone and a key-points estimation model to extract semantic local features. Even so, occluded images still suffer from occlusion and outliers. Then, we view the local features of an image as nodes of a graph and propose an adaptive direction graph convolutional (ADGC)layer to pass relation information between nodes. The proposed ADGC layer can automatically suppress the message-passing of meaningless features by dynamically learning di-rection and degree of linkage. When aligning two groups of local features from two images, we view it as a graph matching problem and propose a cross-graph embedded-alignment (CGEA) layer to jointly learn and embed topology information to local features, and straightly predict similarity score. The proposed CGEA layer not only take full use of alignment learned by graph matching but also re-place sensitive one-to-one matching with a robust soft one. Finally, extensive experiments on occluded, partial, and holistic ReID tasks show the effectiveness of our proposed method. Specifically, our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.

preprint2020arXiv

Learning Delicate Local Representations for Multi-Person Pose Estimation

In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in precise keypoint localization. Additionally, we observe the output features contribute differently to final performance. To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve 79.2 on COCO test-dev, 77.1 on COCO test-challenge dataset. The source code is publicly available for further research at https://github.com/caiyuanhao1998/RSN/

preprint2020arXiv

NMR and NQR studies on transition-metal arsenide superconductors LaRu2As2, KCa2Fe4As4F2, and A2Cr3As3

We report 75As-nuclear magnetic resonance (NMR) and nuclear quadrupole resonance (NQR) measurements on transition-metal arsenides LaRu2As2, KCa2Fe4As4F2, and A2Cr3As3. In the superconducting state of LaRu2As2, a Hebel- Slichter coherence peak is found in the temperature dependence of the spin-lattice relaxation rate-1/T1 just below Tc, which indicates that LaRu2As2 is a full-gap superperconducor. For KCa2Fe4As4F2, antiferromagnetic spin fluctuations are observed in the normal state. We further find that the anisotropy rate RAF = Tc1/Tab1 is small and temperature independent, implying that the low energy spin fluctuations are isotropic in spin space. Our results indicate that KCa2Fe4As4F2 is a moderately overdoped iron-arsenide high-temperature superconductor with a stoichiometric composition. For A2Cr3As3, we calculate the electric field gradient by first-principle method and assign the 75As-NQR peaks with two crystallographically different As sites, paving the way for further NMR investigation.

preprint2020arXiv

Remarks on the theta correspondence over finite fields

S.-Y. Pan decomposes the uniform projection of the Weil representation of a finite symplectic-odd orthogonal dual pair, in terms of Deligne-Lusztig virtual characters, assuming that the order of the finite field is large enough. In this paper we use Pan's decomposition to study the theta correspondence for this kind of dual pairs, following the approach of Adams-Moy and Aubert-Michel-Rouquier. Our results give the theta correspondence between unipotent representations and certain quadratic unipotent representations.