Researcher profile

Mengzhu Wang

Mengzhu Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Delay-Empowered Causal Hierarchical Reinforcement Learning

Many real-world tasks involve delayed effects, where the outcomes of actions emerge after varying time lags. Existing delay-aware reinforcement learning methods often rely on state augmentation, prior knowledge of delay distributions, or access to non-delayed data, limiting their generalization. Hierarchical reinforcement learning, by contrast, inherently offers advantages in handling delays due to its hierarchical structure, yet existing methods are restricted to fixed delays. To address these limitations, we propose Delay-Empowered Causal Hierarchical Reinforcement Learning (DECHRL). DECHRL explicitly models both the causal structure of state transitions and their associated stochastic delay distributions. These are then incorporated into a delay-aware empowerment objective that drives proactive exploration toward highly controllable states, thereby improving performance under temporal uncertainty. We evaluate DECHRL in modified 2D-Minecraft and MiniGrid environments featuring stochastic delays. Experimental results show that DECHRL effectively models temporal delays and significantly outperforms baselines in decision-making under temporal uncertainty.

preprint2022arXiv

On the Equity of Nuclear Norm Maximization in Unsupervised Domain Adaptation

Nuclear norm maximization has shown the power to enhance the transferability of unsupervised domain adaptation model (UDA) in an empirical scheme. In this paper, we identify a new property termed equity, which indicates the balance degree of predicted classes, to demystify the efficacy of nuclear norm maximization for UDA theoretically. With this in mind, we offer a new discriminability-and-equity maximization paradigm built on squares loss, such that predictions are equalized explicitly. To verify its feasibility and flexibility, two new losses termed Class Weighted Squares Maximization (CWSM) and Normalized Squares Maximization (NSM), are proposed to maximize both predictive discriminability and equity, from the class level and the sample level, respectively. Importantly, we theoretically relate these two novel losses (i.e., CWSM and NSM) to the equity maximization under mild conditions, and empirically suggest the importance of the predictive equity in UDA. Moreover, it is very efficient to realize the equity constraints in both losses. Experiments of cross-domain image classification on three popular benchmark datasets show that both CWSM and NSM contribute to outperforming the corresponding counterparts.

preprint2022arXiv

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

Domain generalization (DG) aims to learn a model on one or more different but related source domains that could be generalized into an unseen target domain. Existing DG methods try to prompt the diversity of source domains for the model's generalization ability, while they may have to introduce auxiliary networks or striking computational costs. On the contrary, this work applies the implicit semantic augmentation in feature space to capture the diversity of source domains. Concretely, an additional loss function of distance metric learning (DML) is included to optimize the local geometry of data distribution. Besides, the logits from cross entropy loss with infinite augmentations is adopted as input features for the DML loss in lieu of the deep features. We also provide a theoretical analysis to show that the logits can approximate the distances defined on original features well. Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization. The proposed DML loss with the implicit augmentation is incorporated into a recent DG method, that is, Fourier Augmented Co-Teacher framework (FACT). Meanwhile, our method also can be easily plugged into various DG methods. Extensive experiments on three benchmarks (Digits-DG, PACS and Office-Home) have demonstrated that the proposed method is able to achieve the state-of-the-art performance.

preprint2020arXiv

Improving Unsupervised Domain Adaptation by Reducing Bi-level Feature Redundancy

Reducing feature redundancy has shown beneficial effects for improving the accuracy of deep learning models, thus it is also indispensable for the models of unsupervised domain adaptation (UDA). Nevertheless, most recent efforts in the field of UDA ignores this point. Moreover, main schemes realizing this in general independent of UDA purely involve a single domain, thus might not be effective for cross-domain tasks. In this paper, we emphasize the significance of reducing feature redundancy for improving UDA in a bi-level way. For the first level, we try to ensure compact domain-specific features with a transferable decorrelated normalization module, which preserves specific domain information whilst easing the side effect of feature redundancy on the sequel domain-invariance. In the second level, domain-invariant feature redundancy caused by domain-shared representation is further mitigated via an alternative brand orthogonality for better generalization. These two novel aspects can be easily plugged into any BN-based backbone neural networks. Specifically, simply applying them to ResNet50 has achieved competitive performance to the state-of-the-arts on five popular benchmarks. Our code will be available at https://github.com/dreamkily/gUDA.