Source author record

Mingzhong Wang

Mingzhong Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence cond-mat.supr-con

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Offline Meta-Reinforcement Learning with Flow-Based Task Inference and Adaptive Correction of Feature Overgeneralization

Offline meta-reinforcement learning (OMRL) combines the strengths of learning from diverse datasets in offline RL with the adaptability to new tasks of meta-RL, promising safe and efficient knowledge acquisition by RL agents. However, OMRL still suffers extrapolation errors due to out-of-distribution (OOD) actions, compromised by broad task distributions and Markov Decision Process (MDP) ambiguity in meta-RL setups. Existing research indicates that the generalization of the $Q$ network affects the extrapolation error in offline RL. This paper investigates this relationship by decomposing the $Q$ value into feature and weight components, observing that while decomposition enhances adaptability and convergence in the case of high-quality data, it often leads to policy degeneration or collapse in complex tasks. We observe that decomposed $Q$ values introduce a large estimation bias when the feature encounters OOD samples, a phenomenon we term ''feature overgeneralization''. To address this issue, we propose FLORA, which identifies OOD samples by modeling feature distributions and estimating their uncertainties. FLORA integrates a return feedback mechanism to adaptively adjust feature components. Furthermore, to learn precise task representations, FLORA explicitly models the complex task distribution using a chain of invertible transformations. We theoretically and empirically demonstrate that FLORA achieves rapid adaptation and meta-policy improvement compared to baselines across various environments.

preprint2022arXiv

SimSR: Simple Distance-based State Representation for Deep Reinforcement Learning

This work explores how to learn robust and generalizable state representation from image-based observations with deep reinforcement learning methods. Addressing the computational complexity, stringent assumptions and representation collapse challenges in existing work of bisimulation metric, we devise Simple State Representation (SimSR) operator. SimSR enables us to design a stochastic approximation method that can practically learn the mapping functions (encoders) from observations to latent representation space. In addition to the theoretical analysis and comparison with the existing work, we experimented and compared our work with recent state-of-the-art solutions in visual MuJoCo tasks. The results shows that our model generally achieves better performance and has better robustness and good generalization.

preprint2020arXiv

Smart metastructure method for increasing TC of Bi(Pb)SrCaCuO high-temperature superconductors

Improving the critical transition temperature (TC) of Bi(Pb)SrCaCuO (B(P)SCCO) high-temperature superconductors is important, however, considerable challenges exist. In this study, on the basis of the metamaterial structure and the idea that the injecting energy will promote the formation of Cooper pairs, a smart meta-superconductor B(P)SCCO consisting of B(P)SCCO microparticles and Y2O3:Eu3++Ag or Y2O3:Eu3+ luminophor was designed. In the applied electric field, the Y2O3:Eu3++Ag or Y2O3:Eu3+ luminophor generates an electroluminescence (EL), thereby promoting the TC via EL energy injection. A series of Y2O3:Eu3++Ag topological luminophor-doped B(P)SCCO samples was prepared. Results showed that Y2O3:Eu3++Ag was dispersed around B(P)SCCO particles, forming a metastructure. Accordingly, the onset transition temperature (T_(C,on)) and zero resistance transition temperature (T_(C,0)) of B(P)SCCO increased. Meanwhile, the B(P)SCCO sample doped with 0.2 wt% Y2O3 or Y2O3:Sm3+ nonluminous inhomogeneous phase was also prepared to further prove the influence of EL on the T_C rather than the rare earth effect. Results indicated that the TC of the Y2O3 or Y2O3:Sm3+ doping sample decreased. However, the TC of the 0.2 wt% Y2O3:Eu3++Ag or Y2O3:Eu3+ luminophor-doped sample improved. This outcome further demonstrated that the smart metastructure method can improve the TC of B(P)SCCO.