Source author record

Zhenzhong Wang

Zhenzhong Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Artificial Intelligence econ.EM Machine Learning Neural and Evolutionary Computing

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models

Providing explainable molecular property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown great potential in accurate molecular property prediction, they neither provide chemically meaningful explanations nor faithfully reveal the molecular structure-property relationships. In this work, we develop a framework for explainable molecular property prediction based on language models, dubbed as Lamole, which can provide chemical concepts-aligned explanations. We take a string-based molecular representation -- Group SELFIES -- as input tokens to pretrain and fine-tune our Lamole, as it provides chemically meaningful semantics. By disentangling the information flows of Lamole, we propose combining self-attention weights and gradients for better quantification of each chemically meaningful substructure's impact on the model's output. To make the explanations more faithfully respect the structure-property relationship, we then carefully craft a marginal loss to explicitly optimize the explanations to be able to align with the chemists' annotations. We bridge the manifold hypothesis with the elaborated marginal loss to prove that the loss can align the explanations with the tangent space of the data manifold, leading to concept-aligned explanations. Experimental results over six mutagenicity datasets and one hepatotoxicity dataset demonstrate Lamole can achieve comparable classification accuracy and boost the explanation accuracy by up to 14.3%, being the state-of-the-art in explainable molecular property prediction.

preprint2021arXiv

Manifold Interpolation for Large-Scale Multi-Objective Optimization via Generative Adversarial Networks

Large-scale multiobjective optimization problems (LSMOPs) are characterized as involving hundreds or even thousands of decision variables and multiple conflicting objectives. An excellent algorithm for solving LSMOPs should find Pareto-optimal solutions with diversity and escape from local optima in the large-scale search space. Previous research has shown that these optimal solutions are uniformly distributed on the manifold structure in the low-dimensional space. However, traditional evolutionary algorithms for solving LSMOPs have some deficiencies in dealing with this structural manifold, resulting in poor diversity, local optima, and inefficient searches. In this work, a generative adversarial network (GAN)-based manifold interpolation framework is proposed to learn the manifold and generate high-quality solutions on this manifold, thereby improving the performance of evolutionary algorithms. We compare the proposed algorithm with several state-of-the-art algorithms on large-scale multiobjective benchmark functions. Experimental results have demonstrated the significant improvements achieved by this framework in solving LSMOPs.

preprint2020arXiv

Variable Selection in Macroeconomic Forecasting with Many Predictors

In the data-rich environment, using many economic predictors to forecast a few key variables has become a new trend in econometrics. The commonly used approach is factor augment (FA) approach. In this paper, we pursue another direction, variable selection (VS) approach, to handle high-dimensional predictors. VS is an active topic in statistics and computer science. However, it does not receive as much attention as FA in economics. This paper introduces several cutting-edge VS methods to economic forecasting, which includes: (1) classical greedy procedures; (2) l1 regularization; (3) gradient descent with sparsification and (4) meta-heuristic algorithms. Comprehensive simulation studies are conducted to compare their variable selection accuracy and prediction performance under different scenarios. Among the reviewed methods, a meta-heuristic algorithm called sequential Monte Carlo algorithm performs the best. Surprisingly the classical forward selection is comparable to it and better than other more sophisticated algorithms. In addition, we apply these VS methods on economic forecasting and compare with the popular FA approach. It turns out for employment rate and CPI inflation, some VS methods can achieve considerable improvement over FA, and the selected predictors can be well explained by economic theories.