Source author record

Zhiguang Wang

Zhiguang Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Neural and Evolutionary Computing cond-mat.mtrl-sci Artificial Intelligence Computation and Language Information Theory math.IT

Catalog footprint

What is connected

13works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Exploring Hidden Semantics in Neural Networks with Symbolic Regression

Many recent studies focus on developing mechanisms to explain the black-box behaviors of neural networks (NNs). However, little work has been done to extract the potential hidden semantics (mathematical representation) of a neural network. A succinct and explicit mathematical representation of a NN model could improve the understanding and interpretation of its behaviors. To address this need, we propose a novel symbolic regression method for neural works (called SRNet) to discover the mathematical expressions of a NN. SRNet creates a Cartesian genetic programming (NNCGP) to represent the hidden semantics of a single layer in a NN. It then leverages a multi-chromosome NNCGP to represent hidden semantics of all layers of the NN. The method uses a (1+$λ$) evolutionary strategy (called MNNCGP-ES) to extract the final mathematical expressions of all layers in the NN. Experiments on 12 symbolic regression benchmarks and 5 classification benchmarks show that SRNet not only can reveal the complex relationships between each layer of a NN but also can extract the mathematical representation of the whole NN. Compared with LIME and MAPLE, SRNet has higher interpolation accuracy and trends to approximate the real model on the practical dataset.

preprint2022arXiv

Taylor Genetic Programming for Symbolic Regression

Genetic programming (GP) is a commonly used approach to solve symbolic regression (SR) problems. Compared with the machine learning or deep learning methods that depend on the pre-defined model and the training dataset for solving SR problems, GP is more focused on finding the solution in a search space. Although GP has good performance on large-scale benchmarks, it randomly transforms individuals to search results without taking advantage of the characteristics of the dataset. So, the search process of GP is usually slow, and the final results could be unstable.To guide GP by these characteristics, we propose a new method for SR, called Taylor genetic programming (TaylorGP) (Code and appendix at https://kgae-cup.github.io/TaylorGP/). TaylorGP leverages a Taylor polynomial to approximate the symbolic equation that fits the dataset. It also utilizes the Taylor polynomial to extract the features of the symbolic equation: low order polynomial discrimination, variable separability, boundary, monotonic, and parity. GP is enhanced by these Taylor polynomial techniques. Experiments are conducted on three kinds of benchmarks: classical SR, machine learning, and physics. The experimental results show that TaylorGP not only has higher accuracy than the nine baseline methods, but also is faster in finding stable results.

preprint2020arXiv

Continual Learning in Task-Oriented Dialogue Systems

Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-offs between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.

preprint2020arXiv

Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity

Open-ended human learning and information-seeking are increasingly mediated by digital assistants. However, such systems often ignore the user's pre-existing knowledge. Assuming a correlation between engagement and user responses such as "liking" messages or asking followup questions, we design a Wizard-of-Oz dialog task that tests the hypothesis that engagement increases when users are presented with facts related to what they know. Through crowd-sourcing of this experiment, we collect and release 14K dialogs (181K utterances) where users and assistants converse about geographic topics like geopolitical entities and locations. This dataset is annotated with pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia, and user reactions to messages. Responses using a user's prior knowledge increase engagement. We incorporate this knowledge into a multi-task model that reproduces human assistant policies and improves over a BERT content model by 13 mean reciprocal rank points.

preprint2016arXiv

Adaptive Normalized Risk-Averting Training For Deep Neural Networks

This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard $L_p$-norm error. By analyzing the gradient on the convexity index $λ$, we explain the reason why to learn $λ$ adaptively using gradient descent works. In practice, we show how this method improves training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other specific tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization problem in DNNs.

preprint2016arXiv

Representation Learning with Deconvolution for Multivariate Time Series Classification and Visualization

We propose a new model based on the deconvolutional networks and SAX discretization to learn the representation for multivariate time series. Deconvolutional networks fully exploit the advantage the powerful expressiveness of deep neural networks in the manner of unsupervised learning. We design a network structure specifically to capture the cross-channel correlation with deconvolution, forcing the pooling operation to perform the dimension reduction along each position in the individual channel. Discretization based on Symbolic Aggregate Approximation is applied on the feature vectors to further extract the bag of features. We show how this representation and bag of features helps on classification. A full comparison with the sequence distance based approach is provided to demonstrate the effectiveness of our approach on the standard datasets. We further build the Markov matrix from the discretized representation from the deconvolution to visualize the time series as complex networks, which show more class-specific statistical properties and clear structures with respect to different labels.

preprint2016arXiv

Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline

We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The proposed Fully Convolutional Network (FCN) achieves premium performance to other state-of-the-art approaches and our exploration of the very deep neural networks with the ResNet structure is also competitive. The global average pooling in our convolutional model enables the exploitation of the Class Activation Map (CAM) to find out the contributing region in the raw data for the specific labels. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization capability of our models, learned features, network structures and the classification semantics.

preprint2015arXiv

Empirical Studies on Symbolic Aggregation Approximation Under Statistical Perspectives for Knowledge Discovery in Time Series

Symbolic Aggregation approXimation (SAX) has been the de facto standard representation methods for knowledge discovery in time series on a number of tasks and applications. So far, very little work has been done in empirically investigating the intrinsic properties and statistical mechanics in SAX words. In this paper, we applied several statistical measurements and proposed a new statistical measurement, i.e. information embedding cost (IEC) to analyze the statistical behaviors of the symbolic dynamics. Our experiments on the benchmark datasets and the clinical signals demonstrate that SAX can always reduce the complexity while preserving the core information embedded in the original time series with significant embedding efficiency. Our proposed IEC score provide a priori to determine if SAX is adequate for specific dataset, which can be generalized to evaluate other symbolic representations. Our work provides an analytical framework with several statistical tools to analyze, evaluate and further improve the symbolic dynamics for knowledge discovery in time series.

preprint2015arXiv

Imaging Time-Series to Improve Classification and Imputation

Inspired by recent successes of deep learning in computer vision, we propose a novel framework for encoding time series as different types of images, namely, Gramian Angular Summation/Difference Fields (GASF/GADF) and Markov Transition Fields (MTF). This enables the use of techniques from computer vision for time series classification and imputation. We used Tiled Convolutional Neural Networks (tiled CNNs) on 20 standard datasets to learn high-level features from the individual and compound GASF-GADF-MTF images. Our approaches achieve highly competitive results when compared to nine of the current best time series classification approaches. Inspired by the bijection property of GASF on 0/1 rescaled data, we train Denoised Auto-encoders (DA) on the GASF images of four standard and one synthesized compound dataset. The imputation MSE on test data is reduced by 12.18%-48.02% when compared to using the raw data. An analysis of the features and weights learned via tiled CNNs and DAs explains why the approaches work.

preprint2015arXiv

Self-blocking of interstitial clusters near metallic grain boundaries

Nano-crystallize materials have been known for decades to potentially owe the novel self-healing ability for radiation damage, which has been demonstrated to be especially linked to preferential occupation of interstitials at grain boundary (GB) and promoted vacancy-interstitial annihilation. A major obstacle to better understanding the healing property is the lack of an atomistic picture of the interstitial states near GBs, due to severely separation of the timescale of interstitial segregation from other events and abundance of interstitials at the GB. Here, we report a generic "self-blocking" effect of the interstitial cluster (SIAn) near the metallic GB in W, Mo and Fe. Upon creating a SIAn near the GB, it is immediately trapped by the GB during the GB structural relaxation and blocks there, impeding GB's further spontaneous trapping of the SIAn in the vicinity and making these SIAns stuck nearby the GB. The SIAn in the stuck state surprisingly owes an exceptionally larger number of annihilation sites with vacancies near the GB than the SIAn trapped at the GB due to maintaining its bulk configuration basically. Besides, it also has an unexpectedly long-ranged repelling interaction with the SIA in the bulk region, which may further affect the GB's trap of the SIAn. The self-blocking effect might shed light on more critical and extended role of the GB in healing radiation-damage in NCs than previously recognized the GB's limited role based on GB's trap for the SIA and resulted vacancy-SIA recombination.

preprint2015arXiv

Spatially Encoding Temporal Correlations to Classify Temporal Data Using Convolutional Neural Networks

We propose an off-line approach to explicitly encode temporal patterns spatially as different types of images, namely, Gramian Angular Fields and Markov Transition Fields. This enables the use of techniques from computer vision for feature learning and classification. We used Tiled Convolutional Neural Networks to learn high-level features from individual GAF, MTF, and GAF-MTF images on 12 benchmark time series datasets and two real spatial-temporal trajectory datasets. The classification results of our approach are competitive with state-of-the-art approaches on both types of data. An analysis of the features and weights learned by the CNNs explains why the approach works.

preprint2014arXiv

Crafting the strain state in epitaxial thin films: A case study of CoFe2O4 films on Pb(Mg,Nb)O3-PbTiO3

The strain dependence of electric and magnetic properties has been widely investigated, both from a fundamental science perspective and an applications point of view. Electromechanical coupling through field-induced polarization rotation (PRO) and polarization reorientation (PRE) in piezoelectric single crystals can provide an effective strain in film/substrate epitaxial heterostructures. However, the specific pathway of PRO and PRE is a complex thermodynamic process, depending on chemical composition, temperature, electric field and mechanical load. Here, systematic studies of the temperature-dependent field-induced phase transitions in Pb(Mg,Nb)O3-PbTiO3 single crystals with different initial phase and orientation configurations have been performed. Different types of strains, volatile/nonvolatile and biaxial/uniaxial, have been measured by both macroscopic and in-situ X-ray diffraction techniques. In addition, the strain state of epitaxial Mn-doped CoFe2O4 thin films was examined by magnetic anisotropy measurements, where a giant magnetoelectric coupling has been demonstrated.

preprint2013arXiv

An operational window for radiation-resistant materials based on sequentially healing grain interiors and boundaries

Design of nuclear materials with high radiation-tolerance has great significance1, especially for the next generation of nuclear energy systems2,3. Response of nano- and poly-crystals to irradiation depends on the radiation temperature, dose-rate and grain size4-13. However the dependencies had been studied and interpreted individually, and thus severely lacking is the ability to predict radiation performance of materials in extreme environments. Here we propose an operational window for radiation-resistant materials, which is based on a perspective of interactions among irradiation-induced interstitials, vacancies, and grain boundaries. Using atomic simulations, we find that healing grain boundaries needs much longer time than healing grain interiors. Not been noticed before, this finding suggests priority should be thereafter given to recovery of the grain boundary itself. This large disparity in healing time is reflected in the spectra of defects-recombination energy barriers by the presence of one high-barrier peak in addition to the peak of low barriers. The insight gained from the study instigates new avenues for examining the role of grain boundaries in healing the material. In particular, we sketch out the radiation-endurance window in the parameter space of temperature, dose-rate and grain size. The window helps evaluate material performance and develop resistant materials against radiation damage.

Zhiguang Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Exploring Hidden Semantics in Neural Networks with Symbolic Regression

Taylor Genetic Programming for Symbolic Regression

Continual Learning in Task-Oriented Dialogue Systems

Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity

Adaptive Normalized Risk-Averting Training For Deep Neural Networks

Representation Learning with Deconvolution for Multivariate Time Series Classification and Visualization

Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline

Empirical Studies on Symbolic Aggregation Approximation Under Statistical Perspectives for Knowledge Discovery in Time Series

Imaging Time-Series to Improve Classification and Imputation

Self-blocking of interstitial clusters near metallic grain boundaries

Spatially Encoding Temporal Correlations to Classify Temporal Data Using Convolutional Neural Networks

Crafting the strain state in epitaxial thin films: A case study of CoFe2O4 films on Pb(Mg,Nb)O3-PbTiO3

An operational window for radiation-resistant materials based on sequentially healing grain interiors and boundaries