Source author record

Zhiguo Wang

Zhiguo Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Machine Learning Computer Vision math.OC Applications physics.optics Artificial Intelligence cond-mat.mtrl-sci Information Retrieval math.DS math.ST Neurons and Cognition physics.app-ph Statistics Theory

Catalog footprint

What is connected

23works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Proximal-Based Generative Modeling for Bayesian Inverse Problems

Score-based diffusion models demonstrate superior performance in generative tasks but encounter fundamental bottlenecks in inverse problems due to the analytical intractability of the time-dependent likelihood score. To bridge this gap, we propose a novel proximal-based generative modeling (PGM) framework that rigorously circumvents explicit likelihood evaluation. Our framework is built upon a theoretical equivalence between Gaussian convolution in diffusion processes and Moreau-Yosida regularization in nonsmooth optimization. This enables a new sampling mechanism driven by the proposed Moreau score, which admits a closed-form expression via proximal operators. Moreover, we introduce Moreau score matching to learn the proximal operators that rely solely on samples drawn from the prior distribution. Theoretically, PGM eliminates the early-stopping bias inherent in the score-based diffusion model and achieves non-asymptotic convergence. Experiments demonstrate that PGM significantly surpasses state-of-the-art methods in reconstruction quality and sampling time.

preprint2023arXiv

Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated Learning Framework

As a novel distributed learning paradigm, federated learning (FL) faces serious challenges in dealing with massive clients with heterogeneous data distribution and computation and communication resources. Various client-variance-reduction schemes and client sampling strategies have been respectively introduced to improve the robustness of FL. Among others, primal-dual algorithms such as the alternating direction of method multipliers (ADMM) have been found being resilient to data distribution and outperform most of the primal-only FL algorithms. However, the reason behind remains a mystery still. In this paper, we firstly reveal the fact that the federated ADMM is essentially a client-variance-reduced algorithm. While this explains the inherent robustness of federated ADMM, the vanilla version of it lacks the ability to be adaptive to the degree of client heterogeneity. Besides, the global model at the server under client sampling is biased which slows down the practical convergence. To go beyond ADMM, we propose a novel primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model. In addition, FedVRA unifies several representative FL algorithms in the sense that they are either special instances of FedVRA or are close to it. Extensions of FedVRA to semi/un-supervised learning are also presented. Experiments based on (semi-)supervised image classification tasks demonstrate superiority of FedVRA over the existing schemes in learning scenarios with massive heterogeneous clients and client sampling.

preprint2023arXiv

Flexo-photovoltaic effect and above-bandgap photovoltage in halide perovskites

Halide perovskites have outstanding photovoltaic properties which have been optimized through interfacial engineering. However, as these materials approach the limits imposed by the physics of semiconductor junctions, it is urgent to explore alternatives, such as the bulk photovoltaic effect, whose physical origin is different and not bound by the same limits. In this context, we focus on the flexo-photovoltaic effect, a type of bulk photovoltaic effect that was recently observed in oxides under strain gradients. We have measured the flexo-photovoltaic effect of MAPbBr3 and MAPbI3 crystals under bending and found it to be orders of magnitude larger than for SrTiO3, the benchmark flexo-photovoltaic oxide. For sufficiently large strain gradients, photovoltages bigger than the bandgap can be produced. Bulk photovoltaic effects are additive and, for MAPbI3, the flexo-photovoltage exists on top of a native bulk photovoltage that is hysteretic, consistent with the electrically switchable macroscopic polarization of this material. The results suggest that harnessing the flexo-photovoltaic effect through strain gradient engineering can provide a functional leap forward for halide perovskites.

preprint2022arXiv

An Unbiased Symmetric Matrix Estimator for Topology Inference under Partial Observability

Network topology inference is a fundamental problem in many applications of network science, such as locating the source of fake news, brain connectivity networks detection, etc. Many real-world situations suffer from a critical problem that only a limited part of observed nodes are available. This letter considers the problem of network topology inference under the framework of partial observability. Based on the vector autoregressive model, we propose a novel unbiased estimator for the symmetric network topology with the Gaussian noise and the Laplacian combination rule. Theoretically, we prove that it converges to the network combination matrix in probability. Furthermore, by utilizing the Gaussian mixture model algorithm, an effective algorithm called network inference Gauss algorithm is developed to infer the network structure. Finally, compared with the state-of-the-art methods, numerical experiments demonstrate the proposed algorithm enjoys better performance in the case of small sample sizes.

preprint2022arXiv

REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

Relation extraction is an important but challenging task that aims to extract all hidden relational facts from the text. With the development of deep language models, relation extraction methods have achieved good performance on various benchmarks. However, we observe two shortcomings of previous methods: first, there is no unified framework that works well under various relation extraction settings; second, effectively utilizing external knowledge as background information is absent. In this work, we propose a knowledge-enhanced generative model to mitigate these two issues. Our generative model is a unified framework to sequentially generate relational triplets under various relation extraction settings and explicitly utilizes relevant knowledge from Knowledge Graph (KG) to resolve ambiguities. Our model achieves superior performance on multiple benchmarks and settings, including WebNLG, NYT10, and TACRED.

preprint2021arXiv

Entity-level Factual Consistency of Abstractive Text Summarization

A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.

preprint2020arXiv

A Promotion Method for Generation Error Based Video Anomaly Detection

Surveillance video anomaly detection is to detect events that rarely or never happened in a certain scene. The generation error (GE)-based methods exhibit excellent performance on this task. They firstly train a generative neural network (GNN) to generate normal samples, then judge the samples with large GEs as anomalies. Almost all the GE-based methods utilize frame-level GEs to detect anomalies. However, anomalies generally occur in local areas, the frame-level GE introduces GEs of normal areas to anomaly discriminations, that brings two problems: i) The GE of normal areas reduces the anomaly saliency of the anomalous frame. ii) Different videos have different normal-GE-levels, thus it is hard to set a uniform threshold for all videos to detect anomalies. To address these problems, we propose a promotion method: utilize the maximum of block-level GEs on the frame to detect anomaly. Firstly, we calculate the block-level GEs at each position on the frame. Then, we utilize the maximum of the block-level GEs on the frame to detect anomalies. Based on the existed GNN models, experiments are carried out on multiple datasets. The results demonstrate the effectiveness of the proposed method and achieve state-of-the-art performance.

preprint2020arXiv

Hybrid Tree-based Models for Insurance Claims

Two-part models and Tweedie generalized linear models (GLMs) have been used to model loss costs for short-term insurance contract. For most portfolios of insurance claims, there is typically a large proportion of zero claims that leads to imbalances resulting in inferior prediction accuracy of these traditional approaches. This article proposes the use of tree-based models with a hybrid structure that involves a two-step algorithm as an alternative approach to these traditional models. The first step is the construction of a classification tree to build the probability model for frequency. In the second step, we employ elastic net regression models at each terminal node from the classification tree to build the distribution model for severity. This hybrid structure captures the benefits of tuning hyperparameters at each step of the algorithm; this allows for improved prediction accuracy and tuning can be performed to meet specific business objectives. We examine and compare the predictive performance of such a hybrid tree-based structure in relation to the traditional Tweedie model using both real and synthetic datasets. Our empirical results show that these hybrid tree-based models produce more accurate predictions without the loss of intuitive interpretation.

preprint2020arXiv

Optimally Combining Classifiers for Semi-Supervised Learning

This paper considers semi-supervised learning for tabular data. It is widely known that Xgboost based on tree model works well on the heterogeneous features while transductive support vector machine can exploit the low density separation assumption. However, little work has been done to combine them together for the end-to-end semi-supervised learning. In this paper, we find these two methods have complementary properties and larger diversity, which motivates us to propose a new semi-supervised learning method that is able to adaptively combine the strengths of Xgboost and transductive support vector machine. Instead of the majority vote rule, an optimization problem in terms of ensemble weight is established, which helps to obtain more accurate pseudo labels for unlabeled data. The experimental results on the UCI data sets and real commercial data set demonstrate the superior classification performance of our method over the five state-of-the-art algorithms improving test accuracy by about $3\%-4\%$. The partial code can be found at https://github.com/hav-cam-mit/CTO.

preprint2020arXiv

Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering

Question Answering (QA) is in increasing demand as the amount of information available online and the desire for quick access to this content grows. A common approach to QA has been to fine-tune a pretrained language model on a task-specific labeled dataset. This paradigm, however, relies on scarce, and costly to obtain, large-scale human-labeled data. We propose an unsupervised approach to training QA models with generated pseudo-training data. We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance by allowing the model to learn more complex context-question relationships. Training a QA model on this data gives a relative improvement over a previous unsupervised model in F1 score on the SQuAD dataset by about 14%, and 20% when the answer is a named entity, achieving state-of-the-art performance on SQuAD for unsupervised QA.

preprint2020arXiv

Triplet Online Instance Matching Loss for Person Re-identification

Mining the shared features of same identity in different scene, and the unique features of different identity in same scene, are most significant challenges in the field of person re-identification (ReID). Online Instance Matching (OIM) loss function and Triplet loss function are main methods for person ReID. Unfortunately, both of them have drawbacks. OIM loss treats all samples equally and puts no emphasis on hard samples. Triplet loss processes batch construction in a complicated and fussy way and converges slowly. For these problems, we propose a Triplet Online Instance Matching (TOIM) loss function, which lays emphasis on the hard samples and improves the accuracy of person ReID effectively. It combines the advantages of OIM loss and Triplet loss and simplifies the process of batch construction, which leads to a more rapid convergence. It can be trained on-line when handle the joint detection and identification task. To validate our loss function, we collect and annotate a large-scale benchmark dataset (UESTC-PR) based on images taken from surveillance cameras, which contains 499 identities and 60,437 images. We evaluated our proposed loss function on Duke, Marker-1501 and UESTC-PR using ResNet-50, and the result shows that our proposed loss function outperforms the baseline methods by a maximum of 21.7%, including Softmax loss, OIM loss and Triplet loss.

preprint2016arXiv

AMR-to-text generation as a Traveling Salesman Problem

The task of AMR-to-text generation is to generate grammatical text that sustains the semantic meaning for a given AMR graph. We at- tack the task by first partitioning the AMR graph into smaller fragments, and then generating the translation for each fragment, before finally deciding the order by solving an asymmetric generalized traveling salesman problem (AGTSP). A Maximum Entropy classifier is trained to estimate the traveling costs, and a TSP solver is used to find the optimized solution. The final model reports a BLEU score of 22.44 on the SemEval-2016 Task8 dataset.

preprint2016arXiv

Coverage Embedding Models for Neural Machine Translation

In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scale Chinese-to-English task show that our enhanced model improves the translation quality significantly on various test sets over the strong large vocabulary NMT system.

preprint2016arXiv

Monte Carlo Set-Membership Filtering for Nonlinear Dynamic Systems

When underlying probability density functions of nonlinear dynamic systems are unknown, the filtering problem is known to be a challenging problem. This paper attempts to make progress on this problem by proposing a new class of filtering methods in bounded noise setting via set-membership theory and Monte Carlo (boundary) sampling technique, called Monte Carlo set-membership filter. The set-membership prediction and measurement update are derived by recent convex optimization methods based on S-procedure and Schur complement. To guarantee the on-line usage, the nonlinear dynamics are linearized about the current estimate and the remainder terms are then bounded by an optimization ellipsoid, which can be described as a semi-infinite optimization problem. In general, it is an analytically intractable problem when dynamic systems are nonlinear. However, for a typical nonlinear dynamic system in target tracking, we can analytically derive some regular properties for the remainder. Moreover, based on the remainder properties and the inverse function theorem, the semi-infinite optimization problem can be efficiently solved by Monte Carlo boundary sampling technique. Compared with the particle filter, numerical examples show that when the probability density functions of noises are unknown, the performance of the Monte Carlo set-membership filter is better than that of the particle filter.

preprint2016arXiv

Multi-Perspective Context Matching for Machine Comprehension

Previous machine comprehension (MC) datasets are either too small to train end-to-end deep learning models, or not difficult enough to evaluate the ability of current MC techniques. The newly released SQuAD dataset alleviates these limitations, and gives us a chance to develop more realistic MC models. Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage. Our model first adjusts each word-embedding vector in the passage by multiplying a relevancy weight computed against the question. Then, we encode the question and weighted passage by using bi-directional LSTMs. For each point in the passage, our model matches the context of this point against the encoded question from multiple perspectives and produces a matching vector. Given those matched vectors, we employ another bi-directional LSTM to aggregate all the information and predict the beginning and ending points. Experimental result on the test set of SQuAD shows that our model achieves a competitive result on the leaderboard.

preprint2016arXiv

Sense Embedding Learning for Word Sense Induction

Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines.

preprint2016arXiv

Supervised Attentions for Neural Machine Translation

In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignment qualities significantly over the large-vocabulary neural machine translation system, and even beats a state-of-the-art traditional syntax-based system.

preprint2016arXiv

Vocabulary Manipulation for Neural Machine Translation

In order to capture rich language phenomena, neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary. For each sentence or batch, we only predict the target words in its sentence-level or batch-level vocabulary. Thus, we reduce both the computing time and the memory usage. Our method simply takes into account the translation options of each word or phrase in the source sentence, and picks a very small target vocabulary for each sentence based on a word-to-word translation model or a bilingual phrase library learned from a traditional machine translation model. Experimental results on the large-scale English-to-French task show that our method achieves better translation performance by 1 BLEU point over the large vocabulary neural machine translation system of Jean et al. (2015).

preprint2015arXiv

FAQ-based Question Answering via Word Alignment

In this paper, we propose a novel word-alignment-based method to solve the FAQ-based question answering task. First, we employ a neural network model to calculate question similarity, where the word alignment between two questions is used for extracting features. Second, we design a bootstrap-based feature extraction method to extract a small set of effective lexical features. Third, we propose a learning-to-rank algorithm to train parameters more suitable for the ranking tasks. Experimental results, conducted on three languages (English, Spanish and Japanese), demonstrate that the question similarity model is more effective than baseline systems, the sparse features bring 5% improvements on top-1 accuracy, and the learning-to-rank algorithm works significantly better than the traditional method. We further evaluate our method on the answer sentence selection task. Our method outperforms all the previous systems on the standard TREC data set.

preprint2015arXiv

Photonic Floquet Topological Insulator in an Atomic Ensemble

We demonstrate the photonic Floquet topological insulator (PFTI) in an atomic vapor with nonlinear susceptibilities. The interference of three coupling fields splits the energy levels periodically to form a periodic refractive index structure with honeycomb symmetry that can be adjusted by the choice of frequency detunings and intensities of the coupling fields, which all affect the appearance of Dirac cones in the momentum space. When the honeycomb lattice sites are helically ordered along the propagation direction, we obtain a PFTI in the atomic vapor in which an obliquely incident beam moves along the zigzag edge without scattering energy into the PFTI, due to the confinement of the edge states. The appearance of Dirac cones and the formation of PFTI is strongly affected by the nonlinear susceptibilities; i.e. the PFTI can be shut off by the third-order nonlinear susceptibility and re-opened up by the fifth-order one.

preprint2015arXiv

Posterior Cramer-Rao Bounds for Discrete-Time Nonlinear Filtering with Finitely Correlated Noise

In this paper, an approximation recursive formula of the mean-square error lower bound for the discrete-time nonlinear filtering problem when noises of dynamic systems are temporally correlated is derived based on the Van Trees (posterior) version of the Cramer-Rao inequality. The formula is unified in the sense that it can be applied to the multi-step correlated process noise, multi-step correlated measurement noise and multi-step cross-correlated process and measurement noise simultaneously. The lower bound is evaluated by two typical target tracking examples respectively. Both of them show that the new lower bound is significantly different from that of the method which ignores correlation of noises. Thus, when they are applied to sensor selection problems, number of selected sensors becomes very different to obtain a desired estimation performance.

preprint2014arXiv

Three-dimensional nonparaxial accelerating beams from the transverse Whittaker integral

We investigate three-dimensional nonparaxial linear accelerating beams arising from the transverse Whittaker integral. They include different Mathieu, Weber, and Fresnel beams, among other. These beams accelerate along a semicircular trajectory, with almost invariant nondiffracting shapes. The transverse patterns of accelerating beams are determined by their angular spectra, which are constructed from the Mathieu functions, Weber functions, and Fresnel integrals. Our results not only enrich the understanding of multidimensional nonparaxial accelerating beams, but also display their real applicative potential -- owing to the usefulness of Mathieu and Weber functions, and Fresnel integrals in describing a wealth of wave phenomena in nature.

preprint2013arXiv

Using a Dynamic Neural Field Model to Explore a Direct Collicular Inhibition Account of Inhibition of Return

When the interval between a transient ash of light (a "cue") and a second visual response signal (a "target") exceeds at least 200ms, responding is slowest in the direction indicated by the first signal. This phenomenon is commonly referred to as inhibition of return (IOR). The dynamic neural field model (DNF) has proven to have broad explanatory power for IOR, effectively capturing many empirical results. Previous work has used a short-term depression (STD) implementation of IOR, but this approach fails to explain many behavioral phenomena observed in the literature. Here, we explore a variant model of IOR involving a combination of STD and delayed direct collicular inhibition. We demonstrate that this hybrid model can better reproduce established behavioural results. We use the results of this model to propose several experiments that would yield particularly valuable insight into the nature of the neurophysiological mechanisms underlying IOR.

Zhiguo Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

Proximal-Based Generative Modeling for Bayesian Inverse Problems

Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated Learning Framework

Flexo-photovoltaic effect and above-bandgap photovoltage in halide perovskites

An Unbiased Symmetric Matrix Estimator for Topology Inference under Partial Observability

REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

Entity-level Factual Consistency of Abstractive Text Summarization

A Promotion Method for Generation Error Based Video Anomaly Detection

Hybrid Tree-based Models for Insurance Claims

Optimally Combining Classifiers for Semi-Supervised Learning

Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering

Triplet Online Instance Matching Loss for Person Re-identification

AMR-to-text generation as a Traveling Salesman Problem

Coverage Embedding Models for Neural Machine Translation

Monte Carlo Set-Membership Filtering for Nonlinear Dynamic Systems

Multi-Perspective Context Matching for Machine Comprehension

Sense Embedding Learning for Word Sense Induction

Supervised Attentions for Neural Machine Translation

Vocabulary Manipulation for Neural Machine Translation

FAQ-based Question Answering via Word Alignment

Photonic Floquet Topological Insulator in an Atomic Ensemble

Posterior Cramer-Rao Bounds for Discrete-Time Nonlinear Filtering with Finitely Correlated Noise

Three-dimensional nonparaxial accelerating beams from the transverse Whittaker integral

Using a Dynamic Neural Field Model to Explore a Direct Collicular Inhibition Account of Inhibition of Return