Researcher profile

Zhihan Zhang

Zhihan Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2025arXiv

Efficient self-consistent learning of gate set Pauli noise

Understanding quantum noise is an essential step towards building practical quantum information processing systems. Pauli noise is a useful model that has been widely applied in quantum benchmarking, error mitigation, and error correction. Despite intensive study, most existing works focus on learning Pauli noise channels associated with some specific gates rather than treating the gate set as a whole. A learning algorithm that is self-consistent, complete, and efficient at the same time is yet to be established. In this work, we study the task of gate set Pauli noise learning, where a set of quantum gates, state preparation, and measurements all suffer from unknown Pauli noise channels with a customized noise ansatz. Using tools from algebraic graph theory, we analytically characterize the self-consistently learnable degrees of freedom for Pauli noise models with arbitrary linear ansatz, and design experiments to efficiently learn all the learnable information. Specifically, we show that all learnable information about the gate noise can be learned to relative precision, under mild assumptions on the noise ansatz. We then demonstrate the flexibility of our theory by applying it to concrete physically motivated ansatzs (such as spatially local or quasi-local noise) and experimentally relevant gate sets (such as parallel CZ gates). These results not only enhance the theoretical understanding of quantum noise learning, but also provide a feasible recipe for characterizing existing and near-future quantum information processing devices.

preprint2022arXiv

Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts

Generative commonsense reasoning (GCR) in natural language is to reason about the commonsense while generating coherent text. Recent years have seen a surge of interest in improving the generation quality of commonsense reasoning tasks. Nevertheless, these approaches have seldom investigated diversity in the GCR tasks, which aims to generate alternative explanations for a real-world situation or predict all possible outcomes. Diversifying GCR is challenging as it expects to generate multiple outputs that are not only semantically different but also grounded in commonsense knowledge. In this paper, we propose MoKGE, a novel method that diversifies the generative reasoning by a mixture of expert (MoE) strategy on commonsense knowledge graphs (KG). A set of knowledge experts seek diverse reasoning on KG to encourage various generation outputs. Empirical experiments demonstrated that MoKGE can significantly improve the diversity while achieving on par performance on accuracy on two GCR benchmarks, based on both automatic and human evaluations.

preprint2022arXiv

On the Relationship Between Counterfactual Explainer and Recommender

Recommender systems employ machine learning models to learn from historical data to predict the preferences of users. Deep neural network (DNN) models such as neural collaborative filtering (NCF) are increasingly popular. However, the tangibility and trustworthiness of the recommendations are questionable due to the complexity and lack of explainability of the models. To enable explainability, recent techniques such as ACCENT and FIA are looking for counterfactual explanations that are specific historical actions of a user, the removal of which leads to a change to the recommendation result. In this work, we present a general framework for both DNN and non-DNN models so that the counterfactual explainers all belong to it with specific choices of components. This framework first estimates the influence of a certain historical action after its removal and then uses search algorithms to find the minimal set of such actions for the counterfactual explanation. With this framework, we are able to investigate the relationship between the explainers and recommenders. We empirically study two recommender models (NCF and Factorization Machine) and two datasets (MovieLens and Yelp). We analyze the relationship between the performance of the recommender and the quality of the explainer. We observe that with standard evaluation metrics, the explainers deliver worse performance when the recommendations are more accurate. This indicates that having good explanations to correct predictions is harder than having them to wrong predictions. The community needs more fine-grained evaluation metrics to measure the quality of counterfactual explanations to recommender systems.

preprint2021arXiv

Knowledge-Aware Procedural Text Understanding with Multi-Stage Training

Procedural text describes dynamic state changes during a step-by-step natural process (e.g., photosynthesis). In this work, we focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process. Although recent approaches have achieved substantial progress, their results are far behind human performance. Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved, which require the incorporation of external knowledge bases. Previous works on external knowledge injection usually rely on noisy web mining tools and heuristic rules with limited applicable scenarios. In this paper, we propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge in this task. Specifically, we retrieve informative knowledge triples from ConceptNet and perform knowledge-aware reasoning while tracking the entities. Besides, we employ a multi-stage training schema which fine-tunes the BERT model over unlabeled data collected from Wikipedia before further fine-tuning it on the final model. Experimental results on two procedural text datasets, ProPara and Recipes, verify the effectiveness of the proposed methods, in which our model achieves state-of-the-art performance in comparison to various baselines.

preprint2021arXiv

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations. The proposed architecture efficiently and effectively models the relationship between image patches at multiple scales by constructing a pyramid of local self-attention blocks. The design includes a novel position projection to encode the spatial positions of the patches. SPAN is trained on a generic, synthetic dataset but can also be fine tuned for specific datasets; The proposed method shows significant gains in performance on standard datasets over previous state-of-the-art methods.

preprint2020arXiv

DCA: Diversified Co-Attention towards Informative Live Video Commenting

We focus on the task of Automatic Live Video Commenting (ALVC), which aims to generate real-time video comments with both video frames and other viewers' comments as inputs. A major challenge in this task is how to properly leverage the rich and diverse information carried by video and text. In this paper, we aim to collect diversified information from video and text for informative comment generation. To achieve this, we propose a Diversified Co-Attention (DCA) model for this task. Our model builds bidirectional interactions between video frames and surrounding comments from multiple perspectives via metric learning, to collect a diversified and informative context for comment generation. We also propose an effective parameter orthogonalization technique to avoid excessive overlap of information learned from different perspectives. Results show that our approach outperforms existing methods in the ALVC task, achieving new state-of-the-art results.

preprint2020arXiv

Learning Attribute-Structure Co-Evolutions in Dynamic Graphs

Most graph neural network models learn embeddings of nodes in static attributed graphs for predictive analysis. Recent attempts have been made to learn temporal proximity of the nodes. We find that real dynamic attributed graphs exhibit complex co-evolution of node attributes and graph structure. Learning node embeddings for forecasting change of node attributes and birth and death of links over time remains an open problem. In this work, we present a novel framework called CoEvoGNN for modeling dynamic attributed graph sequence. It preserves the impact of earlier graphs on the current graph by embedding generation through the sequence. It has a temporal self-attention mechanism to model long-range dependencies in the evolution. Moreover, CoEvoGNN optimizes model parameters jointly on two dynamic tasks, attribute inference and link prediction over time. So the model can capture the co-evolutionary patterns of attribute change and link formation. This framework can adapt to any graph neural algorithms so we implemented and investigated three methods based on it: CoEvoGCN, CoEvoGAT, and CoEvoSAGE. Experiments demonstrate the framework (and its methods) outperform strong baselines on predicting an entire unseen graph snapshot of personal attributes and interpersonal links in dynamic social graphs and financial graphs.