Source author record

An Zhang

An Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.AP math.FA Artificial Intelligence Information Retrieval math.CA math.RT math.SP

Catalog footprint

What is connected

16works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

A persistent skill library allows language model agents to reuse successful strategies across tasks. Maintaining such a library requires three coupled capabilities. The agent selects a relevant skill, utilizes it during execution, and distills new skills from experience. Existing methods optimize these capabilities in isolation or with separate reward sources, resulting in partial and conflicting evolution. We propose Skill1, a framework that trains a single policy to co-evolve skill selection, utilization, and distillation toward a shared task-outcome objective. The policy generates a query to search the skill library, re-ranks candidates to select one, solves the task conditioned on it, and distills a new skill from the trajectory. All learning derives from a single task-outcome signal. Its low-frequency trend credits selection and its high-frequency variation credits distillation. Experiments on ALFWorld and WebShop show that Skill1 outperforms prior skill-based and reinforcement learning baselines. Training dynamics confirm the co-evolution of the three capabilities, and ablations show that removing any credit signal degrades the evolution.

preprint2023arXiv

CrossCBR: Cross-view Contrastive Learning for Bundle Recommendation

Bundle recommendation aims to recommend a bundle of related items to users, which can satisfy the users' various needs with one-stop convenience. Recent methods usually take advantage of both user-bundle and user-item interactions information to obtain informative representations for users and bundles, corresponding to bundle view and item view, respectively. However, they either use a unified view without differentiation or loosely combine the predictions of two separate views, while the crucial cooperative association between the two views' representations is overlooked. In this work, we propose to model the cooperative association between the two different views through cross-view contrastive learning. By encouraging the alignment of the two separately learned views, each view can distill complementary information from the other view, achieving mutual enhancement. Moreover, by enlarging the dispersion of different users/bundles, the self-discrimination of representations is enhanced. Extensive experiments on three public datasets demonstrate that our method outperforms SOTA baselines by a large margin. Meanwhile, our method requires minimal parameters of three set of embeddings (user, bundle, and item) and the computational costs are largely reduced due to more concise graph structure and graph learning module. In addition, various ablation and model studies demystify the working mechanism and justify our hypothesis. Codes and datasets are available at https://github.com/mysbupt/CrossCBR.

preprint2023arXiv

Duplex Hecke Algebras and Related Quantum Schur Duality

This article introduces the duplex Hecke algebra, which is an infinite dimensional algebra generated by two Hecke algebras. This concept originates from the degenerate duplex Hecke algebra in the theory of Schur-Weyl duality related to enhanced reductive algebraic groups. We will study the finite dimensional natural representation of the duplex Hecke algebra on tensor space and prove that the duplex Hecke algebra forms a duality with the Levi type quantum group.

preprint2023arXiv

SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments

Traditional biological and pharmaceutical manufacturing plants are controlled by human workers or pre-defined thresholds. Modernized factories have advanced process control algorithms such as model predictive control (MPC). However, there is little exploration of applying deep reinforcement learning to control manufacturing plants. One of the reasons is the lack of high fidelity simulations and standard APIs for benchmarking. To bridge this gap, we develop an easy-to-use library that includes five high-fidelity simulation environments: BeerFMTEnv, ReactorEnv, AtropineEnv, PenSimEnv and mAbEnv, which cover a wide range of manufacturing processes. We build these environments on published dynamics models. Furthermore, we benchmark online and offline, model-based and model-free reinforcement learning algorithms for comparisons of follow-up research.

preprint2022arXiv

Deconfounding to Explanation Evaluation in Graph Neural Networks

Explainability of graph neural networks (GNNs) aims to answer "Why the GNN made a certain prediction?", which is crucial to interpret the model prediction. The feature attribution framework distributes a GNN's prediction to its input features (e.g., edges), identifying an influential subgraph as the explanation. When evaluating the explanation (i.e., subgraph importance), a standard way is to audit the model prediction based on the subgraph solely. However, we argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem. Furthermore, with an in-depth causal analysis, we find the OOD effect acts as the confounder, which brings spurious associations between the subgraph importance and model prediction, making the evaluation less reliable. In this work, we propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction. While the distribution shift is generally intractable, we employ the front-door adjustment and introduce a surrogate variable of the subgraphs. Specifically, we devise a generative model to generate the plausible surrogates that conform to the data distribution, thus approaching the unbiased estimation of subgraph importance. Empirical results demonstrate the effectiveness of DSE in terms of explanation fidelity.

preprint2022arXiv

Discovering Invariant Rationales for Graph Neural Networks

Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features -- rationale -- which guides the model prediction. Unfortunately, the leading rationalization models often rely on data biases, especially shortcut features, to compose rationales and make predictions without probing the critical and causal patterns. Moreover, such data biases easily change outside the training distribution. As a result, these models suffer from a huge drop in interpretability and predictive performance on out-of-distribution data. In this work, we propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs. It conducts interventions on the training distribution to create multiple interventional distributions. Then it approaches the causal rationales that are invariant across different distributions while filtering out the spurious patterns that are unstable. Experiments on both synthetic and real-world datasets validate the superiority of our DIR in terms of interpretability and generalization ability on graph classification over the leading baselines. Code and datasets are available at https://github.com/Wuyxin/DIR-GNN.

preprint2022arXiv

Let Invariant Rationale Discovery Inspire Graph Contrastive Learning

Leading graph contrastive learning (GCL) methods perform graph augmentations in two fashions: (1) randomly corrupting the anchor graph, which could cause the loss of semantic information, or (2) using domain knowledge to maintain salient features, which undermines the generalization to other domains. Taking an invariance look at GCL, we argue that a high-performing augmentation should preserve the salient semantics of anchor graphs regarding instance-discrimination. To this end, we relate GCL with invariant rationale discovery, and propose a new framework, Rationale-aware Graph Contrastive Learning (RGCL). Specifically, without supervision signals, RGCL uses a rationale generator to reveal salient features about graph instance-discrimination as the rationale, and then creates rationale-aware views for contrastive learning. This rationale-aware pre-training scheme endows the backbone model with the powerful representation ability, further facilitating the fine-tuning on downstream tasks. On MNIST-Superpixel and MUTAG datasets, visual inspections on the discovered rationales showcase that the rationale generator successfully captures the salient features (i.e. distinguishing semantic nodes in graphs). On biochemical molecule and social network benchmark datasets, the state-of-the-art performance of RGCL demonstrates the effectiveness of rationale-aware views for contrastive learning. Our codes are available at https://github.com/lsh0520/RGCL.

preprint2022arXiv

Parabolic methods for ultraspherical interpolation inequalities

The carré du champ method is a powerful technique for proving interpolation inequalities with explicit constants in presence of a non-trivial metric on a manifold. The method applies to some classical Gagliardo-Nirenberg-Sobolev inequalities on the sphere, with optimal constants. Very nonlinear regimes close to the critical Sobolev exponent can be covered using nonlinear parabolic flows of porous medium or fast diffusion type. Considering power law weights is a natural question in relation with symmetry breaking issues for Caffarelli-Kohn-Nirenberg inequalities, but regularity estimates for a complete justification of the computation are missing. We provide the first example of a complete parabolic proof based on a nonlinear flow by regularizing the singularity induced by the weight. Our result is established in the simplified framework of a diffusion built on the ultraspherical operator, which amounts to reduce the problem to functions on the sphere with simple symmetry properties.

preprint2022arXiv

Reinforced Causal Explainer for Graph Neural Networks

Explainability is crucial for probing graph neural networks (GNNs), answering questions like "Why the GNN model makes a certain prediction?". Feature attribution is a prevalent technique of highlighting the explanatory subgraph in the input graph, which plausibly leads the GNN model to make its prediction. Various attribution methods exploit gradient-like or attention scores as the attributions of edges, then select the salient edges with top attribution scores as the explanation. However, most of these works make an untenable assumption - the selected edges are linearly independent - thus leaving the dependencies among edges largely unexplored, especially their coalition effect. We demonstrate unambiguous drawbacks of this assumption - making the explanatory subgraph unfaithful and verbose. To address this challenge, we propose a reinforcement learning agent, Reinforced Causal Explainer (RC-Explainer). It frames the explanation task as a sequential decision process - an explanatory subgraph is successively constructed by adding a salient edge to connect the previously selected subgraph. Technically, its policy network predicts the action of edge addition, and gets a reward that quantifies the action's causal effect on the prediction. Such reward accounts for the dependency of the newly-added edge and the previously-added edges, thus reflecting whether they collaborate together and form a coalition to pursue better explanations. As such, RC-Explainer is able to generate faithful and concise explanations, and has a better generalization power to unseen graphs. When explaining different GNNs on three graph classification datasets, RC-Explainer achieves better or comparable performance to SOTA approaches w.r.t. predictive accuracy and contrastivity, and safely passes sanity checks and visual inspections. Codes are available at https://github.com/xiangwang1223/reinforced_causal_explainer.

preprint2020arXiv

Disentangled Graph Collaborative Filtering

Learning informative representations of users and items from the interaction data is of crucial importance to collaborative filtering (CF). Present embedding functions exploit user-item relationships to enrich the representations, evolving from a single user-item instance to the holistic interaction graph. Nevertheless, they largely model the relationships in a uniform manner, while neglecting the diversity of user intents on adopting the items, which could be to pass time, for interest, or shopping for others like families. Such uniform approach to model user interests easily results in suboptimal representations, failing to model diverse relationships and disentangle user intents in representations. In this work, we pay special attention to user-item relationships at the finer granularity of user intents. We hence devise a new model, Disentangled Graph Collaborative Filtering (DGCF), to disentangle these factors and yield disentangled representations. Specifically, by modeling a distribution over intents for each user-item interaction, we iteratively refine the intent-aware interaction graphs and representations. Meanwhile, we encourage independence of different intents. This leads to disentangled representations, effectively distilling information pertinent to each intent. We conduct extensive experiments on three benchmark datasets, and DGCF achieves significant improvements over several state-of-the-art models like NGCF, DisenGCN, and MacridVAE. Further analyses offer insights into the advantages of DGCF on the disentanglement of user intents and interpretability of representations. Our codes are available in https://github.com/xiangwang1223/disentangled_graph_collaborative_filtering.

preprint2016arXiv

Flows and functional inequalities for fractional operators

This paper collects results concerning global rates and large time asymptotics of a fractional fast diffusion on the Euclidean space, which is deeply related with a family of fractional Gagliardo-Nirenberg-Sobolev inequalities. Generically, self-similar solutions are not optimal for the Gagliardo-Nirenberg-Sobolev inequalities, in strong contrast with usual standard fast diffusion equations based on non-fractional operators. Various aspects of the stability of the self-similar solutions and of the entropy methods like carr{é} du champ and R{é}nyi entropy powers methods are investigated and raise a number of open problems.

preprint2016arXiv

Optimal functional inequalities for fractional operators on the sphere and applications

This paper is devoted to optimal functional inequalities for fractional Laplace operators on the sphere. Based on spectral properties, subcritical inequalities are established. Their consequences for fractional heat flows are considered. These subcritical inequalities interpolate between fractional Sobolev and subcritical fractional logarithmic Sobolev inequalities. Their optimal constants are determined by a spectral gap. In the subcritical range, the method also provides us with remainder terms which can be considered as an improved version of the optimal inequalities. We also consider inequalities which interpolate between fractional logarithmic Sobolev and fractional Poincar{é} inequalities. Finally, weighted inequalities involving the fractional Laplacian are obtained in the Euclidean space, using a stereographic projection and scaling properties.

preprint2014arXiv

Remainder Terms for Several Inequalities on Some Groups of Heisenberg-type

We give some estimates of the remainder terms for several conformally-invariant Sobolev-type inequalities on the Heisenberg group, in analogy with the Euclidean case. By considering the variation of associated functionals, we give a stability of two dual forms: the fractional Sobolev (Folland-Stein) and Hardy-Littlewood-Sobolev inequality, in terms of distance to the submanifold of extremizers. Then we compare their remainder terms to improve the inequalities in another way. We also compare, in the limit case s = Q (or $λ$ = 0), the remainder terms of Beckner-Onofri inequality and its dual Logarithmic Hardy-Littlewood-Sobolev inequality. Besides, we also list without proof some results for the other two cases of groups of Iwasawa-type. Our results generalize earlier works on Euclidean spaces by Chen, Frank, Weth [CFW13] and Dolbeault, Jankowiakin [DJ14] onto some groups of Heisenberg-type.

preprint2014arXiv

Restriction Theorems On Métiver Groups Associated to Joint Functional Calculus

In this article, we get the spectral solution $\mathcal{P}_μ^{m}$ of operators $m(\mathcal{L}, -Δ_\mathfrak{z})$, the joint functional calculus of the sub-Laplacian and Laplacian on the centre of Métivier group. Then, we give some group-analogues of the Thomas-Stein-type restriction theorem, asserting the mix-norm boundness of the restriction operators $\mathcal{P}_μ^{m}$ for two classes of functions $m=(a^α+b^β)^γ$ and $m=(1+a^α+b^β)^γ$ with $α, β>0, γ\neq0$.

preprint2014arXiv

Sharp Hardy-Littlewood-Sobolev Inequalities on Octonionic Heisenberg Group

This paper is a second one following our work [CLZ13] in series, considering sharp Hardy- Littlewood-Sobolev inequalities on groups of Heisenberg type. The first important breakthrough was made by Frank and Lieb in [FL12]. In this paper, analogous results are obtained for octonionic Heisenberg group.

preprint2014arXiv

Sharp Hardy-Littlewood-Sobolev Inequalities on Quaternionic Heisenberg Groups

In this paper, we got several sharp Hardy-Littlewood-Sobolev-type inequalities on quaternionic Heisenberg groups (a general form due to Folland and Stein [FS74]), using the symmetrization-free method in a paper of Frank and Lieb [FL12], where they considered the analogues on classical Heisenberg group. First, we give the sharp Hardy-Littlewood-Sobolev inequalities, both on quaternionic Heisenberg group and its equivalent on quaternionic sphere for exponent bigger than 4. The extremizer, as we guess, is almost uniquely constant function on sphere. Then their dual form, sharp conformally-invariant Sobolev inequalities and the right endpoint limit case, Log-Sobolev inequality, are also obtained. For small exponent less 4, constant function is only proved to be a local extremizer. The conformal symmetry of the inequalities and zero center-mass technique play a critical role in the argument.

An Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

CrossCBR: Cross-view Contrastive Learning for Bundle Recommendation

Duplex Hecke Algebras and Related Quantum Schur Duality

SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments

Deconfounding to Explanation Evaluation in Graph Neural Networks

Discovering Invariant Rationales for Graph Neural Networks

Let Invariant Rationale Discovery Inspire Graph Contrastive Learning

Parabolic methods for ultraspherical interpolation inequalities

Reinforced Causal Explainer for Graph Neural Networks

Disentangled Graph Collaborative Filtering

Flows and functional inequalities for fractional operators

Optimal functional inequalities for fractional operators on the sphere and applications

Remainder Terms for Several Inequalities on Some Groups of Heisenberg-type

Restriction Theorems On Métiver Groups Associated to Joint Functional Calculus

Sharp Hardy-Littlewood-Sobolev Inequalities on Octonionic Heisenberg Group

Sharp Hardy-Littlewood-Sobolev Inequalities on Quaternionic Heisenberg Groups