Researcher profile

Ruijie Wang

Ruijie Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

\textsc{MasFACT}: Continual Multi-Agent Topology Learning via Geometry-Aware Posterior Transfer

Multi-agent systems (MAS) powered by large language models (LLMs) have emerged as a powerful paradigm for complex problem solving, where performance critically depends on the underlying inter-agent communication topology. However, existing topology generation methods mainly optimize for isolated tasks, while real-world deployments involve streams of evolving tasks, requiring previously effective collaboration patterns to be retained and reused rather than rediscovered or overwritten. We identify a previously underexplored failure mode, \emph{topology forgetting}, in which adapting to new tasks shifts the topology generator away from communication structures required by earlier tasks. This issue stems from cross-task misalignment in both agent-level functional semantics and relational communication structures. To address this challenge, we propose \textbf{\textsc{MasFACT}}, a geometry-aware posterior transfer framework that preserves and reuses historical collaboration knowledge as transferable topology priors. We transfer these priors across task-specific agent spaces through Fused Gromov-Wasserstein optimal transport and perform PAC-Bayes-guided conservative posterior adaptation to balance task-specific plasticity with structural stability. Experiments across class-, domain-, and task-level continual settings demonstrate that \textsc{MasFACT} consistently improves average accuracy while reducing topology forgetting compared to strong topology generation and replay-based baselines, and can be seamlessly integrated with different MAS topology generators.

preprint2026arXiv

GNN2R: Weakly-Supervised Rationale-Providing Question Answering over Knowledge Graphs

Despite the rapid progress of large language models (LLMs), knowledge graph-based question answering (KGQA) remains essential for producing verifiable and hallucination-resistant answers in many real-world settings where answer trustworthiness and computational efficiency are highly valued. However, most existing KGQA methods provide only final answers in the form of KG entities. Without explicit explanations -- ideally in the form of intermediate reasoning process over relevant KG triples, the QA results are difficult to inspect and interpret. Moreover, this limitation prevents the rich and verifiable knowledge encoded in KGs, which is a key advantage of KGQA over LLMs, from being fully leveraged. However, addressing this issue remains highly challenging due to the lack of annotated intermediate reasoning process and the requirement of high efficiency in KGQA. In this paper, we propose a novel Graph Neural Network-based Two-Step Reasoning method (GNN2R) that can efficiently retrieve both final answers and corresponding reasoning subgraphs as verifiable rationales, using only weak supervision from widely-available final answer annotations. We extensively evaluated GNN2R and demonstrated that GNN2R substantially outperforms existing state-of-the-art KGQA methods in terms of effectiveness, efficiency, and the quality of generated explanations. The complete code and pre-trained models are available at https://github.com/ruijie-wang-uzh/GNN2R.

preprint2026arXiv

S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs

Pre-training on text-attributed graphs (TAGs) is central to building transferable graph foundation models, where LLM-as-Aligner methods align graph and text representations through the semantic knowledge of large language models. However, these methods usually assume that node texts provide sufficient and reliable supervision, an assumption often violated in real-world sparse TAGs. When textual anchors are missing, noisy, or uneven across domains, graph structures must be aligned with weak semantic evidence, leading to unreliable structure-semantics correspondence and sparsity-induced transfer bias. This paper presents S2Aligner, a sparsity-aware and structure-enhanced LLM-as-Aligner framework for graph-text pre-training on sparse TAGs. The key idea is to decouple semantic alignment from structural modeling, allowing topology-aware signals to enhance alignment without contaminating the shared semantic space. Specifically, S2Aligner decomposes graph-text representations into semantic and structural components, uses structure-oriented reconstruction with consistency control to inject reliable topology cues into text representations, and suppresses inconsistent structural signals under textual sparsity. Moreover, S2Aligner introduces sparsity-aware cross-domain risk balancing, which calibrates domain risks through a global-domain density ratio and downweights unreliable sparse samples via graph reliability estimation. Theoretical analysis shows that this objective reduces cross-domain generalization gaps by controlling domain risk discrepancy. Extensive experiments across diverse graph domains, sparsity levels, and downstream tasks demonstrate that S2Aligner consistently outperforms existing baselines.

preprint2022arXiv

RETE: Retrieval-Enhanced Temporal Event Forecasting on Unified Query Product Evolutionary Graph

With the increasing demands on e-commerce platforms, numerous user action history is emerging. Those enriched action records are vital to understand users' interests and intents. Recently, prior works for user behavior prediction mainly focus on the interactions with product-side information. However, the interactions with search queries, which usually act as a bridge between users and products, are still under investigated. In this paper, we explore a new problem named temporal event forecasting, a generalized user behavior prediction task in a unified query product evolutionary graph, to embrace both query and product recommendation in a temporal manner. To fulfill this setting, there involves two challenges: (1) the action data for most users is scarce; (2) user preferences are dynamically evolving and shifting over time. To tackle those issues, we propose a novel Retrieval-Enhanced Temporal Event (RETE) forecasting framework. Unlike existing methods that enhance user representations via roughly absorbing information from connected entities in the whole graph, RETE efficiently and dynamically retrieves relevant entities centrally on each user as high-quality subgraphs, preventing the noise propagation from the densely evolutionary graph structures that incorporate abundant search queries. And meanwhile, RETE autoregressively accumulates retrieval-enhanced user representations from each time step, to capture evolutionary patterns for joint query and product prediction. Empirically, extensive experiments on both the public benchmark and four real-world industrial datasets demonstrate the effectiveness of the proposed RETE method.

preprint2022arXiv

Unsupervised Belief Representation Learning with Information-Theoretic Variational Graph Auto-Encoders

This paper develops a novel unsupervised algorithm for belief representation learning in polarized networks that (i) uncovers the latent dimensions of the underlying belief space and (ii) jointly embeds users and content items (that they interact with) into that space in a manner that facilitates a number of downstream tasks, such as stance detection, stance prediction, and ideology mapping. Inspired by total correlation in information theory, we propose the Information-Theoretic Variational Graph Auto-Encoder (InfoVGAE) that learns to project both users and content items (e.g., posts that represent user views) into an appropriate disentangled latent space. To better disentangle latent variables in that space, we develop a total correlation regularization module, a Proportional-Integral (PI) control module, and adopt rectified Gaussian distribution to ensure the orthogonality. The latent representation of users and content can then be used to quantify their ideological leaning and detect/predict their stances on issues. We evaluate the performance of the proposed InfoVGAE on three real-world datasets, of which two are collected from Twitter and one from U.S. Congress voting records. The evaluation results show that our model outperforms state-of-the-art unsupervised models by reducing 10.5% user clustering errors and achieving 12.1% higher F1 scores for stance separation of content items. In addition, InfoVGAE produces a comparable result with supervised models. We also discuss its performance on stance prediction and user ranking within ideological groups.

preprint2020arXiv

Analyzing the Design Space of Re-opening Policies and COVID-19 Outcomes in the US

Recent re-opening policies in the US, following a period of social distancing measures, introduced a significant increase in daily COVID-19 infections, calling for a roll-back or substantial revisiting of these policies in many states. The situation is suggestive of difficulties modeling the impact of partial distancing/re-opening policies on future epidemic spread for purposes of choosing safe alternatives. More specifically, one needs to understand the impact of manipulating the availability of social interaction venues (e.g., schools, workplaces, and retail establishments) on virus spread. We introduce a model, inspired by social networks research, that answers the above question. Our model compartmentalizes interaction venues into categories we call mixing domains, enabling one to predict COVID-19 contagion trends in different geographic regions under different what if assumptions on partial re-opening of individual domains. We apply our model to several highly impacted states showing (i) how accurately it predicts the extent of current resurgence (from available policy descriptions), and (ii) what alternatives might be more effective at mitigating the second wave. We further compare policies that rely on partial venue closure to policies that espouse wide-spread periodic testing instead (i.e., in lieu of social distancing). Our models predict that the benefits of (mandatory) testing out-shadow the benefits of partial venue closure, suggesting that perhaps more efforts should be directed to such a mitigation strategy.

preprint2020arXiv

Author Name Disambiguation on Heterogeneous Information Network with Adversarial Representation Learning

Author name ambiguity causes inadequacy and inconvenience in academic information retrieval, which raises the necessity of author name disambiguation (AND). Existing AND methods can be divided into two categories: the models focusing on content information to distinguish whether two papers are written by the same author, the models focusing on relation information to represent information as edges on the network and to quantify the similarity among papers. However, the former requires adequate labeled samples and informative negative samples, and are also ineffective in measuring the high-order connections among papers, while the latter needs complicated feature engineering or supervision to construct the network. We propose a novel generative adversarial framework to grow the two categories of models together: (i) the discriminative module distinguishes whether two papers are from the same author, and (ii) the generative module selects possibly homogeneous papers directly from the heterogeneous information network, which eliminates the complicated feature engineering. In such a way, the discriminative module guides the generative module to select homogeneous papers, and the generative module generates high-quality negative samples to train the discriminative module to make it aware of high-order connections among papers. Furthermore, a self-training strategy for the discriminative module and a random walk based generating algorithm are designed to make the training stable and efficient. Extensive experiments on two real-world AND benchmarks demonstrate that our model provides significant performance improvement over the state-of-the-art methods.

preprint2020arXiv

Effects of heterogeneous self-protection awareness on resource-epidemic coevolution dynamics

Recent studies have demonstrated that the allocation of individual resources has a significant influence on the dynamics of epidemic spreading. In the real scenario, individuals have a different level of awareness for self-protection when facing the outbreak of an epidemic. To investigate the effects of the heterogeneous self-awareness distribution on the epidemic dynamics, we propose a resource-epidemic coevolution model in this paper. We first study the effects of the heterogeneous distributions of node degree and self-awareness on the epidemic dynamics on artificial networks. Through extensive simulations, we find that the heterogeneity of self-awareness distribution suppresses the outbreak of an epidemic, and the heterogeneity of degree distribution enhances the epidemic spreading. Next, we study how the correlation between node degree and self-awareness affects the epidemic dynamics. The results reveal that when the correlation is positive, the heterogeneity of self-awareness restrains the epidemic spreading. While, when there is a significant negative correlation, strong heterogeneous or strong homogeneous distribution of the self-awareness is not conducive for disease suppression. We find an optimal heterogeneity of self-awareness, at which the disease can be suppressed to the most extent. Further research shows that the epidemic threshold increases monotonously when the correlation changes from most negative to most positive, and a critical value of the correlation coefficient is found. When the coefficient is below the critical value, an optimal heterogeneity of self-awareness exists; otherwise, the epidemic threshold decreases monotonously with the decline of the self-awareness heterogeneity. At last, we verify the results on four typical real-world networks and find that the results on the real-world networks are consistent with those on the artificial network.

preprint2020arXiv

Revisiting Over-smoothing in Deep GCNs

Oversmoothing has been assumed to be the major cause of performance drop in deep graph convolutional networks (GCNs). In this paper, we propose a new view that deep GCNs can actually learn to anti-oversmooth during training. This work interprets a standard GCN architecture as layerwise integration of a Multi-layer Perceptron (MLP) and graph regularization. We analyze and conclude that before training, the final representation of a deep GCN does over-smooth, however, it learns anti-oversmoothing during training. Based on the conclusion, the paper further designs a cheap but effective trick to improve GCN training. We verify our conclusions and evaluate the trick on three citation networks and further provide insights on neighborhood aggregation in GCNs.

preprint2020arXiv

Self-awareness based resource allocation strategy for containment of epidemic spreading

Resource support between individuals is of particular importance in controlling or mitigating epidemic spreading, especially during pandemics. Whereas there remains the question of how we can protect ourselves from being infected while helping others by donating resources in fighting against the epidemic. To answer the question, we propose a novel resource allocation model by considering the awareness of self-protection of individuals. In the model, a tuning parameter is introduced to quantify the reaction strength of individuals when they are aware of the disease. And then, a coupled model of resource allocation and disease spreading is proposed to study the impact of self-awareness on resource allocation and, its impact on the dynamics of epidemic spreading. Through theoretical analysis and extensive Monte Carlo simulations, we find that in the stationary state, the system converges to two states: the whole healthy or the completely infected, which indicates an abrupt increase in the prevalence when there is a shortage of resources. More importantly, we find that too cautious and too selfless for the people during the outbreak of an epidemic are both not suitable for disease control. Through extensive simulations, we find the optimal point, at which there is a maximum value of the epidemic threshold, and an outbreak can be delayed to the greatest extent. At last, we study further the effects of network structure on the coupled dynamics. We find that the degree heterogeneity promotes the outbreak of disease, and the network structure does not alter the optimal phenomenon in behavior response.