Researcher profile

Ruofan Liu

Ruofan Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

XSearch: Explainable Code Search via Concept-to-Code Alignment

Semantic code search has been widely adopted in both academia and industry. These approaches embed natural-language queries and code snippets into a shared embedding space and retrieve results based on vector similarity. Despit strong performance on benchmark datasets, they often suffer from poor explainability and generalization. Retrieved code may appear semantically similar yet miss critical functional requirements of the query, while providing no explanation of why the result was retrieved. Moreover, such failures become more severe under distribution shift, where models struggle to generalize to unseen benchmarks. In this work, we propose XSearch, an intrinsically explainable code search framework. Our key insight is that by relying on global embedding similarity, existing retrievers inherently take an inductive view. They learn statistical patterns rather than truly understanding the query's functional requirements. We address this problem by reformulating code search as a deductive concept alignment problem. XSearch (i) identifies functional concepts in the query and (ii) explicitly aligns them with corresponding code statements. This explain-then-predict design produces inherent concept-level explanations and mitigates shortcut learning that harms out-of-distribution generalization. We train an encoder with explicit concept-alignment objectives and perform retrieval through explicit matching between query concepts and code statements. Experiments show that, trained on CodeSearchNet using GraphCodeBERT (125M parameters), XSearch improves performance on out-of-distribution benchmarks from 0.02 to 0.33 (15x) over eight state-of-the-art retrievers, and consistently outperforms both encoder- and decoder-based baselines with up to 7B parameters. A user study demonstrates that concept-alignment explanations enable users to evaluate retrieved results faster and more accurately.

preprint2021arXiv

DeepVisualInsight: Time-Travelling Visualization for Spatio-Temporal Causality of Deep Classification Training

Understanding how the predictions of deep learning models are formed during the training process is crucial to improve model performance and fix model defects, especially when we need to investigate nontrivial training strategies such as active learning, and track the root cause of unexpected training results such as performance degeneration. In this work, we propose a time-travelling visual solution DeepVisualInsight (DVI), aiming to manifest the spatio-temporal causality while training a deep learning image classifier. The spatio-temporal causality demonstrates how the gradient-descent algorithm and various training data sampling techniques can influence and reshape the layout of learnt input representation and the classification boundaries in consecutive epochs. Such causality allows us to observe and analyze the whole learning process in the visible low dimensional space. Technically, we propose four spatial and temporal properties and design our visualization solution to satisfy them. These properties preserve the most important information when inverse-)projecting input samples between the visible low-dimensional and the invisible high-dimensional space, for causal analyses. Our extensive experiments show that, comparing to baseline approaches, we achieve the best visualization performance regarding the spatial/temporal properties and visualization efficiency. Moreover, our case study shows that our visual solution can well reflect the characteristics of various training scenarios, showing good potential of DVI as a debugging tool for analyzing deep learning training processes.