Researcher profile

Hao Zhong

Hao Zhong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

CatchAll: Repository-Aware Exception Handling with Knowledge-Guided LLMs

Exception handling is a vital forward error-recovery mechanism in many programming languages, enabling developers to manage runtime anomalies through structured constructs (e.g., try-catch blocks). Improper or missing exception handling often leads to severe consequences, including system crashes and resource leaks. While large language models (LLMs) have demonstrated strong capabilities in code generation, they struggle with exception handling at the repository level, due to complex dependencies and contextual constraints. In this work, we propose CatchAll, a novel LLM-based approach for repository-aware exception handling. CatchAll equips LLMs with three complementary layers of exception-handling knowledge: (1) API-level exception knowledge, obtained from an empirically constructed API-exception mapping that characterizes the exception-throwing behaviors of APIs in real-world codebases; (2) repository-level execution context, which captures exception propagation by modeling contextual call traces around the target code; and (3) cross-repository handling knowledge, distilled from reusable exception-handling patterns mined from historical code across projects. The knowledge is encoded into structured prompts to guide the LLM in generating accurate and context-aware exception-handling code. To evaluate CatchAll, we construct two new benchmarks for repository-aware exception handling: a large-scale dataset RepoExEval and an executable subset RepoExEval-Exec. Experiments demonstrate that RepoExEval consistently outperforms state-of-the-art baselines, achieving a CodeBLEU score of 0.31 (vs. 0.27% for the best baseline), intent prediction accuracy of 60.1% (vs. 48.0%), and Pass@1 of 29% (vs. 25%). These results affirm RepoExEval's effectiveness in real-world repository-level exception handling.

preprint2026arXiv

SpatCode: Rotary-based Unified Encoding Framework for Efficient Spatiotemporal Vector Retrieval

Spatiotemporal vector retrieval has emerged as a critical paradigm in modern information retrieval, enabling efficient access to massive, heterogeneous data that evolve over both time and space. However, existing spatiotemporal retrieval methods are often extensions of conventional vector search systems that rely on external filters or specialized indices to incorporate temporal and spatial constraints, leading to inefficiency, architectural complexity, and limited flexibility in handling heterogeneous modalities. To overcome these challenges, we present a unified spatiotemporal vector retrieval framework that integrates temporal, spatial, and semantic cues within a coherent similarity space while maintaining scalability and adaptability to continuous data streams. Specifically, we propose (1) a Rotary-based Unified Encoding Method that embeds time and location into rotational position vectors for consistent spatiotemporal representation; (2) a Circular Incremental Update Mechanism that supports efficient sliding-window updates without global re-encoding or index reconstruction; and (3) a Weighted Interest-based Retrieval Algorithm that adaptively balances modality weights for context-aware and personalized retrieval. Extensive experiments across multiple real-world datasets demonstrate that our framework substantially outperforms state-of-the-art baselines in both retrieval accuracy and efficiency, while maintaining robustness under dynamic data evolution. These results highlight the effectiveness and practicality of the proposed approach for scalable spatiotemporal information retrieval in intelligent systems.

preprint2022arXiv

Privacy-Utility Trade-Off

In this paper, we investigate the privacy-utility trade-off (PUT) problem, which considers the minimal privacy loss at a fixed expense of utility. Several different kinds of privacy in the PUT problem are studied, including differential privacy, approximate differential privacy, maximal information, maximal leakage, Renyi differential privacy, Sibson mutual information and mutual information. The average Hamming distance is used to measure the distortion caused by the privacy mechanism. We consider two scenarios: global privacy and local privacy. In the framework of global privacy framework, the privacy-distortion function is upper-bounded by the privacy loss of a special mechanism, and lower-bounded by the optimal privacy loss with any possible prior input distribution. In the framework of local privacy, we generalize a coloring method for the PUT problem.

preprint2021arXiv

Investigating and Recommending Co-Changed Entities for JavaScript Programs

JavaScript (JS) is one of the most popular programming languages due to its flexibility and versatility, but maintaining JS code is tedious and error-prone. In our research, we conducted an empirical study to characterize the relationship between co-changed software entities (e.g., functions and variables), and built a machine learning (ML)-based approach to recommend additional entity to edit given developers' code changes. Specifically, we first crawled 14,747 commits in 10 open-source projects; for each commit, we created one or more change dependency graphs (CDGs) to model the referencer-referencee relationship between co-changed entities. Next, we extracted the common subgraphs between CDGs to locate recurring co-change patterns between entities. Finally, based on those patterns, we extracted code features from co-changed entities and trained an ML model that recommends entities-to-change given a program commit. According to our empirical investigation, (1) three recurring patterns commonly exist in all projects; (2) 80%--90% of co-changed function pairs either invoke the same function(s), access the same variable(s), or contain similar statement(s); (3) our ML-based approach CoRec recommended entity changes with high accuracy (73%--78%). CoRec complements prior work because it suggests changes based on program syntax, textual similarity, as well as software history; it achieved higher accuracy than two existing tools in our evaluation.