Researcher profile

Yingfei Xiong

Yingfei Xiong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

A Syntax-Guided Edit Decoder for Neural Program Repair

Automated Program Repair (APR) helps improve the efficiency of software development and maintenance. Recent APR techniques use deep learning, particularly the encoder-decoder architecture, to generate patches. Though existing DL-based APR approaches have proposed different encoder architectures, the decoder remains to be the standard one, which generates a sequence of tokens one by one to replace the faulty statement. This decoder has multiple limitations: 1) allowing to generate syntactically incorrect programs, 2) inefficiently representing small edits, and 3) not being able to generate project-specific identifiers. In this paper, we propose Recoder, a syntax-guided edit decoder with placeholder generation. Recoder is novel in multiple aspects: 1) Recoder generates edits rather than modified code, allowing efficient representation of small edits; 2) Recoder is syntax-guided, with the novel provider/decider architecture to ensure the syntactic correctness of the patched program and accurate generation; 3) Recoder generates placeholders that could be instantiated as project-specific identifiers later. We conduct experiments to evaluate Recoder on 395 bugs from Defects4J v1.2 and 420 additional bugs from Defects4J v2.0. Our results show that Recoder repairs 53 bugs on Defects4J v1.2, which achieves 21.4% improvement over the previous state-of-the-art approach for single-hunk bugs (TBar). Importantly, to our knowledge, Recoder is the first DL-based APR approach that has outperformed the traditional APR approaches on this dataset. Furthermore, Recoder also repairs 19 bugs on the additional bugs from Defects4J v2.0, which is 137.5% more than TBar (8 bugs) and 850% more than SimFix (2 bugs). This result suggests that Recoder has better generalizability than existing APR approaches.

preprint2022arXiv

Generalized Equivariance and Preferential Labeling for GNN Node Classification

Existing graph neural networks (GNNs) largely rely on node embeddings, which represent a node as a vector by its identity, type, or content. However, graphs with unattributed nodes widely exist in real-world applications (e.g., anonymized social networks). Previous GNNs either assign random labels to nodes (which introduces artefacts to the GNN) or assign one embedding to all nodes (which fails to explicitly distinguish one node from another). Further, when these GNNs are applied to unattributed node classification problems, they have an undesired equivariance property, which are fundamentally unable to address the data with multiple possible outputs. In this paper, we analyze the limitation of existing approaches to node classification problems. Inspired by our analysis, we propose a generalized equivariance property and a Preferential Labeling technique that satisfies the desired property asymptotically. Experimental results show that we achieve high performance in several unattributed node classification tasks.

preprint2020arXiv

Interactive Patch Filtering as Debugging Aid

It is widely recognized that program repair tools need to have a high precision to be useful, i.e., the generated patches need to have a high probability to be correct. However, it is fundamentally difficult to ensure the correctness of the patches, and many tools compromise other aspects of repair performance such as recall for an acceptable precision. In this paper we ask a question: can a repair tool with a low precision be still useful? To explore this question, we propose an interactive filtering approach to patch review, which filters out incorrect patches by asking questions to the developers. Our intuition is that incorrect patches can still help understand the bug. With proper tool support, the benefit outweighs the cost even if there are many incorrect patches. We implemented the approach as an Eclipse plugin tool, InPaFer, and evaluated it with a simulated experiment and a user study with 30 developers. The results show that our approach improve the repair performance of developers, with 62.5% more successfully repaired bugs and 25.3% less debugging time in average. In particular, even if the generated patches are all incorrect, the performance of the developers would not be significantly reduced, and could be improved when some patches provide useful information for repairing, such as the faulty location and a partial fix.

preprint2020arXiv

OCoR: An Overlapping-Aware Code Retriever

Code retrieval helps developers reuse the code snippet in the open-source projects. Given a natural language description, code retrieval aims to search for the most relevant code among a set of code. Existing state-of-the-art approaches apply neural networks to code retrieval. However, these approaches still fail to capture an important feature: overlaps. The overlaps between different names used by different people indicate that two different names may be potentially related (e.g., "message" and "msg"), and the overlaps between identifiers in code and words in natural language descriptions indicate that the code snippet and the description may potentially be related. To address these problems, we propose a novel neural architecture named OCoR, where we introduce two specifically-designed components to capture overlaps: the first embeds identifiers by character to capture the overlaps between identifiers, and the second introduces a novel overlap matrix to represent the degrees of overlaps between each natural language word and each identifier. The evaluation was conducted on two established datasets. The experimental results show that OCoR significantly outperforms the existing state-of-the-art approaches and achieves 13.1% to 22.3% improvements. Moreover, we also conducted several in-depth experiments to help understand the performance of different components in OCoR.