Source author record

Anran Li

Anran Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Machine Learning math.OC physics.optics

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

An Empirical Study on Noisy Label Learning for Program Understanding

Recently, deep learning models have been widely applied in program understanding tasks, and these models achieve state-of-the-art results on many benchmark datasets. A major challenge of deep learning for program understanding is that the effectiveness of these approaches depends on the quality of their datasets, and these datasets often contain noisy data samples. A typical kind of noise in program understanding datasets is label noise, which means that the target outputs for some inputs are incorrect. Researchers have proposed various approaches to alleviate the negative impact of noisy labels, and formed a new research topic: noisy label learning (NLL). In this paper, we conduct an empirical study on the effectiveness of noisy label learning on deep learning for program understanding datasets. We evaluate various NLL approaches and deep learning models on three tasks: program classification, vulnerability detection, and code summarization. From the evaluation results, we come to the following findings: 1) small trained-from-scratch models are prone to label noises in program understanding, while large pre-trained models are highly robust against them. 2) NLL approaches significantly improve the program classification accuracies for small models on noisy training sets, but they only slightly benefit large pre-trained models in classification accuracies. 3) NLL can effectively detect synthetic noises in program understanding, but struggle in detecting real-world noises. We believe our findings can provide insights on the abilities of NLL in program understanding, and shed light on future works in tackling noises in software engineering datasets. We have released our code at https://github.com/jacobwwh/noise_SE.

preprint2023arXiv

Learning Program Representations with a Tree-Structured Transformer

Learning vector representations for programs is a critical step in applying deep learning techniques for program understanding tasks. Various neural network models are proposed to learn from tree-structured program representations, e.g., abstract syntax tree (AST) and concrete syntax tree (CST). However, most neural architectures either fail to capture long-range dependencies which are ubiquitous in programs, or cannot learn effective representations for syntax tree nodes, making them incapable of performing the node-level prediction tasks, e.g., bug localization. In this paper, we propose Tree-Transformer, a novel recursive tree-structured neural network to learn the vector representations for source codes. We propose a multi-head attention mechanism to model the dependency between siblings and parent-children node pairs. Moreover, we propose a bi-directional propagation strategy to allow node information passing in two directions, bottom-up and top-down along trees. In this way, Tree-Transformer can learn the information of the node features as well as the global contextual information. The extensive experimental results show that our Tree-Transformer significantly outperforms the existing tree-based and graph-based program representation learning approaches in both the tree-level and node-level prediction tasks.

preprint2015arXiv

Online Resource Allocation with Customer Choice

We introduce a general model of resource allocation with customer choice. In this model, there are multiple resources that are available over a finite horizon. The resources are non-replenishable and perishable. Each unit of a resource can be instantly made into one of several products. There are multiple customer types arriving randomly over time. An assortment of products must be offered to each arriving customer, depending on the type of the customer, the time of arrival, and the remaining inventory. From this assortment, the customer selects a product according to a general choice model. The selection generates a product-dependent and customer-type-dependent reward. The objective of the system is to maximize the total expected reward earned over the horizon. The above problem has a number of applications, including personalized assortment optimization, revenue management of parallel flights, and web- and mobile-based appointment scheduling. We derive online algorithms that are asymptotically optimal and achieve the best constant relative performance guarantees for this class of problems.

preprint2015arXiv

Ultrahigh Enhancement of Electromagnetic Fields by Exciting Localized with Extended Surface Plasmons

Excitation of localized surface plasmons (LSPs) of metal nanoparticles (NPs) residing on a flat metal film has attracted great attentions recently due to the enhanced electromagnetic (EM) fields found to be higher than the case of NPs on a dielectric substrate. In the present work, it is shown that even much higher enhancement of EM fields is obtained by exciting the LSPs through extended surface plasmons (ESPs) generated at the metallic film surface using the Kretschmann-Raether configuration. We show that the largest EM field enhancement and the highest surface-enhanced fluorescence intensity are obtained when the incidence angle is the ESP resonance angle of the underlying metal film. The finite-difference time-domain simulations indicate that excitation of LSPs using ESPs can generate 1-3 orders higher EM field intensity than direct excitation of the LSPs using incidence from free space. The ultrahigh enhancement is attributed to the strong confinement of the ESP waves in the vertical direction. The drastically intensified EM fields are significant for highly-sensitive refractive index sensing, surface-enhanced spectroscopies, and enhancing the efficiency of optoelectronic devices.

Anran Li

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

An Empirical Study on Noisy Label Learning for Program Understanding

Learning Program Representations with a Tree-Structured Transformer

Online Resource Allocation with Customer Choice

Ultrahigh Enhancement of Electromagnetic Fields by Exciting Localized with Extended Surface Plasmons