Source author record

Sen Nie

Sen Nie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Software Engineering cond-mat.stat-mech math.OC

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis

Binary similarity analysis is critical to many code-reuse-related issues and "1-to-1" mechanism is widely applied, where one function in a binary file is matched against one function in a source file or binary file. However, we discover that function mapping is a more complex problem of "1-to-n" or even "n-to-n" due to the existence of function inlining. In this paper, we investigate the effect of function inlining on binary similarity analysis. We first construct 4 inlining-oriented datasets for four similarity analysis tasks, including code search, OSS reuse detection, vulnerability detection, and patch presence test. Then, we further study the extent of function inlining, the performance of existing works under function inlining, and the effectiveness of existing inlining-simulation strategies. Results show that the proportion of function inlining can reach nearly 70%, while most existing works neglect it and use "1-to-1" mechanism. The mismatches cause a 30% loss in performance during code search and a 40% loss during vulnerability detection. Moreover, two existing inlining-simulation strategies can only recover 60% of the inlined functions. We discover that inlining is usually cumulative when optimization increases. Conditional inlining and incremental inlining are suggested to design low-cost and high-coverage inlining-simulation strategies.

preprint2022arXiv

Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings

Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural program embeddings directly from the program source codes, by learning from features such as tokens, abstract syntax trees, and control flow graphs. This paper takes a fresh look at how to improve program embeddings by leveraging compiler intermediate representation (IR). We first demonstrate simple yet highly effective methods for enhancing embedding quality by training embedding models alongside source code and LLVM IR generated by default optimization levels (e.g., -O2). We then introduce IRGen, a framework based on genetic algorithms (GA), to identify (near-)optimal sequences of optimization flags that can significantly improve embedding quality.

preprint2021arXiv

Optimal control of complex networks with conformity behavior

Despite the significant advances in identifying the driver nodes and energy requiring in network control, a framework that incorporates more complicated dynamics remains challenging. Here, we consider the conformity behavior into network control, showing that the control of undirected networked systems with conformity will become easier as long as the number of external inputs beyond a critical point. We find that this critical point is fundamentally determined by the network connectivity. In particular, we investigate the nodal structural characteristic in network control and propose optimal control strategy to reduce the energy requiring in controlling networked systems with conformity behavior. We examine those findings in various synthetic and real networks, confirming that they are prevailing in describing the control energy of networked systems. Our results advance the understanding of network control in practical applications.

preprint2014arXiv

Effect of correlations on controllability transition in network control

The numerical controllability transition makes the success of control can be achieved by increasing the number of driver nodes to a certain point. Motivated by the fact that the degree correlation has vast role in the dynamics on networks, we study the impact of various degree correlations of different networks on the controllability transition point and find that the transition point depicts local maximum in sparse networks as degree correlation r around 0.1 and 0 in ER and SF networks respectively. With the increasing of average degree, the local maximum disappear and the controllability transition cannot be influenced by degree correlation and degree distribution in dense ER networks. The results are supported by numerical simulations and provide more details to estimate the minimal driver nodes in large networks.