Researcher profile

Fiorella Artuso

Fiorella Artuso contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

BinBert: Binary Code Understanding with a Fine-tunable and Execution-aware Transformer

A recent trend in binary code analysis promotes the use of neural solutions based on instruction embedding models. An instruction embedding model is a neural network that transforms sequences of assembly instructions into embedding vectors. If the embedding network is trained such that the translation from code to vectors partially preserves the semantic, the network effectively represents an assembly code model. In this paper we present BinBert, a novel assembly code model. BinBert is built on a transformer pre-trained on a huge dataset of both assembly instruction sequences and symbolic execution information. BinBert can be applied to assembly instructions sequences and it is fine-tunable, i.e. it can be re-trained as part of a neural architecture on task-specific data. Through fine-tuning, BinBert learns how to apply the general knowledge acquired with pre-training to the specific task. We evaluated BinBert on a multi-task benchmark that we specifically designed to test the understanding of assembly code. The benchmark is composed of several tasks, some taken from the literature, and a few novel tasks that we designed, with a mix of intrinsic and downstream tasks. Our results show that BinBert outperforms state-of-the-art models for binary instruction embedding, raising the bar for binary code understanding.

preprint2021arXiv

In Nomine Function: Naming Functions in Stripped Binaries with Neural Networks

In this paper we investigate the problem of automatically naming pieces of assembly code. Where by naming we mean assigning to an assembly function a string of words that would likely be assigned by a human reverse engineer. We formally and precisely define the framework in which our investigation takes place. That is we define the problem, we provide reasonable justifications for the choices that we made for the design of training and the tests. We performed an analysis on a large real-world corpora constituted by nearly 9 millions of functions taken from more than 22k softwares. In such framework we test baselines coming from the field of Natural Language Processing (e.g., Seq2Seq networks and Transformer). Interestingly, our evaluation shows promising results beating the state-of-the-art and reaching good performance. We investigate the applicability of tine-tuning (i.e., taking a model already trained on a large generic corpora and retraining it for a specific task). Such technique is popular and well-known in the NLP field. Our results confirm that fine-tuning is effective even when neural networks are applied to binaries. We show that a model, pre-trained on the aforementioned corpora, when fine-tuned has higher performances on specific domains (such as predicting names in system utilites, malware, etc).