Researcher profile

Marcelo d'Amorim

Marcelo d'Amorim contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Fuzzing Class Specifications

Expressing class specifications via executable constraints is important for various software engineering tasks such as test generation, bug finding and automated debugging, but developers rarely write them. Techniques that infer specifications from code exist to fill this gap, but they are designed to support specific kinds of assertions and are difficult to adapt to support different assertion languages, e.g., to add support for quantification, or additional comparison operators, such as membership or containment. To address the above issue, we present SpecFuzzer, a novel technique that combines grammar-based fuzzing, dynamic invariant detection, and mutation analysis, to automatically produce class specifications. SpecFuzzer uses: (i) a fuzzer as a generator of candidate assertions derived from a grammar that is automatically obtained from the class definition; (ii) a dynamic invariant detector -- Daikon -- to filter out assertions invalidated by a test suite; and (iii) a mutation-based mechanism to cluster and rank assertions, so that similar constraints are grouped and then the stronger prioritized. Grammar-based fuzzing enables SpecFuzzer to be straightforwardly adapted to support different specification languages, by manipulating the fuzzing grammar, e.g., to include additional operators. We evaluate our technique on a benchmark of 43 Java methods employed in the evaluation of the state-of-the-art techniques GAssert and EvoSpex. Our results show that SpecFuzzer can easily support a more expressive assertion language, over which is more effective than GAssert and EvoSpex in inferring specifications, according to standard performance metrics.

preprint2022arXiv

Software Engineering User Study Recruitment on Prolific: An Experience Report

Online participant recruitment platforms such as Prolific have been gaining popularity in research, as they enable researchers to easily access large pools of participants. However, participant quality can be an issue; participants may give incorrect information to gain access to more studies, adding unwanted noise to results. This paper details our experience recruiting participants from Prolific for a user study requiring programming skills in Node.js, with the aim of helping other researchers conduct similar studies. We explore a method of recruiting programmer participants using prescreening validation, attention checks and a series of programming knowledge questions. We received 680 responses, and determined that 55 met the criteria to be invited to our user study. We ultimately conducted user study sessions via video calls with 10 participants. We conclude this paper with a series of recommendations for researchers.

preprint2021arXiv

Shipwright: A Human-in-the-Loop System for Dockerfile Repair

Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and -- to our great surprise -- found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a "time-travel" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6--33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.

preprint2020arXiv

Generating Adversarial Inputs Using A Black-box Differential Technique

Neural Networks (NNs) are known to be vulnerable to adversarial attacks. A malicious agent initiates these attacks by perturbing an input into another one such that the two inputs are classified differently by the NN. In this paper, we consider a special class of adversarial examples, which can exhibit not only the weakness of NN models - as do for the typical adversarial examples - but also the different behavior between two NN models. We call them difference-inducing adversarial examples or DIAEs. Specifically, we propose DAEGEN, the first black-box differential technique for adversarial input generation. DAEGEN takes as input two NN models of the same classification problem and reports on output an adversarial example. The obtained adversarial example is a DIAE, so that it represents a point-wise difference in the input space between the two NN models. Algorithmically, DAEGEN uses a local search-based optimization algorithm to find DIAEs by iteratively perturbing an input to maximize the difference of two models on predicting the input. We conduct experiments on a spectrum of benchmark datasets (e.g., MNIST, ImageNet, and Driving) and NN models (e.g., LeNet, ResNet, Dave, and VGG). Experimental results are promising. First, we compare DAEGEN with two existing white-box differential techniques (DeepXplore and DLFuzz) and find that under the same setting, DAEGEN is 1) effective, i.e., it is the only technique that succeeds in generating attacks in all cases, 2) precise, i.e., the adversarial attacks are very likely to fool machines and humans, and 3) efficient, i.e, it requires a reasonable number of classification queries. Second, we compare DAEGEN with state-of-the-art black-box adversarial attack methods (simba and tremba), by adapting them to work on a differential setting. The experimental results show that DAEGEN performs better than both of them.