Researcher profile

Marcel Böhme

Marcel Böhme contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Effectiveness and Scalability of Fuzzing Techniques in CI/CD Pipelines

Fuzzing has proven to be a fundamental technique to automated software testing but also a costly one. With the increased adoption of CI/CD practices in software development, a natural question to ask is `What are the best ways to integrate fuzzing into CI/CD pipelines considering the velocity in code changes and the automated delivery/deployment practices?'. Indeed, a recent study by Böhme and Zhu shows that four in every five bugs have been introduced by recent code changes (i.e. regressions). In this paper, we take a close look at the integration of fuzzers to CI/CD pipelines from both automated software testing and continuous development angles. Firstly, we study an optimization opportunity to triage commits that do not require fuzzing and find, through experimental analysis, that the average fuzzing effort in CI/CD can be reduced by ~63% in three of the nine libraries we analyzed (>40% for six libraries). Secondly, we investigate the impact of fuzzing campaign duration on the CI/CD process: A shorter fuzzing campaign such as 15 minutes (as opposed to the wisdom of 24 hours in the field) facilitates a faster pipeline and can still uncover important bugs, but may also reduce its capability to detect sophisticated bugs. Lastly, we discuss a prioritization strategy that automatically assigns resources to fuzzing campaigns based on a set of predefined priority strategies. Our findings suggest that continuous fuzzing (as part of the automated testing in CI/CD) is indeed beneficial and there are many optimization opportunities to improve the effectiveness and scalability of fuzz testing.

preprint2021arXiv

Locating Faults with Program Slicing: An Empirical Analysis

Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best performing statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors).

preprint2018arXiv

Smart Greybox Fuzzing

Coverage-based greybox fuzzing (CGF) is one of the most successful methods for automated vulnerability detection. Given a seed file (as a sequence of bits), CGF randomly flips, deletes or bits to generate new files. CGF iteratively constructs (and fuzzes) a seed corpus by retaining those generated files which enhance coverage. However, random bitflips are unlikely to produce valid files (or valid chunks in files), for applications processing complex file formats. In this work, we introduce smart greybox fuzzing (SGF) which leverages a high-level structural representation of the seed file to generate new files. We define innovative mutation operators that work on the virtual file structure rather than on the bit level which allows SGF to explore completely new input domains while maintaining file validity. We introduce a novel validity-based power schedule that enables SGF to spend more time generating files that are more likely to pass the parsing stage of the program, which can expose vulnerabilities much deeper in the processing logic. Our evaluation demonstrates the effectiveness of SGF. On several libraries that parse structurally complex files, our tool AFLSmart explores substantially more paths (up to 200%) and exposes more vulnerabilities than baseline AFL. Our tool AFLSmart has discovered 42 zero-day vulnerabilities in widely-used, well-tested tools and libraries; so far 17 CVEs were assigned.