Researcher profile

Peter Schrammel

Peter Schrammel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements

Existing REST API testing tools are typically evaluated using code coverage and crash-based fault metrics. However, recent LLM-based approaches increasingly generate tests from NL requirements to validate functional behaviour, making traditional metrics weak proxies for whether generated tests validate intended behaviour. To address this gap, we present RESTestBench, a benchmark comprising three REST services paired with manually verified NL requirements in both precise and vague variants, enabling controlled and reproducible evaluation of requirement-based test generation. RESTestBench further introduces a requirements-based mutation testing metric that measures the fault-detection effectiveness of a generated test case with respect to a specific requirement, extending the property-based approach of Bartocci et al. . Using RESTestBench, we evaluate two approaches across multiple state-of-the-art LLMs: (i) non-refinement-based generation, and (ii) refinement-based generation guided by interaction with the running SUT. In the refinement experiments, RESTestBench assesses how exposure to the actual implementation, valid or mutated, affects test effectiveness. Our results show that test effectiveness drops considerably when the generator interacts with faulty or mutated code, especially for vague requirements, sometimes negating the benefit of refinement and indicating that incorporating actual SUT behaviour is unnecessary when requirement detail is high.

preprint2022arXiv

BlueCov: Integrating Test Coverage and Model Checking with JBMC

Automated test case generation tools help businesses to write tests and increase the safety net provided by high regression test coverage when making code changes. Test generation needs to cover as much as possible of the uncovered code while avoiding generating redundant tests for code that is already covered by an existing test-suite. In this paper we present our work on a tool for the real world application of integrating formal analysis with automatic test case generation. The test case generation is based on coverage analysis using the Java bounded model checker (JBMC). Counterexamples of the model checker can be translated into Java method calls with specific parameters. In order to avoid the generation of redundant tests, it is necessary to measure the coverage in the exact same way as JBMC generates its coverage goals. Each existing coverage measurement tool uses a slightly different instrumentation and thus a different coverage criterion. This makes integration with a test case generator based on formal analysis difficult. Therefore, we developed BlueCov as a specific runtime coverage measurement tool which uses the exact same coverage criteria as JBMC does. This approach also allows for incremental test-case generation, only generating test coverage for previously untested code, e.g., to complete existing test suites.

preprint2020arXiv

2LS: Heap Analysis and Memory Safety (Competition Contribution)

2LS is a framework for analysis of sequential C programs that can verify and refute program assertions and termination. The 2LS framework is built upon the CPROVER infrastructure and implements template-based synthesis techniques, e.g. to find invariants and ranking functions, and incremental loop unwinding techniques to find counterexamples and k-induction proofs. The main improvements in this year's version are the ability of 2LS to analyse programs requiring combined reasoning about shape and content of dynamic data structures, and an instrumentation for memory safety properties.