Source author record

Peter Schrammel

Peter Schrammel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Logic in Computer Science Programming Languages Artificial Intelligence Systems and Control

Catalog footprint

What is connected

12works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements

Existing REST API testing tools are typically evaluated using code coverage and crash-based fault metrics. However, recent LLM-based approaches increasingly generate tests from NL requirements to validate functional behaviour, making traditional metrics weak proxies for whether generated tests validate intended behaviour. To address this gap, we present RESTestBench, a benchmark comprising three REST services paired with manually verified NL requirements in both precise and vague variants, enabling controlled and reproducible evaluation of requirement-based test generation. RESTestBench further introduces a requirements-based mutation testing metric that measures the fault-detection effectiveness of a generated test case with respect to a specific requirement, extending the property-based approach of Bartocci et al. . Using RESTestBench, we evaluate two approaches across multiple state-of-the-art LLMs: (i) non-refinement-based generation, and (ii) refinement-based generation guided by interaction with the running SUT. In the refinement experiments, RESTestBench assesses how exposure to the actual implementation, valid or mutated, affects test effectiveness. Our results show that test effectiveness drops considerably when the generator interacts with faulty or mutated code, especially for vague requirements, sometimes negating the benefit of refinement and indicating that incorporating actual SUT behaviour is unnecessary when requirement detail is high.

preprint2022arXiv

BlueCov: Integrating Test Coverage and Model Checking with JBMC

Automated test case generation tools help businesses to write tests and increase the safety net provided by high regression test coverage when making code changes. Test generation needs to cover as much as possible of the uncovered code while avoiding generating redundant tests for code that is already covered by an existing test-suite. In this paper we present our work on a tool for the real world application of integrating formal analysis with automatic test case generation. The test case generation is based on coverage analysis using the Java bounded model checker (JBMC). Counterexamples of the model checker can be translated into Java method calls with specific parameters. In order to avoid the generation of redundant tests, it is necessary to measure the coverage in the exact same way as JBMC generates its coverage goals. Each existing coverage measurement tool uses a slightly different instrumentation and thus a different coverage criterion. This makes integration with a test case generator based on formal analysis difficult. Therefore, we developed BlueCov as a specific runtime coverage measurement tool which uses the exact same coverage criteria as JBMC does. This approach also allows for incremental test-case generation, only generating test coverage for previously untested code, e.g., to complete existing test suites.

preprint2020arXiv

2LS: Heap Analysis and Memory Safety (Competition Contribution)

2LS is a framework for analysis of sequential C programs that can verify and refute program assertions and termination. The 2LS framework is built upon the CPROVER infrastructure and implements template-based synthesis techniques, e.g. to find invariants and ranking functions, and incremental loop unwinding techniques to find counterexamples and k-induction proofs. The main improvements in this year's version are the ability of 2LS to analyse programs requiring combined reasoning about shape and content of dynamic data structures, and an instrumentation for memory safety properties.

preprint2016arXiv

Challenges in Decomposing Encodings of Verification Problems

Modern program verifiers use logic-based encodings of the verification problem that are discharged by a back end reasoning engine. However, instances of such encodings for large programs can quickly overwhelm these back end solvers. Hence, we need techniques to make the solving process scale to large systems, such as partitioning (divide-and-conquer) and abstraction. In recent work, we showed how decomposing the formula encoding of a termination analysis can significantly increase efficiency. The analysis generates a sequence of logical formulas with existentially quantified predicates that are solved by a synthesis-based program analysis engine. However, decomposition introduces abstractions in addition to those required for finding the unknown predicates in the formula, and can hence deteriorate precision. We discuss the challenges associated with such decompositions and their interdependencies with the solving process.

preprint2016arXiv

Sound Static Deadlock Analysis for C/Pthreads (Extended Version)

We present a static deadlock analysis approach for C/pthreads. The design of our method has been guided by the requirement to analyse real-world code. Our approach is sound (i.e., misses no deadlocks) for programs that have defined behaviour according to the C standard, and precise enough to prove deadlock-freedom for a large number of programs. The method consists of a pipeline of several analyses that build on a new context- and thread-sensitive abstract interpretation framework. We further present a lightweight dependency analysis to identify statements relevant to deadlock analysis and thus speed up the overall analysis. In our experimental evaluation, we succeeded to prove deadlock-freedom for 262 programs from the Debian GNU/Linux distribution with in total 2.6 MLOC in less than 11 hours.

preprint2015arXiv

Assisted Coverage Closure

The malfunction of safety-critical systems may cause damage to people and the environment. Software within those systems is rigorously designed and verified according to domain specific guidance, such as ISO26262 for automotive safety. This paper describes academic and industrial co-operation in tool development to support one of the most stringent of the requirements --- achieving full code coverage in requirements-driven testing. We present a verification workflow supported by a tool that integrates the coverage measurement tool RapiCover with the test-vector generator FShell. The tool assists closing the coverage gap by providing the engineer with test vectors that help in debugging coverage-related code quality issues and creating new test cases, as well as justifying the presence of unreachable parts of the code in order to finally achieve full effective coverage according to the required criteria. To illustrate the practical utility of the tool, we report about an application of the tool to a case study from automotive industry.

preprint2015arXiv

Safety Verification and Refutation by k-invariants and k-induction (extended version)

Most software verification tools can be classified into one of a number of established families, each of which has their own focus and strengths. For example, concrete counterexample generation in model checking, invariant inference in abstract interpretation and completeness via annotation for deductive verification. This creates a significant and fundamental usability problem as users may have to learn and use one technique to find potential problems but then need an entirely different one to show that they have been fixed. This paper presents a single, unified algorithm kIkI, which strictly generalises abstract interpretation, bounded model checking and k-induction. This not only combines the strengths of these techniques but allows them to interact and reinforce each other, giving a `single-tool' approach to verification.

preprint2015arXiv

Synthesising Interprocedural Bit-Precise Termination Proofs (extended version)

Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.

preprint2014arXiv

Abstract Acceleration of General Linear Loops

We present abstract acceleration techniques for computing loop invariants for numerical programs with linear assignments and conditionals. Whereas abstract interpretation techniques typically over-approximate the set of reachable states iteratively, abstract acceleration captures the effect of the loop with a single, non-iterative transfer function applied to the initial states at the loop head. In contrast to previous acceleration techniques, our approach applies to any linear loop without restrictions. Its novelty lies in the use of the Jordan normal form decomposition of the loop body to derive symbolic expressions for the entries of the matrix modeling the effect of n>=0 iterations of the loop. The entries of such a matrix depend on $n$ through complex polynomial, exponential and trigonometric functions. Therefore, we introduces an abstract domain for matrices that captures the linear inequality relations between these complex expressions. This results in an abstract matrix for describing the fixpoint semantics of the loop. Our approach integrates smoothly into standard abstract interpreters and can handle programs with nested loops and loops containing conditional branches. We evaluate it over small but complex loops that are commonly found in control software, comparing it with other tools for computing linear loop invariants. The loops in our benchmarks typically exhibit polynomial, exponential and oscillatory behaviors that present challenges to existing approaches. Our approach finds non-trivial invariants to prove useful bounds on the values of variables for such loops, clearly outperforming the existing approaches in terms of precision while exhibiting good performance.

preprint2014arXiv

Incremental Bounded Model Checking for Embedded Software (extended version)

Program analysis is on the brink of mainstream in embedded systems development. Formal verification of behavioural requirements, finding runtime errors and automated test case generation are some of the most common applications of automated verification tools based on Bounded Model Checking. Existing industrial tools for embedded software use an off-the-shelf Bounded Model Checker and apply it iteratively to verify the program with an increasing number of unwindings. This approach unnecessarily wastes time repeating work that has already been done and fails to exploit the power of incremental SAT solving. This paper reports on the extension of the software model checker CBMC to support incremental Bounded Model Checking and its successful integration with the industrial embedded software verification tool BTC EmbeddedTester. We present an extensive evaluation over large industrial embedded programs, which shows that incremental Bounded Model Checking cuts runtimes by one order of magnitude in comparison to the standard non-incremental approach, enabling the application of formal verification to large and complex embedded software.

preprint2014arXiv

Speeding Up Logico-Numerical Strategy Iteration (extended version)

We introduce an efficient combination of polyhedral analysis and predicate partitioning. Template polyhedral analysis abstracts numerical variables inside a program by one polyhedron per control location, with a priori fixed directions for the faces. The strongest inductive invariant in such an abstract domain may be computed by upward strategy iteration. If the transition relation includes disjunctions and existential quantifiers (a succinct representation for an exponential set of paths), this invariant can be computed by a combination of strategy iteration and satisfiability modulo theory (SMT) solving. Unfortunately, the above approaches lead to unacceptable space and time costs if applied to a program whose control states have been partitioned according to predicates. We therefore propose a modification of the strategy iteration algorithm where the strategies are stored succinctly, and the linear programs to be solved at each iteration step are simplified according to an equivalence relation. We have implemented the technique in a prototype tool and we demonstrate on a series of examples that the approach performs significantly better than previous strategy iteration techniques.

preprint2013arXiv

Chaining Test Cases for Reactive System Testing (extended version)

Testing of synchronous reactive systems is challenging because long input sequences are often needed to drive them into a state at which a desired feature can be tested. This is particularly problematic in on-target testing, where a system is tested in its real-life application environment and the time required for resetting is high. This paper presents an approach to discovering a test case chain---a single software execution that covers a group of test goals and minimises overall test execution time. Our technique targets the scenario in which test goals for the requirements are given as safety properties. We give conditions for the existence and minimality of a single test case chain and minimise the number of test chains if a single test chain is infeasible. We report experimental results with a prototype tool for C code generated from Simulink models and compare it to state-of-the-art test suite generators.

Peter Schrammel

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements

BlueCov: Integrating Test Coverage and Model Checking with JBMC

2LS: Heap Analysis and Memory Safety (Competition Contribution)

Challenges in Decomposing Encodings of Verification Problems

Sound Static Deadlock Analysis for C/Pthreads (Extended Version)

Assisted Coverage Closure

Safety Verification and Refutation by k-invariants and k-induction (extended version)

Synthesising Interprocedural Bit-Precise Termination Proofs (extended version)

Abstract Acceleration of General Linear Loops

Incremental Bounded Model Checking for Embedded Software (extended version)

Speeding Up Logico-Numerical Strategy Iteration (extended version)

Chaining Test Cases for Reactive System Testing (extended version)