Source author record

Cristina David

Cristina David appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Logic in Computer Science Programming Languages Artificial Intelligence Software Engineering Computational Engineering, Finance, and Science Human-Computer Interaction Information Retrieval Machine Learning

Catalog footprint

What is connected

9works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

AI-Assisted Authoring for Transparent, Data-Driven Documents

We introduce _transparent documents_, interactive web-based scholarly articles which allow readers to explore the relationship to the underlying data by hovering over fragments of text, and present an LLM-based tool for authoring transparent documents, building on recent developments in data provenance for general-purpose programming languages. As a target platform, our implementation uses Fluid, an open source programming language with a provenance-tracking runtime. Our agent-based tool supports a human author during the creation of transparent documents, identifying fragments of text which can be computed from data, such as numerical values selected from records or computed by aggregations like sum and mean, comparatives and superlatives like _better than_ and _largest_, trend-adjectives like _growing_, and similar quantitative or semi-quantitative phrases, and then attempts to synthesise a suitable Fluid query over the data which generates the target string. The resulting expression is inserted into the article's web page, turning the static text fragment into an interactable data-driven element able to reveal the data that underwrites the natural language claim. We evaluate our approach on a subset of SciGen, an open source dataset consisting of tables from scientific articles and their corresponding descriptions, which we extend with hand-generated counterfactual test cases to evaluate how well machine-generated expressions generalise. Our results show that gpt4o is often able to synthesise compound expressions extensionally compatible with our gold solutions.

preprint2022arXiv

Decomposition Without Regret

Programming languages are embracing both functional and object-oriented paradigms. A key difference between the two paradigms is the way of achieving data abstraction. That is, how to organize data with associated operations. There are important tradeoffs between functional and object-oriented decomposition in terms of extensibility and expressiveness. Unfortunately, programmers are usually forced to select a particular decomposition style in the early stage of programming. Once the wrong design decision has been made, the price for switching to the other decomposition style could be rather high since pervasive manual refactoring is often needed. To address this issue, this paper presents a bidirectional transformation system between functional and object-oriented decomposition. We formalize the core of the system in the FOOD calculus, which captures the essence of functional and object-oriented decomposition. We prove that the transformation preserves the type and semantics of the original program. We further implement FOOD in Scala as a translation tool called Cook and conduct several case studies to demonstrate the applicability and effectiveness of Cook.

preprint2022arXiv

Using Graph Neural Networks for Program Termination

Termination analyses investigate the termination behavior of programs, intending to detect nontermination, which is known to cause a variety of program bugs (e.g. hanging programs, denial-of-service vulnerabilities). Beyond formal approaches, various attempts have been made to estimate the termination behavior of programs using neural networks. However, the majority of these approaches continue to rely on formal methods to provide strong soundness guarantees and consequently suffer from similar limitations. In this paper, we move away from formal methods and embrace the stochastic nature of machine learning models. Instead of aiming for rigorous guarantees that can be interpreted by solvers, our objective is to provide an estimation of a program's termination behavior and of the likely reason for nontermination (when applicable) that a programmer can use for debugging purposes. Compared to previous approaches using neural networks for program termination, we also take advantage of the graph representation of programs by employing Graph Neural Networks. To further assist programmers in understanding and debugging nontermination bugs, we adapt the notions of attention and semantic segmentation, previously used for other application domains, to programs. Overall, we designed and implemented classifiers for program termination based on Graph Convolutional Networks and Graph Attention Networks, as well as a semantic segmentation Graph Neural Network that localizes AST nodes likely to cause nontermination. We also illustrated how the information provided by semantic segmentation can be combined with program slicing to further aid debugging.

preprint2015arXiv

Danger Invariants

Static analysers search for overapproximating proofs of safety commonly known as safety invariants. Fundamentally, such analysers summarise traces into sets of states, thus trading the ability to distinguish traces for computational tractability. Conversely, static bug finders (e.g. Bounded Model Checking) give evidence for the failure of an assertion in the form of a counterexample, which can be inspected by the user. However, static bug finders fail to scale when analysing programs with bugs that require many iterations of a loop as the computational effort grows exponentially with the depth of the bug. We propose a novel approach for finding bugs, which delivers the performance of abstract interpretation together with the concrete precision of BMC. To do this, we introduce the concept of danger invariants -- the dual to safety invariants. Danger invariants summarise sets of traces that are guaranteed to reach an error state. This summarisation allows us to find deep bugs without false alarms and without explicitly unwinding loops. We present a second-order formulation of danger invariants and use the Second-Order SAT solver described in previous work to compute danger invariants for intricate programs taken from the literature.

preprint2015arXiv

Second-Order Propositional Satisfiability

Fundamentally, every static program analyser searches for a proof through a combination of heuristics providing candidate solutions and a candidate validation technique. Essentially, the heuristic reduces a second-order problem to a first-order/propositional one, while the validation is often just a call to a SAT/SMT solver. This results in a monolithic design of such analyses that conflates the formulation of the problem with the solving process. Consequently, any change to the latter causes changes to the whole analysis. This design is dictated by the state of the art in solver technology. While SAT/SMT solvers have experienced tremendous progress, there are barely any second-order solvers. This paper takes a step towards addressing this situation by proposing a decidable fragment of second-order logic that is still expressive enough to capture numerous program analysis problems (e.g. safety proving, bug finding, termination and non-termination proving, superoptimisation). We refer to the satisfiability problem for this fragment as Second-Order SAT and show it is NEXPTIME-complete. Finally, we build a decision procedure for Second-Order SAT based on program synthesis and present experimental evidence that our approach is tractable for program analysis problems.

preprint2015arXiv

Synthesising Interprocedural Bit-Precise Termination Proofs (extended version)

Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.

preprint2015arXiv

Using Program Synthesis for Program Analysis

In this paper, we identify a fragment of second-order logic with restricted quantification that is expressive enough to capture numerous static analysis problems (e.g. safety proving, bug finding, termination and non-termination proving, superoptimisation). We call this fragment the {\it synthesis fragment}. Satisfiability of a formula in the synthesis fragment is decidable over finite domains; specifically the decision problem is NEXPTIME-complete. If a formula in this fragment is satisfiable, a solution consists of a satisfying assignment from the second order variables to \emph{functions over finite domains}. To concretely find these solutions, we synthesise \emph{programs} that compute the functions. Our program synthesis algorithm is complete for finite state programs, i.e. every \emph{function} over finite domains is computed by some \emph{program} that we can synthesise. We can therefore use our synthesiser as a decision procedure for the synthesis fragment of second-order logic, which in turn allows us to use it as a powerful backend for many program analysis tasks. To show the tractability of our approach, we evaluate the program synthesiser on several static analysis problems.

preprint2014arXiv

Propositional Reasoning about Safety and Termination of Heap-Manipulating Programs

This paper shows that it is possible to reason about the safety and termination of programs handling potentially cyclic, singly-linked lists using propositional reasoning even when the safety invariants and termination arguments depend on constraints over the lengths of lists. For this purpose, we propose the theory SLH of singly-linked lists with length, which is able to capture non-trivial interactions between shape and arithmetic. When using the theory of bit-vector arithmetic as a background, SLH is efficiently decidable via a reduction to SAT. We show the utility of SLH for software verification by using it to express safety invariants and termination arguments for programs manipulating potentially cyclic, singly-linked lists with unrestricted, unspecified sharing. We also provide an implementation of the decision procedure and use it to check safety and termination proofs for several heap-manipulating programs.

preprint2014arXiv

Unrestricted Termination and Non-Termination Arguments for Bit-Vector Programs

Proving program termination is typically done by finding a well-founded ranking function for the program states. Existing termination provers typically find ranking functions using either linear algebra or templates. As such they are often restricted to finding linear ranking functions over mathematical integers. This class of functions is insufficient for proving termination of many terminating programs, and furthermore a termination argument for a program operating on mathematical integers does not always lead to a termination argument for the same program operating on fixed-width machine integers. We propose a termination analysis able to generate nonlinear, lexicographic ranking functions and nonlinear recurrence sets that are correct for fixed-width machine arithmetic and floating-point arithmetic Our technique is based on a reduction from program \emph{termination} to second-order \emph{satisfaction}. We provide formulations for termination and non-termination in a fragment of second-order logic with restricted quantification which is decidable over finite domains. The resulted technique is a sound and complete analysis for the termination of finite-state programs with fixed-width integers and IEEE floating-point arithmetic.

Cristina David

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

AI-Assisted Authoring for Transparent, Data-Driven Documents

Decomposition Without Regret

Using Graph Neural Networks for Program Termination

Danger Invariants

Second-Order Propositional Satisfiability

Synthesising Interprocedural Bit-Precise Termination Proofs (extended version)

Using Program Synthesis for Program Analysis

Propositional Reasoning about Safety and Termination of Heap-Manipulating Programs

Unrestricted Termination and Non-Termination Arguments for Bit-Vector Programs