Source author record

Gabriel Scherer

Gabriel Scherer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Programming Languages Logic in Computer Science math.CT

Catalog footprint

What is connected

10works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

The free bifibration on a functor

We consider the problem of constructing the free bifibration generated by a functor of categories $p : D \to C$. This problem was previously considered by Lamarche, and is closely related to the problem, considered by Dawson, Paré, and Pronk, of ``freely adjoining adjoints'' to a category. We develop a proof-theoretic approach to the problem, beginning with a construction of the free bifibration $Λ_p : Bifib(p)\to C$ in which objects of $Bifib(p)$ are formulas of a primitive ``bifibrational logic'', and arrows are derivations in a cut-free sequent calculus modulo a notion of permutation equivalence. We show that instantiating the construction to the identity functor generates a _zigzag double category_ $\mathbb{Z}(C)$, which is also the free double category with companions and conjoints (or fibrant double category) on $C$. The approach adapts smoothly to the more general task of building $(P,N)$-fibrations, where one only asks for pushforwards along arrows in $P$ and pullbacks along arrows in $N$ for some subsets of arrows; this encompasses Kock and Joyal's notion of _ambifibration_ when $(P,N)$ form a factorization system. We establish a series of progressively stronger normal forms, guided by ideas of _focusing_ from proof theory, and obtain a canonicity result under assumption that the base category is factorization preordered relative to $P$ and $N$. This canonicity result allows us to decide the word problem and to enumerate relative homsets without duplicates. Finally, we describe several examples of a combinatorial nature, including a category of plane trees generated as a free bifibration over $ω$, and a category of increasing forests generated as a free ambifibration over $Δ$, which contains the lattices of noncrossing partitions as quotients of its fibers by the Beck-Chevalley condition for bicartesian squares.

preprint2022arXiv

Debootstrapping without Archeology: Stacked Implementations in Camlboot

Context: It is common for programming languages that their reference implementation is implemented in the language itself. This requires a "bootstrap": a copy of a previous version of the implementation is provided along with the sources, to be able to run the implementation itself. Those bootstrap files are opaque binaries; they could contain bugs, or even malicious changes that could reproduce themselves when running the source version of the language implementation -- this is called the "trusting trust attack". For this reason, a collective project called Bootstrappable was launched in 2016 to remove those bootstraps, providing alternative build paths that do not rely on opaque binaries. Inquiry: Debootstrapping generally combines a mix of two approaches. The "archaeological" approach works by locating old versions of systems, or legacy alternative implementations, that do not need the bootstrap, and by preserving or restoring the ability to run them. The "tailored" approach re-implements a new, non-bootstrapped implementation of the system to debootstrap. Currently, the "tailored" approach is dominant for low-level system components (C, coreutils), and the "archaeological" approach is dominant among the few higher-level languages that were debootstrapped. Approach: We advocate for the benefits of "tailored" debootstrapping implementations of high-level languages. The new implementation needs not be production-ready, it suffices that it is able to run the reference implementation correctly. We argue that this is feasible with a reasonable development effort, with several side benefits besides debootstrapping. Knowledge: We propose a specific design of composing/stacking several implementations: a reference interpreter for the language of interest, implemented in a small subset of the language, and a compiler for this small subset (in another language). Developing a reference interpreter is valuable independently of debootstrapping: it may help clarify the language semantics, and can be reused for other purposes such as differential testing of the other implementations. Grounding: We present Camlboot, our project to debootstrap the OCaml compiler, version 4.07. Once we converged on this final design, the last version of Camlboot took about a person-month of implementation effort, demonstrating feasibility. Using diverse double-compilation, we were able to prove the absence of trusting trust attack in the existing bootstrap of the standard OCaml implementation. Importance: To our knowledge, this document is the first scholarly discussion of "tailored" debootstrapping for high-level programming languages. Debootstrapping is an interesting problem which recently grew an active community of free software contributors, but so far the interactions with the programming-language research community have been minimal. We share our experience on Camlboot, trying to highlight aspects that are of interest to other language designers and implementors; we hope to foster stronger ties between the Bootstrappable project and relevant academic communities. In particular, the debootstrapping experience has been an interesting reflection on OCaml design and implementation, and we hope that other language implementors would find it equally valuable.

preprint2021arXiv

Tail Modulo Cons

OCaml function calls consume space on the system stack. Operating systems set default limits on the stack space which are much lower than the available memory. If a program runs out of stack space, they get the dreaded "Stack Overflow" exception -- they crash. As a result, OCaml programmers have to be careful, when they write recursive functions, to remain in the so-called _tail-recursive_ fragment, using _tail_ calls that do not consume stack space. This discipline is a source of difficulties for both beginners and experts. Beginners have to be taught recursion, and then tail-recursion. Experts disagree on the "right" way to write `List.map`. The direct version is beautiful but not tail-recursive, so it crashes on larger inputs. The naive tail-recursive transformation is (slightly) slower than the direct version, and experts may want to avoid that cost. Some libraries propose horrible implementations, unrolling code by hand, to compensate for this performance loss. In general, tail-recursion requires the programmer to manually perform sophisticated program transformations. In this work we propose an implementation of "Tail Modulo Cons" (TMC) for OCaml. TMC is a program transformation for a fragment of non-tail-recursive functions, that rewrites them in _destination-passing style_. The supported fragment is smaller than other approaches such as continuation-passing-style, but the performance of the transformed code is on par with the direct, non-tail-recursive version. Many useful functions that traverse a recursive datastructure and rebuild another recursive structure are in the TMC fragment, in particular `List.map` (and `List.filter`, `List.append`, etc.). Finally those functions can be written in a way that is beautiful, correct on all inputs, and efficient.

preprint2020arXiv

Dependent Pearl: Normalization by realizability

For those of us who generally live in the world of syntax, semantic proof techniques such as reducibility, realizability or logical relations seem somewhat magical despite -- or perhaps due to -- their seemingly unreasonable effectiveness. Why do they work? At which point in the proof is "the real work" done? Hoping to build a programming intuition of these proofs, we implement a normalization argument for the simply-typed lambda-calculus with sums: instead of a proof, it is described as a program in a dependently-typed meta-language. The semantic technique we set out to study is Krivine's classical realizability, which amounts to a proof-relevant presentation of reducibility arguments -- unary logical relations. Reducibility assigns a predicate to each type, realizability assigns a set of realizers, which are abstract machines that extend lambda-terms with a first-class notion of contexts. Normalization is a direct consequence of an adequacy theorem or "fundamental lemma", which states that any well-typed term translates to a realizer of its type. We show that the adequacy theorem, when written as a dependent program, corresponds to an evaluation procedure. In particular, a weak normalization proof precisely computes a series of reduction from the input term to a normal form. Interestingly, the choices that we make when we define the reducibility predicates -- truth and falsity witnesses for each connective -- determine the evaluation order of the proof, with each datatype constructor behaving in a lazy or strict fashion. While most of the ideas in this presentation are folklore among specialists, our dependently-typed functional program provides an accessible presentation to a wider audience. In particular, our work provides a gentle introduction to abstract machine calculi which have recently been used as an effective research vehicle.

preprint2017arXiv

Correctness of Speculative Optimizations with Dynamic Deoptimization

High-performance dynamic language implementations make heavy use of speculative optimizations to achieve speeds close to statically compiled languages. These optimizations are typically performed by a just-in-time compiler that generates code under a set of assumptions about the state of the program and its environment. In certain cases, a program may execute code compiled under assumptions that are no longer valid. The implementation must then deoptimize the program on-the-fly; this entails finding semantically equivalent code that does not rely on invalid assumptions, translating program state to that expected by the target code, and transferring control. This paper looks at the interaction between optimization and deoptimization, and shows that reasoning about speculation is surprisingly easy when assumptions are made explicit in the program representation. This insight is demonstrated on a compiler intermediate representation, named \sourir, modeled after the high-level representation for a dynamic language. Traditional compiler optimizations such constant folding, dead code elimination, and function inlining are shown to be correct in the presence of assumptions. Furthermore, the paper establishes the correctness of compiler transformations specific to deoptimization: namely unrestricted deoptimization, predicate hoisting, and assume composition.

preprint2016arXiv

Deciding equivalence with sums and the empty type

The logical technique of focusing can be applied to the $λ$-calculus; in a simple type system with atomic types and negative type formers (functions, products, the unit type), its normal forms coincide with $βη$-normal forms. Introducing a saturation phase gives a notion of quasi-normal forms in presence of positive types (sum types and the empty type). This rich structure let us prove the decidability of $βη$-equivalence in presence of the empty type, the fact that it coincides with contextual equivalence, and a finite model property.

preprint2013arXiv

GADTs meet subtyping

While generalized algebraic datatypes (\GADTs) are now considered well-understood, adding them to a language with a notion of subtyping comes with a few surprises. What does it mean for a \GADT parameter to be covariant? The answer turns out to be quite subtle. It involves fine-grained properties of the subtyping relation that raise interesting design questions. We allow variance annotations in \GADT definitions, study their soundness, and present a sound and complete algorithm to check them. Our work may be applied to real-world ML-like languages with explicit subtyping such as OCaml, or to languages with general subtyping constraints.

preprint2013arXiv

Tracking Data-Flow with Open Closure Types

Type systems hide data that is captured by function closures in function types. In most cases this is a beneficial design that favors simplicity and compositionality. However, some applications require explicit information about the data that is captured in closures. This paper introduces open closure types, that is, function types that are decorated with type contexts. They are used to track data-flow from the environment into the function closure. A simply-typed lambda calculus is used to study the properties of the type theory of open closure types. A distinctive feature of this type theory is that an open closure type of a function can vary in different type contexts. To present an application of the type theory, it is shown that a type derivation establishes a simple non-interference property in the sense of information-flow theory. A publicly available prototype implementation of the system can be used to experiment with type derivations for example programs.

preprint2012arXiv

GADT meet Subtyping

While generalized abstract datatypes (GADT) are now considered well-understood, adding them to a language with a notion of subtyping comes with a few surprises. What does it mean for a GADT parameter to be covariant? The answer turns out to be quite subtle. It involves fine-grained properties of the subtyping relation that raise interesting design questions. We allow variance annotations in GADT definitions, study their soundness, and present a sound and complete algorithm to check them. Our work may be applied to real-world ML-like languages with explicit subtyping such as OCaml, or to languages with general subtyping constraints.

preprint2012arXiv

On Irrelevance and Algorithmic Equality in Predicative Type Theory

Dependently typed programs contain an excessive amount of static terms which are necessary to please the type checker but irrelevant for computation. To separate static and dynamic code, several static analyses and type systems have been put forward. We consider Pfenning's type theory with irrelevant quantification which is compatible with a type-based notion of equality that respects eta-laws. We extend Pfenning's theory to universes and large eliminations and develop its meta-theory. Subject reduction, normalization and consistency are obtained by a Kripke model over the typed equality judgement. Finally, a type-directed equality algorithm is described whose completeness is proven by a second Kripke model.

Gabriel Scherer

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

The free bifibration on a functor

Debootstrapping without Archeology: Stacked Implementations in Camlboot

Tail Modulo Cons

Dependent Pearl: Normalization by realizability

Correctness of Speculative Optimizations with Dynamic Deoptimization

Deciding equivalence with sums and the empty type

GADTs meet subtyping

Tracking Data-Flow with Open Closure Types

GADT meet Subtyping

On Irrelevance and Algorithmic Equality in Predicative Type Theory