Source author record

Sam Tobin-Hochstadt

Sam Tobin-Hochstadt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Programming Languages

Catalog footprint

What is connected

9works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Forward Build Systems, Formally

Build systems are a fundamental part of software construction, but their correctness has received comparatively little attention, relative to more prominent parts of the toolchain. In this paper, we address the correctness of \emph{forward build systems}, which automatically determine the dependency structure of the build, rather than having it specified by the programmer. We first define what it means for a forward build system to be correct -- it must behave identically to simply executing the programmer-specified commands in order. Of course, realistic build systems avoid repeated work, stop early when possible, and run commands in parallel, and we prove that these optimizations, as embodied in the recent forward build system \textsc{Rattle}, preserve our definition of correctness. Along the way, we show that other forward build systems, such as \textsc{Fabricate} and \textsc{Memoize}, are also correct. We carry out all of our work in \Agda, and describe in detail the assumptions underlying both \textsc{Rattle} itself and our modeling of it.

preprint2020arXiv

Build Scripts with Perfect Dependencies

Build scripts for most build systems describe the actions to run, and the dependencies between those actions---but often build scripts get those dependencies wrong. Most build scripts have both too few dependencies (leading to incorrect build outputs) and too many dependencies (leading to excessive rebuilds and reduced parallelism). Any programmer who has wondered why a small change led to excess compilation, or who resorted to a "clean" step, has suffered the ill effects of incorrect dependency specification. We outline a build system where dependencies are not specified, but instead captured by tracing execution. The consequence is that dependencies are always correct by construction and build scripts are easier to write. The simplest implementation of our approach would lose parallelism, but we are able to recover parallelism using speculation.

preprint2016arXiv

Higher-order symbolic execution for contract verification and refutation

We present a new approach to automated reasoning about higher-order programs by endowing symbolic execution with a notion of higher-order, symbolic values. Our approach is sound and relatively complete with respect to a first-order solver for base type values. Therefore, it can form the basis of automated verification and bug-finding tools for higher-order programs. To validate our approach, we use it to develop and evaluate a system for verifying and refuting behavioral software contracts of components in a functional language, which we call soft contract verification. In doing so, we discover a mutually beneficial relation between behavioral contracts and higher-order symbolic execution. Our system uses higher-order symbolic execution, leveraging contracts as a source of symbolic values including unknown behavioral values, and employs an updatable heap of contract invariants to reason about flow-sensitive facts. Whenever a contract is refuted, it reports a concrete counterexample reproducing the error, which may involve solving for an unknown function. The approach is able to analyze first-class contracts, recursive data structures, unknown functions, and control-flow-sensitive refinements of values, which are all idiomatic in dynamic languages. It makes effective use of an off-the-shelf solver to decide problems without heavy encodings. The approach is competitive with a wide range of existing tools---including type systems, flow analyzers, and model checkers---on their own benchmarks. We have built a tool which analyzes programs written in Racket, and report on its effectiveness in verifying and refuting contracts.

preprint2016arXiv

Occurrence Typing Modulo Theories

We present a new type system combining occurrence typing, previously used to type check programs in dynamically-typed languages such as Racket, JavaScript, and Ruby, with dependent refinement types. We demonstrate that the addition of refinement types allows the integration of arbitrary solver-backed reasoning about logical propositions from external theories. By building on occurrence typing, we can add our enriched type system as an extension of Typed Racket---adding dependency and refinement reuses the existing formalism while increasing its expressiveness. Dependent refinement types allow Typed Racket programmers to express rich type relationships, ranging from data structure invariants such as red-black tree balance to preconditions such as vector bounds. Refinements allow programmers to embed the propositions that occurrence typing in Typed Racket already reasons about into their types. Further, extending occurrence typing to refinements allows us to make the underlying formalism simpler and more powerful. In addition to presenting the design of our system, we present a formal model of the system, show how to integrate it with theories over both linear arithmetic and bitvectors, and evaluate the system in the context of the full Typed Racket implementation. Specifically, we take safe vector access as a case study, and examine all vector accesses in a 56,000 line corpus of Typed Racket programs. Our system is able to prove that 50% of these are safe with no new annotation, and with a few annotations and modifications, we can capture close to 80%.

preprint2014arXiv

Soft Contract Verification

Behavioral software contracts are a widely used mechanism for governing the flow of values between components. However, run-time monitoring and enforcement of contracts imposes significant overhead and delays discovery of faulty components to run-time. To overcome these issues, we present soft contract verification, which aims to statically prove either complete or partial contract correctness of components, written in an untyped, higher-order language with first-class contracts. Our approach uses higher-order symbolic execution, leveraging contracts as a source of symbolic values including unknown behavioral values, and employs an updatable heap of contract invariants to reason about flow-sensitive facts. We prove the symbolic execution soundly approximates the dynamic semantics and that verified programs can't be blamed. The approach is able to analyze first-class contracts, recursive data structures, unknown functions, and control-flow-sensitive refinements of values, which are all idiomatic in dynamic languages. It makes effective use of an off-the-shelf solver to decide problems without heavy encodings. The approach is competitive with a wide range of existing tools---including type systems, flow analyzers, and model checkers---on their own benchmarks.

preprint2012arXiv

Higher-Order Symbolic Execution via Contracts

We present a new approach to automated reasoning about higher-order programs by extending symbolic execution to use behavioral contracts as symbolic values, enabling symbolic approximation of higher-order behavior. Our approach is based on the idea of an abstract reduction semantics that gives an operational semantics to programs with both concrete and symbolic components. Symbolic components are approximated by their contract and our semantics gives an operational interpretation of contracts-as-values. The result is a executable semantics that soundly predicts program behavior, including contract failures, for all possible instantiations of symbolic components. We show that our approach scales to an expressive language of contracts including arbitrary programs embedded as predicates, dependent function contracts, and recursive contracts. Supporting this feature-rich language of specifications leads to powerful symbolic reasoning using existing program assertions. We then apply our approach to produce a verifier for contract correctness of components, including a sound and computable approximation to our semantics that facilitates fully automated contract verification. Our implementation is capable of verifying contracts expressed in existing programs, and of justifying valuable contract-elimination optimizations.

preprint2011arXiv

Extensible Pattern Matching in an Extensible Language

Pattern matching is a widely used technique in functional languages, especially those in the ML and Haskell traditions, where it is at the core of the semantics. In languages in the Lisp tradition, in contrast, pattern matching it typically provided by libraries built with macros. We present match, a sophisticated pattern matcher for Racket, implemented as language extension. using macros. The system supports novel and widely-useful pattern-matching forms, and is itself extensible. The extensibility of match is implemented via a general technique for creating extensible language extensions.

preprint2011arXiv

Semantic Solutions to Program Analysis Problems

Problems in program analysis can be solved by developing novel program semantics and deriving abstractions conventionally. For over thirty years, higher-order program analysis has been sold as a hard problem. Its solutions have required ingenuity and complex models of approximation. We claim that this difficulty is due to premature focus on abstraction and propose a new approach that emphasizes semantics. Its simplicity enables new analyses that are beyond the current state of the art.

preprint2011arXiv

The Design and Implementation of Typed Scheme: From Scripts to Programs

When scripts in untyped languages grow into large programs, maintaining them becomes difficult. A lack of explicit type annotations in typical scripting languages forces programmers to must (re)discover critical pieces of design information every time they wish to change a program. This analysis step both slows down the maintenance process and may even introduce mistakes due to the violation of undiscovered invariants. This paper presents Typed Scheme, an explicitly typed extension of PLT Scheme, an untyped scripting language. Its type system is based on the novel notion of occurrence typing, which we formalize and mechanically prove sound. The implementation of Typed Scheme additionally borrows elements from a range of approaches, including recursive types, true unions and subtyping, plus polymorphism combined with a modicum of local inference. The formulation of occurrence typing naturally leads to a simple and expressive version of predicates to describe refinement types. A Typed Scheme program can use these refinement types to keep track of arbitrary classes of values via the type system. Further, we show how the Typed Scheme type system, in conjunction with simple recursive types, is able to encode refinements of existing datatypes, thus expressing both proposed variations of refinement types.

Sam Tobin-Hochstadt

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Forward Build Systems, Formally

Build Scripts with Perfect Dependencies

Higher-order symbolic execution for contract verification and refutation

Occurrence Typing Modulo Theories

Soft Contract Verification

Higher-Order Symbolic Execution via Contracts

Extensible Pattern Matching in an Extensible Language

Semantic Solutions to Program Analysis Problems

The Design and Implementation of Typed Scheme: From Scripts to Programs