Source author record

Paulette Clancy

Paulette Clancy appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language physics.chem-ph Quantitative Methods Software Engineering

Catalog footprint

What is connected

2works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Can Coding Agents Reproduce Findings in Computational Materials Science?

Large language models are increasingly deployed as autonomous coding agents and have achieved remarkably strong performance on software engineering benchmarks. However, it is unclear whether such success transfers to computational scientific workflows, where tasks require not only strong coding ability, but also the ability to navigate complex, domain-specific procedures and to interpret results in the context of scientific claims. To address this question, we present AutoMat, a benchmark for evaluating LLM-based agents' ability to reproduce claims from computational materials science. AutoMat poses three interrelated challenges: recovering underspecified computational procedures, navigating specialized toolchains, and determining whether the resulting evidence supports a claim. By working closely with subject matter experts, we curate a set of claims from real materials science papers to test whether coding agents can recover and execute the end-to-end workflow needed to support (or undermine) such claims. We then evaluate multiple representative coding agent settings across several foundation models. Our results show that current LLM-based agents obtain low overall success rates on AutoMat, with the best-performing setting achieving a success rate of only 54.1%. Error analysis further reveals that agents perform worst when workflows must be reconstructed from paper text alone and that they fail primarily due to incomplete procedures, methodological deviations, and execution fragility. Taken together, these findings position AutoMat as both a benchmark for computational scientific reproducibility and a tool for diagnosing the current limitations of agentic systems in AI-for-science settings.

preprint2010arXiv

Accurate implementation of leaping in space: The spatial partitioned-leaping algorithm

There is a great need for accurate and efficient computational approaches that can account for both the discrete and stochastic nature of chemical interactions as well as spatial inhomogeneities and diffusion. This is particularly true in biology and nanoscale materials science, where the common assumptions of deterministic dynamics and well-mixed reaction volumes often break down. In this article, we present a spatial version of the partitioned-leaping algorithm (PLA), a multiscale accelerated-stochastic simulation approach built upon the tau-leaping framework of Gillespie. We pay special attention to the details of the implementation, particularly as it pertains to the time step calculation procedure. We point out conceptual errors that have been made in this regard in prior implementations of spatial tau-leaping and illustrate the manifestation of these errors through practical examples. Finally, we discuss the fundamental difficulties associated with incorporating efficient exact-stochastic techniques, such as the next-subvolume method, into a spatial-leaping framework and suggest possible solutions.