Source author record

Nicholas Roberts

Nicholas Roberts appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Computer Vision math.NA physics.comp-ph Populations and Evolution

Catalog footprint

What is connected

4works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Breakeven complexity: A new perspective on neural partial differential equation solvers

Neural surrogate solvers of partial differential equations (PDEs) promise dramatic speedups over numerical methods, especially in scenarios requiring many solves. However, current accuracy-based evaluations do not fully consider two central issues: (1) neural solvers incur substantial up-front costs for data generation, training, and tuning; and (2) classical solvers can also generate low-fidelity solutions at a sufficiently low simulation cost. To explicitly account for these realities and fully incorporate end-to-end costs, we propose an evaluation framework centered on breakeven complexity, a metric that counts the forward solves before a learned solver is cost-effective relative to an error-equivalent traditional solver. To evaluate this measure, we apply scaling laws to determine how much training budget to allocate to data generation and discuss how to achieve smooth error-matching in diverse settings. We evaluate the breakeven complexity of multiple neural PDE solvers on three PDEs on 2D periodic domains from APEBench and a novel benchmark of flows past multiple obstacles generated by the GPU-native PyFR code. Among other findings, our results suggest that neural PDE solvers become more effective as problems get harder in terms of cost, dimension, rollout, physics regime (e.g. higher Reynolds number), etc.

preprint2026arXiv

Tabby: A Language Model Architecture for Tabular and Structured Data Synthesis

While advances in large language models (LLMs) have greatly improved the quality of synthetic text data in recent years, synthesizing tabular data has received relatively less attention. We address this disparity with Tabby, a simple but powerful post-training modification to the standard Transformer language model architecture, enabling its use for tabular dataset synthesis. Tabby enables the representation of differences across columns using Gated Mixture-of-Experts, with column-specific sets of parameters. Empirically, Tabby results in data quality near or equal to that of real data. By pairing our novel LLM table training technique, Plain, with Tabby, we observe up to a 44% improvement in quality over previous methods. We also show that Tabby extends beyond tables to more general structured data, reaching parity with real data on a nested JSON dataset as well.

preprint2023arXiv

NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks

Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, e.g. image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to address the following question: do state-of-the-art NAS methods perform well on diverse tasks? To construct the benchmark, we curate ten tasks spanning a diverse array of application domains, dataset sizes, problem dimensionalities, and learning objectives. Each task is carefully chosen to interoperate with modern CNN-based search methods while possibly being far-afield from its original development domain. To speed up and reduce the cost of NAS research, for two of the tasks we release the precomputed performance of 15,625 architectures comprising a standard CNN search space. Experimentally, we show the need for more robust NAS evaluation of the kind NAS-Bench-360 enables by showing that several modern NAS procedures perform inconsistently across the ten tasks, with many catastrophically poor results. We also demonstrate how NAS-Bench-360 and its associated precomputed results will enable future scientific discoveries by testing whether several recent hypotheses promoted in the NAS literature hold on diverse tasks. NAS-Bench-360 is hosted at https://nb360.ml.cmu.edu.

preprint2014arXiv

The Effects of Regional Vaccination Heterogeneity on Measles Outbreaks with France as a Case Study

The rubeola virus, commonly known as measles, is one of the major causes of vaccine-preventable deaths among children worldwide. This is the case despite the fact that an effective vaccine is widely available. Even in developed countries elimination efforts have fallen short as seen by recent outbreaks in Europe, which had over 30,000 cases reported in 2010. The string of measles outbreaks in France from 2008-2011 is of particular interest due to the documented disparity in regional vaccination coverage. The impact of heterogeneous vaccine coverage on disease transmission is a broad interest and the focus of this study. A Susceptible-Exposed-Infectious-Recovered (SEIR) multi-patch epidemiological model capturing the regional differences in vaccination rates and mixing is introduced. The mathematical analysis of a two-patch system is carried out to help our understanding of the behavior of multi-patch systems. Numerical simulations are generated to aid the study of the system's qualitative dynamics. Data from the recent French outbreaks were used to generate parameter values and to help connect theory with application. Our findings show that heterogeneous vaccination coverage increases controlled reproduction number compared to comparable homogeneous coverage.