Researcher profile

David Hall

David Hall contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Relative Scaling Laws for LLMs

Scaling laws describe how language models improve with additional data, parameters, and compute. While widely used, they are typically measured on aggregate test sets. Aggregate evaluations yield clean trends but average over heterogeneous subpopulations, obscuring performance disparities. We introduce relative scaling laws, which track how performance gaps between test distributions evolve with scale rather than focusing solely on absolute error. Using 255 decoder-only Transformers trained under matched-compute (IsoFLOP) budgets from $10^{18}$--$10^{20}$ FLOPs on standard pretraining datasets, we find diverse trajectories: academic domains on MMLU converge toward parity; regional English dialects shift depending on population size; and clusters of AI risk behaviours split, with capability- and influence-related risks increasing during pretraining while adversarial risks do not. These results show that although scaling improves overall performance, it is not a universal equalizer. To support further study, we release all model checkpoints from this work to enable practitioners to measure relative alongside traditional scaling laws, in order to better prioritize robustness challenges in light of the bitter lesson.

preprint2022arXiv

FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at $0.25^{\circ}$ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning wind energy resources, predicting extreme weather events such as tropical cyclones, extra-tropical cyclones, and atmospheric rivers. FourCastNet matches the forecasting accuracy of the ECMWF Integrated Forecasting System (IFS), a state-of-the-art Numerical Weather Prediction (NWP) model, at short lead times for large-scale variables, while outperforming IFS for variables with complex fine-scale structure, including precipitation. FourCastNet generates a week-long forecast in less than 2 seconds, orders of magnitude faster than IFS. The speed of FourCastNet enables the creation of rapid and inexpensive large-ensemble forecasts with thousands of ensemble-members for improving probabilistic forecasting. We discuss how data-driven deep learning models such as FourCastNet are a valuable addition to the meteorology toolkit to aid and augment NWP models.

preprint2022arXiv

FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts five orders-of-magnitude faster than NWP while approaching state-of-the-art accuracy. FourCast-Net is optimized and scales efficiently on three supercomputing systems: Selene, Perlmutter, and JUWELS Booster up to 3,808 NVIDIA A100 GPUs, attaining 140.8 petaFLOPS in mixed precision (11.9%of peak at that scale). The time-to-solution for training FourCastNet measured on JUWELS Booster on 3,072GPUs is 67.4minutes, resulting in an 80,000times faster time-to-solution relative to state-of-the-art NWP, in inference. FourCastNet produces accurate instantaneous weather predictions for a week in advance, enables enormous ensembles that better capture weather extremes, and supports higher global forecast resolutions.

preprint2022arXiv

Mitigating the effects of particle background on the Athena Wide-Field Imager

The Wide Field Imager (WFI) flying on Athena will usher in the next era of studying the hot and energetic Universe. WFI observations of faint, diffuse sources will be limited by uncertainty in the background produced by high-energy particles. These particles produce easily identified "cosmic-ray tracks" along with signals from secondary photons and electrons generated by particle interactions with the instrument. The signal from these secondaries is identical to the X-rays focused by the optics, and cannot be filtered without also eliminating these precious photons. As part of a larger effort to understand the WFI background, we here present results from a study of background-reduction techniques that exploit the spatial correlation between cosmic-ray particle tracks and secondary events. We use Geant4 simulations to generate a realistic particle background, sort this into simulated WFI frames, and process those frames in a similar way to the expected flight and ground software to produce a WFI observation containing only particle background. The technique under study, Self Anti-Coincidence or SAC, then selectively filters regions of the detector around particle tracks, turning the WFI into its own anti-coincidence detector. We show that SAC is effective at improving the systematic uncertainty for observations of faint, diffuse sources, but at the cost of statistical uncertainty due to a reduction in signal. If sufficient pixel pulse-height information is telemetered to the ground for each frame, then this technique can be applied selectively based on the science goals, providing flexibility without affecting the data quality for other science. The results presented here are relevant for any future silicon-based pixelated X-ray imaging detector, and could allow the WFI and similar instruments to probe to truly faint X-ray surface brightness.

preprint2021arXiv

Task-Oriented Dialogue as Dataflow Synthesis

We describe an approach to task-oriented dialogue in which dialogue state is represented as a dataflow graph. A dialogue agent maps each user utterance to a program that extends this graph. Programs include metacomputation operators for reference and revision that reuse dataflow fragments from previous turns. Our graph-based state enables the expression and manipulation of complex user intents, and explicit metacomputation makes these intents easier for learned models to predict. We introduce a new dataset, SMCalFlow, featuring complex dialogues about events, weather, places, and people. Experiments show that dataflow graphs and metacomputation substantially improve representability and predictability in these natural dialogues. Additional experiments on the MultiWOZ dataset show that our dataflow representation enables an otherwise off-the-shelf sequence-to-sequence model to match the best existing task-specific state tracking model. The SMCalFlow dataset and code for replicating experiments are available at https://www.microsoft.com/en-us/research/project/dataflow-based-dialogue-semantic-machines.

preprint2020arXiv

BenchBot: Evaluating Robotics Research in Photorealistic 3D Simulation and on Real Robots

We introduce BenchBot, a novel software suite for benchmarking the performance of robotics research across both photorealistic 3D simulations and real robot platforms. BenchBot provides a simple interface to the sensorimotor capabilities of a robot when solving robotics research problems; an interface that is consistent regardless of whether the target platform is simulated or a real robot. In this paper we outline the BenchBot system architecture, and explore the parallels between its user-centric design and an ideal research development process devoid of tangential robot engineering challenges. The paper describes the research benefits of using the BenchBot system, including: enhanced capacity to focus solely on research problems, direct quantitative feedback to inform research development, tools for deriving comprehensive performance characteristics, and submission formats which promote sharability and repeatability of research outcomes. BenchBot is publicly available (http://benchbot.org), and we encourage its use in the research community for comprehensively evaluating the simulated and real world performance of novel robotic algorithms.

preprint2020arXiv

Characterisation of the Particle-Induced Background of XMM-Newton EPIC-pn: Short and Long Term Variability

The particle-induced background of X-ray observatories is produced by Galactic Cosmic Ray (GCR) primary protons, electrons, and He ions. Events due to direct interaction with the detector are usually removed by on board processing. The interactions of these primary particles with the detector environment produce secondary particles that mimic X-ray events from celestial sources and are much more difficult to identify. The filter wheel closed data from the XMM-Newton EPIC-pn camera in small window mode (SWM) contains both the X-ray-like background events and the events due to direct interactions with the primary particles. From this data we demonstrate that X-ray-like background events are spatially correlated with the primary particle interaction. This result can be used to further characterise and reduce the non-X-ray background in silicon-based X-ray detectors in current and future missions. We also show that spectrum and pattern fractions of secondary particle events are different from those produced by cosmic X-rays.

preprint2020arXiv

Probabilistic Object Detection: Definition and Evaluation

We introduce Probabilistic Object Detection, the task of detecting objects in images and accurately quantifying the spatial and semantic uncertainties of the detections. Given the lack of methods capable of assessing such probabilistic object detections, we present the new Probability-based Detection Quality measure (PDQ).Unlike AP-based measures, PDQ has no arbitrary thresholds and rewards spatial and label quality, and foreground/background separation quality while explicitly penalising false positive and false negative detections. We contrast PDQ with existing mAP and moLRP measures by evaluating state-of-the-art detectors and a Bayesian object detector based on Monte Carlo Dropout. Our experiments indicate that conventional object detectors tend to be spatially overconfident and thus perform poorly on the task of probabilistic object detection. Our paper aims to encourage the development of new object detection approaches that provide detections with accurately estimated spatial and label uncertainties and are of critical importance for deployment on robots and embodied AI systems in the real world.

preprint2020arXiv

The Robotic Vision Scene Understanding Challenge

Being able to explore an environment and understand the location and type of all objects therein is important for indoor robotic platforms that must interact closely with humans. However, it is difficult to evaluate progress in this area due to a lack of standardized testing which is limited due to the need for active robot agency and perfect object ground-truth. To help provide a standard for testing scene understanding systems, we present a new robot vision scene understanding challenge using simulation to enable repeatable experiments with active robot agency. We provide two challenging task types, three difficulty levels, five simulated environments and a new evaluation measure for evaluating 3D cuboid object maps. Our aim is to drive state-of-the-art research in scene understanding through enabling evaluation and comparison of active robotic vision systems.

preprint2020arXiv

What can robotics research learn from computer vision research?

The computer vision and robotics research communities are each strong. However progress in computer vision has become turbo-charged in recent years due to big data, GPU computing, novel learning algorithms and a very effective research methodology. By comparison, progress in robotics seems slower. It is true that robotics came later to exploring the potential of learning -- the advantages over the well-established body of knowledge in dynamics, kinematics, planning and control is still being debated, although reinforcement learning seems to offer real potential. However, the rapid development of computer vision compared to robotics cannot be only attributed to the former's adoption of deep learning. In this paper, we argue that the gains in computer vision are due to research methodology -- evaluation under strict constraints versus experiments; bold numbers versus videos.

preprint2019arXiv

An energy consistent discretization of the nonhydrostatic equations in primitive variables

We derive a formulation of the nonhydrostatic equations in spherical geometry with a Lorenz staggered vertical discretization. The combination conserves a discrete energy in exact time integration when coupled with a mimetic horizontal discretization. The formulation is a version of Dubos and Tort (2014) rewritten in terms of primitive variables. It is valid for terrain following mass or height coordinates and for both Eulerian or vertically Lagrangian discretizations. The discretization relies on an extension to Simmons and Burridge (1981) vertical differencing which we show obeys a discrete derivative product rule. This product rule allows us to simplify the treatment of the vertical transport terms. Energy conservation is obtained via a term-by-term balance in the kinetic, internal and potential energy budgets, ensuring an energy-consistent discretization with no spurious sources of energy. We demonstrate convergence with respect to time truncation error in a spectral element code with a HEVI IMEX timestepping algorithm