Researcher profile

Christopher Kelly

Christopher Kelly contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Principles and Guidelines for Randomized Controlled Trials in AI Evaluation

This work establishes a foundational framework for standardizing AI evaluation RCTs (sometimes called human uplift studies). Drawing on established experimental practices from disciplines with established RCT traditions, including software engineering, economics, clinical and health sciences, and psychology, we adopt the (Shadish et al., 2002) four-validity framework and extend it with a fifth principle on transparency, repeatability, and verification adapted from the Transparency and Openness Promotion (TOP) Guidelines (Center for Open Science, 2025). We operationalize all five principles into 33 guidelines adapted for AI evaluation RCT contexts, expressed as requirements with rationales, implementation instructions, and evidence bases. We position the principles and guidelines as serving three key roles for AI evaluation RCTs: a design tool for planning studies, an evaluation rubric for assessing existing work, and a blueprint for standard setting as the field converges on norms. Our framework extends prior work by centering evaluation on human performance rather than model output alone, formalizing causal inference through RCT methodology for AI contexts, integrating heterogeneity analysis and practical significance assessment, implementing a graded transparency and repeatability framework, and addressing AI-specific challenges including model versioning, human-AI interaction dynamics, contamination and spillover effects, and equitable impact assessment.

preprint2022arXiv

Algorithms for Domain Wall Fermions

We discuss algorithms for domain wall fermions focussing on accelerating Hybrid Monte Carlo sampling of gauge configurations. Firstly a new multigrid algorithm for domain wall solvers and secondly a domain decomposed hybrid monte carlo approach applied to large subvolumes and optimised for GPU accelerated nodes. We propose a formulation of DD-RHMC that is suitable for the simulation of odd numbers of fermions.

preprint2022arXiv

Discovering new physics in rare kaon decays

The decays and mixing of $K$ mesons are remarkably sensitive to the weak interactions of quarks and leptons at high energies. They provide important tests of the standard model at both first and second order in the Fermi constant $G_F$ and offer a window into possible new phenomena at energies as high as 1,000 TeV. These possibilities become even more compelling as the growing capabilities of lattice QCD make high-precision standard model predictions possible. Here we discuss and attempt to forecast some of these capabilities.

preprint2022arXiv

Lattice QCD and the Computational Frontier

The search for new physics requires a joint experimental and theoretical effort. Lattice QCD is already an essential tool for obtaining precise model-free theoretical predictions of the hadronic processes underlying many key experimental searches, such as those involving heavy flavor physics, the anomalous magnetic moment of the muon, nucleon-neutrino scattering, and rare, second-order electroweak processes. As experimental measurements become more precise over the next decade, lattice QCD will play an increasing role in providing the needed matching theoretical precision. Achieving the needed precision requires simulations with lattices with substantially increased resolution. As we push to finer lattice spacing we encounter an array of new challenges. They include algorithmic and software-engineering challenges, challenges in computer technology and design, and challenges in maintaining the necessary human resources. In this white paper we describe those challenges and discuss ways they are being dealt with. Overcoming them is key to supporting the community effort required to deliver the needed theoretical support for experiments in the coming decade.

preprint2021arXiv

Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy

Over half a million individuals are diagnosed with head and neck cancer each year worldwide. Radiotherapy is an important curative treatment for this disease, but it requires manual time consuming delineation of radio-sensitive organs at risk (OARs). This planning process can delay treatment, while also introducing inter-operator variability with resulting downstream radiation dose differences. While auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying and achieving expert performance remain. Adopting a deep learning approach, we demonstrate a 3D U-Net architecture that achieves expert-level performance in delineating 21 distinct head and neck OARs commonly segmented in clinical practice. The model was trained on a dataset of 663 deidentified computed tomography (CT) scans acquired in routine clinical practice and with both segmentations taken from clinical practice and segmentations created by experienced radiographers as part of this research, all in accordance with consensus OAR definitions. We demonstrate the model's clinical applicability by assessing its performance on a test set of 21 CT scans from clinical practice, each with the 21 OARs segmented by two independent experts. We also introduce surface Dice similarity coefficient (surface DSC), a new metric for the comparison of organ delineation, to quantify deviation between OAR surface contours rather than volumes, better reflecting the clinical task of correcting errors in the automated organ segmentations. The model's generalisability is then demonstrated on two distinct open source datasets, reflecting different centres and countries to model training. With appropriate validation studies and regulatory approvals, this system could improve the efficiency, consistency, and safety of radiotherapy pathways.

preprint2020arXiv

Testing And Hardening IoT Devices Against the Mirai Botnet

A large majority of cheap Internet of Things (IoT) devices that arrive brand new, and are configured with out-of-the-box settings, are not being properly secured by the manufactures, and are vulnerable to existing malware lurking on the Internet. Among them is the Mirai botnet which has had its source code leaked to the world, allowing any malicious actor to configure and unleash it. A combination of software assets not being utilised safely and effectively are exposing consumers to a full compromise. We configured and attacked 4 different IoT devices using the Mirai libraries. Our experiments concluded that three out of the four devices were vulnerable to the Mirai malware and became infected when deployed using their default configuration. This demonstrates that the original security configurations are not sufficient to provide acceptable levels of protection for consumers, leaving their devices exposed and vulnerable. By analysing the Mirai libraries and its attack vectors, we were able to determine appropriate device configuration countermeasures to harden the devices against this botnet, which were successfully validated through experimentation.

preprint2019arXiv

Lattice simulations with G-parity Boundary Conditions

We discuss G-parity lattice boundary conditions as a means to impose momentum on the pion ground state without breaking isospin symmetry. This technique is expected to be critical for the precision measurement of $K\rightarrow(ππ)_{I=0}$ matrix elements where physical kinematics demands moving pions in the final state and the statistical noise caused by disconnected contributions will make it difficult to use multi-exponential fits to isolate this as an excited state. We present a formalism for computing hadronic Green's functions with G-parity boundary conditions, derive the discretized action and its symmetries, discuss how the strange quark can be introduced and detail techniques for the numerical implementation of these boundary conditions. We demonstrate and test these methods using several $16^3\times 32$ dynamical domain wall ensembles with a $420$ MeV pion mass and G-parity boundary conditions in one and two spatial directions.