Source author record

Tom Kempton

Tom Kempton appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.DS math.NT math.CA Artificial Intelligence Computation and Language Machine Learning

Catalog footprint

What is connected

11works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text

The ability to reliably distinguish human-written text from that generated by large language models is of profound societal importance. The dominant approach to this problem exploits the likelihood hypothesis: that machine-generated text should appear more probable to a detector language model than human-written text. However, we demonstrate that the token-level signal distinguishing human and machine text is non-uniform across the hidden space of the detector model, and naively averaging likelihood-based token scores across regions with fundamentally different statistical structure, as most detectors do, causes a form of Simpson's paradox: a strong local signal is destroyed by inappropriate aggregation. To correct for this, we introduce a learned local calibration step grounded in Bayesian decision theory. Rather than aggregating raw token scores, we first learn lightweight predictors of the score distributions conditioned on position in hidden space, and aggregate calibrated log-likelihood ratios instead. This single intervention dramatically and consistently improves detection performance across all baseline detectors and all datasets we consider. For example, our calibrated variant of Fast-DetectGPT improves AUROC from $0.63$ to $0.85$ on GPT-5.4 text, and a locally-calibrated DMAP detector we introduce achieves state-of-the-art performance across the board. That said, our central contribution is not a new detector, but a precise diagnosis of a significant cause of under-performance of existing detectors and a principled, modular remedy compatible with any token-averaging pipeline. This will serve as a foundation for the community to build upon, with natural avenues including richer distributional models, improved calibration strategies, and principled ensembling with hidden-space geometry signals via the full Bayes-optimal decision rule.

preprint2021arXiv

Measures on the Spectra of Algebraic Integers

Given a real number beta > 1, the spectrum of beta is a well studied dynamical object. In this article we show the existence of a certain measure on the spectrum of beta related to the distribution of random polynomials in beta, and discuss the local structure of this measure. We also make links with the question of the Hausdorff dimension of the corresponding Bernoulli Convolution

preprint2016arXiv

Planar self-affine sets with equal Hausdorff, box and affinity dimensions

Using methods from ergodic theory along with properties of the Furstenberg measure we obtain conditions under which certain classes of plane self-affine sets have Hausdorff or box-counting dimensions equal to their affinity dimension. We exhibit some new specific classes of self-affine sets for which these dimensions are equal.

preprint2015arXiv

Bernoulli Convolutions and 1D Dynamics

We describe a family $ϕ_λ$ of dynamical systems on the unit interval which preserve Bernoulli convolutions. We show that if there are parameter ranges for which these systems are piecewise convex, then the corresponding Bernoulli convolution will be absolutely continuous with bounded density. We study the systems $ϕ_λ$ and give some numerical evidence to suggest values of $λ$ for which $ϕ_λ$ may be piecewise convex.

preprint2015arXiv

The dimension of projections of self-affine sets and measures

Let E be a plane self-affine set defined by affine transformations with linear parts given by matrices with positive entries. We show that if mu is a Bernoulli measure on E with dim_H mu = dim_L mu, where dim_H and dim_L denote Hausdorff and Lyapunov dimensions, then the projection of mu in all but at most one direction has Hausdorff dimension min{dim_H mu,1}. We transfer this result to sets and show that many self-affine sets have projections of dimension min{dim_H E,1} in all but at most one direction.

preprint2015arXiv

The random continued fraction transformation

We introduce a random dynamical system related to continued fraction expansions. It uses random combination of the Gauss map and the Rényi (or backwards) continued fraction map. We explore the continued fraction expansions that this system produces as well as the dynamical properties of the system.

preprint2015arXiv

The Scenery Flow for Self-Affine Measures

We describe the scaling scenery associated to Bernoulli measures supported on separated self-affine sets under the condition that certain projections of the measure are absolutely continuous.

preprint2013arXiv

On the Invariant Density of the Random Beta-Transformation

We construct a Lebesgue measure preserving natural extension of the random beta-transformation. This allows us to give a formula for the density of the absolutely continuous invariant probability measure, answering a question of Dajani and de Vries, and also to evaluate some estimates on the typical branching rate of the set of beta-expansions of a real number.

preprint2013arXiv

Sets of beta-expansions and the Hausdorff Measure of Slices through Fractals

We study natural measures on sets of beta-expansions and on slices through self similar sets. In the setting of beta-expansions, these allow us to better understand the measure of maximal entropy for the random beta-transformation and to reinterpret a result of Lindenstrauss, Peres and Schlag in terms of equidistribution. Each of these applications is relevant to the study of Bernoulli convolutions. In the fractal setting this allows us to understand how to disintegrate Hausdorff measure by slicing, leading to conditions under which almost every slice through a self similar set has positive Hausdorff measure, generalising long known results about almost everywhere values of the Hausdorff dimension.

preprint2012arXiv

Counting Beta Expansions and the Absolute Continuity of Bernoulli Convolutions

We study the typical growth rate of the number of words of length n which can be extended to beta-expansions of x. In the general case we give a lower bound for the growth rate, while in the case that the Bernoulli convolution associated to parameter beta is absolutely continuous we are able to give the growth rate precisely. This gives new necessary and sufficient conditions for the absolute continuity of Bernoulli convolutions.

preprint2012arXiv

Digit Frequencies and Bernoulli Convolutions

It is well known that the Bernoulli convolution $ν_β$ associated to the golden mean has Hausdorff dimension less than 1, i.e. that there exists a set $A$ with $ν_β(A)=1$ and $dim_H(A)<1$. We construct such a set $A$ explicitly and discuss how our approach might be generalised to prove the singularity of other Bernoulli convolutions

Tom Kempton

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text

Measures on the Spectra of Algebraic Integers

Planar self-affine sets with equal Hausdorff, box and affinity dimensions

Bernoulli Convolutions and 1D Dynamics

The dimension of projections of self-affine sets and measures

The random continued fraction transformation

The Scenery Flow for Self-Affine Measures

On the Invariant Density of the Random Beta-Transformation

Sets of beta-expansions and the Hausdorff Measure of Slices through Fractals

Counting Beta Expansions and the Absolute Continuity of Bernoulli Convolutions

Digit Frequencies and Bernoulli Convolutions