Researcher profile

Elle Najt

Elle Najt contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors

Since autonomous coding agents generate complex behaviors at high-volume, we may want to use other LLMs to monitor actions to reduce the risk from dangerous misaligned behavior. To better understand the limitations of such monitors against the diverse attack strategies that a coding agent could use, we present SLEIGHT-Bench (Subtle Low-itEration Insight-Guided Harmful Transcripts), a benchmark of synthetic transcripts containing 40 attacks across 11 categories, each showing a coding agent covertly pursuing a harmful objective (e.g. weight exfiltration, credential theft, unmonitored rogue deployment). We find that 20 of 40 attacks are never caught by an Opus 4.6 monitor with extended thinking across 10 trials at a 1% false-positive rate, while the overall catch rate is 32%. We identify several evasion strategies such as exploiting model priors, user instruction ambiguity, and state manipulation, which facilitate creating attacks that evade frontier monitors. We also elicit stronger monitor performance using coding agents as monitors versus regular prompted monitors, and for some evasion strategies show improved catch rates with targeted monitor prompts. Our dataset and evaluation framework are available at https://github.com/safety-research/sleight-bench and https://huggingface.co/datasets/sleightbench/SLEIGHT-Bench.

preprint2022arXiv

The Legend of Zelda: The Complexity of Mechanics

We analyze some of the many game mechanics available to Link in the classic Legend of Zelda series of video games. In each case, we prove that the generalized game with that mechanic is polynomial, NP-complete, NP-hard and in PSPACE, or PSPACE-complete. In the process we give an overview of many of the hardness proof techniques developed for video games over the past decade: the motion-planning-through-gadgets framework, the planar doors framework, the doors-and-buttons framework, the "Nintendo" platform game / SAT framework, and the collectible tokens and toll roads / Hamiltonicity framework.

preprint2020arXiv

Empirical Sampling of Connected Graph Partitions for Redistricting

The space of connected graph partitions underlies statistical models used as evidence in court cases and reform efforts that analyze political districting plans. In response to the demands of redistricting applications, researchers have developed sampling methods that traverse this space, building on techniques developed for statistical physics. In this paper, we study connections between redistricting and statistical physics, and in particular with self-avoiding walks. We exploit knowledge of phase transitions and asymptotic behavior in self avoiding walks to analyze two questions of crucial importance for Markov Chain Monte Carlo analysis of districting plans. First, we examine mixing times of a popular Glauber dynamics based Markov chain and show how the self-avoiding walk phase transitions interact with mixing time. We examine factors new to the redistricting context that complicate the picture, notably the population balance requirements, connectivity requirements, and the irregular graphs used. Second, we analyze the robustness of the qualitative properties of typical districting plans with respect to score functions and a certain lattice-like graph, called the state-dual graph, that is used as a discretization of geographic regions in most districting analysis. This helps us better understand the complex relationship between typical properties of districting plans and the score functions designed by political districting analysts. We conclude with directions for research at the interface of statistical physics, Markov chains, and political districting.

preprint2019arXiv

Complexity and Geometry of Sampling Connected Graph Partitions

In this paper, we prove intractability results about sampling from the set of partitions of a planar graph into connected components. Our proofs are motivated by a technique introduced by Jerrum, Valiant, and Vazirani. Moreover, we use gadgets inspired by their technique to provide families of graphs where the "flip walk" Markov chain used in practice for this sampling task exhibits exponentially slow mixing. Supporting our theoretical results we present some empirical evidence demonstrating the slow mixing of the flip walk on grid graphs and on real data. Inspired by connections to the statistical physics of self-avoiding walks, we investigate the sensitivity of certain popular sampling algorithms to the graph topology. Finally, we discuss a few cases where the sampling problem is tractable. Applications to political redistricting have recently brought increased attention to this problem, and we articulate open questions about this application that are highlighted by our results.

preprint2019arXiv

The Gerrymandering Jumble: Map Projections Permute Districts' Compactness Scores

In political redistricting, the compactness of a district is used as a quantitative proxy for its fairness. Several well-established, yet competing, notions of geographic compactness are commonly used to evaluate the shapes of regions, including the Polsby-Popper score, the convex hull score, and the Reock score, and these scores are used to compare two or more districts or plans. In this paper, we prove mathematically that any map projection from the sphere to the plane reverses the ordering of the scores of some pair of regions for all three of these scores. Empirically, we demonstrate that the effect of using the Cartesian latitude-longitude projection on the order of Reock scores is quite dramatic.