Researcher profile

Michael Hahn

Michael Hahn contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Barriers to Universal Reasoning With Transformers (And How to Overcome Them)

Chain-of-Thought (CoT) has been shown to empirically improve Transformers' performance, and theoretically increase their expressivity to Turing completeness. However, whether Transformers can learn to generalize to CoT traces longer than those seen during training is understudied. We use recent theoretical frameworks for Transformer length generalization and find that -- under standard positional encodings and a finite alphabet -- Transformers with CoT cannot solve problems beyond $TC^0$, i.e. the expressivity benefits do not hold under the stricter requirement of length-generalizable learnability. However, if we allow the vocabulary to grow with problem size, we attain a length-generalizable simulation of Turing machines where the CoT trace length is linear in the simulated runtime up to a constant. Our construction overcomes two core obstacles to reliable length generalization: repeated copying and last-occurrence retrieval. We assign each tape position a unique signpost token, and log only value changes to enable recovery of the current tape symbol through counts circumventing both barriers. Further, we empirically show that the use of such signpost tokens and value change encodings provide actionable guidance to improve length generalization on hard problems.

preprint2026arXiv

How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning

In-context learning (ICL) excels at new tasks from minimal examples, yet we still lack a mechanistic explanation of how few-shot prompts shape a model's function vector (FV)--a causal activation direction that drives task behavior on the ICL query. Across tasks and models, an $n$-shot FV is well-approximated by a linear combination of example-level sub-FVs, suggesting additive and composable contributions from individual demonstrations. Beyond additivity, we show that models contextualize individual examples' representations based on prior examples to adaptively reweight which demonstrations dominate the FV: attention shifts toward examples that are more informative and less ambiguous under the context. Finally, a causal decomposition separates Query-Key routing from Value updates, finding that contextualization's most consistent contributions to FV quality arise from Query-Key alignment--particularly in ambiguous settings--while Value-mediated effects are more heterogeneous. Together, these results unify additive superposition with context-dependent attention reweighting into a mechanistic, testable account of how few-shot prompts implement tasks.

preprint2026arXiv

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

As Large Language Models (LLMs) are increasingly integrated into academic peer review, their vulnerability to adversarial prompts -- adversarial instructions embedded in submissions to manipulate outcomes -- emerges as a critical threat to scholarly integrity. To counter this, we propose a novel adversarial framework where a Generator model, trained to create sophisticated attack prompts, is jointly optimized with a Defender model tasked with their detection. This system is trained using a loss function inspired by Information Retrieval Generative Adversarial Networks, which fosters a dynamic co-evolution between the two models, forcing the Defender to develop robust capabilities against continuously improving attack strategies. The resulting framework demonstrates significantly enhanced resilience to novel and evolving threats compared to static defenses, thereby establishing a critical foundation for securing the integrity of peer review.

preprint2026arXiv

Tug-of-war between idioms' figurative and literal interpretations in LLMs

Idioms present a unique challenge for language models due to their non-compositional figurative interpretations, which often strongly diverge from the idiom's literal interpretation. In this paper, we employ causal tracing to systematically analyze how pretrained causal transformers deal with this ambiguity. We localize three mechanisms: (i) Early sublayers and specific attention heads retrieve an idiom's figurative interpretation, while suppressing its literal interpretation. (ii) When disambiguating context precedes the idiom, the model leverages it from the earliest layer and later layers refine the interpretation if the context conflicts with the retrieved interpretation. (iii) Then, selective, competing pathways carry both interpretations: an intermediate pathway prioritizes the figurative interpretation and a parallel direct route favors the literal interpretation, ensuring that both readings remain available. Our findings provide mechanistic evidence for idiom comprehension in autoregressive transformers.

preprint2022arXiv

Crosslinguistic word order variation reflects evolutionary pressures of dependency and information locality

Languages vary considerably in syntactic structure. About 40% of the world's languages have subject-verb-object order, and about 40% have subject-object-verb order. Extensive work has sought to explain this word order variation across languages. However, the existing approaches are not able to explain coherently the frequency distribution and evolution of word order in individual languages. We propose that variation in word order reflects different ways of balancing competing pressures of dependency locality and information locality, whereby languages favor placing elements together when they are syntactically related or contextually informative about each other. Using data from 80 languages in 17 language families and phylogenetic modeling, we demonstrate that languages evolve to balance these pressures, such that word order change is accompanied by change in the frequency distribution of the syntactic structures which speakers communicate to maintain overall efficiency. Variability in word order thus reflects different ways in which languages resolve these evolutionary pressures. We identify relevant characteristics that result from this joint optimization, particularly the frequency with which subjects and objects are expressed together for the same verb. Our findings suggest that syntactic structure and usage across languages co-adapt to support efficient communication under limited cognitive resources.

preprint2022arXiv

Evidence for Parameteric Decay Instability in the Lower Solar Atmosphere

We find evidence for the first observation of the parametric decay instability (PDI) in the lower solar atmosphere. Specifically, we find that the power spectrum of density fluctuations near the solar transition region resembles the power spectrum of the velocity fluctuations, but with the frequency axis scaled up by about a factor of two. These results are from an analysis of the Si IV lines observed by the Interface Region Imaging Spectrometer (IRIS) in the transition region of a polar coronal hole. We also find that the density fluctuations have radial velocity of about 75 km/s and that the velocity fluctuations are much faster with an estimated speed of 250 km/s, as is expected for sound waves and Alfvén waves, respectively, in the transition region. Theoretical calculations show that this frequency relationship is consistent with those expected from PDI for the plasma conditions of the observed region. These measurements suggest an interaction between sound waves and Alfvén waves in the transition region that is evidence for the parametric decay instability.

preprint2019arXiv

Laboratory Calibrations of Fe XII-XIV Line-Intensity Ratios for Electron Density Diagnostics

We have used an electron beam ion trap to measure electron-density-diagnostic line-intensity ratios for extreme ultraviolet lines from F XII, XIII, and XIV at wavelengths of 185-205 255-276 Angstroms. These ratios can be used as density diagnostics for astrophysical spectra and are especially relevant to solar physics. We found that density diagnostics using the Fe XIII 196.53/202.04 and the Fe XIV 264.79/274.21 and 270.52A/274.21 line ratios are reliable using the atomic data calculated with the Flexible Atomic Code. On the other hand, we found a large discrepancy between the FAC theory and experiment for the commonly used Fe XII (186.85 + 186.88)/195.12 line ratio. These FAC theory calculations give similar results to the data tabulated in CHIANTI, which are commonly used to analyze solar observations. Our results suggest that the discrepancies seen between solar coronal density measurements using the Fe XII (186.85 + 186.88)/195.12 and Fe XIII 196.54/202.04 line ratios are likely due to issues with the atomic calculations for Fe XII.

preprint2019arXiv

Measured reduction in Alfvén wave energy propagating through longitudinal gradients scaled to match solar coronal holes

We have explored the effectiveness of a longitudinal gradient in Alfvén speed in reducing the energy of propagating Alfvén waves under conditions scaled to match solar coronal holes. The experiments were conducted in the Large Plasma Device at the University of California, Los Angeles. Our results show that the energy of the transmitted Alfvén wave decreases as the inhomogeneity parameter, $λ/L_{\rm A}$, increases. Here, $λ$ is the wavelength of the Alfvén wave and $L_{\rm A}$ is the scale length of Alfvén speed gradient. For gradients similar to those in coronal holes, the waves are observed to lose a factor of $\approx 5$ more energy than they do when propagating through a uniform plasma without a gradient. We have carried out further experiments and analyses to constrain the cause of wave energy reduction in the gradient. The loss of Alfvén wave energy from mode coupling is unlikely, as we have not detected any other modes. Contrary to theoretical expectations, the reduction in the energy of the transmitted wave is not accompanied by a detectable reflected wave. Nonlinear effects are ruled out as the amplitude of the initial wave is too small and the wave frequency well below the ion cyclotron frequency. Since the total energy must be conserved, it is possible that the lost wave energy is being deposited in the plasma. Further studies are needed to explore where the energy is going.