Source author record

John Hewitt

John Hewitt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.HE Computation and Language Machine Learning Artificial Intelligence cs.CY

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Subliminal Steering: Stronger Encoding of Hidden Signals

Subliminal learning describes a student language model inheriting a behavioral bias by fine-tuning on seemingly innocuous data generated by a biased teacher model. Prior work has begun to characterize this phenomenon but leaves open questions about the scope of signals it can transfer, the mechanisms that explain it, and the precision with which a bias can be encoded by seemingly unrelated data. We tackle all three problems by introducing subliminal steering, a variant of subliminal learning in which the teacher's bias is implemented not via a system prompt, as in prior work, but through a steering vector trained to maximize the likelihood of a set of target samples. First, we show that subliminal steering transfers complex multi-word biases, whereas prior work focused on single-word preferences, demonstrating a large scope of subliminally transferrable signals. Second, we provide mechanistic evidence that subliminal learning transfers not only the target behavioral bias, but also the steering vector itself, localized to the layers at which the teacher was steered. Finally, we show that the bias is encoded with surprising precision. We train a new steering vector directly on the subliminally-laden dataset and find that it attains high cosine similarity with the original vector.

preprint2022arXiv

On the Opportunities and Risks of Foundation Models

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

preprint2020arXiv

Finding Universal Grammatical Relations in Multilingual BERT

Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually. To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the multilingual setting. We show that subspaces of mBERT representations recover syntactic tree distances in languages other than English, and that these subspaces are approximately shared across languages. Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. This evidence suggests that even without explicit supervision, multilingual masked language models learn certain linguistic universals.

preprint2016arXiv

Discovery of X-ray Emission from the Galactic Supernova Remnant G32.8-0.1 with Suzaku

We present the first dedicated X-ray study of the supernova remnant (SNR) G32.8-0.1 (Kes 78) with Suzaku. X-ray emission from the whole SNR shell has been detected for the first time. The X-ray morphology is well correlated with the emission from the radio shell, while anti-correlated with the molecular cloud found in the SNR field. The X-ray spectrum shows not only conventional low-temperature (kT ~ 0.6 keV) thermal emission in a non-equilibrium ionization state, but also a very high temperature (kT ~ 3.4 keV) component with a very low ionization timescale (~ 2.7e9 cm^{-3}s), or a hard non-thermal component with a photon index Gamma~2.3. The average density of the low-temperature plasma is rather low, of the order of 10^{-3}--10^{-2} cm^{-3}, implying that this SNR is expanding into a low-density cavity. We discuss the X-ray emission of the SNR, also detected in TeV with H.E.S.S., together with multi-wavelength studies of the remnant and other gamma-ray emitting SNRs, such as W28 and RCW 86. Analysis of a time-variable source, 2XMM J185114.3-000004, found in the northern part of the SNR, is also reported for the first time. Rapid time variability and a heavily absorbed hard X-ray spectrum suggest that this source could be a new supergiant fast X-ray transient.

preprint2015arXiv

New Identification of the Mixed-Morphology Supernova Remnant G298.6-0.0 with Possible Gamma-ray Association

We present an X-ray analysis on the Galactic supernova remnant (SNR) G298.6-0.0 with Suzaku. The X-ray image shows a center-filled structure inside the radio shell, implying this SNR is categorized as a mixed-morphology (MM) SNR. The spectrum is well reproduced by a single temperature plasma model in ionization equilibrium, with a temperature of 0.78 (0.70-0.87) keV. The total plasma mass of 30 solar mass indicates that the plasma has interstellar medium origin. The association with a GeV gamma-ray source 3FGL J1214.0-6236 on the shell of the SNR is discussed, in comparison with other MM SNRs with GeV gamma-ray associations. It is found that the flux ratio between absorption-corrected thermal X-rays and GeV gamma-rays decreases as the MM SNRs evolve to larger physical sizes. The absorption-corrected X-ray flux of G298.6-0.0 and the GeV gamma-ray flux of 3FGL J1214.0-6236 closely follow this trend, implying that 3FGL J1214.0-6236 is likely to be the GeV counterpart of G298.6-0.0.