Source author record

Jared Kaplan

Jared Kaplan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-th hep-ph cond-mat.str-el Machine Learning hep-ex Artificial Intelligence Computation and Language astro-ph.CO astro-ph.HE cond-mat.stat-mech Cryptography and Security gr-qc

Catalog footprint

What is connected

42works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks

We introduce enhanced Constitutional Classifiers that deliver production-grade jailbreak robustness with dramatically reduced computational costs and refusal rates compared to previous-generation defenses. Our system combines several key insights. First, we develop exchange classifiers that evaluate model responses in their full conversational context, which addresses vulnerabilities in last-generation systems that examine outputs in isolation. Second, we implement a two-stage classifier cascade where lightweight classifiers screen all traffic and escalate only suspicious exchanges to more expensive classifiers. Third, we train efficient linear probe classifiers and ensemble them with external classifiers to simultaneously improve robustness and reduce computational costs. Together, these techniques yield a production-grade system achieving a 40x computational cost reduction compared to our baseline exchange classifier, while maintaining a 0.05% refusal rate on production traffic. Through extensive red-teaming comprising over 1,700 hours, we demonstrate strong protection against universal jailbreaks -- no attack on this system successfully elicited responses to all eight target queries comparable in detail to an undefended model. Our work establishes Constitutional Classifiers as practical and efficient safeguards for large language models.

preprint2022arXiv

Scaling Laws and Interpretability of Learning from Repeated Data

Recent large language models have been trained on vast datasets, but also often on repeated data, either intentionally for the purpose of upweighting higher quality data, or unintentionally because data deduplication is not perfect and the model is exposed to repeated data at the sentence, paragraph, or document level. Some works have reported substantial negative performance effects of this repeated data. In this paper we attempt to study repeated data systematically and to understand its effects mechanistically. To do this, we train a family of models where most of the data is unique but a small fraction of it is repeated many times. We find a strong double descent phenomenon, in which repeated data can lead test loss to increase midway through training. A predictable range of repetition frequency leads to surprisingly severe degradation in performance. For instance, performance of an 800M parameter model can be degraded to that of a 2x smaller model (400M params) by repeating 0.1% of the data 100 times, despite the other 90% of the training tokens remaining unique. We suspect there is a range in the middle where the data can be memorized and doing so consumes a large fraction of the model's capacity, and this may be where the peak of degradation occurs. Finally, we connect these observations to recent mechanistic interpretability work - attempting to reverse engineer the detailed computations performed by the model - by showing that data repetition disproportionately damages copying and internal structures associated with generalization, such as induction heads, providing a possible mechanism for the shift from generalization to memorization. Taken together, these results provide a hypothesis for why repeating a relatively small fraction of data in large language models could lead to disproportionately large harms to performance.

preprint2022arXiv

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

We apply preference modeling and reinforcement learning from human feedback (RLHF) to finetune language models to act as helpful and harmless assistants. We find this alignment training improves performance on almost all NLP evaluations, and is fully compatible with training for specialized skills such as python coding and summarization. We explore an iterated online mode of training, where preference models and RL policies are updated on a weekly cadence with fresh human feedback data, efficiently improving our datasets and models. Finally, we investigate the robustness of RLHF training, and identify a roughly linear relation between the RL reward and the square root of the KL divergence between the policy and its initialization. Alongside our main results, we perform peripheral analyses on calibration, competing objectives, and the use of OOD detection, compare our models with human writers, and provide samples from our models using prompts appearing in recent related work.

preprint2021arXiv

Scaling Laws for Transfer

We study empirical scaling laws for transfer learning between distributions in an unsupervised, fine-tuning setting. When we train increasingly large neural networks from-scratch on a fixed-size dataset, they eventually become data-limited and stop improving in performance (cross-entropy loss). When we do the same for models pre-trained on a large language dataset, the slope in performance gains is merely reduced rather than going to zero. We calculate the effective data "transferred" from pre-training by determining how much data a transformer of the same size would have required to achieve the same loss when training from scratch. In other words, we focus on units of data while holding everything else fixed. We find that the effective data transferred is described well in the low data regime by a power-law of parameter count and fine-tuning dataset size. We believe the exponents in these power-laws correspond to measures of the generality of a model and proximity of distributions (in a directed rather than symmetric sense). We find that pre-training effectively multiplies the fine-tuning dataset size. Transfer, like overall performance, scales predictably in terms of parameters, data, and compute.

preprint2020arXiv

A Neural Scaling Law from the Dimension of the Data Manifold

When data is plentiful, the loss achieved by well-trained neural networks scales as a power-law $L \propto N^{-α}$ in the number of network parameters $N$. This empirical scaling law holds for a wide variety of data modalities, and may persist over many orders of magnitude. The scaling law can be explained if neural models are effectively just performing regression on a data manifold of intrinsic dimension $d$. This simple theory predicts that the scaling exponents $α\approx 4/d$ for cross-entropy and mean-squared error losses. We confirm the theory by independently measuring the intrinsic dimension and the scaling exponents in a teacher/student framework, where we can study a variety of $d$ and $α$ by dialing the properties of random teacher networks. We also test the theory with CNN image classifiers on several datasets and with GPT-type language models.

preprint2020arXiv

Language Models are Few-Shot Learners

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

preprint2020arXiv

Scaling Laws for Neural Language Models

We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence of overfitting on model/dataset size and the dependence of training speed on model size. These relationships allow us to determine the optimal allocation of a fixed compute budget. Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.

preprint2019arXiv

A Species or Weak-Gravity Bound for Large $N$ Gauge Theories Coupled to Gravity

Causality constrains the gravitational interactions of massive higher spin particles in both AdS and flat spacetime. We explore the extent to which these constraints apply to composite particles, explaining why they do not rule out macroscopic objects or hydrogen atoms. However, we find that they do apply to glueballs and mesons in confining large $N$ gauge theories. Assuming such theories contain massive bound states of general spin, we find parametric bounds in $(3+1)$ spacetime dimensions of the form $N\lesssim \frac{M_{Pl}}{Λ_{\text{QCD}}}$ relating $N$, the QCD scale, and the Planck scale. We also argue that a stronger bound replacing $Λ_{\text{QCD}}$ with the UV cut-off scale may be derived from eikonal scattering in flat spacetime.

preprint2016arXiv

A Quantum Correction To Chaos

We use results on Virasoro conformal blocks to study chaotic dynamics in CFT$_2$ at large central charge c. The Lyapunov exponent $λ_L$, which is a diagnostic for the early onset of chaos, receives $1/c$ corrections that may be interpreted as $λ_L = \frac{2 π}β \left( 1 + \frac{12}{c} \right)$. However, out of time order correlators receive other equally important $1/c$ suppressed contributions that do not have such a simple interpretation. We revisit the proof of a bound on $λ_L$ that emerges at large $c$, focusing on CFT$_2$ and explaining why our results do not conflict with the analysis leading to the bound. We also comment on relationships between chaos, scattering, causality, and bulk locality.

preprint2016arXiv

Conformal Blocks Beyond the Semi-Classical Limit

Black hole microstates and their approximate thermodynamic properties can be studied using heavy-light correlation functions in AdS/CFT. Universal features of these correlators can be extracted from the Virasoro conformal blocks in CFT2, which encapsulate quantum gravitational effects in AdS3. At infinite central charge c, the Virasoro vacuum block provides an avatar of the black hole information paradox in the form of periodic Euclidean-time singularities that must be resolved at finite c. We compute Virasoro blocks in the heavy-light, large c limit, extending our previous results by determining perturbative 1/c corrections. We obtain explicit closed-form expressions for both the `semi-classical' $h_L^2 / c^2$ and `quantum' $h_L / c^2$ corrections to the vacuum block, and we provide integral formulas for general Virasoro blocks. We comment on the interpretation of our results for thermodynamics, discussing how monodromies in Euclidean time can arise from AdS calculations using `geodesic Witten diagrams'. We expect that only non-perturbative corrections in 1/c can resolve the singularities associated with the information paradox.

preprint2016arXiv

On Information Loss in AdS$_3$/CFT$_2$

We discuss information loss from black hole physics in AdS$_3$, focusing on two sharp signatures infecting CFT$_2$ correlators at large central charge $c$: 'forbidden singularities' arising from Euclidean-time periodicity due to the effective Hawking temperature, and late-time exponential decay in the Lorentzian region. We study an infinite class of examples where forbidden singularities can be resolved by non-perturbative effects at finite $c$, and we show that the resolution has certain universal features that also apply in the general case. Analytically continuing to the Lorentzian regime, we find that the non-perturbative effects that resolve forbidden singularities qualitatively change the behavior of correlators at times $t \sim S_{BH}$, the black hole entropy. This may resolve the exponential decay of correlators at late times in black hole backgrounds. By Borel resumming the $1/c$ expansion of exact examples, we explicitly identify 'information-restoring' effects from heavy states that should correspond to classical solutions in AdS$_3$. Our results suggest a line of inquiry towards a more precise formulation of the gravitational path integral in AdS$_3$.

preprint2015arXiv

Eikonalization of Conformal Blocks

Classical field configurations such as the Coulomb potential and Schwarzschild solution are built from the t-channel exchange of many light degrees of freedom. We study the CFT analog of this phenomenon, which we term the `eikonalization' of conformal blocks. We show that when an operator $T$ appears in the OPE $\mathcal{O}(x) \mathcal{O}(0)$, then the large spin $\ell$ Fock space states $[TT \cdots T]_{\ell}$ also appear in this OPE with a computable coefficient. The sum over the exchange of these Fock space states in an $\langle \mathcal{O} \mathcal{O} \mathcal{O} \mathcal{O} \rangle$ correlator build the classical `$T$ field' in the dual AdS description. In some limits the sum of all Fock space exchanges can be represented as the exponential of a single $T$ exchange in the 4-pt correlator of $\mathcal{O}$. Our results should be useful for systematizing $1/\ell$ perturbation theory in general CFTs and simplifying the computation of large spin OPE coefficients. As examples we obtain the leading $\log \ell$ dependence of Fock space conformal block coefficients, and we directly compute the OPE coefficients of the simplest `triple-trace' operators.

preprint2015arXiv

Enhanced Pairing of Quantum Critical Metals Near d=3+1

We study the dynamics of a quantum critical boson coupled to a Fermi surface in intermediate energy regimes where the Landau damping of the boson can be parametrically controlled, either via large Fermi velocity or by large N techniques. We focus on developing a systematic approach to studying the BCS instability, including careful treatment of the enhanced log^2 and log^3 singularities which appear already at 1-loop. We also treat possible instabilities to charge density wave (CDW) formation, and compare the scales Lambda_{BCS} and Lambda_{CDW} of the onset of the instabilities in different parametric regimes. We address the question of whether the dressing of the fermions into a non-Fermi liquid via interactions with the order parameter field can happen at energies > Lambda_{BCS}, Lambda_{CDW}.

preprint2015arXiv

Hawking from Catalan

The Virasoro algebra determines all `graviton' matrix elements in AdS$_3$/CFT$_2$. We study the explicit exchange of any number of Virasoro gravitons between heavy and light CFT$_2$ operators at large central charge. These graviton exchanges can be written in terms of new on-shell tree diagrams, organized in a perturbative expansion in $h_H/c$, the heavy operator dimension divided by the central charge. The Virasoro vacuum conformal block, which is the sum of all the tree diagrams, obeys a differential recursion relation generalizing that of the Catalan numbers. We use this recursion relation to sum the on-shell diagrams to all orders, computing the Virasoro vacuum block. Extrapolating to large $h_H/c$ determines the Hawking temperature of a BTZ black hole in dual AdS$_3$ theories.

preprint2015arXiv

Virasoro Conformal Blocks and Thermality from Classical Background Fields

We show that in 2d CFTs at large central charge, the coupling of the stress tensor to heavy operators can be re-absorbed by placing the CFT in a non-trivial background metric. This leads to a more precise computation of the Virasoro conformal blocks between heavy and light operators, which are shown to be equivalent to global conformal blocks evaluated in the new background. We also generalize to the case where the operators carry U(1) charges. The refined Virasoro blocks can be used as the seed for a new Virasoro block recursion relation expanded in the heavy-light limit. We comment on the implications of our results for the universality of black hole thermality in $AdS_3$, or equivalently, the eigenstate thermalization hypothesis for $CFT_2$ at large central charge.

preprint2014arXiv

An Effective Theory for Holographic RG Flows

We study the dilaton action induced by RG flows between holographic CFT fixed points. For this purpose we introduce a general bulk effective theory for the goldstone boson of the broken spacetime symmetry, providing an AdS analog of the EFT of Inflation. In two dimensions, we use the effective theory to compute the dilaton action, as well as the UV and IR conformal anomalies, without further assumptions. In higher dimensions we take a `slow-flow' limit analogous to the assumption of slow-roll in Inflation, and in this context we obtain the dilaton action, focusing on terms proportional to the difference of the A-type anomalies. We include Gauss-Bonnet terms in the gravitational action in order to verify that our method correctly differentiates between A-type and other anomalies.

preprint2014arXiv

Covariant Approaches to Superconformal Blocks

We develop techniques for computing superconformal blocks in 4d superconformal field theories. First we study the super-Casimir differential equation, deriving simple new expressions for superconformal blocks for 4-point functions containing chiral operators in theories with N-extended supersymmetry. We also reproduce these results by extending the "shadow formalism" of Ferrara, Gatto, Grillo, and Parisi to supersymmetric theories, where superconformal blocks can be represented as superspace integrals of three-point functions multiplied by shadow three-point functions.

preprint2014arXiv

Slow Fermions in Quantum Critical Metals

We study the low-energy behavior of metals coupled to gapless bosons. This problem arises in several contexts in modern condensed matter physics; we focus on the theory of metals near continuous quantum phase transitions (where the boson is the order parameter). In the vicinity of d=3 spatial dimensions, the upper critical dimension of the theory, the ratio of fermion and boson speeds, v/c, acts as an additional control parameter, enabling us to access IR fixed points where this ratio vanishes. This limit corresponds to a non-Fermi liquid coupled to bosons with critical exponents governed by the Wilson-Fisher fixed point.

preprint2014arXiv

Universality of Long-Distance AdS Physics from the CFT Bootstrap

We begin by explicating a recent proof of the cluster decomposition principle in AdS_{d+1} from the CFT_d bootstrap in d > 2. The CFT argument also computes the leading interactions between distant objects in AdS, and we confirm the universal agreement between the CFT bootstrap and AdS gravity in the semi-classical limit. We proceed to study the generalization to 2d CFTs, which requires knowledge of the Virasoro conformal blocks in a lightcone OPE limit. We compute these blocks in a semiclassical, large central charge approximation, and use them to prove a suitably modified theorem. In particular, from the 2d bootstrap we prove the existence of large spin operators with fixed 'anomalous dimensions' indicative of the presence of deficit angles in AdS_3. As we approach the threshold for the BTZ black hole, interpreted as a CFT scaling dimension, the twist spectrum of large spin operators becomes dense. Due to the exchange of the Virasoro identity block, primary states above the BTZ threshold mimic a thermal background for light operators. We derive the BTZ quasi-normal modes, and we use the bootstrap equation to prove that the twist spectrum is dense. Corrections to thermality could be obtained from a more refined computation of the Virasoro conformal blocks.

preprint2013arXiv

Conformal Blocks in the Large D Limit

We derive conformal blocks in an inverse spacetime dimension expansion. In this large D limit, the blocks are naturally written in terms of a new combination of conformal cross-ratios. We comment on the implications for the conformal bootstrap at large D.

preprint2013arXiv

Decoupling of High Dimension Operators from the Low Energy Sector in Holographic Models

We study the decoupling of high dimension operators from the the description of the low-energy spectrum in theories where conformal symmetry is broken by a single scale, which we refer to as `broken CFTs'. Holographic duality suggests that this decoupling occurs in generic backgrounds. We show how the decoupling of high mass states in the (d+1)-dimensional bulk relates to the decoupling of high energy states in the d-dimensional broken CFT. In other words, we explain why both high dimension operators and high mass states in the CFT decouple from the low-energy physics of the mesons and glueballs. In many cases, the decoupling can occur exponentially fast in the dimension of the operator. Holography motivates a new kind of form factor proportional to the two point function between broken CFT operators with very different scaling dimensions. This new notion of decoupling can provide a systematic justification for holographic descriptions of QCD and condensed matter systems with only light degrees of freedom in the bulk.

preprint2013arXiv

Non-Fermi liquid behavior of large N_B quantum critical metals

The problem of continuous quantum phase transitions in metals involves critical bosons coupled to a Fermi surface. We solve the theory in the limit of a large number, N_B, of bosonic flavors, where the bosons transform in the adjoint representation, while the fermions are in the fundamental representation of a global SU(N_B) flavor symmetry group. The leading large N_B solution corresponds to a non-Fermi liquid coupled to Wilson-Fisher bosons. In a certain energy range, the fermion velocity vanishes - resulting in the destruction of the Fermi surface. Subleading 1/N_B corrections correspond to a qualitatively different form of Landau damping of the bosonic critical fluctuations. We discuss the model in d=3-epsilon but because of the additional control afforded by large N_B, our results are valid down to d=2. In the limit epsilon << 1, the large N_B solution is consistent with the RG analysis of Ref. 1.

preprint2013arXiv

Non-Fermi liquid fixed point in a Wilsonian theory of quantum critical metals

We study the problem of disorder-free metals near a continuous quantum critical point. We depart from the standard paradigm of Hertz and Millis, and treat both fermions and bosons i.e. order parameter fields) on equal footing. We construct a Wilsonian effective field theory that integrates out only high energy boson and fermion modes. Below the upper critical dimension of the theory (d=3 spatial dimensions), we find new fixed points in which the bosons are described by the Wilson-Fisher fixed point and are coupled to a non-Fermi liquid metal. We describe subtleties with the renormalization group flow of four-Fermi interactions, which can be surmounted in a controlled large N limit. In this limit, we find that the theory has no superconducting instability.

preprint2013arXiv

The Analytic Bootstrap and AdS Superhorizon Locality

We take an analytic approach to the CFT bootstrap, studying the 4-pt correlators of d > 2 dimensional CFTs in an Eikonal-type limit, where the conformal cross ratios satisfy |u| << |v| < 1. We prove that every CFT with a scalar operator ϕmust contain infinite sequences of operators O_{τ,l} with twist approaching τ-> 2Δ_ϕ+ 2n for each integer n as l -> infinity. We show how the rate of approach is controlled by the twist and OPE coefficient of the leading twist operator in the ϕx ϕOPE, and we discuss SCFTs and the 3d Ising Model as examples. Additionally, we show that the OPE coefficients of other large spin operators appearing in the OPE are bounded as l -> infinity. We interpret these results as a statement about superhorizon locality in AdS for general CFTs.

preprint2012arXiv

A New Theory of Anyons

We study a 2+1 dimensional theory of bosons and fermions with an omega ~ k^2 dispersion relation. The most general interactions consistent with specific symmetries impart fractional statistics to the fermions. Unlike examples involving Chern-Simons gauge theories, our statistical phases derive from the exchange of gapless propagating bosons with marginal interactions. Even though no gap exists, we show that the anyonic statistics are precisely defined. Symmetries combine with the vacuum structure to guarantee the non-renormalization of our anyonic phases.

preprint2012arXiv

AdS Field Theory from Conformal Field Theory

We provide necessary and sufficient conditions for a Conformal Field Theory to have a description in terms of a perturbative Effective Field Theory in AdS. The first two conditions are well-known: the existence of a perturbative `1/N' expansion and an approximate Fock space of states generated by a finite number of low-dimension operators. We add a third condition, that the Mellin amplitudes of the CFT correlators must be well-approximated by functions that are bounded by a polynomial at infinity in Mellin space, or in other words, that the Mellin amplitudes have an effective theory-type expansion. We explain the relationship between our conditions and unitarity, and provide an analogy with scattering amplitudes that becomes exact in the flat space limit of AdS. The analysis also yields a simple connection between conformal blocks and AdS diagrams, providing a new calculational tool very much in the spirit of the S-Matrix program. We also begin to explore the potential pathologies associated with higher spin fields in AdS by generalizing Weinberg's soft theorems to AdS/CFT. The AdS analog of Weinberg's argument constrains the interactions of conserved currents in CFTs, but there are potential loopholes that are unavailable to theories of massless higher spin particles in flat spacetime.

preprint2012arXiv

Analyticity and the Holographic S-Matrix

We derive a simple relation between the Mellin amplitude for AdS/CFT correlation functions and the bulk S-Matrix in the flat spacetime limit, proving a conjecture of Penedones. As a consequence of the Operator Product Expansion, the Mellin amplitude for any unitary CFT must be a meromorphic function with simple poles on the real axis. This provides a powerful and suggestive handle on the locality vis-a-vis analyticity properties of the S-Matrix. We begin to explore analyticity by showing how the familiar poles and branch cuts of scattering amplitudes arise from the holographic description. For this purpose we compute examples of Mellin amplitudes corresponding to 1-loop and 2-loop Witten diagrams in AdS. We also examine the flat spacetime limit of conformal blocks, implicitly relating the S-Matrix program to the Bootstrap program for CFTs. We use this connection to show how the existence of small black holes in AdS leads to a universal prediction for the conformal block decomposition of the dual CFT.

preprint2012arXiv

Unitarity and the Holographic S-Matrix

The bulk S-Matrix can be given a non-perturbative definition in terms of the flat space limit of AdS/CFT. We show that the unitarity of the S-Matrix, ie the optical theorem, can be derived by studying the behavior of the OPE and the conformal block decomposition in the flat space limit. When applied to perturbation theory in AdS, this gives a holographic derivation of the cutting rules for Feynman diagrams. To demonstrate these facts we introduce some new techniques for the analysis of conformal field theories. Chief among these is a method for conglomerating local primary operators to extract the contribution of an individual primary in their OPE. This provides a method for isolating the contribution of specific conformal blocks which we use to prove an important relation between certain conformal block coefficients and anomalous dimensions. These techniques make essential use of the simplifications that occur when CFT correlators are expressed in terms of a Mellin amplitude.

preprint2011arXiv

A Natural Language for AdS/CFT Correlators

We provide dramatic evidence that `Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into `left' and `right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

preprint2011arXiv

Heavy Flavor Simplified Models at the LHC

We consider a comprehensive set of simplified models that contribute to final states with top and bottom quarks at the LHC. These simplified models are used to create minimal search strategies that ensure optimal coverage of new heavy flavor physics involving the pair production of color octets and triplets. We provide a set of benchmarks that are representative of model space, which can be used by experimentalists to perform their own optimization of search strategies. For data sets larger than 1/fb, same-sign dilepton and 3b search regions become very powerful. Expected sensitivities from existing and optimized searches are given.

preprint2011arXiv

LHC Predictions from a Tevatron Anomaly in the Top Quark Forward-Backward Asymmetry

We examine the implications of the recent CDF measurement of the top-quark forward-backward asymmetry, focusing on a scenario with a new color octet vector boson at 1-3 TeV. We study several models, as well as a general effective field theory, and determine the parameter space which provides the best simultaneous fit to the CDF asymmetry, the Tevatron top pair production cross section, and the exclusion regions from LHC dijet resonance and contact interaction searches. Flavor constraints on these models are more subtle and less severe than the literature indicates. We find a large region of allowed parameter space at high axigluon mass and a smaller region at low mass; we match the latter to an SU(3)xSU(3)/SU(3) coset model with a heavy vector-like fermion. Our scenario produces discoverable effects at the LHC with only 1-2 inverse femtobarns of luminosity at 7-8 TeV. Lastly, we point out that a Tevatron measurement of the b-quark forward-backward asymmetry would be very helpful in characterizing the physics underlying the top-quark asymmetry.

preprint2011arXiv

Scattering States in AdS/CFT

We show that suitably regulated multi-trace primary states in large N CFTs behave like `in' and `out' scattering states in the flat-space limit of AdS. Their transition matrix elements approach the exact scattering amplitudes for the bulk theory, providing a natural CFT definition of the flat space S-Matrix. We study corrections resulting from the AdS curvature and particle propagation far from the center of AdS, and show that AdS simply provides an IR regulator that disappears in the flat space limit.

preprint2011arXiv

Simplified Models for LHC New Physics Searches

This document proposes a collection of simplified models relevant to the design of new-physics searches at the LHC and the characterization of their results. Both ATLAS and CMS have already presented some results in terms of simplified models, and we encourage them to continue and expand this effort, which supplements both signature-based results and benchmark model interpretations. A simplified model is defined by an effective Lagrangian describing the interactions of a small number of new particles. Simplified models can equally well be described by a small number of masses and cross-sections. These parameters are directly related to collider physics observables, making simplified models a particularly effective framework for evaluating searches and a useful starting point for characterizing positive signals of new physics. This document serves as an official summary of the results from the "Topologies for Early LHC Searches" workshop, held at SLAC in September of 2010, the purpose of which was to develop a set of representative models that can be used to cover all relevant phase space in experimental searches. Particular emphasis is placed on searches relevant for the first ~50-500 pb-1 of data and those motivated by supersymmetric models. This note largely summarizes material posted at http://lhcnewphysics.org/, which includes simplified model definitions, Monte Carlo material, and supporting contacts within the theory community. We also comment on future developments that may be useful as more data is gathered and analyzed by the experiments.

preprint2010arXiv

Discovering New Light States at Neutrino Experiments

Experiments designed to measure neutrino oscillations also provide major opportunities for discovering very weakly coupled states. In order to produce neutrinos, experiments such as LSND collide thousands of Coulombs of protons into fixed targets, while MINOS and MiniBooNE also focus and then dump beams of muons. The neutrino detectors beyond these beam dumps are therefore an excellent arena in which to look for long-lived pseudoscalars or for vector bosons that kinetically mix with the photon. We show that these experiments have significant sensitivity beyond previous beam dumps, and are able to partially close the gap between laboratory experiments and supernovae constraints on pseudoscalars. Future upgrades to the NuMI beamline and Project X will lead to even greater opportunities for discovery. We also discuss thin target experiments with muon beams, such as those available in COMPASS, and show that they constitute a powerful probe for leptophilic PNGBs.

preprint2010arXiv

On the Origin of Light Dark Matter Species

TeV-mass dark matter charged under a new GeV-scale gauge force can explain electronic cosmic-ray anomalies. We propose that the CoGeNT and DAMA direct detection experiments are observing scattering of light stable states -- "GeV-Matter" -- that are charged under this force and constitute a small fraction of the dark matter halo. Dark higgsinos in a supersymmetric dark sector are natural candidates for GeV-Matter that scatter off protons with a universal cross-section of 5 x 10^{-38} cm^2 and can naturally be split by 10-30 keV so that their dominant interaction with protons is down-scattering. As an example, down-scattering of an O(5) GeV dark higgsino can simultaneously explain the spectra observed by both CoGeNT and DAMA. The event rates in these experiments correspond to a GeV-Matter abundance of 0.2-1% of the halo mass density. This abundance can arise directly from thermal freeze-out at weak coupling, or from the late decay of an unstable TeV-scale WIMP. Our proposal can be tested by searches for exotics in the BaBar and Belle datasets.

preprint2010arXiv

What is the Simplest Quantum Field Theory?

Conventional wisdom says that the simpler the Lagrangian of a theory the simpler its perturbation theory, but an increased understanding of the structure of the S-matrix in gauge theories and gravity has been pointing to the opposite conclusion. In this paper we suggest that N=8 SUGRA has the simplest interacting S-matrix in 4D. Using Grassmann coherent states for external particles shows that amplitudes with maximal SUSY are smooth objects, with the action of SUSY manifest. We show that all tree amplitudes in N=4 SYM and N=8 SUGRA vanish at (supersymmetric) infinite complex momentum, and can thus be determined by recursion relations. We also identify the action of the non-linearly realized E_{7(7)} symmetry of N=8 SUGRA on scattering amplitudes. We give a simple discussion of the structure of 1-loop amplitudes in any QFT, in close parallel to recent work of Forde, showing that the coefficients of scalar "triangle" and "bubble" integrals are determined by the "pole at infinite momentum" of tree amplitude products appearing in cuts. The on-shell superspace for maximal SUSY makes it easy to compute the multiplet sums that arise in these cuts, leading to a simple proof of the absence of triangles and bubbles at 1-loop. We also argue that rational terms are absent. This establishes the recent conjecture that 1-loop amplitudes in N=8 SUGRA have only scalar box integrals, just as N=4 SYM. It is natural to conjecture that with maximal SUSY, amplitudes are completely determined by their leading singularities even beyond tree- and 1-loop level; this would directly imply the perturbative finiteness of N=8 SUGRA. The remarkable properties of scattering amplitudes call for an explanation in terms of a "weak-weak" dual formulation of QFT, a holographic dual of flat space.

preprint2009arXiv

A Duality For The S Matrix

We propose a dual formulation for the S Matrix of N = 4 SYM. The dual provides a basis for the "leading singularities" of scattering amplitudes to all orders in perturbation theory, which are sharply defined, IR safe data that uniquely determine the full amplitudes at tree level and 1-loop, and are conjectured to do so at all loop orders. The scattering amplitude for n particles in the sector with k negative helicity gluons is associated with a simple integral over the space of k planes in n dimensions, with the action of parity and cyclic symmetries manifest. The residues of the integrand compute a basis for the leading singularities. A given leading singularity is associated with a particular choice of integration contour, which we explicitly identify at tree level and 1-loop for all NMHV amplitudes as well as the 8 particle NNMHV amplitude. We also identify a number of 2-loop leading singularities for up to 8 particles. There are a large number of relations among residues which follow from the multi-variable generalization of Cauchy's theorem known as the "global residue theorem". These relations imply highly non-trivial identities guaranteeing the equivalence of many different representations of the same amplitude. They also enforce the cancellation of non-local poles as well as consistent infrared structure at loop level. Our conjecture connects the physics of scattering amplitudes to a particular subvariety in a Grassmannian; space-time locality is reflected in the topological properties of this space.

preprint2009arXiv

The S-Matrix in Twistor Space

The simplicity and hidden symmetries of (Super) Yang-Mills and (Super)Gravity scattering amplitudes suggest the existence of a "weak-weak" dual formulation in which these structures are made manifest at the expense of manifest locality. We suggest that this dual description lives in (2,2) signature and is naturally formulated in twistor space. We recast the BCFW recursion relations in an on-shell form that begs to be transformed into twistor space. Our twistor transformation is inspired by Witten's, but differs in treating twistor and dual twistor variables more equally. In these variables the three and four-point amplitudes are amazingly simple; the BCFW relations are represented by diagrammatic rules that precisely define the "twistor diagrams" of Andrew Hodges. The "Hodges diagrams" for Yang-Mills theory are disks and not trees; they reveal striking connections between amplitudes and suggest a new form for them in momentum space. We also obtain a twistorial formulation of gravity. All tree amplitudes can be combined into an "S-Matrix" functional which is the natural holographic observable in asymptotically flat space; the BCFW formula turns into a quadratic equation for this "S-Matrix", providing a holographic description of N=4 SYM and N=8 Supergravity at tree level. We explore loop amplitudes in (2,2) signature and twistor space, beginning with a discussion of IR behavior. We find that the natural pole prescription renders the amplitudes well-defined and free of IR divergences. Loop amplitudes vanish for generic momenta, and in twistor space are even simpler than their tree-level counterparts! This further supports the idea that there exists a sharply defined object corresponding to the S-Matrix in (2,2) signature, computed by a dual theory naturally living in twistor space.

preprint2009arXiv

Unraveling L_{n,k}: Grassmannian Kinematics

It was recently proposed that the leading singularities of the S-Matrix of N = 4 super Yang-Mills theory arise as the residues of a contour integral over a Grassmannian manifold, with space-time locality encoded through residue theorems generalizing Cauchy's theorem to more than one variable. We provide a method to identify the residue corresponding to any leading singularity, and we carry this out very explicitly for all leading singularities at tree level and one-loop. We also give several examples at higher loops, including all generic two-loop leading singularities and an interesting four-loop object. As a special case we consider a 12-pt N^4MHV leading singularity at two loops that has a new kinematic structure involving double square roots. Our analysis results in a simple picture for how the topological structure of loop graphs is reflected in various substructures within the Grassmannian.

preprint2008arXiv

On Tree Amplitudes in Gauge Theory and Gravity

The BCFW recursion relations provide a powerful way to compute tree amplitudes in gauge theories and gravity, but only hold if some amplitudes vanish when two of the momenta are taken to infinity in a particular complex direction. This is a very surprising property, since individual Feynman diagrams all diverge at infinite momentum. In this paper we give a simple physical understanding of amplitudes in this limit, which corresponds to a hard particle with (complex) light-like momentum moving in a soft background, and can be conveniently studied using the background field method exploiting background light-cone gauge. An important role is played by enhanced spin symmetries at infinite momentum--a single copy of a "Lorentz" group for gauge theory and two copies for gravity--which together with Ward identities give a systematic expansion for amplitudes at large momentum. We use this to study tree amplitudes in a wide variety of theories, and in particular demonstrate that certain pure gauge and gravity amplitudes do vanish at infinity. Thus the BCFW recursion relations can be used to compute completely general gluon and graviton tree amplitudes in any number of dimensions. We briefly comment on the implications of these results for computing massive 4D amplitudes by KK reduction, as well understanding the unexpected cancelations that have recently been found in loop-level gravity amplitudes.

preprint2007arXiv

The Plasma Puddle as a Perturbative Black Hole

We argue that the weak coupling regime of a large N gauge theory in the Higgs phase contains black hole-like objects. These so-called ``plasma puddles'' are meta-stable lumps of hot plasma lying in locally un-Higgsed regions of space. They decay via O(1/N) thermal radiation and, perhaps surprisingly, absorb all incident matter. We show that an incident particle of energy E striking the plasma puddle will shower into an enormous number of decay products whose multiplicity grows linearly with E, and whose average energy is independent of E. Once these ultra-soft particles reach the interior they are thermalized by the plasma within, and so the object appears ``black.'' We determine some gross properties like the size and temperature of the the plasma puddle in terms of fundamental parameters in the gauge theory. Interestingly, demanding that the plasma puddle emit thermal Hawking radiation implies that the object is black (i.e. absorbs all incident particles), which implies classical stability, which implies satisfaction of the Bekenstein entropy bound. Because of the AdS/CFT duality and the many similarities between plasma puddles and black holes, we conjecture that black objects are a robust feature of quantum gravity.

preprint2006arXiv

Dark Matter Generation and Split Supersymmetry

We analyze a simple Split Supersymmetry scenario where fermion masses come from anomaly mediation, yielding m_s ~ 1000 TeV, m_{3/2} ~ 100 TeV, and m_f ~ 1 TeV. We consider non-thermal dark matter production in the presence of moduli, and we find that the decay chains of moduli to LSPs and moduli to gravitinos to LSPs generate dark matter more efficiently than perturbative freeze-out, allowing for a light, LHC visible spectrum. These decaying moduli can also weaken cosmological constraints on the axion decay constant. With squark masses of order 1000 TeV, LHC gluinos will decay millimeters from their primary vertices, resulting in a striking experimental signature, and the suppression of Flavor Changing Neutral Currents is almost sufficient to allow arbitrary mixing in squark mass matrices.

Jared Kaplan

What is connected

Connect this record

See the researcher in context

Building this map preview

42 published item(s)

Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks

Scaling Laws and Interpretability of Learning from Repeated Data

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Scaling Laws for Transfer

A Neural Scaling Law from the Dimension of the Data Manifold

Language Models are Few-Shot Learners

Scaling Laws for Neural Language Models

A Species or Weak-Gravity Bound for Large $N$ Gauge Theories Coupled to Gravity

A Quantum Correction To Chaos

Conformal Blocks Beyond the Semi-Classical Limit

On Information Loss in AdS$_3$/CFT$_2$

Eikonalization of Conformal Blocks

Enhanced Pairing of Quantum Critical Metals Near d=3+1

Hawking from Catalan

Virasoro Conformal Blocks and Thermality from Classical Background Fields

An Effective Theory for Holographic RG Flows

Covariant Approaches to Superconformal Blocks

Slow Fermions in Quantum Critical Metals

Universality of Long-Distance AdS Physics from the CFT Bootstrap

Conformal Blocks in the Large D Limit

Decoupling of High Dimension Operators from the Low Energy Sector in Holographic Models

Non-Fermi liquid behavior of large N_B quantum critical metals

Non-Fermi liquid fixed point in a Wilsonian theory of quantum critical metals

The Analytic Bootstrap and AdS Superhorizon Locality

A New Theory of Anyons

AdS Field Theory from Conformal Field Theory

Analyticity and the Holographic S-Matrix

Unitarity and the Holographic S-Matrix

A Natural Language for AdS/CFT Correlators

Heavy Flavor Simplified Models at the LHC

LHC Predictions from a Tevatron Anomaly in the Top Quark Forward-Backward Asymmetry

Scattering States in AdS/CFT

Simplified Models for LHC New Physics Searches

Discovering New Light States at Neutrino Experiments

On the Origin of Light Dark Matter Species

What is the Simplest Quantum Field Theory?

A Duality For The S Matrix

The S-Matrix in Twistor Space

Unraveling L_{n,k}: Grassmannian Kinematics

On Tree Amplitudes in Gauge Theory and Gravity

The Plasma Puddle as a Perturbative Black Hole

Dark Matter Generation and Split Supersymmetry