Researcher profile

Jonathan Frazer

Jonathan Frazer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

The ability to accurately model the fitness landscape of protein sequences is critical to a wide range of applications, from quantifying the effects of human variants on disease likelihood, to predicting immune-escape mutations in viruses and designing novel biotherapeutic proteins. Deep generative models of protein sequences trained on multiple sequence alignments have been the most successful approaches so far to address these tasks. The performance of these methods is however contingent on the availability of sufficiently deep and diverse alignments for reliable training. Their potential scope is thus limited by the fact many protein families are hard, if not impossible, to align. Large language models trained on massive quantities of non-aligned protein sequences from diverse families address these problems and show potential to eventually bridge the performance gap. We introduce Tranception, a novel transformer architecture leveraging autoregressive predictions and retrieval of homologous sequences at inference to achieve state-of-the-art fitness prediction performance. Given its markedly higher performance on multiple mutants, robustness to shallow alignments and ability to score indels, our approach offers significant gain of scope over existing approaches. To enable more rigorous model testing across a broader range of protein families, we develop ProteinGym -- an extensive set of multiplexed assays of variant effects, substantially increasing both the number and diversity of assays compared to existing benchmarks.

preprint2012arXiv

Inflationary perturbation theory is geometrical optics in phase space

A pressing problem in comparing inflationary models with observation is the accurate calculation of correlation functions. One approach is to evolve them using ordinary differential equations ("transport equations"), analogous to the Schwinger-Dyson hierarchy of in-out quantum field theory. We extend this approach to the complete set of momentum space correlation functions. A formal solution can be obtained using raytracing techniques adapted from geometrical optics. We reformulate inflationary perturbation theory in this language, and show that raytracing reproduces the familiar "delta N" Taylor expansion. Our method produces ordinary differential equations which allow the Taylor coefficients to be computed efficiently. We use raytracing methods to express the gauge transformation between field fluctuations and the curvature perturbation, zeta, in geometrical terms. Using these results we give a compact expression for the nonlinear gauge-transform part of fNL in terms of the principal curvatures of uniform energy-density hypersurfaces in field space.

preprint2011arXiv

Exploring a string-like landscape

We explore inflationary trajectories within randomly-generated two-dimensional potentials, considered as a toy model of the string landscape. Both the background and perturbation equations are solved numerically, the latter using the two-field formalism of Peterson and Tegmark which fully incorporates the effect of isocurvature perturbations. Sufficient inflation is a rare event, occurring for only roughly one in $10^5$ potentials. For models generating sufficient inflation, we find that the majority of runs satisfy current constraints from WMAP. The scalar spectral index is less than 1 in all runs. The tensor-to-scalar ratio is below the current limit, while typically large enough to be detected by next-generation CMB experiments and perhaps also by Planck. In many cases the inflationary consistency equation is broken by the effect of isocurvature modes.