Researcher profile

Thierry Mora

Thierry Mora contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2022arXiv

Inspecting the interaction between HIV and the immune system through genetic turnover

Chronic infections of the human immunodeficiency virus (HIV) create a very complex co-evolutionary process, where the virus tries to escape the continuously adapting host immune system. Quantitative details of this process are largely unknown and could help in disease treatment and vaccine development. Here we study a longitudinal dataset of ten HIV-infected people, where both the B-cell receptors and the virus are deeply sequenced. We focus on simple measures of turnover, which quantify how much the composition of the viral strains and the immune repertoire change between time points. At the single-patient level, the viral-host turnover rates do not show any statistically significant correlation, however they correlate if the information is aggregated across patients. In particular, we identify an anti-correlation: large changes in the viral pool composition come with small changes in the B-cell receptor repertoire. This result seems to contradict the naive expectation that when the virus mutates quickly, the immune repertoire needs to change to keep up. However, we show that the observed anti-correlation naturally emerges and can be understood in terms of simple population-genetics models.

preprint2022arXiv

MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence ratio, or equivalently the posterior function. Here we frame the inference task as an estimation of an energy function parametrized with an artificial neural network. We present an intuitive approach where the optimal model of the likelihood-to-evidence ratio is found by maximizing the likelihood of simulated data. Within this framework, the connection between the task of simulation-based inference and mutual information maximization is clear, and we show how several known methods of posterior estimation relate to alternative lower bounds to mutual information. These distinct objective functions aim at the same optimal energy form and therefore can be directly benchmarked. We compare their accuracy in the inference of model parameters, focusing on four dynamical systems that encompass common challenges in time series analysis: dynamics driven by multiplicative noise, nonlinear interactions, chaotic behavior, and high-dimensional parameter space.

preprint2022arXiv

NoisET: Noise learning and Expansion detection of T-cell receptors

High-throughput sequencing of T- and B-cell receptors makes it possible to track immune repertoires across time, in different tissues, in acute and chronic diseases and in healthy individuals. However quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. We review methods for accounting for both biological and experimental noise and present an easy-to-use python package NoisET that implements and generalizes a previously developed Bayesian method. It can be used to learn experimental noise models for repertoire sequencing from replicates, and to detect responding clones following a stimulus. We test the package on different repertoire sequencing technologies and datasets. We review how such approaches have been used to identify responding clonotypes in vaccination and disease data. Availability: NoisET is freely available to use with source code at github.com/statbiophys/NoisET.

preprint2021arXiv

A Renormalization Group Approach to Connect Discrete- and Continuous-Time Descriptions of Gaussian Processes

Discretization of continuous stochastic processes is needed to numerically simulate them or to infer models from experimental time series. However, depending on the nature of the process, the same discretization scheme, if not accurate enough, may perform very differently for the two tasks. Exact discretizations, which work equally well at any scale, are characterized by the property of invariance under coarse-graining. Motivated by this observation, we build an explicit Renormalization Group approach for Gaussian time series generated by auto-regressive models. We show that the RG fixed points correspond to discretizations of linear SDEs, and only come in the form of first order Markov processes or non-Markovian ones. This fact provides an alternative explanation of why standard delay-vector embedding procedures fail in reconstructing partially observed noise-driven systems. We also suggest a possible effective Markovian discretization for the inference of partially observed underdamped equilibrium processes based on the exploitation of the Einstein relation.

preprint2021arXiv

Affinity maturation for an optimal balance between long-term immune coverage and short-term resource constraints

In order to target threatening pathogens, the adaptive immune system performs a continuous reorganization of its lymphocyte repertoire. Following an immune challenge, the B cell repertoire can evolve cells of increased specificity for the encountered strain. This process of affinity maturation generates a memory pool whose diversity and size remain difficult to predict. We assume that the immune system follows a strategy that maximizes the long-term immune coverage and minimizes the short-term metabolic costs associated with affinity maturation. This strategy is defined as an optimal decision process on a finite dimensional phenotypic space, where a pre-existing population of naive cells is sequentially challenged with a neutrally evolving strain. We unveil a trade-off between immune protection against future strains and the necessary reorganization of the repertoire. This plasticity of the repertoire drives the emergence of distinct regimes for the size and diversity of the memory pool, depending on the density of naive cells and on the mutation rate of the strain. The model predicts power-law distributions of clonotype sizes observed in data, and rationalizes antigenic imprinting as a strategy to minimize metabolic costs while keeping good immune protection against future strains.

preprint2021arXiv

Antigenic waves of virus-immune co-evolution

The evolution of many microbes and pathogens, including circulating viruses such as seasonal influenza, is driven by immune pressure from the host population. In turn, the immune systems of infected populations get updated, chasing viruses even further away. Quantitatively understanding how these dynamics result in observed patterns of rapid pathogen and immune adaptation is instrumental to epidemiological and evolutionary forecasting. Here we present a mathematical theory of co-evolution between immune systems and viruses in a finite-dimensional antigenic space, which describes the cross-reactivity of viral strains and immune systems primed by previous infections. We show the emergence of an antigenic wave that is pushed forward and canalized by cross-reactivity. We obtain analytical results for shape, speed, and angular diffusion of the wave. In particular, we show that viral-immune co-evolution generates a new emergent timescale, the persistence time of the wave's direction in antigenic space, which can be much longer than the coalescence time of the viral population. We compare these dynamics to the observed antigenic turnover of influenza strains, and we discuss how the dimensionality of antigenic space impacts on the predictability of the evolutionary dynamics. Our results provide a concrete and tractable framework to describe pathogen-host co-evolution.

preprint2020arXiv

Building general Langevin models from discrete data sets

Many living and complex systems exhibit second order emergent dynamics. Limited experimental access to the configurational degrees of freedom results in data that appears to be generated by a non-Markovian process. This poses a challenge in the quantitative reconstruction of the model from experimental data, even in the simple case of equilibrium Langevin dynamics of Hamiltonian systems. We develop a novel Bayesian inference approach to learn the parameters of such stochastic effective models from discrete finite length trajectories. We first discuss the failure of naive inference approaches based on the estimation of derivatives through finite differences, regardless of the time resolution and the length of the sampled trajectories. We then derive, adopting higher order discretization schemes, maximum likelihood estimators for the model parameters that provide excellent results even with moderately long trajectories. We apply our method to second order models of collective motion and show that our results also hold in the presence of interactions.

preprint2020arXiv

Fierce selection and interference in B-cell repertoire response to chronic HIV-1

During chronic infection, HIV-1 engages in a rapid coevolutionary arms race with the host's adaptive immune system. While it is clear that HIV exerts strong selection on the adaptive immune system, the characteristics of the somatic evolution that shape the immune response are still unknown. Traditional population genetics methods fail to distinguish chronic immune response from healthy repertoire evolution. Here, we infer the evolutionary modes of B-cell repertoires and identify complex dynamics with a constant production of better B-cell receptor mutants that compete, maintaining large clonal diversity and potentially slowing down adaptation. A substantial fraction of mutations that rise to high frequencies in pathogen engaging CDRs of B-cell receptors (BCRs) are beneficial, in contrast to many such changes in structurally relevant frameworks that are deleterious and circulate by hitchhiking. We identify a pattern where BCRs in patients who experience larger viral expansions undergo stronger selection with a rapid turnover of beneficial mutations due to clonal interference in their CDR3 regions. Using population genetics modeling, we show that the extinction of these beneficial mutations can be attributed to the rise of competing beneficial alleles and clonal interference. The picture is of a dynamic repertoire, where better clones may be outcompeted by new mutants before they fix.

preprint2020arXiv

Immune Fingerprinting through Repertoire Similarity

Immune repertoires provide a unique fingerprint reflecting the immune history of individuals, with potential applications in precision medicine. However, the question of how personal that information is and how it can be used to identify individuals has not been explored. Here, we show that individuals can be uniquely identified from repertoires of just a few thousands lymphocytes. We present &#34;Immprint,&#34; a classifier using an information-theoretic measure of repertoire similarity to distinguish pairs of repertoire samples coming from the same versus different individuals. Using published T-cell receptor repertoires and statistical modeling, we tested its ability to identify individuals with great accuracy, including identical twins, by computing false positive and false negative rates $< 10^{-6}$ from samples composed of 10,000 T-cells. We verified through longitudinal datasets and simulations that the method is robust to acute infections and the passage of time. These results emphasize the private and personal nature of repertoire data.

preprint2020arXiv

Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data

Somatic hypermutations of immunoglobulin (Ig) genes occuring during affinity maturation drive B-cell receptors&#39; ability to evolve strong binding to their antigenic targets. The landscape of these mutations is highly heterogeneous, with certain regions of the Ig gene being preferentially targeted. However, a rigorous quantification of this bias has been difficult because of phylogenetic correlations between sequences and the interference of selective forces. Here, we present an approach that corrects for these issues, and use it to learn a model of hypermutation preferences from a recently published large IgH repertoire dataset. The obtained model predicts mutation profiles accurately and in a reproducible way, including in the previously uncharacterized Complementarity Determining Region 3, revealing that both the sequence context of the mutation and its absolute position along the gene are important. In addition, we show that hypermutations occurring concomittantly along B-cell lineages tend to co-localize, suggesting a possible mechanism for accelerating affinity maturation.

preprint2020arXiv

Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cell memory formation after mild COVID-19 infection

COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. T cells play a key role in the adaptive antiviral immune response by killing infected cells and facilitating the selection of virus-specific antibodies. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cell response nor the diversity of resulting immune memory are well understood. In this study we use longitudinal high-throughput T cell receptor (TCR) sequencing to track changes in the T cell repertoire following two mild cases of COVID-19. In both donors we identified CD4+ and CD8+ T cell clones with transient clonal expansion after infection. The antigen specificity of CD8+ TCR sequences to SARS-CoV-2 epitopes was confirmed by both MHC tetramer binding and presence in large database of SARS-CoV-2 epitope-specific TCRs. We describe characteristic motifs in TCR sequences of COVID-19-reactive clones and show preferential occurence of these motifs in publicly available large dataset of repertoires from COVID-19 patients. We show that in both donors the majority of infection-reactive clonotypes acquire memory phenotypes. Certain T cell clones were detected in the memory fraction at the pre-infection timepoint, suggesting participation of pre-existing cross-reactive memory T cells in the immune response to SARS-CoV-2.

preprint2020arXiv

On generative models of T-cell receptor sequences

T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free Variational Auto-Encoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost.

preprint2020arXiv

Population variability in the generation and thymic selection of T-cell repertoires

The diversity of T-cell receptor (TCR) repertoires is achieved by a combination of two intrinsically stochastic steps: random receptor generation by VDJ recombination, and selection based on the recognition of random self-peptides presented on the major histocompatibility complex. These processes lead to a large receptor variability within and between individuals. However, the characterization of the variability is hampered by the limited size of the sampled repertoires. We introduce a new software tool SONIA to facilitate inference of individual-specific computational models for the generation and selection of the TCR beta chain (TRB) from sequenced repertoires of 651 individuals, separating and quantifying the variability of the two processes of generation and selection in the population. We find not only that most of the variability is driven by the VDJ generation process, but there is a large degree of consistency between individuals with the inter-individual variance of repertoires being about 2% of the intra-individual variance. Known viral-specific TCRs follow the same generation and selection statistics as all TCRs.

preprint2012arXiv

Statistical inference of the generation probability of T-cell receptors from sequence repertoires

Stochastic rearrangement of germline DNA by VDJ recombination is at the origin of immune system diversity. This process is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Since any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on non-productive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our distribution predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.

preprint2012arXiv

Transition path sampling algorithm for discrete many-body systems

We propose a new Monte Carlo method for efficiently sampling trajectories with fixed initial and final conditions in a system with discrete degrees of freedom. The method can be applied to any stochastic process with local interactions, including systems that are out of equilibrium. We combine the proposed path-sampling algorithm with thermodynamic integration to calculate transition rates. We demonstrate our method on the well studied 2D Ising model with periodic boundary conditions, and show agreement with other results both for large and small system sizes. The method scales well with the system size, allowing one to simulate systems with many degrees of freedom, and providing complementary information with respect to other algorithms.

preprint2010arXiv

Are biological systems poised at criticality?

Many of life&#39;s most fascinating phenomena emerge from interactions among many elements--many amino acids determine the structure of a single protein, many genes determine the fate of a cell, many neurons are involved in shaping our thoughts and memories. Physicists have long hoped that these collective behaviors could be described using the ideas and methods of statistical mechanics. In the past few years, new, larger scale experiments have made it possible to construct statistical mechanics models of biological systems directly from real data. We review the surprising successes of this &#34;inverse&#34; approach, using examples form families of proteins, networks of neurons, and flocks of birds. Remarkably, in all these cases the models that emerge from the data are poised at a very special point in their parameter space--a critical point. This suggests there may be some deeper theoretical principle behind the behavior of these diverse systems.

preprint2010arXiv

Limits of sensing temporal concentration changes by single cells

Berg and Purcell [Biophys. J. 20, 193 (1977)] calculated how the accuracy of concentration sensing by single-celled organisms is limited by noise from the small number of counted molecules. Here we generalize their results to the sensing of concentration ramps, which is often the biologically relevant situation (e.g. during bacterial chemotaxis). We calculate lower bounds on the uncertainty of ramp sensing by three measurement devices: a single receptor, an absorbing sphere, and a monitoring sphere. We contrast two strategies, simple linear regression of the input signal versus maximum likelihood estimation, and show that the latter can be twice as accurate as the former. Finally, we consider biological implementations of these two strategies, and identify possible signatures that maximum likelihood estimation is implemented by real biological systems.

preprint2009arXiv

Maximum entropy models for antibody diversity

Recognition of pathogens relies on families of proteins showing great diversity. Here we construct maximum entropy models of the sequence repertoire, building on recent experiments that provide a nearly exhaustive sampling of the IgM sequences in zebrafish. These models are based solely on pairwise correlations between residue positions, but correctly capture the higher order statistical properties of the repertoire. Exploiting the interpretation of these models as statistical physics problems, we make several predictions for the collective properties of the sequence ensemble: the distribution of sequences obeys Zipf&#39;s law, the repertoire decomposes into several clusters, and there is a massive restriction of diversity due to the correlations. These predictions are completely inconsistent with models in which amino acid substitutions are made independently at each site, and are in good agreement with the data. Our results suggest that antibody diversity is not limited by the sequences encoded in the genome, and may reflect rapid adaptation to antigenic challenges. This approach should be applicable to the study of the global properties of other protein families.

preprint2008arXiv

Thermodynamics of natural images

The scale invariance of natural images suggests an analogy to the statistical mechanics of physical systems at a critical point. Here we examine the distribution of pixels in small image patches and show how to construct the corresponding thermodynamics. We find evidence for criticality in a diverging specific heat, which corresponds to large fluctuations in how &#34;surprising&#34; we find individual images, and in the quantitative form of the entropy vs. energy. The energy landscape derived from our thermodynamic framework identifies special image configurations that have intrinsic error correcting properties, and neurons which could detect these features have a strong resemblance to the cells found in primary visual cortex.