Researcher profile

Aleksandra M. Walczak

Aleksandra M. Walczak contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence ratio, or equivalently the posterior function. Here we frame the inference task as an estimation of an energy function parametrized with an artificial neural network. We present an intuitive approach where the optimal model of the likelihood-to-evidence ratio is found by maximizing the likelihood of simulated data. Within this framework, the connection between the task of simulation-based inference and mutual information maximization is clear, and we show how several known methods of posterior estimation relate to alternative lower bounds to mutual information. These distinct objective functions aim at the same optimal energy form and therefore can be directly benchmarked. We compare their accuracy in the inference of model parameters, focusing on four dynamical systems that encompass common challenges in time series analysis: dynamics driven by multiplicative noise, nonlinear interactions, chaotic behavior, and high-dimensional parameter space.

preprint2022arXiv

NoisET: Noise learning and Expansion detection of T-cell receptors

High-throughput sequencing of T- and B-cell receptors makes it possible to track immune repertoires across time, in different tissues, in acute and chronic diseases and in healthy individuals. However quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. We review methods for accounting for both biological and experimental noise and present an easy-to-use python package NoisET that implements and generalizes a previously developed Bayesian method. It can be used to learn experimental noise models for repertoire sequencing from replicates, and to detect responding clones following a stimulus. We test the package on different repertoire sequencing technologies and datasets. We review how such approaches have been used to identify responding clonotypes in vaccination and disease data. Availability: NoisET is freely available to use with source code at github.com/statbiophys/NoisET.

preprint2021arXiv

Affinity maturation for an optimal balance between long-term immune coverage and short-term resource constraints

In order to target threatening pathogens, the adaptive immune system performs a continuous reorganization of its lymphocyte repertoire. Following an immune challenge, the B cell repertoire can evolve cells of increased specificity for the encountered strain. This process of affinity maturation generates a memory pool whose diversity and size remain difficult to predict. We assume that the immune system follows a strategy that maximizes the long-term immune coverage and minimizes the short-term metabolic costs associated with affinity maturation. This strategy is defined as an optimal decision process on a finite dimensional phenotypic space, where a pre-existing population of naive cells is sequentially challenged with a neutrally evolving strain. We unveil a trade-off between immune protection against future strains and the necessary reorganization of the repertoire. This plasticity of the repertoire drives the emergence of distinct regimes for the size and diversity of the memory pool, depending on the density of naive cells and on the mutation rate of the strain. The model predicts power-law distributions of clonotype sizes observed in data, and rationalizes antigenic imprinting as a strategy to minimize metabolic costs while keeping good immune protection against future strains.

preprint2021arXiv

Antigenic waves of virus-immune co-evolution

The evolution of many microbes and pathogens, including circulating viruses such as seasonal influenza, is driven by immune pressure from the host population. In turn, the immune systems of infected populations get updated, chasing viruses even further away. Quantitatively understanding how these dynamics result in observed patterns of rapid pathogen and immune adaptation is instrumental to epidemiological and evolutionary forecasting. Here we present a mathematical theory of co-evolution between immune systems and viruses in a finite-dimensional antigenic space, which describes the cross-reactivity of viral strains and immune systems primed by previous infections. We show the emergence of an antigenic wave that is pushed forward and canalized by cross-reactivity. We obtain analytical results for shape, speed, and angular diffusion of the wave. In particular, we show that viral-immune co-evolution generates a new emergent timescale, the persistence time of the wave's direction in antigenic space, which can be much longer than the coalescence time of the viral population. We compare these dynamics to the observed antigenic turnover of influenza strains, and we discuss how the dimensionality of antigenic space impacts on the predictability of the evolutionary dynamics. Our results provide a concrete and tractable framework to describe pathogen-host co-evolution.

preprint2020arXiv

Building general Langevin models from discrete data sets

Many living and complex systems exhibit second order emergent dynamics. Limited experimental access to the configurational degrees of freedom results in data that appears to be generated by a non-Markovian process. This poses a challenge in the quantitative reconstruction of the model from experimental data, even in the simple case of equilibrium Langevin dynamics of Hamiltonian systems. We develop a novel Bayesian inference approach to learn the parameters of such stochastic effective models from discrete finite length trajectories. We first discuss the failure of naive inference approaches based on the estimation of derivatives through finite differences, regardless of the time resolution and the length of the sampled trajectories. We then derive, adopting higher order discretization schemes, maximum likelihood estimators for the model parameters that provide excellent results even with moderately long trajectories. We apply our method to second order models of collective motion and show that our results also hold in the presence of interactions.

preprint2020arXiv

Immune Fingerprinting through Repertoire Similarity

Immune repertoires provide a unique fingerprint reflecting the immune history of individuals, with potential applications in precision medicine. However, the question of how personal that information is and how it can be used to identify individuals has not been explored. Here, we show that individuals can be uniquely identified from repertoires of just a few thousands lymphocytes. We present &#34;Immprint,&#34; a classifier using an information-theoretic measure of repertoire similarity to distinguish pairs of repertoire samples coming from the same versus different individuals. Using published T-cell receptor repertoires and statistical modeling, we tested its ability to identify individuals with great accuracy, including identical twins, by computing false positive and false negative rates $< 10^{-6}$ from samples composed of 10,000 T-cells. We verified through longitudinal datasets and simulations that the method is robust to acute infections and the passage of time. These results emphasize the private and personal nature of repertoire data.

preprint2020arXiv

Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data

Somatic hypermutations of immunoglobulin (Ig) genes occuring during affinity maturation drive B-cell receptors&#39; ability to evolve strong binding to their antigenic targets. The landscape of these mutations is highly heterogeneous, with certain regions of the Ig gene being preferentially targeted. However, a rigorous quantification of this bias has been difficult because of phylogenetic correlations between sequences and the interference of selective forces. Here, we present an approach that corrects for these issues, and use it to learn a model of hypermutation preferences from a recently published large IgH repertoire dataset. The obtained model predicts mutation profiles accurately and in a reproducible way, including in the previously uncharacterized Complementarity Determining Region 3, revealing that both the sequence context of the mutation and its absolute position along the gene are important. In addition, we show that hypermutations occurring concomittantly along B-cell lineages tend to co-localize, suggesting a possible mechanism for accelerating affinity maturation.

preprint2020arXiv

Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cell memory formation after mild COVID-19 infection

COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. T cells play a key role in the adaptive antiviral immune response by killing infected cells and facilitating the selection of virus-specific antibodies. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cell response nor the diversity of resulting immune memory are well understood. In this study we use longitudinal high-throughput T cell receptor (TCR) sequencing to track changes in the T cell repertoire following two mild cases of COVID-19. In both donors we identified CD4+ and CD8+ T cell clones with transient clonal expansion after infection. The antigen specificity of CD8+ TCR sequences to SARS-CoV-2 epitopes was confirmed by both MHC tetramer binding and presence in large database of SARS-CoV-2 epitope-specific TCRs. We describe characteristic motifs in TCR sequences of COVID-19-reactive clones and show preferential occurence of these motifs in publicly available large dataset of repertoires from COVID-19 patients. We show that in both donors the majority of infection-reactive clonotypes acquire memory phenotypes. Certain T cell clones were detected in the memory fraction at the pre-infection timepoint, suggesting participation of pre-existing cross-reactive memory T cells in the immune response to SARS-CoV-2.

preprint2020arXiv

On generative models of T-cell receptor sequences

T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free Variational Auto-Encoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost.

preprint2020arXiv

Population variability in the generation and thymic selection of T-cell repertoires

The diversity of T-cell receptor (TCR) repertoires is achieved by a combination of two intrinsically stochastic steps: random receptor generation by VDJ recombination, and selection based on the recognition of random self-peptides presented on the major histocompatibility complex. These processes lead to a large receptor variability within and between individuals. However, the characterization of the variability is hampered by the limited size of the sampled repertoires. We introduce a new software tool SONIA to facilitate inference of individual-specific computational models for the generation and selection of the TCR beta chain (TRB) from sequenced repertoires of 651 individuals, separating and quantifying the variability of the two processes of generation and selection in the population. We find not only that most of the variability is driven by the VDJ generation process, but there is a large degree of consistency between individuals with the inter-individual variance of repertoires being about 2% of the intra-individual variance. Known viral-specific TCRs follow the same generation and selection statistics as all TCRs.

preprint2010arXiv

Telling time with an intrinsically noisy clock

Intracellular transmission of information via chemical and transcriptional networks is thwarted by a physical limitation: the finite copy number of the constituent chemical species introduces unavoidable intrinsic noise. Here we provide a method for solving for the complete probabilistic description of intrinsically noisy oscillatory driving. We derive and numerically verify a number of simple scaling laws. Unlike in the case of measuring a static quantity, response to an oscillatory driving can exhibit a resonant frequency which maximizes information transmission. Further, we show that the optimal regulatory design is dependent on the biophysical constraints (i.e., the allowed copy number and response time). The resulting phase diagram illustrates under what conditions threshold regulation outperforms linear regulation.