Researcher profile

Michael Sørensen

Michael Sørensen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2020arXiv

A generative angular model of protein structure evolution

Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modelled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modelling both "smooth" conformational changes and "catastrophic" conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence-structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof.

preprint2020arXiv

Langevin diffusions on the torus: estimation and applications

We introduce stochastic models for continuous-time evolution of angles and develop their estimation. We focus on studying Langevin diffusions with stationary distributions equal to well-known distributions from directional statistics, since such diffusions can be regarded as toroidal analogues of the Ornstein-Uhlenbeck process. Their likelihood function is a product of transition densities with no analytical expression, but that can be calculated by solving the Fokker-Planck equation numerically through adequate schemes. We propose three approximate likelihoods that are computationally tractable: (i) a likelihood based on the stationary distribution; (ii) toroidal adaptations of the Euler and Shoji-Ozaki pseudo-likelihoods; (iii) a likelihood based on a specific approximation to the transition density of the wrapped normal process. A simulation study compares, in dimensions one and two, the approximate transition densities to the exact ones, and investigates the empirical performance of the approximate likelihoods. Finally, two diffusions are used to model the evolution of the backbone angles of the protein G (PDB identifier 1GB1) during a molecular dynamics simulation. The software package sdetorus implements the estimation methods and applications presented in the paper.

preprint2020arXiv

Prediction-based estimation for diffusion models with high-frequency data

This paper obtains asymptotic results for parametric inference using prediction-based estimating functions when the data are high frequency observations of a diffusion process with an infinite time horizon. Specifically, the data are observations of a diffusion process at $n$ equidistant time points $Δ_n i$, and the asymptotic scenario is $Δ_n \to 0$ and $nΔ_n \to \infty$. For a useful and tractable classes of prediction-based estimating functions, existence of a consistent estimator is proved under standard weak regularity conditions on the diffusion process and the estimating function. Asymptotic normality of the estimator is established under the additional rate condition $nΔ_n^3 \to 0$. The prediction-based estimating functions are approximate martingale estimating functions to a smaller order than what has previously been studied, and new non-standard asymptotic theory is needed. A Monte Carlo method for calculating the asymptotic variance of the estimators is proposed.

preprint2020arXiv

Toroidal diffusions and protein structure evolution

This chapter shows how toroidal diffusions are convenient methodological tools for modelling protein evolution in a probabilistic framework. The chapter addresses the construction of ergodic diffusions with stationary distributions equal to well-known directional distributions, which can be regarded as toroidal analogues of the Ornstein-Uhlenbeck process. The important challenges that arise in the estimation of the diffusion parameters require the consideration of tractable approximate likelihoods and, among the several approaches introduced, the one yielding a specific approximation to the transition density of the wrapped normal process is shown to give the best empirical performance on average. This provides the methodological building block for Evolutionary Torus Dynamic Bayesian Network (ETDBN), a hidden Markov model for protein evolution that emits a wrapped normal process and two continuous-time Markov chains per hidden state. The chapter describes the main features of ETDBN, which allows for both "smooth" conformational changes and "catastrophic" conformational jumps, and several empirical benchmarks. The insights into the relationship between sequence and structure evolution that ETDBN provides are illustrated in a case study.

preprint2013arXiv

Statistical inference for discrete-time samples from affine stochastic delay differential equations

Statistical inference for discrete time observations of an affine stochastic delay differential equation is considered. The main focus is on maximum pseudo-likelihood estimators, which are easy to calculate in practice. A more general class of prediction-based estimating functions is investigated as well. In particular, the optimal prediction-based estimating function and the asymptotic properties of the estimators are derived. The maximum pseudo-likelihood estimator is a particular case, and an expression is found for the efficiency loss when using the maximum pseudo-likelihood estimator, rather than the computationally more involved optimal prediction-based estimator. The distribution of the pseudo-likelihood estimator is investigated in a simulation study. Two examples of affine stochastic delay equation are considered in detail.