Researcher profile

Terry Lyons

Terry Lyons contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2022arXiv

An asymptotic radius of convergence for the Loewner equation and simulation of $SLE_k$ traces via splitting

In this paper, we shall study the convergence of Taylor approximations for the backward Loewner differential equation (driven by Brownian motion) near the origin. More concretely, whenever the initial condition of the backward Loewner equation (which lies in the upper half plane) is small and has the form $Z_{0} = \varepsilon i$, we show these approximations exhibit an $O(\varepsilon)$ error provided the time horizon is $\varepsilon^{2+δ}$ for $δ> 0$. Statements of this theorem will be given using both rough path and $L^{2}(\mathbb{P})$ estimates. Furthermore, over the time horizon of $\varepsilon^{2-δ}$, we shall see that "higher degree" terms within the Taylor expansion become larger than "lower degree" terms for small $\varepsilon$. In this sense, the time horizon on which approximations are accurate scales like $\varepsilon^{2}$. This scaling comes naturally from the Loewner equation when growing vector field derivatives are balanced against decaying iterated integrals of the Brownian motion. As well as being of theoretical interest, this scaling may be used as a guiding principle for developing adaptive step size strategies which perform efficiently near the origin. In addition, this result highlights the limitations of using stochastic Taylor methods (such as the Euler-Maruyama and Milstein methods) for approximating $SLE_κ$ traces. Due to the analytically tractable vector fields of the Loewner equation, we will show Ninomiya-Victoir (or Strang) splitting is particularly well suited for SLE simulation. As the singularity at the origin can lead to large numerical errors, we shall employ the adaptive step size proposed by Tom Kennedy to discretize $SLE_κ$ traces using this splitting. We believe that the Ninomiya-Victoir scheme is the first high order numerical method that has been successfully applied to $SLE_κ$ traces.

preprint2022arXiv

ImageSig: A signature transform for ultra-lightweight image recognition

This paper introduces a new lightweight method for image recognition. ImageSig is based on computing signatures and does not require a convolutional structure or an attention-based encoder. It is striking to the authors that it achieves: a) an accuracy for 64 X 64 RGB images that exceeds many of the state-of-the-art methods and simultaneously b) requires orders of magnitude less FLOPS, power and memory footprint. The pretrained model can be as small as 44.2 KB in size. ImageSig shows unprecedented performance on hardware such as Raspberry Pi and Jetson-nano. ImageSig treats images as streams with multiple channels. These streams are parameterized by spatial directions. We contribute to the functionality of signature and rough path theory to stream-like data and vision tasks on static images beyond temporal streams. With very few parameters and small size models, the key advantage is that one could have many of these "detectors" assembled on the same chip; moreover, the feature acquisition can be performed once and shared between different models of different tasks - further accelerating the process. This contributes to energy efficiency and the advancements of embedded AI at the edge.

preprint2021arXiv

A Generalised Signature Method for Multivariate Time Series Feature Extraction

The 'signature method' refers to a collection of feature extraction techniques for multivariate time series, derived from the theory of controlled differential equations. There is a great deal of flexibility as to how this method can be applied. On the one hand, this flexibility allows the method to be tailored to specific problems, but on the other hand, can make precise application challenging. This paper makes two contributions. First, the variations on the signature method are unified into a general approach, the \emph{generalised signature method}, of which previous variations are special cases. A primary aim of this unifying framework is to make the signature method more accessible to any machine learning practitioner, whereas it is now mostly used by specialists. Second, and within this framework, we derive a canonical collection of choices that provide a domain-agnostic starting point. We derive these choices as a result of an extensive empirical study on 26 datasets and go on to show competitive performance against current benchmarks for multivariate time series classification. Finally, to ease practical application, we make our techniques available as part of the open-source [redacted] project.

preprint2021arXiv

Estimating the probability that a given vector is in the convex hull of a random sample

For a $d$-dimensional random vector $X$, let $p_{n, X}(θ)$ be the probability that the convex hull of $n$ independent copies of $X$ contains a given point $θ$. We provide several sharp inequalities regarding $p_{n, X}(θ)$ and $N_X(θ)$ denoting the smallest $n$ for which $p_{n, X}(θ)\ge1/2$. As a main result, we derive the totally general inequality $1/2 \le α_X(θ)N_X(θ)\le 3d + 1$, where $α_X(θ)$ (a.k.a. the Tukey depth) is the minimum probability that $X$ is in a fixed closed halfspace containing the point $θ$. We also show several applications of our general results: one is a moment-based bound on $N_X(\mathbb{E}[X])$, which is an important quantity in randomized approaches to cubature construction or measure reduction problem. Another application is the determination of the canonical convex body included in a random convex polytope given by independent copies of $X$, where our combinatorial approach allows us to generalize existing results in random matrix community significantly.

preprint2021arXiv

Modelling Paralinguistic Properties in Conversational Speech to Detect Bipolar Disorder and Borderline Personality Disorder

Bipolar disorder (BD) and borderline personality disorder (BPD) are two chronic mental health conditions that clinicians find challenging to distinguish based on clinical interviews, due to their overlapping symptoms. In this work, we investigate the automatic detection of these two conditions by modelling both verbal and non-verbal cues in a set of interviews. We propose a new approach of modelling short-term features with visibility-signature transform, and compare it with widely used high-level statistical functions. We demonstrate the superior performance of our proposed signature-based model. Furthermore, we show the role of different sets of features in characterising BD and BPD.

preprint2021arXiv

Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU

Signatory is a library for calculating and performing functionality related to the signature and logsignature transforms. The focus is on machine learning, and as such includes features such as CPU parallelism, GPU support, and backpropagation. To our knowledge it is the first GPU-capable library for these operations. Signatory implements new features not available in previous libraries, such as efficient precomputation strategies. Furthermore, several novel algorithmic improvements are introduced, producing substantial real-world speedups even on the CPU without parallelism. The library operates as a Python wrapper around C++, and is compatible with the PyTorch ecosystem. It may be installed directly via \texttt{pip}. Source code, documentation, examples, benchmarks and tests may be found at \texttt{\url{https://github.com/patrick-kidger/signatory}}. The license is Apache-2.0.

preprint2021arXiv

The shifted ODE method for underdamped Langevin MCMC

In this paper, we consider the underdamped Langevin diffusion (ULD) and propose a numerical approximation using its associated ordinary differential equation (ODE). When used as a Markov Chain Monte Carlo (MCMC) algorithm, we show that the ODE approximation achieves a $2$-Wasserstein error of $\varepsilon$ in $\mathcal{O}\big(d^{\frac{1}{3}}/\varepsilon^{\frac{2}{3}}\big)$ steps under the standard smoothness and strong convexity assumptions on the target distribution. This matches the complexity of the randomized midpoint method proposed by Shen and Lee [NeurIPS 2019] which was shown to be order optimal by Cao, Lu and Wang. However, the main feature of the proposed numerical method is that it can utilize additional smoothness of the target log-density $f$. More concretely, we show that the ODE approximation achieves a $2$-Wasserstein error of $\varepsilon$ in $\mathcal{O}\big(d^{\frac{2}{5}}/\varepsilon^{\frac{2}{5}}\big)$ and $\mathcal{O}\big(\sqrt{d}/\varepsilon^{\frac{1}{3}}\big)$ steps when Lipschitz continuity is assumed for the Hessian and third derivative of $f$. By discretizing this ODE using a third order Runge-Kutta method, we can obtain a practical MCMC method that uses just two additional gradient evaluations per step. In our experiment, where the target comes from a logistic regression, this method shows faster convergence compared to other unadjusted Langevin MCMC algorithms.

preprint2020arXiv

A Data-driven Market Simulator for Small Data Environments

Neural network based data-driven market simulation unveils a new and flexible way of modelling financial time series without imposing assumptions on the underlying stochastic dynamics. Though in this sense generative market simulation is model-free, the concrete modelling choices are nevertheless decisive for the features of the simulated paths. We give a brief overview of currently used generative modelling approaches and performance evaluation metrics for financial time series, and address some of the challenges to achieve good results in the latter. We also contrast some classical approaches of market simulation with simulation based on generative modelling and highlight some advantages and pitfalls of the new approach. While most generative models tend to rely on large amounts of training data, we present here a generative model that works reliably in environments where the amount of available training data is notoriously small. Furthermore, we show how a rough paths perspective combined with a parsimonious Variational Autoencoder framework provides a powerful way for encoding and evaluating financial time series in such environments where available training data is scarce. Finally, we also propose a suitable performance evaluation metric for financial time series and discuss some connections of our Market Generator to deep hedging.

preprint2020arXiv

An optimal polynomial approximation of Brownian motion

In this paper, we will present a strong (or pathwise) approximation of standard Brownian motion by a class of orthogonal polynomials. The coefficients that are obtained from the expansion of Brownian motion in this polynomial basis are independent Gaussian random variables. Therefore it is practical (requires $N$ independent Gaussian coefficients) to generate an approximate sample path of Brownian motion that respects integration of polynomials with degree less than $N$. Moreover, since these orthogonal polynomials appear naturally as eigenfunctions of an integral operator defined by the Brownian bridge covariance function, the proposed approximation is optimal in a certain weighted $L^{2}(\mathbb{P})$ sense. In addition, discretizing Brownian paths as piecewise parabolas gives a locally higher order numerical method for stochastic differential equations (SDEs) when compared to the standard piecewise linear approach. We shall demonstrate these ideas by simulating Inhomogeneous Geometric Brownian Motion (IGBM). This numerical example will also illustrate the deficiencies of the piecewise parabola approximation when compared to a new version of the asymptotically efficient log-ODE (or Castell-Gaines) method.

preprint2020arXiv

Generalised Interpretable Shapelets for Irregular Time Series

The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'. However it has previously suffered from a number of limitations, such as being limited to regularly-spaced fully-observed time series, and having to choose between efficient training and interpretability. Here, we extend the method to continuous time, and in doing so handle the general case of irregularly-sampled partially-observed multivariate time series. Furthermore, we show that a simple regularisation penalty may be used to train efficiently without sacrificing interpretability. The continuous-time formulation additionally allows for learning the length of each shapelet (previously a discrete object) in a differentiable manner. Finally, we demonstrate that the measure of similarity between time series may be generalised to a learnt pseudometric. We validate our method by demonstrating its performance and interpretability on several datasets; for example we discover (purely from data) that the digits 5 and 6 may be distinguished by the chirality of their bottom loop, and that a kind of spectral gap exists in spoken audio classification.

preprint2020arXiv

Numerical method for model-free pricing of exotic derivatives using rough path signatures

We estimate prices of exotic options in a discrete-time model-free setting when the trader has access to market prices of a rich enough class of exotic and vanilla options. This is achieved by estimating an unobservable quantity called "implied expected signature" from such market prices, which are used to price other exotic derivatives. The implied expected signature is an object that characterises the market dynamics.

preprint2020arXiv

Universal Approximation with Deep Narrow Networks

The classical Universal Approximation Theorem holds for neural networks of arbitrary width and bounded depth. Here we consider the natural `dual' scenario for networks of bounded width and arbitrary depth. Precisely, let $n$ be the number of inputs neurons, $m$ be the number of output neurons, and let $ρ$ be any nonaffine continuous function, with a continuous nonzero derivative at some point. Then we show that the class of neural networks of arbitrary depth, width $n + m + 2$, and activation function $ρ$, is dense in $C(K; \mathbb{R}^m)$ for $K \subseteq \mathbb{R}^n$ with $K$ compact. This covers every activation function possible to use in practice, and also includes polynomial activation functions, which is unlike the classical version of the theorem, and provides a qualitative difference between deep narrow networks and shallow wide networks. We then consider several extensions of this result. In particular we consider nowhere differentiable activation functions, density in noncompact domains with respect to the $L^p$-norm, and how the width may be reduced to just $n + m + 1$ for `most' activation functions.