Researcher profile

Sotirios Sabanis

Sotirios Sabanis contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2023arXiv

Kinetic Langevin MCMC Sampling Without Gradient Lipschitz Continuity -- the Strongly Convex Case

In this article we consider sampling from log concave distributions in Hamiltonian setting, without assuming that the objective gradient is globally Lipschitz. We propose two algorithms based on monotone polygonal (tamed) Euler schemes, to sample from a target measure, and provide non-asymptotic 2-Wasserstein distance bounds between the law of the process of each algorithm and the target measure. Finally, we apply these results to bound the excess risk optimization error of the associated optimization problem.

preprint2023arXiv

Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms

Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. In many cases, the gradient of any such loss function has superlinear growth, making the use of the widely-accepted (stochastic) gradient descent methods, which are based on Euler numerical schemes, problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in \citet{tamed-euler, SabanisAoAP} and for MCMC algorithms in \citet{tula}. Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.

preprint2022arXiv

Existence, uniqueness and approximation of solutions of SDEs with superlinear coefficients in the presence of discontinuities of the drift coefficient

Existence, uniqueness, and $L_p$-approximation results are presented for scalar stochastic differential equations (SDEs) by considering the case where, the drift coefficient has finitely many spatial discontinuities while both coefficients can grow superlinearly (in the space variable). These discontinuities are described by a piecewise local Lipschitz continuity and a piecewise monotone-type condition while the diffusion coefficient is assumed to be locally Lipschitz continuous and non-degenerate at the discontinuity points of the drift coefficient. Moreover, the superlinear nature of the coefficients is dictated by a suitable coercivity condition and a polynomial growth of the (local) Lipschitz constants of the coefficients. Existence and uniqueness of strong solutions of such SDEs are obtained. Furthermore, the classical $L_p$-error rate $1/2$, for a suitable range of values of $p$, is recovered for a tamed Euler scheme which is used for approximating these solutions. To the best of the authors' knowledge, these are the first existence, uniqueness and approximation results for this class of SDEs.

preprint2021arXiv

On stochastic gradient Langevin dynamics with dependent data streams: the fully non-convex case

We consider the problem of sampling from a target distribution, which is \emph {not necessarily logconcave}, in the context of empirical risk minimization and stochastic optimization as presented in Raginsky et al. (2017). Non-asymptotic analysis results are established in the $L^1$-Wasserstein distance for the behaviour of Stochastic Gradient Langevin Dynamics (SGLD) algorithms. We allow the estimation of gradients to be performed even in the presence of \emph{dependent} data streams. Our convergence estimates are sharper and \emph{uniform} in the number of iterations, in contrast to those in previous studies.

preprint2020arXiv

A fully data-driven approach to minimizing CVaR for portfolio of assets via SGLD with discontinuous updating

A new approach in stochastic optimization via the use of stochastic gradient Langevin dynamics (SGLD) algorithms, which is a variant of stochastic gradient decent (SGD) methods, allows us to efficiently approximate global minimizers of possibly complicated, high-dimensional landscapes. With this in mind, we extend here the non-asymptotic analysis of SGLD to the case of discontinuous stochastic gradients. We are thus able to provide theoretical guarantees for the algorithm's convergence in (standard) Wasserstein distances for both convex and non-convex objective functions. We also provide explicit upper estimates of the expected excess risk associated with the approximation of global minimizers of these objective functions. All these findings allow us to devise and present a fully data-driven approach for the optimal allocation of weights for the minimization of CVaR of portfolio of assets with complete theoretical guarantees for its performance. Numerical results illustrate our main findings.