Researcher profile

Steffen Dereich

Steffen Dereich contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures

We study gradient flows for loss landscapes of fully connected feedforward neural networks with commonly used continuously differentiable activation functions such as the logistic, hyperbolic tangent, softplus or GELU function. We prove that the gradient flow either converges to a critical point or diverges to infinity while the loss converges to an asymptotic critical value. Moreover, we prove the existence of a threshold $\varepsilon>0$ such that the loss value of any gradient flow initialized at most $\varepsilon$ above the optimal level converges to it. For polynomial target functions and sufficiently big architecture and data set, we prove that the optimal loss value is zero and can only be realized asymptotically. From this setting, we deduce our main result that any gradient flow with sufficiently good initialization diverges to infinity. Our proof heavily relies on the geometry of o-minimal structures. We confirm these theoretical findings with numerical experiments and extend our investigation to more realistic scenarios, where we observe an analogous behavior.

preprint2013arXiv

Random networks with sublinear preferential attachment: The giant component

We study a dynamical random network model in which at every construction step a new vertex is introduced and attached to every existing vertex independently with a probability proportional to a concave function f of its current degree. We give a criterion for the existence of a giant component, which is both necessary and sufficient, and which becomes explicit when f is linear. Otherwise it allows the derivation of explicit necessary and sufficient conditions, which are often fairly close. We give an explicit criterion to decide whether the giant component is robust under random removal of edges. We also determine asymptotically the size of the giant component and the empirical distribution of component sizes in terms of the survival probability and size distribution of a multitype branching random walk associated with f.

preprint2012arXiv

Emergence of condensation in Kingman's model of selection and mutation

We describe the onset of condensation in the simple model for the balance between selection and mutation given by Kingman in terms of a scaling limit theorem. Loosely speaking, this shows that the wave moving towards genes of maximal fitness has the shape of a gamma distribution. We conjecture that this wave shape is a universal phenomenon that can also be found in a variety of more complex models, well beyond the genetics context, and provide some further evidence for this.

preprint2011arXiv

Constructive quantization: approximation by empirical measures

In this article, we study the approximation of a probability measure $μ$ on $\mathbb{R}^{d}$ by its empirical measure $\hatμ_{N}$ interpreted as a random quantization. As error criterion we consider an averaged $p$-th moment Wasserstein metric. In the case where $2p<d$, we establish refined upper and lower bounds for the error, a high-resolution formula. Moreover, we provide a universal estimate based on moments, a so-called Pierce type estimate. In particular, we show that quantization by empirical measures is of optimal order under weak assumptions.

preprint2011arXiv

Multilevel Monte Carlo algorithms for Lévy-driven SDEs with Gaussian correction

We introduce and analyze multilevel Monte Carlo algorithms for the computation of $\mathbb {E}f(Y)$, where $Y=(Y_t)_{t\in[0,1]}$ is the solution of a multidimensional Lévy-driven stochastic differential equation and $f$ is a real-valued function on the path space. The algorithm relies on approximations obtained by simulating large jumps of the Lévy process individually and applying a Gaussian approximation for the small jump part. Upper bounds are provided for the worst case error over the class of all measurable real functions $f$ that are Lipschitz continuous with respect to the supremum norm. These upper bounds are easily tractable once one knows the behavior of the Lévy measure around zero. In particular, one can derive upper bounds from the Blumenthal--Getoor index of the Lévy process. In the case where the Blumenthal--Getoor index is larger than one, this approach is superior to algorithms that do not apply a Gaussian approximation. If the Lévy process does not incorporate a Wiener process or if the Blumenthal--Getoor index $β$ is larger than $\frac{4}{3}$, then the upper bound is of order $τ^{-({4-β})/({6β})}$ when the runtime $τ$ tends to infinity. Whereas in the case, where $β$ is in $[1,\frac{4}{3}]$ and the Lévy process has a Gaussian component, we obtain bounds of order $τ^{-β/(6β-4)}$. In particular, the error is at most of order $τ^{-1/6}$.

preprint2011arXiv

Typical distances in ultrasmall random networks

We show that in preferential attachment models with power-law exponent $τ\in(2,3)$ the distance between randomly chosen vertices in the giant component is asymptotically equal to $(4+o(1))\, \frac{\log\log N}{-\log (τ-2)}$, where $N$ denotes the number of nodes. This is twice the value obtained for several types of configuration models with the same power-law exponent. The extra factor reveals the different structure of typical shortest paths in preferential attachment graphs.

preprint2010arXiv

The high resolution vector quantization problem with Orlicz norm distortion

We derive a high-resolution formula for the quantization problem under Orlicz norm distortion. In this setting, the optimal point density solves a variational problem which comprises a function $g:\mathbb{R}_+\to[0,\infty)$ characterizing the quantization complexity of the underlying Orlicz space. Moreover, asymptotically optimal codebooks induce a tight sequence of empirical measures. The set of possible accumulation points is characterized and in most cases it consists of a single element. In that case, we find convergence as in the classical setting.

preprint2010arXiv

Universality of the asymptotics of the one-sided exit problem for integrated processes

We consider the one-sided exit problem for (fractionally) integrated random walks and Lévy processes. We prove that the rate of decrease of the non-exit probability -- the so-called survival exponent -- is universal in this class of processes. In particular, the survival exponent can be inferred from the (fractionally) integrated Brownian motion. This, in particular, extends Sinai&#39;s result on the survival exponent for the integrated simple random walk to general random walks with some finite exponential moment. Further, we prove existence and monotonicity of the survival exponent of fractionally integrated processes. We show that this exponent is related to a constant appearing in the study of random polynomials.