Researcher profile

Moritz Helias

Moritz Helias contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Renormalization group for deep neural networks: Universality of learning and scaling laws

Self-similarity, where observables at different length scales exhibit similar behavior, is ubiquitous in natural systems. Such systems are typically characterized by power-law correlations and universality, and are studied using the powerful framework of the renormalization group (RG). Intriguingly, power laws and weak forms of universality also pervade real-world datasets and deep learning models, motivating the application of RG ideas to the analysis of deep learning. In this work, we develop an RG framework to analyze self-similarity and its breakdown in learning curves for a class of weakly non-linear (non-lazy) neural networks trained on power-law distributed data. Features often neglected in standard treatments -- such as spectrum discreteness and lack of translation invariance -- lead to both quantitative and qualitative departures from conventional perturbative RG. In particular, we find that the concept of scaling intervals naturally replaces that of scaling dimensions. Despite these differences, the framework retains key RG features: it enables the classification of perturbations as relevant or irrelevant, and reveals a form of universality at large data limits, governed by a Gaussian Process-like UV fixed point.

preprint2022arXiv

Origami in N dimensions: How feed-forward networks manufacture linear separability

Neural networks can implement arbitrary functions. But, mechanistically, what are the tools at their disposal to construct the target? For classification tasks, the network must transform the data classes into a linearly separable representation in the final hidden layer. We show that a feed-forward architecture has one primary tool at hand to achieve this separability: progressive folding of the data manifold in unoccupied higher dimensions. The operation of folding provides a useful intuition in low-dimensions that generalizes to high ones. We argue that an alternative method based on shear, requiring very deep architectures, plays only a small role in real-world networks. The folding operation, however, is powerful as long as layers are wider than the data dimensionality, allowing efficient solutions by providing access to arbitrary regions in the distribution, such as data points of one class forming islands within the other classes. We argue that a link exists between the universal approximation property in ReLU networks and the fold-and-cut theorem (Demaine et al., 1998) dealing with physical paper folding. Based on the mechanistic insight, we predict that the progressive generation of separability is necessarily accompanied by neurons showing mixed selectivity and bimodal tuning curves. This is validated in a network trained on the poker hand task, showing the emergence of bimodal tuning curves during training. We hope that our intuitive picture of the data transformation in deep networks can help to provide interpretability, and discuss possible applications to the theory of convolutional networks, loss landscapes, and generalization. TL;DR: Shows that the internal processing of deep networks can be thought of as literal folding operations on the data distribution in the N-dimensional activation space. A link to a well-known theorem in origami theory is provided.

preprint2022arXiv

Self-consistent formulations for stochastic nonlinear neuronal dynamics

Neural dynamics is often investigated with tools from bifurcation theory. However, many neuron models are stochastic, mimicking fluctuations in the input from unknown parts of the brain or the spiking nature of signals. Noise changes the dynamics with respect to the deterministic model; in particular bifurcation theory cannot be applied. We formulate stochastic neuronal dynamics in the Martin-Siggia-Rose de Dominicis-Janssen (MSRDJ) formalism and present the fluctuation expansion of the effective action and the functional renormalization group (fRG) as two systematic ways to incorporate corrections to the mean dynamics and time-dependent statistics due to fluctuations in the presence of nonlinear neuronal gain. To formulate self-consistency equations, we derive a fundamental link between the effective action in the Onsager-Machlup(OM) formalism, which allows the study of phase transitions, and the MSRDJ effective action, which is computationally advantageous. These results in particular allow the derivation of an OM effective action for systems with non-Gaussian noise. This approach naturally leads to effective deterministic equations for the first moment of the stochastic system; they explain how nonlinearities and noise cooperate to produce memory effects. Moreover, the MSRDJ formulation yields an effective linear system that has identical power spectra and linear response. Starting from the better known loopwise approximation, we then discuss the use of the fRG as a method to obtain self-consistency beyond the mean. We present a new efficient truncation scheme for the hierarchy of flow equations for the vertex functions by adapting the Blaizot, Méndez and Wschebor approximation from the derivative expansion to the vertex expansion. The methods are presented by means of the simplest possible example of a stochastic differential equation that has generic features of neuronal dynamics.

preprint2021arXiv

Event-based update of synapses in voltage-based learning rules

Due to the point-like nature of neuronal spiking, efficient neural network simulators often employ event-based simulation schemes for synapses. Yet many types of synaptic plasticity rely on the membrane potential of the postsynaptic cell as a third factor in addition to pre- and postsynaptic spike times. Synapses therefore require continuous information to update their strength which a priori necessitates a continuous update in a time-driven manner. The latter hinders scaling of simulations to realistic cortical network sizes and relevant time scales for learning. Here, we derive two efficient algorithms for archiving postsynaptic membrane potentials, both compatible with modern simulation engines based on event-based synapse updates. We theoretically contrast the two algorithms with a time-driven synapse update scheme to analyze advantages in terms of memory and computations. We further present a reference implementation in the spiking neural network simulator NEST for two prototypical voltage-based plasticity rules: the Clopath rule and the Urbanczik-Senn rule. For both rules, the two event-based algorithms significantly outperform the time-driven scheme. Depending on the amount of data to be stored for plasticity, which heavily differs between the rules, a strong performance increase can be achieved by compressing or sampling of information on membrane potentials. Our results on computational efficiency related to archiving of information provide guidelines for the design of learning rules in order to make them practically usable in large-scale networks.

preprint2021arXiv

Gell-Mann-Low criticality in neural networks

Criticality is deeply related to optimal computational capacity. The lack of a renormalized theory of critical brain dynamics, however, so far limits insights into this form of biological information processing to mean-field results. These methods neglect a key feature of critical systems: the interaction between degrees of freedom across all length scales, which allows for complex nonlinear computation. We present a renormalized theory of a prototypical neural field theory, the stochastic Wilson-Cowan equation. We compute the flow of couplings, which parameterize interactions on increasing length scales. Despite similarities with the Kardar-Parisi-Zhang model, the theory is of a Gell-Mann-Low type, the archetypal form of a renormalizable quantum field theory. Here, nonlinear couplings vanish, flowing towards the Gaussian fixed point, but logarithmically slowly, thus remaining effective on most scales. We show this critical structure of interactions to implement a desirable trade-off between linearity, optimal for information storage, and nonlinearity, required for computation.

preprint2020arXiv

Capacity of the covariance perceptron

The classical perceptron is a simple neural network that performs a binary classification by a linear mapping between static inputs and outputs and application of a threshold. For small inputs, neural networks in a stationary state also perform an effectively linear input-output transformation, but of an entire time series. Choosing the temporal mean of the time series as the feature for classification, the linear transformation of the network with subsequent thresholding is equivalent to the classical perceptron. Here we show that choosing covariances of time series as the feature for classification maps the neural network to what we call a 'covariance perceptron'; a mapping between covariances that is bilinear in terms of weights. By extending Gardner's theory of connections to this bilinear problem, using a replica symmetric mean-field theory, we compute the pattern and information capacities of the covariance perceptron in the infinite-size limit. Closed-form expressions reveal superior pattern capacity in the binary classification task compared to the classical perceptron in the case of a high-dimensional input and low-dimensional output. For less convergent networks, the mean perceptron classifies a larger number of stimuli. However, since covariances span a much larger input and output space than means, the amount of stored information in the covariance perceptron exceeds the classical counterpart. For strongly convergent connectivity it is superior by a factor equal to the number of input neurons. Theoretical calculations are validated numerically for finite size systems using a gradient-based optimization of a soft-margin, as well as numerical solvers for the NP hard quadratically constrained quadratic programming problem, to which training can be mapped.

preprint2019arXiv

Conditions for wave trains in spiking neural networks

Spatiotemporal patterns such as traveling waves are frequently observed in recordings of neural activity. The mechanisms underlying the generation of such patterns are largely unknown. Previous studies have investigated the existence and uniqueness of different types of waves or bumps of activity using neural-field models, phenomenological coarse-grained descriptions of neural-network dynamics. But it remains unclear how these insights can be transferred to more biologically realistic networks of spiking neurons, where individual neurons fire irregularly. Here, we employ mean-field theory to reduce a microscopic model of leaky integrate-and-fire (LIF) neurons with distance-dependent connectivity to an effective neural-field model. In contrast to existing phenomenological descriptions, the dynamics in this neural-field model depends on the mean and the variance in the synaptic input, both determining the amplitude and the temporal structure of the resulting effective coupling kernel. For the neural-field model we employ liner stability analysis to derive conditions for the existence of spatial and temporal oscillations and wave trains, that is, temporally and spatially periodic traveling waves. We first prove that wave trains cannot occur in a single homogeneous population of neurons, irrespective of the form of distance dependence of the connection probability. Compatible with the architecture of cortical neural networks, wave trains emerge in two-population networks of excitatory and inhibitory neurons as a combination of delay-induced temporal oscillations and spatial oscillations due to distance-dependent connectivity profiles. Finally, we demonstrate quantitative agreement between predictions of the analytically tractable neural-field model and numerical simulations of both networks of nonlinear rate-based units and networks of LIF neurons.

preprint2019arXiv

Statistical field theory for neural networks

These notes attempt a self-contained introduction into statistical field theory applied to neural networks of rate units and binary spins. The presentation consists of three parts: First, the introduction of fundamental notions of probabilities, moments, cumulants, and their relation by the linked cluster theorem, of which Wick's theorem is the most important special case; followed by the diagrammatic formulation of perturbation theory, reviewed in the statistical setting. Second, dynamics described by stochastic differential equations in the Ito-formulation, treated in the Martin-Siggia-Rose-De Dominicis-Janssen path integral formalism. With concepts from disordered systems, we then study networks with random connectivity and derive their self-consistent dynamic mean-field theory, explaining the statistics of fluctuations and the emergence of different phases with regular and chaotic dynamics. Third, we introduce the effective action, vertex functions, and the loopwise expansion. These tools are illustrated by systematic derivations of self-consistency equations, going beyond the mean-field approximation. These methods are applied to the pairwise maximum entropy (Ising spin) model, including the recently-found diagrammatic derivation of the Thouless-Anderson-Palmer mean field theory.

preprint2013arXiv

A unified view on weakly correlated recurrent networks

The diversity of neuron models used in contemporary theoretical neuroscience to investigate specific properties of covariances raises the question how these models relate to each other. In particular it is hard to distinguish between generic properties and peculiarities due to the abstracted model. Here we present a unified view on pairwise covariances in recurrent networks in the irregular regime. We consider the binary neuron model, the leaky integrate-and-fire model, and the Hawkes process. We show that linear approximation maps each of these models to either of two classes of linear rate models, including the Ornstein-Uhlenbeck process as a special case. The classes differ in the location of additive noise in the rate dynamics, which is on the output side for spiking models and on the input side for the binary model. Both classes allow closed form solutions for the covariance. For output noise it separates into an echo term and a term due to correlated input. The unified framework enables us to transfer results between models. For example, we generalize the binary model and the Hawkes process to the presence of conduction delays and simplify derivations for established results. Our approach is applicable to general network structures and suitable for population averages. The derived averages are exact for fixed out-degree network architectures and approximate for fixed in-degree. We demonstrate how taking into account fluctuations in the linearization procedure increases the accuracy of the effective theory and we explain the class dependent differences between covariances in the time and the frequency domain. Finally we show that the oscillatory instability emerging in networks of integrate-and-fire models with delayed inhibitory feedback is a model-invariant feature: the same structure of poles in the complex frequency plane determines the population power spectra.

preprint2013arXiv

The correlation structure of local cortical networks intrinsically results from recurrent dynamics

The co-occurrence of action potentials of pairs of neurons within short time intervals is known since long. Such synchronous events can appear time-locked to the behavior of an animal and also theoretical considerations argue for a functional role of synchrony. Early theoretical work tried to explain correlated activity by neurons transmitting common fluctuations due to shared inputs. This, however, overestimates correlations. Recently the recurrent connectivity of cortical networks was shown responsible for the observed low baseline correlations. Two different explanations were given: One argues that excitatory and inhibitory population activities closely follow the external inputs to the network, so that their effects on a pair of cells mutually cancel. Another explanation relies on negative recurrent feedback to suppress fluctuations in the population activity, equivalent to small correlations. In a biological neuronal network one expects both, external inputs and recurrence, to affect correlated activity. The present work extends the theoretical framework of correlations to include both contributions and explains their qualitative differences. Moreover the study shows that the arguments of fast tracking and recurrent feedback are not equivalent, only the latter correctly predicts the cell-type specific correlations.

preprint2010arXiv

The perfect integrator driven by Poisson input and its approximation in the diffusion limit

In this note we consider the perfect integrator driven by Poisson process input. We derive its equilibrium and response properties and contrast them to the approximations obtained by applying the diffusion approximation. In particular, the probability density in the vicinity of the threshold differs, which leads to altered response properties of the system in equilibrium.