Source author record

Moritz Helias

Moritz Helias appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Neurons and Cognition cond-mat.dis-nn cond-mat.stat-mech Machine Learning Biological Physics math.DS math.PR Methodology

Catalog footprint

What is connected

21works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Renormalization group for deep neural networks: Universality of learning and scaling laws

Self-similarity, where observables at different length scales exhibit similar behavior, is ubiquitous in natural systems. Such systems are typically characterized by power-law correlations and universality, and are studied using the powerful framework of the renormalization group (RG). Intriguingly, power laws and weak forms of universality also pervade real-world datasets and deep learning models, motivating the application of RG ideas to the analysis of deep learning. In this work, we develop an RG framework to analyze self-similarity and its breakdown in learning curves for a class of weakly non-linear (non-lazy) neural networks trained on power-law distributed data. Features often neglected in standard treatments -- such as spectrum discreteness and lack of translation invariance -- lead to both quantitative and qualitative departures from conventional perturbative RG. In particular, we find that the concept of scaling intervals naturally replaces that of scaling dimensions. Despite these differences, the framework retains key RG features: it enables the classification of perturbations as relevant or irrelevant, and reveals a form of universality at large data limits, governed by a Gaussian Process-like UV fixed point.

preprint2022arXiv

Origami in N dimensions: How feed-forward networks manufacture linear separability

Neural networks can implement arbitrary functions. But, mechanistically, what are the tools at their disposal to construct the target? For classification tasks, the network must transform the data classes into a linearly separable representation in the final hidden layer. We show that a feed-forward architecture has one primary tool at hand to achieve this separability: progressive folding of the data manifold in unoccupied higher dimensions. The operation of folding provides a useful intuition in low-dimensions that generalizes to high ones. We argue that an alternative method based on shear, requiring very deep architectures, plays only a small role in real-world networks. The folding operation, however, is powerful as long as layers are wider than the data dimensionality, allowing efficient solutions by providing access to arbitrary regions in the distribution, such as data points of one class forming islands within the other classes. We argue that a link exists between the universal approximation property in ReLU networks and the fold-and-cut theorem (Demaine et al., 1998) dealing with physical paper folding. Based on the mechanistic insight, we predict that the progressive generation of separability is necessarily accompanied by neurons showing mixed selectivity and bimodal tuning curves. This is validated in a network trained on the poker hand task, showing the emergence of bimodal tuning curves during training. We hope that our intuitive picture of the data transformation in deep networks can help to provide interpretability, and discuss possible applications to the theory of convolutional networks, loss landscapes, and generalization. TL;DR: Shows that the internal processing of deep networks can be thought of as literal folding operations on the data distribution in the N-dimensional activation space. A link to a well-known theorem in origami theory is provided.

preprint2022arXiv

Self-consistent formulations for stochastic nonlinear neuronal dynamics

Neural dynamics is often investigated with tools from bifurcation theory. However, many neuron models are stochastic, mimicking fluctuations in the input from unknown parts of the brain or the spiking nature of signals. Noise changes the dynamics with respect to the deterministic model; in particular bifurcation theory cannot be applied. We formulate stochastic neuronal dynamics in the Martin-Siggia-Rose de Dominicis-Janssen (MSRDJ) formalism and present the fluctuation expansion of the effective action and the functional renormalization group (fRG) as two systematic ways to incorporate corrections to the mean dynamics and time-dependent statistics due to fluctuations in the presence of nonlinear neuronal gain. To formulate self-consistency equations, we derive a fundamental link between the effective action in the Onsager-Machlup(OM) formalism, which allows the study of phase transitions, and the MSRDJ effective action, which is computationally advantageous. These results in particular allow the derivation of an OM effective action for systems with non-Gaussian noise. This approach naturally leads to effective deterministic equations for the first moment of the stochastic system; they explain how nonlinearities and noise cooperate to produce memory effects. Moreover, the MSRDJ formulation yields an effective linear system that has identical power spectra and linear response. Starting from the better known loopwise approximation, we then discuss the use of the fRG as a method to obtain self-consistency beyond the mean. We present a new efficient truncation scheme for the hierarchy of flow equations for the vertex functions by adapting the Blaizot, Méndez and Wschebor approximation from the derivative expansion to the vertex expansion. The methods are presented by means of the simplest possible example of a stochastic differential equation that has generic features of neuronal dynamics.

preprint2021arXiv

Event-based update of synapses in voltage-based learning rules

Due to the point-like nature of neuronal spiking, efficient neural network simulators often employ event-based simulation schemes for synapses. Yet many types of synaptic plasticity rely on the membrane potential of the postsynaptic cell as a third factor in addition to pre- and postsynaptic spike times. Synapses therefore require continuous information to update their strength which a priori necessitates a continuous update in a time-driven manner. The latter hinders scaling of simulations to realistic cortical network sizes and relevant time scales for learning. Here, we derive two efficient algorithms for archiving postsynaptic membrane potentials, both compatible with modern simulation engines based on event-based synapse updates. We theoretically contrast the two algorithms with a time-driven synapse update scheme to analyze advantages in terms of memory and computations. We further present a reference implementation in the spiking neural network simulator NEST for two prototypical voltage-based plasticity rules: the Clopath rule and the Urbanczik-Senn rule. For both rules, the two event-based algorithms significantly outperform the time-driven scheme. Depending on the amount of data to be stored for plasticity, which heavily differs between the rules, a strong performance increase can be achieved by compressing or sampling of information on membrane potentials. Our results on computational efficiency related to archiving of information provide guidelines for the design of learning rules in order to make them practically usable in large-scale networks.

preprint2021arXiv

Gell-Mann-Low criticality in neural networks

Criticality is deeply related to optimal computational capacity. The lack of a renormalized theory of critical brain dynamics, however, so far limits insights into this form of biological information processing to mean-field results. These methods neglect a key feature of critical systems: the interaction between degrees of freedom across all length scales, which allows for complex nonlinear computation. We present a renormalized theory of a prototypical neural field theory, the stochastic Wilson-Cowan equation. We compute the flow of couplings, which parameterize interactions on increasing length scales. Despite similarities with the Kardar-Parisi-Zhang model, the theory is of a Gell-Mann-Low type, the archetypal form of a renormalizable quantum field theory. Here, nonlinear couplings vanish, flowing towards the Gaussian fixed point, but logarithmically slowly, thus remaining effective on most scales. We show this critical structure of interactions to implement a desirable trade-off between linearity, optimal for information storage, and nonlinearity, required for computation.

preprint2020arXiv

Capacity of the covariance perceptron

The classical perceptron is a simple neural network that performs a binary classification by a linear mapping between static inputs and outputs and application of a threshold. For small inputs, neural networks in a stationary state also perform an effectively linear input-output transformation, but of an entire time series. Choosing the temporal mean of the time series as the feature for classification, the linear transformation of the network with subsequent thresholding is equivalent to the classical perceptron. Here we show that choosing covariances of time series as the feature for classification maps the neural network to what we call a 'covariance perceptron'; a mapping between covariances that is bilinear in terms of weights. By extending Gardner's theory of connections to this bilinear problem, using a replica symmetric mean-field theory, we compute the pattern and information capacities of the covariance perceptron in the infinite-size limit. Closed-form expressions reveal superior pattern capacity in the binary classification task compared to the classical perceptron in the case of a high-dimensional input and low-dimensional output. For less convergent networks, the mean perceptron classifies a larger number of stimuli. However, since covariances span a much larger input and output space than means, the amount of stored information in the covariance perceptron exceeds the classical counterpart. For strongly convergent connectivity it is superior by a factor equal to the number of input neurons. Theoretical calculations are validated numerically for finite size systems using a gradient-based optimization of a soft-margin, as well as numerical solvers for the NP hard quadratically constrained quadratic programming problem, to which training can be mapped.

preprint2019arXiv

Conditions for wave trains in spiking neural networks

Spatiotemporal patterns such as traveling waves are frequently observed in recordings of neural activity. The mechanisms underlying the generation of such patterns are largely unknown. Previous studies have investigated the existence and uniqueness of different types of waves or bumps of activity using neural-field models, phenomenological coarse-grained descriptions of neural-network dynamics. But it remains unclear how these insights can be transferred to more biologically realistic networks of spiking neurons, where individual neurons fire irregularly. Here, we employ mean-field theory to reduce a microscopic model of leaky integrate-and-fire (LIF) neurons with distance-dependent connectivity to an effective neural-field model. In contrast to existing phenomenological descriptions, the dynamics in this neural-field model depends on the mean and the variance in the synaptic input, both determining the amplitude and the temporal structure of the resulting effective coupling kernel. For the neural-field model we employ liner stability analysis to derive conditions for the existence of spatial and temporal oscillations and wave trains, that is, temporally and spatially periodic traveling waves. We first prove that wave trains cannot occur in a single homogeneous population of neurons, irrespective of the form of distance dependence of the connection probability. Compatible with the architecture of cortical neural networks, wave trains emerge in two-population networks of excitatory and inhibitory neurons as a combination of delay-induced temporal oscillations and spatial oscillations due to distance-dependent connectivity profiles. Finally, we demonstrate quantitative agreement between predictions of the analytically tractable neural-field model and numerical simulations of both networks of nonlinear rate-based units and networks of LIF neurons.

preprint2019arXiv

Statistical field theory for neural networks

These notes attempt a self-contained introduction into statistical field theory applied to neural networks of rate units and binary spins. The presentation consists of three parts: First, the introduction of fundamental notions of probabilities, moments, cumulants, and their relation by the linked cluster theorem, of which Wick's theorem is the most important special case; followed by the diagrammatic formulation of perturbation theory, reviewed in the statistical setting. Second, dynamics described by stochastic differential equations in the Ito-formulation, treated in the Martin-Siggia-Rose-De Dominicis-Janssen path integral formalism. With concepts from disordered systems, we then study networks with random connectivity and derive their self-consistent dynamic mean-field theory, explaining the statistics of fluctuations and the emergence of different phases with regular and chaotic dynamics. Third, we introduce the effective action, vertex functions, and the loopwise expansion. These tools are illustrated by systematic derivations of self-consistency equations, going beyond the mean-field approximation. These methods are applied to the pairwise maximum entropy (Ising spin) model, including the recently-found diagrammatic derivation of the Thouless-Anderson-Palmer mean field theory.

preprint2016arXiv

Functional methods for disordered neural networks

Neural networks of the brain form one of the most complex systems we know. Many qualitative features of the emerging collective phenomena, such as correlated activity, stability, response to inputs, chaotic and regular behavior, can, however, be understood in simple models that are accessible to a treatment in statistical mechanics, or, more precisely, classical statistical field theory. This tutorial presents the fundamentals behind contemporary developments in the theory of neural networks of rate units that are based on methods from statistical mechanics of classical systems with a large number of interacting degrees of freedom. In particular we will focus on a relevant class of systems that have quenched (time independent) disorder. In neural networks, the main source of disorder arises from random synaptic couplings between neurons. These systems are in many respects similar to spin glasses. The tutorial therefore also explains the methods for these disordered systems as far as they are applied in neuroscience. The presentation consists of two parts. In the first part we introduce stochastic differential equations in the Martin - Siggia - Rose - De Dominicis - Janssen path integral formalism. In the second part we employ this language to derive the dynamic mean-field theory for deterministic random networks, the basis of the seminal work by Sompolinsky, Crisanti, Sommers 1988, as well as a recent extension to stochastic dynamics.

preprint2015arXiv

A reaction diffusion-like formalism for plastic neural networks reveals dissipative solitons at criticality

Self-organized structures in networks with spike-timing dependent plasticity (STDP) are likely to play a central role for information processing in the brain. In the present study we derive a reaction-diffusion-like formalism for plastic feed-forward networks of nonlinear rate neurons with a correlation sensitive learning rule inspired by and being qualitatively similar to STDP. After obtaining equations that describe the change of the spatial shape of the signal from layer to layer, we derive a criterion for the non-linearity necessary to obtain stable dynamics for arbitrary input. We classify the possible scenarios of signal evolution and find that close to the transition to the unstable regime meta-stable solutions appear. The form of these dissipative solitons is determined analytically and the evolution and interaction of several such coexistent objects is investigated.

preprint2015arXiv

Correlated fluctuations in strongly-coupled binary networks beyond equilibrium

Randomly coupled Ising spins constitute the classical model of collective phenomena in disordered systems, with applications covering ferromagnetism, combinatorial optimization, protein folding, stock market dynamics, and social dynamics. The phase diagram of these systems is obtained in the thermodynamic limit by averaging over the quenched randomness of the couplings. However, many applications require the statistics of activity for a single realization of the possibly asymmetric couplings in finite-sized networks. Examples include reconstruction of couplings from the observed dynamics, learning in the central nervous system by correlation-sensitive synaptic plasticity, and representation of probability distributions for sampling-based inference. The systematic cumulant expansion for kinetic binary (Ising) threshold units with strong, random and asymmetric couplings presented here goes beyond mean-field theory and is applicable outside thermodynamic equilibrium; a system of approximate non-linear equations predicts average activities and pairwise covariances in quantitative agreement with full simulations down to hundreds of units. The linearized theory yields an expansion of the correlation- and response functions in collective eigenmodes, leads to an efficient algorithm solving the inverse problem, and shows that correlations are invariant under scaling of the interaction strengths.

preprint2015arXiv

Modulated escape from a metastable state driven by colored noise

Many phenomena in nature are described by excitable systems driven by colored noise. The temporal correlations in the fluctuations hinder an analytical treatment. We here present a general method of reduction to a white-noise system, capturing the color of the noise by effective and time-dependent boundary conditions. We apply the formalism to a model of the excitability of neuronal membranes, the leaky integrate-and-fire neuron model, revealing an analytical expression for the linear response of the system valid up to moderate frequencies. The closed form analytical expression enables the characterization of the response properties of such excitable units and the assessment of oscillations emerging in networks thereof.

preprint2015arXiv

Reduction of colored noise in excitable systems to white noise and dynamic boundary conditions

A recent study on the effect of colored driving noise on the escape from a metastable state derives an analytic expression of the transfer function of the leaky integrate-and-fire neuron model subject to colored noise. Here we present an alternative derivation of the results, taking into account time-dependent boundary conditions explicitly. This systematic approach may facilitate future extensions beyond first order perturbation theory. The analogy of the quantum harmonic oscillator to the LIF neuron model subject to white noise enables a derivation of the well known transfer function simpler than the original approach. We offer a pedagogical presentation including all intermediate steps of the calculations.

preprint2015arXiv

Scalability of asynchronous networks is limited by one-to-one mapping between effective connectivity and correlations

Network models are routinely downscaled compared to nature in terms of numbers of nodes or edges because of a lack of computational resources, often without explicit mention of the limitations this entails. While reliable methods have long existed to adjust parameters such that the first-order statistics of network dynamics are conserved, here we show that limitations already arise if also second-order statistics are to be maintained. The temporal structure of pairwise averaged correlations in the activity of recurrent networks is determined by the effective population-level connectivity. We first show that in general the converse is also true and explicitly mention degenerate cases when this one-to-one relationship does not hold. The one-to-one correspondence between effective connectivity and the temporal structure of pairwise averaged correlations implies that network scalings should preserve the effective connectivity if pairwise averaged correlations are to be held constant. Changes in effective connectivity can even push a network from a linearly stable to an unstable, oscillatory regime and vice versa. On this basis, we derive conditions for the preservation of both mean population-averaged activities and pairwise averaged correlations under a change in numbers of neurons or synapses in the asynchronous regime typical of cortical networks. We find that mean activities and correlation structure can be maintained by an appropriate scaling of the synaptic weights, but only over a range of numbers of synapses that is limited by the variance of external inputs to the network. Our results therefore show that the reducibility of asynchronous networks is fundamentally limited.

preprint2013arXiv

A unified view on weakly correlated recurrent networks

The diversity of neuron models used in contemporary theoretical neuroscience to investigate specific properties of covariances raises the question how these models relate to each other. In particular it is hard to distinguish between generic properties and peculiarities due to the abstracted model. Here we present a unified view on pairwise covariances in recurrent networks in the irregular regime. We consider the binary neuron model, the leaky integrate-and-fire model, and the Hawkes process. We show that linear approximation maps each of these models to either of two classes of linear rate models, including the Ornstein-Uhlenbeck process as a special case. The classes differ in the location of additive noise in the rate dynamics, which is on the output side for spiking models and on the input side for the binary model. Both classes allow closed form solutions for the covariance. For output noise it separates into an echo term and a term due to correlated input. The unified framework enables us to transfer results between models. For example, we generalize the binary model and the Hawkes process to the presence of conduction delays and simplify derivations for established results. Our approach is applicable to general network structures and suitable for population averages. The derived averages are exact for fixed out-degree network architectures and approximate for fixed in-degree. We demonstrate how taking into account fluctuations in the linearization procedure increases the accuracy of the effective theory and we explain the class dependent differences between covariances in the time and the frequency domain. Finally we show that the oscillatory instability emerging in networks of integrate-and-fire models with delayed inhibitory feedback is a model-invariant feature: the same structure of poles in the complex frequency plane determines the population power spectra.

preprint2013arXiv

Echoes in correlated neural systems

Correlations are employed in modern physics to explain microscopic and macroscopic phenomena, like the fractional quantum Hall effect and the Mott insulator state in high temperature superconductors and ultracold atoms. Simultaneously probed neurons in the intact brain reveal correlations between their activity, an important measure to study information processing in the brain that also influences macroscopic signals of neural activity, like the electro encephalogram (EEG). Networks of spiking neurons differ from most physical systems: The interaction between elements is directed, time delayed, mediated by short pulses, and each neuron receives events from thousands of neurons. Even the stationary state of the network cannot be described by equilibrium statistical mechanics. Here we develop a quantitative theory of pairwise correlations in finite sized random networks of spiking neurons. We derive explicit analytic expressions for the population averaged cross correlation functions. Our theory explains why the intuitive mean field description fails, how the echo of single action potentials causes an apparent lag of inhibition with respect to excitation, and how the size of the network can be scaled while maintaining its dynamical state. Finally, we derive a new criterion for the emergence of collective oscillations from the spectrum of the time-evolution propagator.

preprint2013arXiv

The correlation structure of local cortical networks intrinsically results from recurrent dynamics

The co-occurrence of action potentials of pairs of neurons within short time intervals is known since long. Such synchronous events can appear time-locked to the behavior of an animal and also theoretical considerations argue for a functional role of synchrony. Early theoretical work tried to explain correlated activity by neurons transmitting common fluctuations due to shared inputs. This, however, overestimates correlations. Recently the recurrent connectivity of cortical networks was shown responsible for the observed low baseline correlations. Two different explanations were given: One argues that excitatory and inhibitory population activities closely follow the external inputs to the network, so that their effects on a pair of cells mutually cancel. Another explanation relies on negative recurrent feedback to suppress fluctuations in the population activity, equivalent to small correlations. In a biological neuronal network one expects both, external inputs and recurrence, to affect correlated activity. The present work extends the theoretical framework of correlations to include both contributions and explains their qualitative differences. Moreover the study shows that the arguments of fast tracking and recurrent feedback are not equivalent, only the latter correctly predicts the cell-type specific correlations.

preprint2012arXiv

Decorrelation of neural-network activity by inhibitory feedback

Correlations in spike-train ensembles can seriously impair the encoding of information by their spatio-temporal structure. An inevitable source of correlation in finite neural networks is common presynaptic input to pairs of neurons. Recent theoretical and experimental studies demonstrate that spike correlations in recurrent neural networks are considerably smaller than expected based on the amount of shared presynaptic input. By means of a linear network model and simulations of networks of leaky integrate-and-fire neurons, we show that shared-input correlations are efficiently suppressed by inhibitory feedback. To elucidate the effect of feedback, we compare the responses of the intact recurrent network and systems where the statistics of the feedback channel is perturbed. The suppression of spike-train correlations and population-rate fluctuations by inhibitory feedback can be observed both in purely inhibitory and in excitatory-inhibitory networks. The effect is fully understood by a linear theory and becomes already apparent at the macroscopic level of the population averaged activity. At the microscopic level, shared-input correlations are suppressed by spike-train correlations: In purely inhibitory networks, they are canceled by negative spike-train correlations. In excitatory-inhibitory networks, spike-train correlations are typically positive. Here, the suppression of input correlations is not a result of the mere existence of correlations between excitatory (E) and inhibitory (I) neurons, but a consequence of a particular structure of correlations among the three possible pairings (EE, EI, II).

preprint2012arXiv

Noise Suppression and Surplus Synchrony by Coincidence Detection

The functional significance of correlations between action potentials of neurons is still a matter of vivid debates. In particular it is presently unclear how much synchrony is caused by afferent synchronized events and how much is intrinsic due to the connectivity structure of cortex. The available analytical approaches based on the diffusion approximation do not allow to model spike synchrony, preventing a thorough analysis. Here we theoretically investigate to what extent common synaptic afferents and synchronized inputs each contribute to closely time-locked spiking activity of pairs of neurons. We employ direct simulation and extend earlier analytical methods based on the diffusion approximation to pulse-coupling, allowing us to introduce precisely timed correlations in the spiking activity of the synaptic afferents. We investigate the transmission of correlated synaptic input currents by pairs of integrate-and-fire model neurons, so that the same input covariance can be realized by common inputs or by spiking synchrony. We identify two distinct regimes: In the limit of low correlation linear perturbation theory accurately determines the correlation transmission coefficient, which is typically smaller than unity, but increases sensitively even for weakly synchronous inputs. In the limit of high afferent correlation, in the presence of synchrony a qualitatively new picture arises. As the non-linear neuronal response becomes dominant, the output correlation becomes higher than the total correlation in the input. This transmission coefficient larger unity is a direct consequence of non-linear neural processing in the presence of noise, elucidating how synchrony-coded signals benefit from these generic properties present in cortical networks.

preprint2010arXiv

Non-equilibrium dynamics of stochastic point processes with refractoriness

Stochastic point processes with refractoriness appear frequently in the quantitative analysis of physical and biological systems, such as the generation of action potentials by nerve cells, the release and reuptake of vesicles at a synapse, and the counting of particles by detector devices. Here we present an extension of renewal theory to describe ensembles of point processes with time varying input. This is made possible by a representation in terms of occupation numbers of two states: Active and refractory. The dynamics of these occupation numbers follows a distributed delay differential equation. In particular, our theory enables us to uncover the effect of refractoriness on the time-dependent rate of an ensemble of encoding point processes in response to modulation of the input. We present exact solutions that demonstrate generic features, such as stochastic transients and oscillations in the step response as well as resonances, phase jumps and frequency doubling in the transfer of periodic signals. We show that a large class of renewal processes can indeed be regarded as special cases of the model we analyze. Hence our approach represents a widely applicable framework to define and analyze non-stationary renewal processes.

preprint2010arXiv

The perfect integrator driven by Poisson input and its approximation in the diffusion limit

In this note we consider the perfect integrator driven by Poisson process input. We derive its equilibrium and response properties and contrast them to the approximations obtained by applying the diffusion approximation. In particular, the probability density in the vicinity of the threshold differs, which leads to altered response properties of the system in equilibrium.

Moritz Helias

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Renormalization group for deep neural networks: Universality of learning and scaling laws

Origami in N dimensions: How feed-forward networks manufacture linear separability

Self-consistent formulations for stochastic nonlinear neuronal dynamics

Event-based update of synapses in voltage-based learning rules

Gell-Mann-Low criticality in neural networks

Capacity of the covariance perceptron

Conditions for wave trains in spiking neural networks

Statistical field theory for neural networks

Functional methods for disordered neural networks

A reaction diffusion-like formalism for plastic neural networks reveals dissipative solitons at criticality

Correlated fluctuations in strongly-coupled binary networks beyond equilibrium

Modulated escape from a metastable state driven by colored noise

Reduction of colored noise in excitable systems to white noise and dynamic boundary conditions

Scalability of asynchronous networks is limited by one-to-one mapping between effective connectivity and correlations

A unified view on weakly correlated recurrent networks

Echoes in correlated neural systems

The correlation structure of local cortical networks intrinsically results from recurrent dynamics

Decorrelation of neural-network activity by inhibitory feedback

Noise Suppression and Surplus Synchrony by Coincidence Detection

Non-equilibrium dynamics of stochastic point processes with refractoriness

The perfect integrator driven by Poisson input and its approximation in the diffusion limit