Source author record

Francesco Alemanno

Francesco Alemanno appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.dis-nn Machine Learning math-ph math.MP Artificial Intelligence Biological Physics math.NA Numerical Analysis

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A polynomially accelerated fixed-point iteration for vector problems

Fixed-point solvers are ubiquitous in nonlinear PDEs, yet their progress collapses whenever the Jacobian at the solution carries an eigenvalue arbitrarily close to one. We ask whether such stagnation can be removed without storing long histories or solving dense least squares. Under two assumptions -- (A1) the linearised error $e_n$ is dominated by a multiplier $m$ with $|m|<1$ and (A2) residuals shrink monotonically -- we construct a quadratic blend of three iterates whose error polynomial has a double root at $m$. This three-point polynomial accelerator (TPA) cancels the stubborn mode up to $o(\|e_n\|)$, reduces to Aitken's $Δ^2$ process in one dimension, and matches a doubly blended Anderson step with depth $m=2$ when the regularisation vanishes, yet it keeps the Picard memory footprint. The only extra ingredient is a residual-based estimate of $w=(1-m)^{-1}$ obtained from a closed-form regularised least-squares fit that remains stable even when two residuals nearly coincide. Numerical experiments on linear systems with clustered spectra, a $320$-dimensional nonlinear $\tanh$ fixed point, and a $50\times 50$ Poisson discretisation show that TPA reaches the $10^{-8}$ residual tolerance in $32$, $36$, and $244$ map evaluations (respectively). In the same settings SOR requires $663$ steps and Anderson acceleration with depth $m=5$ consumes $52$, $38$, and $955$ evaluations. TPA therefore supplies a parameter-free, constant-memory drop-in accelerator whenever a single contraction factor throttles convergence.

preprint2023arXiv

Hopfield model with planted patterns: a teacher-student self-supervised learning model

While Hopfield networks are known as paradigmatic models for memory storage and retrieval, modern artificial intelligence systems mainly stand on the machine learning paradigm. We show that it is possible to formulate a teacher-student self-supervised learning problem with Boltzmann machines in terms of a suitable generalization of the Hopfield model with structured patterns, where the spin variables are the machine weights and patterns correspond to the training set's examples. We analyze the learning performance by studying the phase diagram in terms of the training set size, the dataset noise and the inference temperature (i.e. the weight regularization). With a small but informative dataset the machine can learn by memorization. With a noisy dataset, an extensive number of examples above a critical threshold is needed. In this regime the memory storage limits of the system becomes an opportunity for the occurrence of a learning regime in which the system can generalize.

preprint2022arXiv

Recurrent neural networks that generalize from examples and optimize by dreaming

The gap between the huge volumes of data needed to train artificial neural networks and the relatively small amount of data needed by their biological counterparts is a central puzzle in machine learning. Here, inspired by biological information-processing, we introduce a generalized Hopfield network where pairwise couplings between neurons are built according to Hebb's prescription for on-line learning and allow also for (suitably stylized) off-line sleeping mechanisms. Moreover, in order to retain a learning framework, here the patterns are not assumed to be available, instead, we let the network experience solely a dataset made of a sample of noisy examples for each pattern. We analyze the model by statistical-mechanics tools and we obtain a quantitative picture of its capabilities as functions of its control parameters: the resulting network is an associative memory for pattern recognition that learns from examples on-line, generalizes and optimizes its storage capacity by off-line sleeping. Remarkably, the sleeping mechanisms always significantly reduce (up to $\approx 90\%$) the dataset size required to correctly generalize, further, there are memory loads that are prohibitive to Hebbian networks without sleeping (no matter the size and quality of the provided examples), but that are easily handled by the present "rested" neural networks.

preprint2021arXiv

Pattern recognition in Deep Boltzmann machines

We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines and we solve for its quenched free energy, in the thermodynamic limit and allowing for a first step of replica symmetry breaking. This result is accomplished rigorously exploiting interpolating techniques and recovering the expression already known for the replica-symmetry case. Further, we drop the restriction constraint by introducing intra-layer connections among spins and we show that the resulting system can be mapped into a modular Hopfield network, which is also addressed rigorously via interpolating techniques up to the first step of replica symmetry breaking.

preprint2020arXiv

Generalized Guerra's interpolation schemes for dense associative neural networks

In this work we develop analytical techniques to investigate a broad class of associative neural networks set in the high-storage regime. These techniques translate the original statistical-mechanical problem into an analytical-mechanical one which implies solving a set of partial differential equations, rather than tackling the canonical probabilistic route. We test the method on the classical Hopfield model - where the cost function includes only two-body interactions (i.e., quadratic terms) - and on the "relativistic" Hopfield model - where the (expansion of the) cost function includes p-body (i.e., of degree p) contributions. Under the replica symmetric assumption, we paint the phase diagrams of these models by obtaining the explicit expression of their free energy as a function of the model parameters (i.e., noise level and memory storage). Further, since for non-pairwise models ergodicity breaking is non necessarily a critical phenomenon, we develop a fluctuation analysis and find that criticality is preserved in the relativistic model.

preprint2019arXiv

Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories

Recently, Hopfield and Krotov introduced the concept of {\em dense associative memories} [DAM] (close to spin-glasses with $P$-wise interactions in a disordered statistical mechanical jargon): they proved a number of remarkable features these networks share and suggested their use to (partially) explain the success of the new generation of Artificial Intelligence. Thanks to a remarkable ante-litteram analysis by Baldi \& Venkatesh, among these properties, it is known these networks can handle a maximal amount of stored patterns $K$ scaling as $K \sim N^{P-1}$.\\ In this paper, once introduced a {\em minimal dense associative network} as one of the most elementary cost-functions falling in this class of DAM, we sacrifice this high-load regime -namely we force the storage of {\em solely} a linear amount of patterns, i.e. $K = αN$ (with $α>0$)- to prove that, in this regime, these networks can correctly perform pattern recognition even if pattern signal is $O(1)$ and is embedded in a sea of noise $O(\sqrt{N})$, also in the large $N$ limit. To prove this statement, by extremizing the quenched free-energy of the model over its natural order-parameters (the various magnetizations and overlaps), we derived its phase diagram, at the replica symmetric level of description and in the thermodynamic limit: as a sideline, we stress that, to achieve this task, aiming at cross-fertilization among disciplines, we pave two hegemon routes in the statistical mechanics of spin glasses, namely the replica trick and the interpolation technique.\\ Both the approaches reach the same conclusion: there is a not-empty region, in the noise-$T$ vs load-$α$ phase diagram plane, where these networks can actually work in this challenging regime; in particular we obtained a quite high critical (linear) load in the (fast) noiseless case resulting in $\lim_{β\to \infty}α_c(β)=0.65$.

preprint2019arXiv

Neural networks with redundant representation: detecting the undetectable

We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order P=4. The latter is known to be able to Hebbian-store an amount of patterns scaling as N^{P-1}, where N denotes the number of constituting binary neurons interacting P-wisely. We also prove that, by keeping the dense associative network far from the saturation regime (namely, allowing for a number of patterns scaling only linearly with N, while P>2) such a system is able to perform pattern recognition far below the standard signal-to-noise threshold. In particular, a network with P=4 is able to retrieve information whose intensity is O(1) even in the presence of a noise O(\sqrt{N}) in the large N limit. This striking skill stems from a redundancy representation of patterns -- which is afforded given the (relatively) low-load information storage -- and it contributes to explain the impressive abilities in pattern recognition exhibited by new-generation neural networks. The whole theory is developed rigorously, at the replica symmetric level of approximation, and corroborated by signal-to-noise analysis and Monte Carlo simulations.

preprint2018arXiv

Dreaming neural networks: rigorous results

Recently a daily routine for associative neural networks has been proposed: the network Hebbian-learns during the awake state (thus behaving as a standard Hopfield model), then, during its sleep state, optimizing information storage, it consolidates pure patterns and removes spurious ones: this forces the synaptic matrix to collapse to the projector one (ultimately approaching the Kanter-Sompolinksy model). This procedure keeps the learning Hebbian-based (a biological must) but, by taking advantage of a (properly stylized) sleep phase, still reaches the maximal critical capacity (for symmetric interactions). So far this emerging picture (as well as the bulk of papers on unlearning techniques) was supported solely by mathematically-challenging routes, e.g. mainly replica-trick analysis and numerical simulations: here we rely extensively on Guerra's interpolation techniques developed for neural networks and, in particular, we extend the generalized stochastic stability approach to the case. Confining our description within the replica symmetric approximation (where the previous ones lie), the picture painted regarding this generalization (and the previously existing variations on theme) is here entirely confirmed. Further, still relying on Guerra's schemes, we develop a systematic fluctuation analysis to check where ergodicity is broken (an analysis entirely absent in previous investigations). We find that, as long as the network is awake, ergodicity is bounded by the Amit-Gutfreund-Sompolinsky critical line (as it should), but, as the network sleeps, sleeping destroys spin glass states by extending both the retrieval as well as the ergodic region: after an entire sleeping session the solely surviving regions are retrieval and ergodic ones and this allows the network to achieve the perfect retrieval regime (the number of storable patterns equals the number of neurons in the network).

Francesco Alemanno

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

A polynomially accelerated fixed-point iteration for vector problems

Hopfield model with planted patterns: a teacher-student self-supervised learning model

Recurrent neural networks that generalize from examples and optimize by dreaming

Pattern recognition in Deep Boltzmann machines

Generalized Guerra's interpolation schemes for dense associative neural networks

Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories

Neural networks with redundant representation: detecting the undetectable

Dreaming neural networks: rigorous results