Researcher profile

Matteo Marsili

Matteo Marsili contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Quantifying Relevance in Learning and Inference

Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharted territories where data is high-dimensional and scarce, and prior information on "true" models is scant if not totally absent. Here we review recent progress on understanding learning, based on the notion of "relevance". The relevance, as we define it here, quantifies the amount of information that a dataset or the internal representation of a learning machine contains on the generative model of the data. This allows us to define maximally informative samples, on one hand, and optimal learning machines on the other. These are ideal limits of samples and of machines, that contain the maximal amount of information about the unknown generative process, at a given resolution (or level of compression). Both ideal limits exhibit critical features in the statistical sense: Maximally informative samples are characterised by a power-law frequency distribution (statistical criticality) and optimal learning machines by an anomalously large susceptibility. The trade-off between resolution (i.e. compression) and relevance distinguishes the regime of noisy representations from that of lossy compression. These are separated by a special point characterised by Zipf's law statistics. This identifies samples obeying Zipf's law as the most compressed loss-less representations that are optimal in the sense of maximal relevance. Criticality in optimal learning machines manifests in an exponential degeneracy of energy levels, that leads to unusual thermodynamic properties.

preprint2021arXiv

A random energy approach to deep learning

We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are characterised by a broad distribution of energy levels. The analysis of Deep Belief Networks and Restricted Boltzmann Machines on different datasets confirms these conclusions.

preprint2020arXiv

Characterising authors on the extent of their paper acceptance: A case study of the Journal of High Energy Physics

New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate are likely to have a high number of citations, high $h$-index, higher number of collaborators etc. We notice that they receive relatively lengthy and positive reviews for their papers. In addition, we also construct three networks -- co-reviewer, co-citation and collaboration network and study the network-centric features and intra- and inter-category edge interactions. We find that the authors with high acceptance rate are more `central' in these networks; the volume of intra- and inter-category interactions are also drastically different for the authors with high acceptance rate compared to the other authors. Finally, using the above set of features, we train standard machine learning models (random forest, XGBoost) and obtain very high class wise precision and recall. In a followup discussion we also narrate how apart from the author characteristics, the peer-review system might itself have a role in propelling the distinction among the different categories which could lead to potential discrimination and unfairness and calls for further investigation by the system admins.

preprint2020arXiv

Estimating the impact of preventive quarantine with reverse epidemiology

The impact of mitigation or control measures on an epidemics can be estimated by fitting the parameters of a compartmental model to empirical data, and running the model forward with modified parameters that account for a specific measure. This approach has several drawbacks, stemming from biases or lack of availability of data and instability of parameter estimates. Here we take the opposite approach -- that we call reverse epidemiology. Given the data, we reconstruct backward in time an ensemble of networks of contacts, and we assess the impact of measures on that specific realization of the contagion process. This approach is robust because it only depends on parameters that describe the evolution of the disease within one individual (e.g. latency time) and not on parameters that describe the spread of the epidemics in a population. Using this method, we assess the impact of preventive quarantine on the ongoing outbreak of Covid-19 in Italy. This gives an estimate of how many infected could have been avoided had preventive quarantine been enforced at a given time.

preprint2020arXiv

Optimal Work Extraction and the Minimum Description Length Principle

We discuss work extraction from classical information engines (e.g., Szilárd) with $N$-particles, $q$ partitions, and initial arbitrary non-equilibrium states. In particular, we focus on their {\em optimal} behaviour, which includes the measurement of a set of quantities $Φ$ with a feedback protocol that extracts the maximal average amount of work. We show that the optimal non-equilibrium state to which the engine should be driven before the measurement is given by the normalised maximum-likelihood probability distribution of a statistical model that admits $Φ$ as sufficient statistics. Furthermore, we show that the minimax universal code redundancy $\mathcal{R}^*$ associated to this model, provides an upper bound to the work that the demon can extract on average from the cycle, in units of $k_{\rm B}T$. We also find that, in the limit of $N$ large, the maximum average extracted work cannot exceed $H[Φ]/2$, i.e. one half times the Shannon entropy of the measurement. Our results establish a connection between optimal work extraction in stochastic thermodynamics and optimal universal data compression, providing design principles for optimal information engines. In particular, they suggest that: (i) optimal coding is thermodynamically efficient, and (ii) it is essential to drive the system into a critical state in order to achieve optimal performance.

preprint2019arXiv

The peculiar statistical mechanics of Optimal Learning Machines

Optimal Learning Machines (OLM) are systems that extract maximally informative representation of the environment they are in contact with, or of the data they are presented. It has recently been suggested that these systems are characterised by an exponential distribution of energy levels. In order to understand the peculiar properties of OLM within a broader framework, I consider an ensemble of optimisation problems over functions of many variables, part of which describe a sub-system and the rest account for its interaction with a random environment. The number of states of the sub-system with a given value of the objective function obeys a stretched exponential distribution, with exponent $γ$, and the interaction part is drawn at random from the same distribution, independently for each configuration of the whole system. Systems with $γ=1$ then correspond to OLM, and we find that they sit at the boundary between two regions with markedly different properties. For all $γ>0$ the system exhibits a freezing phase transition. The transition is discontinuous for $γ<1$ and it is continuous for $γ>1$. The region $γ>1$ corresponds to learnable energy landscapes and the behaviour of the sub-system becomes predictable as the size of the environment exceeds a critical threshold. For $γ<1$, instead, the energy landscape is unlearnable and the behaviour of the system becomes more and more unpredictable as the size of the environment increases. Sub-systems with $γ=1$ (OLM) feature a behaviour which is independent of the relative size of the environment. This is consistent with the expectation that efficient representations should be largely independent of the level of detail of the description of the environment.

preprint2010arXiv

On information efficiency and financial stability

We study a simple model of an asset market with informed and non-informed agents. In the absence of non-informed agents, the market becomes information efficient when the number of traders with different private information is large enough. Upon introducing non-informed agents, we find that the latter contribute significantly to the trading activity if and only if the market is (nearly) information efficient. This suggests that information efficiency might be a necessary condition for bubble phenomena, induced by the behavior of non-informed traders, or conversely that throwing some sands in the gears of financial markets may curb the occurrence of bubbles.