Source author record

Claudio Zeni

Claudio Zeni appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.comp-ph cond-mat.mtrl-sci cond-mat.other Information Theory Machine Learning math.IT

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Compact atomic descriptors enable accurate predictions via linear models

We probe the accuracy of linear ridge regression employing a three-body local density representation derived from the atomic cluster expansion. We benchmark the accuracy of this framework in the prediction of formation energies and atomic forces in molecules and solids. We find that such a simple regression framework performs on par with state-of-the-art machine learning methods which are, in most cases, more complex and more computationally demanding. Subsequently, we look for ways to sparsify the descriptor and further improve the computational efficiency of the method. To this aim, we use both principal component analysis and least absolute shrinkage operator regression for energy fitting on six single-element datasets. Both methods highlight the possibility of constructing a descriptor that is four times smaller than the original with a similar or even improved accuracy. Furthermore, we find that the reduced descriptors share a sizable fraction of their features across the six independent datasets, hinting at the possibility of designing material-agnostic, optimally compressed, and accurate descriptors.

preprint2022arXiv

Data-driven simulation and characterisation of gold nanoparticle melting

The simulation and analysis of the thermal stability of nanoparticles, a stepping stone towards their application in technological devices, require fast and accurate force fields, in conjunction with effective characterisation methods. In this work, we develop efficient, transferable, and interpretable machine learning force fields for gold nanoparticles based on data gathered from Density Functional Theory calculations. We use them to investigate the thermodynamic stability of gold nanoparticles of different sizes (1 to 6 nm), containing up to 6266 atoms, concerning a solid-liquid phase change through molecular dynamics simulations. We predict nanoparticle melting temperatures in good agreement with available experimental data. Furthermore, we characterize the solid-liquid phase change mechanism employing an unsupervised learning scheme to categorize local atomic environments. We thus provide a data-driven definition of liquid atomic arrangements in the inner and surface regions of a nanoparticle and employ it to show that melting initiates at the outer layers.

preprint2022arXiv

Exploring the robust extrapolation of high-dimensional machine learning potentials

We show that, contrary to popular assumptions, predictions from machine learning potentials built upon high-dimensional atom-density representations almost exclusively occur in regions of the representation space which lie outside the convex hull defined by the training set points. We then propose a perspective to rationalize the domain of robust extrapolation and accurate prediction of atomistic machine learning potentials in terms of the probability density induced by training points in the representation space

preprint2022arXiv

Ranking the information content of distance measures

Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches, particularly when data are sparse. We introduce a statistical test that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This in turn allows finding the most informative distance measure out of a pool of candidates. The approach is applied to find the most relevant policy variables for controlling the Covid-19 epidemic and to find compact yet informative representations of atomic structures, but its potential applications are wide ranging in many branches of science.

preprint2019arXiv

Building nonparametric $n$-body force fields using Gaussian process regression

Constructing a classical potential suited to simulate a given atomic system is a remarkably difficult task. This chapter presents a framework under which this problem can be tackled, based on the Bayesian construction of nonparametric force fields of a given order using Gaussian process (GP) priors. The formalism of GP regression is first reviewed, particularly in relation to its application in learning local atomic energies and forces. For accurate regression it is fundamental to incorporate prior knowledge into the GP kernel function. To this end, this chapter details how properties of smoothness, invariance and interaction order of a force field can be encoded into corresponding kernel properties. A range of kernels is then proposed, possessing all the required properties and an adjustable parameter $n$ governing the interaction order modelled. The order $n$ best suited to describe a given system can be found automatically within the Bayesian framework by maximisation of the marginal likelihood. The procedure is first tested on a toy model of known interaction and later applied to two real materials described at the DFT level of accuracy. The models automatically selected for the two materials were found to be in agreement with physical intuition. More in general, it was found that lower order (simpler) models should be chosen when the data are not sufficient to resolve more complex interactions. Low $n$ GPs can be further sped up by orders of magnitude by constructing the corresponding tabulated force field, here named "MFF".

Claudio Zeni

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Compact atomic descriptors enable accurate predictions via linear models

Data-driven simulation and characterisation of gold nanoparticle melting

Exploring the robust extrapolation of high-dimensional machine learning potentials

Ranking the information content of distance measures

Building nonparametric $n$-body force fields using Gaussian process regression