Source author record

Massimo Bernaschi

Massimo Bernaschi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.comp-ph Distributed, Parallel, and Cluster Computing physics.flu-dyn cond-mat.dis-nn cond-mat.soft cond-mat.stat-mech Cryptography and Security Biological Physics cond-mat.mes-hall Data Structures and Algorithms Machine Learning Social and Information Networks

Catalog footprint

What is connected

18works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Low energy excitations in a long prism geometry: computing the lower critical dimension of the Ising spin glass

We propose a general method for studying systems that display excitations with arbitrarily low energy in their low-temperature phase. We argue that in a rectangular right prism geometry, with longitudinal size much larger than the transverse size, correlations decay exponentially (at all temperatures) along the longitudinal dimension, but the scaling of the correlation length with the transverse size carries crucial information from which the lower critical dimension can be inferred. The method is applied in the particularly demanding context of Ising spin glasses at zero magnetic field. The lower critical dimension and the multifractal spectrum for the correlation function are computed from large-scale numerical simulations. Several technical novelties (such as the unexpectedly crucial performance of Houdayer's cluster method or the convenience of using open - rather than periodic - boundary conditions) allow us to study three-dimensional prisms with transverse dimensions up to $L=24$ and effectively infinite longitudinal dimensions down to low temperatures. The value that we find for the lower critical dimension turns out to be in agreement with expectations from both the Replica Symmetry Breaking theory and the Droplet model for spin glasses. We argue that our novel setting holds promise in clarifying which of the two competing theories more accurately describes three-dimensional spin glasses.

preprint2022arXiv

Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators

Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications with sparse data. To challenge this viewpoint, we discuss methods and present solutions for accelerating sparse matrix multiplication on such architectures. In particular, we present a 1-dimensional blocking algorithm with theoretical guarantees on the density, which builds dense blocks from arbitrary sparse matrices. Experimental results show that, even for unstructured and highly-sparse matrices, our block-based solution which exploits Nvidia Tensor Cores is faster than its sparse counterpart. We observed significant speed-ups of up to two orders of magnitude on real-world sparse matrices.

preprint2021arXiv

LBcuda: a high-performance CUDA port of LBsoft for simulation of colloidal systems

We present LBcuda, a GPU accelerated version of LBsoft, our open-source MPI-based software for the simulation of multi-component colloidal flows. We describe the design principles, the optimization and the resulting performance as compared to the CPU version, using both an average cost GPU and high-end NVidia GPU cards (V100 and the latest A100). The results show a substantial acceleration for the fluid solver reaching up to 200 GLUPS (Giga Lattice Updates Per Second) on a cluster made of 512 A100 NVIDIA cards simulating a grid of eight billion lattice points. These results open attractive prospects for the computational design of new materials based on colloidal particles.

preprint2021arXiv

Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

Password security hinges on an in-depth understanding of the techniques adopted by attackers. Unfortunately, real-world adversaries resort to pragmatic guessing strategies such as dictionary attacks that are inherently difficult to model in password security studies. In order to be representative of the actual threat, dictionary attacks must be thoughtfully configured and tuned. However, this process requires a domain-knowledge and expertise that cannot be easily replicated. The consequence of inaccurately calibrating dictionary attacks is the unreliability of password security analyses, impaired by a severe measurement bias. In the present work, we introduce a new generation of dictionary attacks that is consistently more resilient to inadequate configurations. Requiring no supervision or domain-knowledge, this technique automatically approximates the advanced guessing strategies adopted by real-world attackers. To achieve this: (1) We use deep neural networks to model the proficiency of adversaries in building attack configurations. (2) Then, we introduce dynamic guessing strategies within dictionary attacks. These mimic experts' ability to adapt their guessing strategies on the fly by incorporating knowledge on their targets. Our techniques enable more robust and sound password strength estimates within dictionary attacks, eventually reducing overestimation in modeling real-world threats in password security. Code available: https://github.com/TheAdamProject/adams

preprint2021arXiv

TLBfind: a Thermal Lattice Boltzmann code for concentrated emulsions with FINite-size Droplets

In this paper, we present TLBfind, a GPU code for simulating the hydrodynamics of droplets along with a dynamic temperature field. TLBfind hinges on a two-dimensional multi-component lattice Boltzmann (LB) model simulating a concentrated emulsion with finite-size droplets evolving in a thermal convective state, just above the transition from conduction to convection. The droplet concentration of the emulsion system is tenable, and at the core of the code lies the possibility to measure a large number of physical observables characterising the flow and droplets. Furthermore, TLBfind includes a parallel implementation on GPU of the Delaunay triangulation useful for the detection of droplets' plastic rearrangements, and several types of boundary conditions, supporting simulations of channels with structured rough walls.

preprint2020arXiv

Improving Password Guessing via Representation Learning

Learning useful representations from unstructured data is one of the core challenges, as well as a driving force, of modern data-driven approaches. Deep learning has demonstrated the broad advantages of learning and harnessing such representations. In this paper, we introduce a deep generative model representation learning approach for password guessing. We show that an abstract password representation naturally offers compelling and versatile properties that can be used to open new directions in the extensively studied, and yet presently active, password guessing field. These properties can establish novel password generation techniques that are neither feasible nor practical with the existing probabilistic and non-probabilistic approaches. Based on these properties, we introduce:(1) A general framework for conditional password guessing that can generate passwords with arbitrary biases; and (2) an Expectation Maximization-inspired framework that can dynamically adapt the estimated password distribution to match the distribution of the attacked password set.

preprint2020arXiv

LBsoft: a parallel open-source software for simulation of colloidal systems

We present LBsoft, an open-source software developed mainly to simulate the hydro-dynamics of colloidal systems based on the concurrent coupling between lattice Boltzmann methods for the fluid and discrete particle dynamics for the colloids. Such coupling has been developed before, but, to the best of our knowledge, no detailed discussion of the programming issues to be faced in order to attain efficient implementation on parallel architectures, has ever been presented to date. In this paper, we describe in detail the underlying multi-scale models, their coupling procedure, along side with a description of the relevant input variables, to facilitate third-parties usage. The code is designed to exploit parallel computing platforms, taking advantage also of the recent AVX-512 instruction set. We focus on LBsoft structure, functionality, parallel implementation, performance and availability, so as to facilitate the access to this computational tool to the research community in the field. The capabilities of LBsoft are highlighted for a number of prototypical case studies, such as pickering emulsions, bicontinuous systems, as well as an original study of the coarsening process in confined bijels under shear.

preprint2020arXiv

Towards Exascale Lattice Boltzmann computing

We discuss the state of art of Lattice Boltzmann (LB) computing, with special focus on prospective LB schemes capable of meeting the forthcoming Exascale challenge. After reviewing the basic notions of LB computing, we discuss current techniques to improve the performance of LB codes on parallel machines and illustrate selected leading-edge applications in the Petascale range. Finally, we put forward a few ideas on how to improve the communication/computation overlap in current large-scale LB simulations, as well as possible strategies towards fault-tolerant LB schemes.

preprint2019arXiv

A Performance Study of the 2D Ising Model on GPUs

The simulation of the two-dimensional Ising model is used as a benchmark to show the computational capabilities of Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions.

preprint2019arXiv

Strong ergodicity breaking in aging of mean field spin glasses

Out of equilibrium relaxation processes show aging if they become slower as time passes. Aging processes are ubiquitous and play a fundamental role in the physics of glasses and spin glasses and in other applications (e.g. in algorithms minimizing complex cost/loss functions). The theory of aging in the out of equilibrium dynamics of mean-field spin glass models has achieved a fundamental role, thanks to the asymptotic analytic solution found by Cugliandolo and Kurchan. However this solution is based on assumptions (e.g. the weak ergodicity breaking hypothesis) which have never been put under a strong test until now. In the present work we present the results of an extraordinary large set of numerical simulations of the prototypical mean-field spin glass models, namely the Sherrington-Kirkpatrick and the Viana-Bray models. Thanks to a very intensive use of GPUs, we have been able to run the latter model for more than $2^{64}$ spin updates and thus safely extrapolate the numerical data both in the thermodynamical limit and in the large times limit. The measurements of the two-times correlation functions in isothermal aging after a quench from a random initial configuration to a temperature $T<T_c$ provides clear evidence that, at large times, such correlations do not decay to zero as expected by assuming weak ergodicity breaking. We conclude that strong ergodicity breaking takes place in mean-field spin glasses aging dynamics which, asymptotically, takes place in a confined configurational space. Theoretical models for the aging dynamics need to be revised accordingly.

preprint2016arXiv

Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the influence of a vertex in a graph. The BC score of a vertex is proportional to the number of all-pairs-shortest-paths passing through it. However, complete and exact BC computation for a large-scale graph is an extraordinary challenge that requires high performance computing techniques to provide results in a reasonable amount of time. Our approach combines bi-dimensional (2-D) decomposition of the graph and multi-level parallelism together with a suitable data-thread mapping that overcomes most of the difficulties caused by the irregularity of the computation on GPUs. Furthermore, we propose novel heuristics which exploit the topology information of the graph in order to reduce time and space requirements of BC computation. Experimental results on synthetic and real-world graphs show that the proposed techniques allow the BC computation of graphs which are too large to fit in the memory of a single computational node along with a significant reduction of the computing time.

preprint2016arXiv

Fluidisation and plastic activity in a model soft-glassy material flowing in micro-channels with rough walls

By means of mesoscopic numerical simulations of a model soft-glassy material, we investigate the role of boundary roughness on the flow behaviour of the material, probing the bulk/wall and global/local rheologies. We show that the roughness reduces the wall slip induced by wettability properties and acts as a source of fluidisation for the material. A direct inspection of the plastic events suggests that their rate of occurrence grows with the fluidity field, reconciling our simulations with kinetic elasto-plastic descriptions of jammed materials. Notwithstanding, we observe qualitative and quantitative differences in the scaling, depending on the distance from the rough wall and on the imposed shear. The impact of roughness on the orientational statistics is also studied.

preprint2014arXiv

Highly optimized simulations on single- and multi-GPU systems of 3D Ising spin glass

We present a highly optimized implementation of a Monte Carlo (MC) simulator for the three-dimensional Ising spin-glass model with bimodal disorder, i.e., the 3D Edwards-Anderson model running on CUDA enabled GPUs. Multi-GPU systems exchange data by means of the Message Passing Interface (MPI). The chosen MC dynamics is the classic Metropolis one, which is purely dissipative, since the aim was the study of the critical off-equilibrium relaxation of the system. We focused on the following issues: i) the implementation of efficient access patterns for nearest neighbours in a cubic stencil and for lagged-Fibonacci-like pseudo-Random Numbers Generators (PRNGs); ii) a novel implementation of the asynchronous multispin-coding Metropolis MC step allowing to store one spin per bit and iii) a multi-GPU version based on a combination of MPI and CUDA streams. We highlight how cubic stencils and PRNGs are two subjects of very general interest because of their widespread use in many simulation codes. Our code best performances ~3 and ~5 psFlip on a GTX Titan with our implementations of the MINSTD and MT19937 respectively.

preprint2014arXiv

Mesoscopic simulation study of wall roughness effects in micro-channel flows of dense emulsions

We study the Poiseuille flow of a soft-glassy material above the jamming point, where the material flows like a complex fluid with Herschel- Bulkley rheology. Microscopic plastic rearrangements and the emergence of their spatial correlations induce cooperativity flow behavior whose effect is pronounced in presence of confinement. With the help of lattice Boltzmann numerical simulations of confined dense emulsions, we explore the role of geometrical roughness in providing activation of plastic events close to the boundaries. We probe also the spatial configuration of the fluidity field, a continuum quantity which can be related to the rate of plastic events, thereby allowing us to establish a link between the mesoscopic plastic dynamics of the jammed material and the macroscopic flow behaviour.

preprint2014arXiv

Parallel Distributed Breadth First Search on the Kepler Architecture

We present the results obtained by using an evolution of our CUDA-based solution for the exploration, via a Breadth First Search, of large graphs. This latest version exploits at its best the features of the Kepler architecture and relies on a combination of techniques to reduce both the number of communications among the GPUs and the amount of exchanged data. The final result is a code that can visit more than 800 billion edges in a second by using a cluster equipped with 4096 Tesla K20X GPUs.

preprint2013arXiv

GPU peer-to-peer techniques applied to a cluster interconnect

Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class GPUs on an FPGA-based cluster interconnect. Besides, the current software implementation, which integrates this feature by minimally extending the RDMA programming model, is discussed, as well as some issues raised while employing it in a higher level API like MPI. Finally, the current limits of the technique are studied by analyzing the performance improvements on low-level benchmarks and on two GPU-accelerated applications, showing when and how they seem to benefit from the GPU peer-to-peer method.

preprint2010arXiv

Shear Banding from lattice kinetic models with competing interactions

Soft Glassy Materials, Non Linear Rheology, Lattice Kinetic models, frustrated phase separation} We present numerical simulations based on a Boltzmann kinetic model with competing interactions, aimed at characterizating the rheological properties of soft-glassy materials. The lattice kinetic model is shown to reproduce typical signatures of driven soft-glassy flows in confined geometries, such as Herschel-Bulkley rheology, shear-banding and histeresys. This lends further credit to the present lattice kinetic model as a valuable tool for the theoretical/computational investigation of the rheology of driven soft-glassy materials under confinement.

preprint2009arXiv

Numerical simulation of conformational variability in biopolymer translocation through wide nanopores

Numerical results on the translocation of long biopolymers through mid-sized and wide pores are presented. The simulations are based on a novel methodology which couples molecular motion to a mesoscopic fluid solvent. Thousands of events of long polymers (up to 8000 monomers) are monitored as they pass through nanopores. Comparison between the different pore sizes shows that wide pores can host a larger number of multiple biopolymer segments, as compared to smaller pores. The simulations provide clear evidence of folding quantization in the translocation process as the biopolymers undertake multi-folded configurations, characterized by a well-defined integer number of folds. Accordingly, the translocation time is no longer represented by a single-exponent power law dependence on the length, as it is the case for single-file translocation through narrow pores. The folding quantization increases with the biopolymer length, while the rate of translocated beads at each time step is linearly correlated to the number of resident beads in the pore. Finally, analysis of the statistics over the translocation work unravels the importance of the hydrodynamic interactions in the process.

Massimo Bernaschi

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Low energy excitations in a long prism geometry: computing the lower critical dimension of the Ising spin glass

Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators

LBcuda: a high-performance CUDA port of LBsoft for simulation of colloidal systems

Reducing Bias in Modeling Real-world Password Strength via Deep Learning and Dynamic Dictionaries

TLBfind: a Thermal Lattice Boltzmann code for concentrated emulsions with FINite-size Droplets

Improving Password Guessing via Representation Learning

LBsoft: a parallel open-source software for simulation of colloidal systems

Towards Exascale Lattice Boltzmann computing

A Performance Study of the 2D Ising Model on GPUs

Strong ergodicity breaking in aging of mean field spin glasses

Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

Fluidisation and plastic activity in a model soft-glassy material flowing in micro-channels with rough walls

Highly optimized simulations on single- and multi-GPU systems of 3D Ising spin glass

Mesoscopic simulation study of wall roughness effects in micro-channel flows of dense emulsions

Parallel Distributed Breadth First Search on the Kepler Architecture

GPU peer-to-peer techniques applied to a cluster interconnect

Shear Banding from lattice kinetic models with competing interactions

Numerical simulation of conformational variability in biopolymer translocation through wide nanopores