Researcher profile

Edoardo di Napoli

Edoardo di Napoli contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

ChASE -- A Distributed Hybrid CPU-GPU Eigensolver for Large-scale Hermitian Eigenvalue Problems

As modern massively parallel clusters are getting larger with beefier compute nodes, traditional parallel eigensolvers, such as direct solvers, struggle keeping the pace with the hardware evolution and being able to scale efficiently due to additional layers of communication and synchronization. This difficulty is especially important when porting traditional libraries to heterogeneous computing architectures equipped with accelerators, such as Graphics Processing Unit (GPU). Recently, there have been significant scientific contributions to the development of filter-based subspace eigensolver to compute partial eigenspectrum. The simpler structure of these type of algorithms makes for them easier to avoid the communication and synchronization bottlenecks typical of direct solvers. The Chebyshev Accelerated Subspace Eigensolver (ChASE) is a modern subspace eigensolver to compute partial extremal eigenpairs of large-scale Hermitian eigenproblems with the acceleration of a filter based on Chebyshev polynomials. In this work, we extend our previous work on ChASE by adding support for distributed hybrid CPU-multi-GPU computing architectures. Our tests show that ChASE achieves very good scaling performance up to 144 nodes with 526 NVIDIA A100 GPUs in total on dense eigenproblems of size up to $360$k.

preprint2020arXiv

A Long-Range Ising Model of a Barabási-Albert Network

Networks that have power-law connectivity, commonly referred to as the scale-free networks, are an important class of complex networks. A heterogeneous mean-field approximation has been previously proposed for the Ising model of the Barabási-Albert model of scale-free networks with classical spins on the nodes wherein it was shown that the critical temperature for such a system scales logarithmically with network size. For finite sizes, there is no criticality for such a system and hence no true phase transition in terms of singular behavior. Further, in the thermodynamic limit, the mean-field prediction of an infinite critical temperature for the system may exclude any true phase transition even then. Nevertheless, with an eye on potential applications of the model on biological systems that are generally finite, one may still try to find approximations that describe the relevant observables quantitatively. Here we present an alternative, approximate formulation for the description of the Ising model of a Barabási-Albert Network. Using the classical definition of magnetization, we show that Ising models on a network can be well-approximated by a long-range interacting homogeneous Ising model wherein each node of the network couples to all other spins with a strength determined by the mean degree of the Barabási-Albert Network. In such an effective long-range Ising model of a Barabási-Albert Network, the critical temperature is directly proportional to the number of preferentially attached links added to grow the network. The proposed model describes the magnetization of the majority of the sites with average or smaller than average degree better compared to the heterogeneous mean-field approximation. The long-range Ising model is the only homogeneous description of Barabási-Albert networks that we know of.

preprint2020arXiv

Solution to the modified Helmholtz equation for arbitrary periodic charge densities

We present a general method for solving the modified Helmholtz equation without shape approximation for an arbitrary periodic charge distribution, whose solution is known as the Yukawa potential or the screened Coulomb potential. The method is an extension of Weinert's pseudo-charge method [M. Weinert, J. Math. Phys. 22, 2433 (1981)] for solving the Poisson equation for the same class of charge density distributions. The inherent differences between the Poisson and the modified Helmholtz equation are in their respective radial solutions. These are polynomial functions, for the Poisson equation, and modified spherical Bessel functions, for the modified Helmholtz equation. This leads to a definition of a modified pseudo-charge density and modified multipole moments. We have shown that Weinert's convergence analysis of an absolutely and uniformly convergent Fourier series of the pseudo-charge density is transferred to the modified pseudo-charge density. We conclude by illustrating the algorithmic changes necessary to turn an available implementation of the Poisson solver into a solver for the modified Helmholtz equation.

preprint2020arXiv

The LAPW method with eigendecomposition based on the Hari--Zimmermann generalized hyperbolic SVD

In this paper we propose an accurate, highly parallel algorithm for the generalized eigendecomposition of a matrix pair $(H, S)$, given in a factored form $(F^{\ast} J F, G^{\ast} G)$. Matrices $H$ and $S$ are generally complex and Hermitian, and $S$ is positive definite. This type of matrices emerges from the representation of the Hamiltonian of a quantum mechanical system in terms of an overcomplete set of basis functions. This expansion is part of a class of models within the broad field of Density Functional Theory, which is considered the golden standard in condensed matter physics. The overall algorithm consists of four phases, the second and the fourth being optional, where the two last phases are computation of the generalized hyperbolic SVD of a complex matrix pair $(F,G)$, according to a given matrix $J$ defining the hyperbolic scalar product. If $J = I$, then these two phases compute the GSVD in parallel very accurately and efficiently.

preprint2017arXiv

Accelerating the computation of FLAPW methods on heterogeneous architectures

Legacy codes in computational science and engineering have been very successful in providing essential functionality to researchers. However, they are not capable of exploiting the massive parallelism provided by emerging heterogeneous architectures. The lack of portable performance and scalability puts them at high risk: either they evolve or they are doomed to disappear. One example of legacy code which would heavily benefit from a modern design is FLEUR, a software for electronic structure calculations. In previous work, the computational bottleneck of FLEUR was partially re-engineered to have a modular design that relies on standard building blocks, namely BLAS and LAPACK. In this paper, we demonstrate how the initial redesign enables the portability to heterogeneous architectures. More specifically, we study different approaches to port the code to architectures consisting of multi-core CPUs equipped with one or more coprocessors such as Nvidia GPUs and Intel Xeon Phis. Our final code attains over 70\% of the architectures' peak performance, and outperforms Nvidia's and Intel's libraries. Finally, on JURECA, the supercomputer where FLEUR is often executed, the code takes advantage of the full power of the computing nodes, attaining $5\times$ speedup over the sole use of the CPUs.