Source author record

Giuliano Taffoni

Giuliano Taffoni appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM Distributed, Parallel, and Cluster Computing astro-ph astro-ph.CO astro-ph.GA Performance

Catalog footprint

What is connected

7works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Accelerating cosmological simulations on GPUs: a step towards sustainability and green-awareness

The increasing complexity and scale of cosmological N-body simulations, driven by astronomical surveys like Euclid, call for a paradigm shift towards more sustainable and energy-efficient high-performance computing (HPC). The rising energy consumption of supercomputing facilities poses a significant environmental and financial challenge. In this work, we build upon a recently developed GPU implementation of pinocchio, a widely-used tool for the fast generation of dark matter (DM) halo catalogues, to investigate energy consumption. Using a different resource configuration, we confirmed the time-to-solution behavior observed in a companion study, and we use these runs to compare time-to-solution with energy-to-solution. By profiling the code on various HPC platforms with a newly developed implementation of the Power Measurement Toolkit (PMT), we demonstrate an 8x reduction in energy-to-solution and 8x speed-up in time-to-solution compared to the CPU-only version. Taken together, these gains translate into an overall efficiency improvement of up to 64x. Our results show that the GPU-accelerated pinocchio not only achieves substantial speed-up, making the generation of large-scale mock catalogues more tractable, but also significantly reduces the energy footprint of the simulations. This work represents an step towards ``green-aware" scientific computing in cosmology, proving that performance and sustainability can be simultaneously achieved.

preprint2023arXiv

High Performance W-stacking for Imaging Radio Astronomy Data: a Parallel and Accelerated Solution

Current and upcoming radio-interferometers are expected to produce volumes of data of increasing size that need to be processed in order to generate the corresponding sky brightness distributions through imaging. This represents an outstanding computational challenge, especially when large fields of view and/or high resolution observations are processed. We have investigated the adoption of modern High Performance Computing systems specifically addressing the gridding, FFT-transform and w-correction of imaging, combining parallel and accelerated solutions. We have demonstrated that the code we have developed can support dataset and images of any size compatible with the available hardware, efficiently scaling up to thousands of cores or hundreds of GPUs, keeping the time to solution below one hour even when images of the size of the order of billion or tens of billion of pixels are generated. In addition, portability has been targeted as a primary objective, both in terms of usability on different computing platforms and in terms of performance. The presented results have been obtained on two different state-of-the-art High Performance Computing architectures.

preprint2022arXiv

A distributed computing infrastructure for LOFAR Italian community

The LOw-Frequency ARray is a low-frequency radio interferometer composed by observational stations spread across Europe and it is the largest precursor of SKA in terms of effective area and generated data rates. In 2018, the Italian community officially joined LOFAR project, and it deployed a distributed computing and storage infrastructure dedicated to LOFAR data analysis. The infrastructure is based on 4 nodes distributed in different Italian locations and it offers services for pipelines execution, storage of final and intermediate results and support for the use of the software and infrastructure. As the analysis of the LOw-Frequency ARray data requires a very complex computational procedure, a container-based approach has been adopted to distribute software environments to the different computing resources. A science platform approach is used to facilitate interactive access to computational resources. In this paper, we describe the architecture and main features of the infrastructure.

preprint2022arXiv

Galaxies in the central regions of simulated galaxy clusters

In this paper, we assess the impact of numerical resolution and of the implementation of energy input from AGN feedback models on the inner structure of cluster sub-haloes in hydrodynamic simulations. We compare several zoom-in re-simulations of a sub-sample of the cluster-sized haloes studied in Meneghetti et al. (2020), obtained by varying mass resolution, softening length and AGN energy feedback scheme. We study the impact of these different setups on the subhalo abundances, their radial distribution, their density and mass profiles and the relation between the maximum circular velocity, which is a proxy for subhalo compactness. Regardless of the adopted numerical resolution and feedback model, subhaloes with masses Msub < 1e11Msun/h, the most relevant mass-range for galaxy-galaxy strong lensing, have maximum circular velocities ~30% smaller than those measured from strong lensing observations of Bergamini et al. (2019). We also find that simulations with less effective AGN energy feedback produce massive subhaloes (Msub> 1e11 Msun/h ) with higher maximum circular velocity and that their Vmax - Msub relation approaches the observed one. However the stellar-mass number count of these objects exceeds the one found in observations and we find that the compactness of these simulated subhaloes is the result of an extremely over-efficient star formation in their cores, also leading to larger-than-observed subhalo stellar mass. We conclude that simulations are unable to simultaneously reproduce the observed stellar masses and compactness (or maximum circular velocities) of cluster galaxies. Thus, the discrepancy between theory and observations that emerged from the analysis of Meneghetti et al. (2020) persists. It remains an open question as to whether such a discrepancy reflects limitations of the current implementation of galaxy formation models or the LCDM paradigm.

preprint2020arXiv

CHIPP: INAF pilot project for HTC, HPC and HPDA

CHIPP (Computing HTC in INAF Pilot Project) is an Italian project funded by the Italian Institute for Astrophysics (INAF) and promoted by the ICT office of INAF. The main purpose of the CHIPP project is to coordinate the use of, and access to, already existing high throughput computing and high-performance computing and data processing resources (for small/medium size programs) for the INAF community. Today, Tier2/Tier3 systems (1,200 CPU/core) are provided at the INAF institutes at Trieste and Catania, but in the future, the project will evolve including also other computing infrastructures. During the last two years, more than 30 programs have been approved for a total request of 30 Million CPU-h. Most of the programs are HPC, data reduction and analysis, machine learning. In this paper, we describe in details the CHIPP infrastructures and the results of the first two years of activity.

preprint2020arXiv

Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application

New challenges in Astronomy and Astrophysics (AA) are urging the need for a large number of exceptionally computationally intensive simulations. "Exascale" (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. Currently, the High Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the primary challenge to the achievement of the "Exascale" is the power-consumption. The goal of this work is to give some insights about performance and energy footprint of contemporary architectures for a real astrophysical application in an HPC context. We use a state-of-the-art N-body application that we re-engineered and optimized to exploit the heterogeneous underlying hardware fully. We quantitatively evaluate the impact of computation on energy consumption when running on four different platforms. Two of them represent the current HPC systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster based on ARM-MPSoC, and one is a "prototype towards Exascale" equipped with ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the different devices where the high-end GPUs excel in terms of time-to-solution while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. This work can be a reference for future platforms development for astrophysics applications where computationally intensive calculations are required.

preprint2003arXiv

On the Life and Death of Satellite Haloes

We study the evolution of dark matter satellites orbiting inside more massive haloes using semi-analytical tools coupled with high-resolution N-Body simulations. We select initial satellite sizes, masses, orbital energies, and eccentricities as predicted by hierarchical models of structure formation. Both the satellite and the main halo are described by a Navarro, Frenk & White density profile with various concentrations.