Researcher profile

Claudio Gheller

Claudio Gheller contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2023arXiv

High Performance W-stacking for Imaging Radio Astronomy Data: a Parallel and Accelerated Solution

Current and upcoming radio-interferometers are expected to produce volumes of data of increasing size that need to be processed in order to generate the corresponding sky brightness distributions through imaging. This represents an outstanding computational challenge, especially when large fields of view and/or high resolution observations are processed. We have investigated the adoption of modern High Performance Computing systems specifically addressing the gridding, FFT-transform and w-correction of imaging, combining parallel and accelerated solutions. We have demonstrated that the code we have developed can support dataset and images of any size compatible with the available hardware, efficiently scaling up to thousands of cores or hundreds of GPUs, keeping the time to solution below one hour even when images of the size of the order of billion or tens of billion of pixels are generated. In addition, portability has been targeted as a primary objective, both in terms of usability on different computing platforms and in terms of performance. The presented results have been obtained on two different state-of-the-art High Performance Computing architectures.

preprint2022arXiv

A distributed computing infrastructure for LOFAR Italian community

The LOw-Frequency ARray is a low-frequency radio interferometer composed by observational stations spread across Europe and it is the largest precursor of SKA in terms of effective area and generated data rates. In 2018, the Italian community officially joined LOFAR project, and it deployed a distributed computing and storage infrastructure dedicated to LOFAR data analysis. The infrastructure is based on 4 nodes distributed in different Italian locations and it offers services for pipelines execution, storage of final and intermediate results and support for the use of the software and infrastructure. As the analysis of the LOw-Frequency ARray data requires a very complex computational procedure, a container-based approach has been adopted to distribute software environments to the different computing resources. A science platform approach is used to facilitate interactive access to computational resources. In this paper, we describe the architecture and main features of the infrastructure.

preprint2021arXiv

A New View of Observed Galaxies through 3D Modelling and Visualisation

Observational astronomers survey the sky in great detail to gain a better understanding of many types of astronomical phenomena. In particular, the formation and evolution of galaxies, including our own, is a wide field of research. Three dimensional (spatial 3D) scientific visualisation is typically limited to simulated galaxies, due to the inherently two dimensional spatial resolution of Earth-based observations. However, with appropriate means of reconstruction, such visualisation can also be used to bring out the inherent 3D structure that exists in 2D observations of known galaxies, providing new views of these galaxies and visually illustrating the spatial relationships within galaxy groups that are not obvious in 2D. We present a novel approach to reconstruct and visualise 3D representations of nearby galaxies based on observational data using the scientific visualisation software Splotch. We apply our approach to a case study of the nearby barred spiral galaxy known as M83, presenting a new perspective of the M83 local group and highlighting the similarities between our reconstructed views of M83 and other known galaxies of similar inclinations.

preprint2020arXiv

Gadget3 on GPUs with OpenACC

We present preliminary results of a GPU porting of all main Gadget3 modules (gravity computation, SPH density computation, SPH hydrodynamic force, and thermal conduction) using OpenACC directives. Here we assign one GPU to each MPI rank and exploit both the host and accellerator capabilities by overlapping computations on the CPUs and GPUs: while GPUs asynchronously compute interactions between particles within their MPI ranks, CPUs perform tree-walks and MPI communications of neighbouring particles. We profile various portions of the code to understand the origin of our speedup, where we find that a peak speedup is not achieved because of time-steps with few active particles. We run a hydrodynamic cosmological simulation from the Magneticum project, with $2\cdot10^{7}$ particles, where we find a final total speedup of $\approx 2.$ We also present the results of an encouraging scaling test of a preliminary gravity-only OpenACC porting, run in the context of the EuroHack17 event, where the prototype of the porting proved to keep a constant speedup up to $1024$ GPUs.

preprint2020arXiv

Gyrokinetic Simulations on Many- and Multi-core Architectures with the Global Electromagnetic Particle-In-Cell Code ORB5

Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly complex problems, requiring the effective exploitation of cutting-edge HPC architectures. This paper focuses on the enabling of ORB5, a state-of-the-art, first-principles-based gyrokinetic code, on modern parallel hybrid multi-core, multi-GPU systems. ORB5 is a Lagrangian, Particle-In-Cell (PIC), finite element, global, electromagnetic code, originally implementing distributed parallelism through MPI-based on domain decomposition and domain cloning. In order to support multi/many cores devices, the code has been completely refactored. Data structures have been re-designed to ensure efficient memory access, enhancing data locality. Multi-threading has been introduced through OpenMP on the CPU and adopting OpenACC to support GPU acceleration. MPI can still be used in combination with the two approaches. The performance results obtained using the full production ORB5 code on the Summit system at ORNL, on Piz Daint at CSCS and on the Marconi system at CINECA are presented, showing the effectiveness and performance portability of the adopted solutions: the same source code version was used to produce all results on all architectures.

preprint2020arXiv

Multi wavelength cross-correlation analysis of the simulated cosmic web

We used magneto-hydrodynamical cosmological simulations to investigate the cross-correlation between different observables (i.e. X-ray emission, Sunyaev-Zeldovich signal at 21 cm, HI temperature decrement, diffuse synchrotron emission and Faraday Rotation) as a probe of the diffuse matter distribution in the cosmic web. We adopt an uniform and simplistic approach to produce synthetic observations at various wavelengths, and we compare the detection chances of different combinations of observables correlated with each other and with the underlying galaxy distribution in the volume. With presently available surveys of galaxies and existing instruments, the best chances to detect the diffuse gas in the cosmic web outside of halos is by cross-correlating the distribution of galaxies with Sunyaev-Zeldovich observations. We also find that the cross-correlation between the galaxy network and the radio emission or the Faraday Rotation can already be used to limit the amplitude of extragalactic magnetic fields, well outside of the cluster volume usually explored by existing radio observations, and to probe the origin of cosmic magnetism with the future generation of radio surveys.

preprint2010arXiv

High-performance astrophysical visualization using Splotch

The scientific community is presently witnessing an unprecedented growth in the quality and quantity of data sets coming from simulations and real-world experiments. To access effectively and extract the scientific content of such large-scale data sets (often sizes are measured in hundreds or even millions of Gigabytes) appropriate tools are needed. Visual data exploration and discovery is a robust approach for rapidly and intuitively inspecting large-scale data sets, e.g. for identifying new features and patterns or isolating small regions of interest within which to apply time-consuming algorithms. This paper presents a high performance parallelized implementation of Splotch, our previously developed visual data exploration and discovery algorithm for large-scale astrophysical data sets coming from particle-based simulations. Splotch has been improved in order to exploit modern massively parallel architectures, e.g. multicore CPUs and CUDA-enabled GPUs. We present performance and scalability benchmarks on a number of test cases, demonstrating the ability of our high performance parallelized Splotch to handle efficiently large-scale data sets, such as the outputs of the Millennium II simulation, the largest cosmological simulation ever performed.