Source author record

Claudio Gheller

Claudio Gheller appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM astro-ph.CO Distributed, Parallel, and Cluster Computing astro-ph.GA astro-ph Graphics Performance physics.comp-ph physics.plasm-ph

Catalog footprint

What is connected

17works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

High Performance W-stacking for Imaging Radio Astronomy Data: a Parallel and Accelerated Solution

Current and upcoming radio-interferometers are expected to produce volumes of data of increasing size that need to be processed in order to generate the corresponding sky brightness distributions through imaging. This represents an outstanding computational challenge, especially when large fields of view and/or high resolution observations are processed. We have investigated the adoption of modern High Performance Computing systems specifically addressing the gridding, FFT-transform and w-correction of imaging, combining parallel and accelerated solutions. We have demonstrated that the code we have developed can support dataset and images of any size compatible with the available hardware, efficiently scaling up to thousands of cores or hundreds of GPUs, keeping the time to solution below one hour even when images of the size of the order of billion or tens of billion of pixels are generated. In addition, portability has been targeted as a primary objective, both in terms of usability on different computing platforms and in terms of performance. The presented results have been obtained on two different state-of-the-art High Performance Computing architectures.

preprint2022arXiv

A distributed computing infrastructure for LOFAR Italian community

The LOw-Frequency ARray is a low-frequency radio interferometer composed by observational stations spread across Europe and it is the largest precursor of SKA in terms of effective area and generated data rates. In 2018, the Italian community officially joined LOFAR project, and it deployed a distributed computing and storage infrastructure dedicated to LOFAR data analysis. The infrastructure is based on 4 nodes distributed in different Italian locations and it offers services for pipelines execution, storage of final and intermediate results and support for the use of the software and infrastructure. As the analysis of the LOw-Frequency ARray data requires a very complex computational procedure, a container-based approach has been adopted to distribute software environments to the different computing resources. A science platform approach is used to facilitate interactive access to computational resources. In this paper, we describe the architecture and main features of the infrastructure.

preprint2021arXiv

A New View of Observed Galaxies through 3D Modelling and Visualisation

Observational astronomers survey the sky in great detail to gain a better understanding of many types of astronomical phenomena. In particular, the formation and evolution of galaxies, including our own, is a wide field of research. Three dimensional (spatial 3D) scientific visualisation is typically limited to simulated galaxies, due to the inherently two dimensional spatial resolution of Earth-based observations. However, with appropriate means of reconstruction, such visualisation can also be used to bring out the inherent 3D structure that exists in 2D observations of known galaxies, providing new views of these galaxies and visually illustrating the spatial relationships within galaxy groups that are not obvious in 2D. We present a novel approach to reconstruct and visualise 3D representations of nearby galaxies based on observational data using the scientific visualisation software Splotch. We apply our approach to a case study of the nearby barred spiral galaxy known as M83, presenting a new perspective of the M83 local group and highlighting the similarities between our reconstructed views of M83 and other known galaxies of similar inclinations.

preprint2020arXiv

Gadget3 on GPUs with OpenACC

We present preliminary results of a GPU porting of all main Gadget3 modules (gravity computation, SPH density computation, SPH hydrodynamic force, and thermal conduction) using OpenACC directives. Here we assign one GPU to each MPI rank and exploit both the host and accellerator capabilities by overlapping computations on the CPUs and GPUs: while GPUs asynchronously compute interactions between particles within their MPI ranks, CPUs perform tree-walks and MPI communications of neighbouring particles. We profile various portions of the code to understand the origin of our speedup, where we find that a peak speedup is not achieved because of time-steps with few active particles. We run a hydrodynamic cosmological simulation from the Magneticum project, with $2\cdot10^{7}$ particles, where we find a final total speedup of $\approx 2.$ We also present the results of an encouraging scaling test of a preliminary gravity-only OpenACC porting, run in the context of the EuroHack17 event, where the prototype of the porting proved to keep a constant speedup up to $1024$ GPUs.

preprint2020arXiv

Gyrokinetic Simulations on Many- and Multi-core Architectures with the Global Electromagnetic Particle-In-Cell Code ORB5

Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly complex problems, requiring the effective exploitation of cutting-edge HPC architectures. This paper focuses on the enabling of ORB5, a state-of-the-art, first-principles-based gyrokinetic code, on modern parallel hybrid multi-core, multi-GPU systems. ORB5 is a Lagrangian, Particle-In-Cell (PIC), finite element, global, electromagnetic code, originally implementing distributed parallelism through MPI-based on domain decomposition and domain cloning. In order to support multi/many cores devices, the code has been completely refactored. Data structures have been re-designed to ensure efficient memory access, enhancing data locality. Multi-threading has been introduced through OpenMP on the CPU and adopting OpenACC to support GPU acceleration. MPI can still be used in combination with the two approaches. The performance results obtained using the full production ORB5 code on the Summit system at ORNL, on Piz Daint at CSCS and on the Marconi system at CINECA are presented, showing the effectiveness and performance portability of the adopted solutions: the same source code version was used to produce all results on all architectures.

preprint2020arXiv

Multi wavelength cross-correlation analysis of the simulated cosmic web

We used magneto-hydrodynamical cosmological simulations to investigate the cross-correlation between different observables (i.e. X-ray emission, Sunyaev-Zeldovich signal at 21 cm, HI temperature decrement, diffuse synchrotron emission and Faraday Rotation) as a probe of the diffuse matter distribution in the cosmic web. We adopt an uniform and simplistic approach to produce synthetic observations at various wavelengths, and we compare the detection chances of different combinations of observables correlated with each other and with the underlying galaxy distribution in the volume. With presently available surveys of galaxies and existing instruments, the best chances to detect the diffuse gas in the cosmic web outside of halos is by cross-correlating the distribution of galaxies with Sunyaev-Zeldovich observations. We also find that the cross-correlation between the galaxy network and the radio emission or the Faraday Rotation can already be used to limit the amplitude of extragalactic magnetic fields, well outside of the cluster volume usually explored by existing radio observations, and to probe the origin of cosmic magnetism with the future generation of radio surveys.

preprint2016arXiv

Evolution of cosmic filaments and of their galaxy population from MHD cosmological simulations

Despite containing about a half of the total matter in the Universe, at most wavelengths the filamentary structure of the cosmic web is difficult to observe. In this work, we use large unigrid cosmological simulations to investigate how the geometrical, thermodynamical and magnetic properties of cosmological filaments vary with mass and redshift (z $\leq 1$). We find that the average temperature, length, volume and magnetic field of filaments are tightly log-log correlated with the underlying total gravitational mass. This reflects the role of self-gravity in shaping their properties and enables statistical predictions of their observational properties based on their mass. We also focus on the properties of the simulated population of galaxy-sized halos within filaments, and compare their properties to the results obtained from the spectroscopic GAMA survey. Simulated and observed filaments with the same length are found to contain an equal number of galaxies, with very similar distribution of halo masses. The total number of galaxies within each filament and the total/average stellar mass in galaxies can now be used to predict also the large-scale properties of the gas in the host filaments across tens or hundreds of Mpc in scale. These results are the first steps towards the future use of galaxy catalogues in order to select the best targets for observations of the warm-hot intergalactic medium.

preprint2016arXiv

On the non-thermal energy content of cosmic structures

1) Background: the budget of non-thermal energy in galaxy clusters is not well constrained, owing to the observational and theoretical difficulties in studying these diluted plasmas on large scales. 2) Method: we use recent cosmological simulations with complex physics in order to connect the emergence of non-thermal energy to the underlying evolution of gas and dark matter. 3) Results: the impact of non-thermal energy (e.g. cosmic rays, magnetic fields and turbulent motions) is found to increase in the outer region of galaxy clusters. Within numerical and theoretical uncertainties, turbulent motions dominate the budget of non-thermal energy in most of the cosmic volume. 4) Conclusion: assessing the distribution non-thermal energy in galaxy clusters is crucial to perform high-precision cosmology in the future. Constraining the level of non-thermal energy in cluster outskirts will improve our understanding of the acceleration of relativistic particles by cosmic shocks and of the origin of extragalactic magnetic fields.

preprint2016arXiv

Splotch: porting and optimizing for the Xeon Phi

With the increasing size and complexity of data produced by large scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous High Performance Computing environments for increased throughput and efficiency. We focus on the porting and optimization of Splotch, a scalable visualization algorithm, to utilize the Xeon Phi, Intel's coprocessor based upon the new Many Integrated Core architecture. We discuss steps taken to offload data to the coprocessor and algorithmic modifications to aid faster processing on the many-core architecture and make use of the uniquely wide vector capabilities of the device, with accompanying performance results using multiple Xeon Phi. Finally performance is compared against results achieved with the GPU implementation of Splotch.

preprint2015arXiv

Filaments of the radio cosmic web: opportunities and challenges for SKA

The detection of the diffuse gas component of the cosmic web remains a formidable challenge. In this work we study synchrotron emission from the cosmic web with simulated SKA1 observations, which can represent an fundamental probe of the warm-hot intergalactic medium. We investigate radio emission originated by relativistic electrons accelerated by shocks surrounding cosmic filaments, assuming diffusive shock acceleration and as a function of the (unknown) large-scale magnetic fields. The detection of the brightest parts of large ($>10 \rm Mpc$) filaments of the cosmic web should be within reach of the SKA1-LOW, if the magnetic field is at the level of a $\sim 10$ percent equipartition with the thermal gas, corresponding to $\sim 0.1 μG$ for the most massive filaments in simulations. In the course of a 2-years survey with SKA1-LOW, this will enable a first detection of the "tip of the iceberg" of the radio cosmic web, and allow for the use of the SKA as a powerful tool to study the origin of cosmic magnetism in large-scale structures. On the other hand, the SKA1-MID and SKA1-SUR seem less suited for this science case at low redshift ($z \leq 0.4$), owing to the missing short baselines and the consequent lack of signal from the large-scale brightness fluctuations associated with the filaments. In this case only very long exposures ($\sim 1000$ hr) may enable the detection of $\sim 1-2$ filament for field of view in the SKA1-SUR PAF Band1.

preprint2015arXiv

Properties of Cosmological Filaments extracted from Eulerian Simulations

Using a new parallel algorithm implemented within the VisIt framework, we analysed large cosmological grid simulations to study the properties of baryons in filaments. The procedure allows us to build large catalogues with up to $\sim 3 \cdot 10^4$ filaments per simulated volume and to investigate the properties of cosmic filaments for very large volumes at high resolution (up to $300^3 ~\rm Mpc^3$ simulated with $2048^3$ cells). We determined scaling relations for the mass, volume, length and temperature of filaments and compared them to those of galaxy clusters. The longest filaments have a total length of about $200 ~\rm Mpc$ with a mass of several $10^{15} M_{\odot}$. We also investigated the effects of different gas physics. Radiative cooling significantly modifies the thermal properties of the warm-hot-intergalactic medium of filaments, mainly by lowering their mean temperature via line cooling. On the other hand, powerful feedback from active galactic nuclei in surrounding halos can heat up the gas in filaments. The impact of shock-accelerated cosmic rays from diffusive shock acceleration on filaments is small and the ratio of between cosmic ray and gas pressure within filaments is of the order of $\sim 10-20$ percent.

preprint2014arXiv

GPU Accelerated Particle Visualization with Splotch

Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in re-designing Splotch for exploiting emerging HPC architectures nowadays increasingly populated with GPUs. A performance model is introduced for data transfers, computations and memory access, to guide our re-factoring of Splotch. A number of parallelization issues are discussed, in particular relating to race conditions and workload balancing, towards achieving optimal performances. Our implementation was accomplished by using the CUDA programming paradigm. Our strategy is founded on novel schemes achieving optimized data organisation and classification of particles. We deploy a reference simulation to present performance results on acceleration gains and scalability. We finally outline our vision for future work developments including possibilities for further optimisations and exploitation of emerging technologies.

preprint2014arXiv

Numerical cosmology on the GPU with Enzo and Ramses

A number of scientific numerical codes can currently exploit GPUs with remarkable performance. In astrophysics, Enzo and Ramses are prime examples of such applications. The two codes have been ported to GPUs adopting different strategies and programming models, Enzo adopting CUDA and Ramses using OpenACC. We describe here the different solutions used for the GPU implementation of both cases. Performance benchmarks will be presented for Ramses. The results of the usage of the more mature GPU version of Enzo, adopted for a scientific project within the CHRONOS programme, will be summarised.

preprint2014arXiv

Simulations of cosmic rays in large-scale structures: numerical and physical effects

Non-thermal (relativistic) particles are injected into the cosmos by structure formation shock waves, active galactic nuclei and stellar explosions. We present a suite of unigrid cosmological simulations (up to $2048^3$) using a two-fluid model in the grid code ENZO. The simulations include the dynamical effects of cosmic-ray (CR) protons and cover a range of theoretically motivated acceleration efficiencies. For the bulk of the cosmic volume the modelling of CR processes is rather stable with respect to resolution, provided that a minimum (cell) resolution of $\approx 100 ~\rm kpc/h$ is employed. However, the results for the innermost cluster regions depend on the assumptions for the baryonic physics. Inside clusters, non-radiative runs at high resolution tend to produce an energy density of CRs that are below available upper limits from the FERMI satellite, while the radiative runs are found to produce a higher budget of CRs. We show that weak ($M \leq 3-5$) shocks and shock-reacceleration are crucial to set the level of CRs in the innermost region of clusters, while in the outer regions the level of CR energy is mainly set via direct injection by stronger shocks, and is less sensitive to cooling and feedback from active galactic nuclei and supernovae.

preprint2010arXiv

High-performance astrophysical visualization using Splotch

The scientific community is presently witnessing an unprecedented growth in the quality and quantity of data sets coming from simulations and real-world experiments. To access effectively and extract the scientific content of such large-scale data sets (often sizes are measured in hundreds or even millions of Gigabytes) appropriate tools are needed. Visual data exploration and discovery is a robust approach for rapidly and intuitively inspecting large-scale data sets, e.g. for identifying new features and patterns or isolating small regions of interest within which to apply time-consuming algorithms. This paper presents a high performance parallelized implementation of Splotch, our previously developed visual data exploration and discovery algorithm for large-scale astrophysical data sets coming from particle-based simulations. Splotch has been improved in order to exploit modern massively parallel architectures, e.g. multicore CPUs and CUDA-enabled GPUs. We present performance and scalability benchmarks on a number of test cases, demonstrating the ability of our high performance parallelized Splotch to handle efficiently large-scale data sets, such as the outputs of the Millennium II simulation, the largest cosmological simulation ever performed.

preprint2010arXiv

Massive and refined: a sample of large galaxy clusters simulated at high resolution. I:Thermal gas and shock waves properties

We present a sample of 20 massive galaxy clusters with total virial masses in the range of 6 10^14 M_sol<M(vir)< 2 10^15M_sol, re-simulated with a customized version of the 1.5. ENZO code employing Adaptive Mesh Refinement. This technique allowed us to obtain unprecedented high spatial resolution (25kpc/h) up to the distance of 3 virial radii from the clusters center, and makes it possible to focus with the same level of detail on the physical properties of the innermost and of the outermost cluster regions, providing new clues on the role of shock waves and turbulent motions in the ICM, across a wide range of scales. In this paper, a first exploratory study of this data set is presented. We report on the thermal properties of galaxy clusters at z=0. Integrated and morphological properties of gas density, gas temperature, gas entropy and baryon fraction distributions are discussed, and compared with existing outcomes both from the observational and from the numerical literature. Our cluster sample shows an overall good consistency with the results obtained adopting other numerical techniques (e.g. Smoothed Particles Hydrodynamics), yet it provides a more accurate representation of the accretion patterns far outside the cluster cores. We also reconstruct the properties of shock waves within the sample by means of a velocity-based approach, and we study Mach numbers and energy distributions for the various dynamical states in clusters, giving estimates for the injection of Cosmic Rays particles at shocks. The present sample is rather unique in the panorama of cosmological simulations of massive galaxy clusters, due to its dynamical range, statistics of objects and number of time outputs. For this reason, we deploy a public repository of the available data, accessible via web portal at http://data.cineca.it.

preprint1995arXiv

COLLISIONAL VERSUS COLLISIONLESS MATTER: A ONE-DIMENSIONAL ANALYSIS OF GRAVITATIONAL CLUSTERING

We present the results of a series of one-dimensional N-body and hydrodynamical simulations which have been used for testing the different clustering properties of baryonic and dark matter in an expanding background. Initial Gaussian random density perturbations with a power-law spectrum $P(k) \propto k^n$ are assumed. We analyse the distribution of density fluctuations and thermodynamical quantities for different spectral indices $n$ and discuss the statistical properties of clustering in the corresponding simulations. At large scales the final distribution of the two components is very similar while at small scales the dark matter presents a lumpiness which is not found in the baryonic matter. The amplitude of density fluctuations in each component depends on the spectral index $n$ and only for $n=-1$ the amplitude of baryonic density fluctuations is larger than that in the dark component. This result is also confirmed by the behaviour of the bias factor, defined as the ratio between the r.m.s of baryonic and dark matter fluctuations at different scales: while for $n=1,\ 3$ it is always less than unity except at very large scales where it tends to one, for $n=-1$ it is above 1.4 at all scales. All simulations show also that there is not an exact correspondence between the positions of largest peaks in dark and baryonic components, as confirmed by a cross-correlation analysis. The final temperatures depend on the initial spectral index: the highest values are obtained for $n=-1$ and are in proximity of high density regions.

Claudio Gheller

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

High Performance W-stacking for Imaging Radio Astronomy Data: a Parallel and Accelerated Solution

A distributed computing infrastructure for LOFAR Italian community

A New View of Observed Galaxies through 3D Modelling and Visualisation

Gadget3 on GPUs with OpenACC

Gyrokinetic Simulations on Many- and Multi-core Architectures with the Global Electromagnetic Particle-In-Cell Code ORB5

Multi wavelength cross-correlation analysis of the simulated cosmic web

Evolution of cosmic filaments and of their galaxy population from MHD cosmological simulations

On the non-thermal energy content of cosmic structures

Splotch: porting and optimizing for the Xeon Phi

Filaments of the radio cosmic web: opportunities and challenges for SKA

Properties of Cosmological Filaments extracted from Eulerian Simulations

GPU Accelerated Particle Visualization with Splotch

Numerical cosmology on the GPU with Enzo and Ramses

Simulations of cosmic rays in large-scale structures: numerical and physical effects

High-performance astrophysical visualization using Splotch

Massive and refined: a sample of large galaxy clusters simulated at high resolution. I:Thermal gas and shock waves properties

COLLISIONAL VERSUS COLLISIONLESS MATTER: A ONE-DIMENSIONAL ANALYSIS OF GRAVITATIONAL CLUSTERING