Source author record

Masaki Iwasawa

Masaki Iwasawa appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.GA astro-ph.IM astro-ph.CO astro-ph astro-ph.EP astro-ph.SR physics.comp-ph

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

PeTar: a high-performance N-body code for modeling massive collisional stellar systems

The numerical simulations of massive collisional stellar systems, such as globular clusters (GCs), are very time-consuming. Until now, only a few realistic million-body simulations of GCs with a small fraction of binaries (5%) have been performed by using the NBODY6++GPU code. Such models took half a year computational time on a GPU based super-computer. In this work, we develop a new N-body code, PeTar, by combining the methods of Barnes-Hut tree, Hermite integrator and slow-down algorithmic regularization (SDAR). The code can accurately handle an arbitrary fraction of multiple systems (e.g. binaries, triples) while keeping a high performance by using the hybrid parallelization methods with MPI, OpenMP, SIMD instructions and GPU. A few benchmarks indicate that PeTar and NBODY6++GPU have a very good agreement on the long-term evolution of the global structure, binary orbits and escapers. On a highly configured GPU desktop computer, the performance of a million-body simulation with all stars in binaries by using PeTar is 11 times faster than that of NBODY6++GPU. Moreover, on the Cray XC50 supercomputer, PeTar well scales when number of cores increase. The ten million-body problem, which covers the region of ultra compact dwarfs and nuclearstar clusters, becomes possible to be solved.

preprint2019arXiv

Accelerated FDPS --- Algorithms to Use Accelerators with FDPS

In this paper, we describe the algorithms we implemented in FDPS to make efficient use of accelerator hardware such as GPGPUs. We have developed FDPS to make it possible for many researchers to develop their own high-performance parallel particle-based simulation programs without spending large amount of time for parallelization and performance tuning. The basic idea of FDPS is to provide a high-performance implementation of parallel algorithms for particle-based simulations in a "generic" form, so that researchers can define their own particle data structure and interparticle interaction functions and supply them to FDPS. FDPS compiled with user-supplied data type and interaction function provides all necessary functions for parallelization, and using those functions researchers can write their programs as though they are writing simple non-parallel program. It has been possible to use accelerators with FDPS, by writing the interaction function that uses the accelerator. However, the efficiency was limited by the latency and bandwidth of communication between the CPU and the accelerator and also by the mismatch between the available degree of parallelism of the interaction function and that of the hardware parallelism. We have modified the interface of user-provided interaction function so that accelerators are more efficiently used. We also implemented new techniques which reduce the amount of work on the side of CPU and amount of communication between CPU and accelerators. We have measured the performance of N-body simulations on a systems with NVIDIA Volta GPGPU using FDPS and the achieved performance is around 27 \% of the theoretical peak limit. We have constructed a detailed performance model, and found that the current implementation can achieve good performance on systems with much smaller memory and communication bandwidth.

preprint2016arXiv

Implementation and performance of FDPS: A Framework Developing Parallel Particle Simulation Codes

We present the basic idea, implementation, measured performance and performance model of FDPS (Framework for developing particle simulators). FDPS is an application-development framework which helps the researchers to develop particle-based simulation programs for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, redistribution of particles, and gathering of particle information for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as Barnes-Hut tree method should be used for long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are necessary. FDPS provides all of these necessary functions for efficient parallel execution of particle-based simulations as "templates", which are independent of the actual data structure of particles and the functional form of the interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N^2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speedup was obtained for up to the full system of K computer. The minimum calculation time per timestep is in the range of 30 ms (N=10^7) to 300 ms (N=10^9). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.

preprint2015arXiv

GPU-Enabled Particle-Particle Particle-Tree Scheme for Simulating Dense Stellar Cluster System

We describe the implementation and performance of the ${\rm P^3T}$ (Particle-Particle Particle-Tree) scheme for simulating dense stellar systems. In ${\rm P^3T}$, the force experienced by a particle is split into short-range and long-range contributions. Short-range forces are evaluated by direct summation and integrated with the fourth order Hermite predictor-corrector method with the block timesteps. For long-range forces, we use a combination of the Barnes-Hut tree code and the leapfrog integrator. The tree part of our simulation environment is accelerated using graphical processing units (GPU), whereas the direct summation is carried out on the host CPU. Our code gives excellent performance and accuracy for star cluster simulations with a large number of particles even when the core size of the star cluster is small.

preprint2010arXiv

Eccentric evolution of SMBH binaries

In recent numerical simulations \citep{matsubayashi07,lockmann08}, it has been found that the eccentricity of supermassive black hole(SMBH) - intermediate black hole(IMBH) binaries grows toward unity through interactions with stellar background. This increase of eccentricity reduces the merging timescale of the binary through the gravitational radiation to the value well below the Hubble Time. It also gives the theoretical explanation of the existence of eccentric binary such as that in OJ287 \citep{lehto96, valtonen08}. In self-consistent N-body simulations, this increase of eccentricity is always observed. On the other hand, the result of scattering experiment between SMBH binaries and field stars \citep{quinlan96} indicated no increase of eccentricity. This discrepancy leaves the high eccentricity of the SMBH binaries in $N$-body simulations unexplained. Here we present a stellar-dynamical mechanism that drives the increase of the eccentricity of an SMBH binary with large mass ratio. There are two key processes involved. The first one is the Kozai mechanism under non-axisymmetric potential, which effectively randomizes the angular momenta of surrounding stars. The other is the selective ejection of stars with prograde orbits. Through these two mechanisms, field stars extract the orbital angular momentum of the SMBH binary. Our proposed mechanism causes the increase in the eccentricity of most of SMBH binaries, resulting in the rapid merger through gravitational wave radiation. Our result has given a definite solution to the "last-parsec problem".

preprint2010arXiv

The origin of S-stars and a young stellar disk: distribution of debris stars of a sinking star cluster

Within the distance of 1 pc from the Galactic center (GC), more than 100 young massive stars have been found. The massive stars at 0.1-1 pc from the GC are located in one or two disks, while those within 0.1 pc from the GC, S-stars, have an isotropic distribution. How these stars are formed is not well understood, especially for S-stars. Here we propose that a young star cluster with an intermediate-mass black hole (IMBH) can form both the disks and S-stars. We performed a fully self-consistent $N$-body simulation of a star cluster near the GC. Stars escaped from the tidally disrupted star cluster were carried to the GC due to an 1:1 mean motion resonance with the IMBH formed in the cluster. In the final phase of the evolution, the eccentricity of the IMBH becomes very high. In this phase, stars carried by the 1:1 resonance with the IMBH were dropped from the resonance and their orbits are randomized by a chaotic Kozai mechanism. The mass function of these carried stars is extremely top-heavy within 10". The surface density distributions of young massive stars has a slope of -1.5 within 10" from the GC. The distribution of stars in the most central region is isotropic. These characteristics agree well with those of stars observed within 10" from the GC.

preprint2009arXiv

Trojan Stars in the Galactic Center

We performed, for the first time, the simulation of spiral-in of a star cluster formed close to the Galactic center (GC) using a fully self-consistent $N$-body model. In our model, the central super-massive black hole (SMBH) is surrounded by stars and the star cluster. Not only are the orbits of stars and the cluster stars integrated self-consistently, but the stellar evolution, collisions and merging of the cluster stars are also included. We found that an intermediate-mass black hole (IMBH) is formed in the star cluster and stars escaped from the cluster are captured into a 1:1 mean motion resonance with the IMBH. These "Trojan" stars are brought close to the SMBH by the IMBH, which spirals into the GC due to the dynamical friction. Our results show that, once the IMBH is formed, it brings the massive stars to the vicinity of the central SMBH even after the star cluster itself is disrupted. Stars carried by the IMBH form a disk similar to the observed disks and the core of the cluster including the IMBH has properties similar to those of IRS13E, which is a compact assembly of several young stars.

Masaki Iwasawa

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

PeTar: a high-performance N-body code for modeling massive collisional stellar systems

Accelerated FDPS --- Algorithms to Use Accelerators with FDPS

Implementation and performance of FDPS: A Framework Developing Parallel Particle Simulation Codes

GPU-Enabled Particle-Particle Particle-Tree Scheme for Simulating Dense Stellar Cluster System

Eccentric evolution of SMBH binaries

The origin of S-stars and a young stellar disk: distribution of debris stars of a sinking star cluster

Trojan Stars in the Galactic Center