Source author record

Lukas Einkemmer

Lukas Einkemmer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis physics.comp-ph astro-ph.IM physics.plasm-ph Distributed, Parallel, and Cluster Computing math-ph math.MP math.OC Mathematical Software Performance physics.app-ph

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

LeXInt: Package for Exponential Integrators employing Leja interpolation

We present a publicly available software for exponential integrators that computes the $φ_l(z)$ functions using polynomial interpolation. The interpolation method at Leja points have recently been shown to be competitive with the traditionally-used Krylov subspace method. The developed framework facilitates easy adaptation into any Python software package for time integration.

preprint2022arXiv

Efficient 6D Vlasov simulation using the dynamical low-rank framework Ensign

Running kinetic simulations using grid-based methods is extremely expensive due to the up to six-dimensional phase space. Recently, it has been shown that dynamical low-rank algorithms can drastically reduce the required computational effort, while still accurately resolving important physical features such as filamentation and Landau damping. In this paper, we propose a new second order projector-splitting dynamical low-rank algorithm for the full six-dimensional Vlasov--Poisson equations. An exponential integrator based Fourier spectral method is employed to obtain a numerical scheme that is CFL condition free but still fully explicit. The resulting method is implemented with the aid of Ensign, a software framework which facilitates the efficient implementation of dynamical low-rank algorithms on modern multi-core CPU as well as GPU based systems. Its usage and features are briefly described in the paper as well. The presented numerical results demonstrate that 6D simulations can be run on a single workstation and highlight the significant speedup that can be obtained using GPUs.

preprint2022arXiv

Efficient adaptive step size control for exponential integrators

Traditional step size controllers make the tacit assumption that the cost of a time step is independent of the step size. This is reasonable for explicit and implicit integrators that use direct solvers. In the context of exponential integrators, however, an iterative approach, such as Krylov methods or polynomial interpolation, to compute the action of the required matrix functions is usually employed. In this case, the assumption of constant cost is not valid. This is, in particular, a problem for higher-order exponential integrators, which are able to take relatively large time steps based on accuracy considerations. In this paper, we consider an adaptive step size controller for exponential Rosenbrock methods that determines the step size based on the premise of minimizing computational cost. The largest allowed step size, given by accuracy considerations, merely acts as a constraint. We test this approach on a range of nonlinear partial differential equations. Our results show significant improvements (up to a factor of 4 reduction in the computational cost) over the traditional step size controller for a wide range of tolerances.

preprint2022arXiv

Exponential Integrators for Resistive Magnetohydrodynamics: Matrix-free Leja Interpolation and Efficient Adaptive Time Stepping

We propose a novel algorithm for the temporal integration of the resistive magnetohydrodynamics (MHD) equations. The approach is based on exponential Rosenbrock schemes in combination with Leja interpolation. It naturally preserves Gauss's law for magnetism and is unencumbered by the stability constraints observed for explicit methods. Remarkable progress has been achieved in designing exponential integrators and computing the required matrix functions efficiently. However, employing them in MHD simulations of realistic physical scenarios requires a matrix-free implementation. We show how an efficient algorithm based on Leja interpolation that only uses the right-hand side of the differential equation (i.e. matrix free), can be constructed. We further demonstrate that it outperforms Krylov-based exponential integrators as well as explicit and implicit methods using test models of magnetic reconnection and the Kelvin--Helmholtz instability. Furthermore, an adaptive step-size strategy that gives excellent and predictable performance, particularly in the lenient- to intermediate-tolerance regime that is often of importance in practical applications, is employed.

preprint2021arXiv

A $μ$-mode integrator for solving evolution equations in Kronecker form

In this paper, we propose a $μ$-mode integrator for computing the solution of stiff evolution equations. The integrator is based on a $d$-dimensional splitting approach and uses exact (usually precomputed) one-dimensional matrix exponentials. We show that the action of the exponentials, i.e. the corresponding batched matrix-vector products, can be implemented efficiently on modern computer systems. We further explain how $μ$-mode products can be used to compute spectral transforms efficiently even if no fast transform is available. We illustrate the performance of the new integrator by solving, among the others, three-dimensional linear and nonlinear Schrödinger equations, and we show that the $μ$-mode integrator can significantly outperform numerical methods well established in the field. We also discuss how to efficiently implement this integrator on both multi-core CPUs and GPUs. Finally, the numerical experiments show that using GPUs results in performance improvements between a factor of $10$ and $20$, depending on the problem.

preprint2020arXiv

Semi-Lagrangian Vlasov simulation on GPUs

In this paper, our goal is to efficiently solve the Vlasov equation on GPUs. A semi-Lagrangian discontinuous Galerkin scheme is used for the discretization. Such kinetic computations are extremely expensive due to the high-dimensional phase space. The SLDG code, which is publicly available under the MIT license abstracts the number of dimensions and uses a shared codebase for both GPU and CPU based simulations. We investigate the performance of the implementation on a range of both Tesla (V100, Titan V, K80) and consumer (GTX 1080 Ti) GPUs. Our implementation is typically able to achieve a performance of approximately 470 GB/s on a single GPU and 1600 GB/s on four V100 GPUs connected via NVLink. This results in a speedup of about a factor of ten (comparing a single GPU with a dual socket Intel Xeon Gold node) and approximately a factor of 35 (comparing a single node with and without GPUs). In addition, we investigate the effect of single precision computation on the performance of the SLDG code and demonstrate that a template based dimension independent implementation can achieve good performance regardless of the dimensionality of the problem.

preprint2019arXiv

A low-rank projector-splitting integrator for the Vlasov--Maxwell equations with divergence correction

The Vlasov--Maxwell equations are used for the kinetic description of magnetized plasmas. As they are posed in an up to 3+3 dimensional phase space, solving this problem is extremely expensive from a computational point of view. In this paper, we exploit the low-rank structure in the solution of the Vlasov equation. More specifically, we consider the Vlasov--Maxwell system and propose a dynamic low-rank integrator. The key idea is to approximate the dynamics of the system by constraining it to a low-rank manifold. This is accomplished by a projection onto the tangent space. There, the dynamics is represented by the low-rank factors, which are determined by solving lower-dimensional partial differential equations. The proposed scheme performs well in numerical experiments and succeeds in capturing the main features of the plasma dynamics. We demonstrate this good behavior for a range of test problems. The coupling of the Vlasov equation with the Maxwell system, however, introduces additional challenges. In particular, the divergence of the electric field resulting from Maxwell's equations is not consistent with the charge density computed from the Vlasov equation. We propose a correction based on Lagrange multipliers which enforces Gauss' law up to machine precision.

preprint2019arXiv

Performance optimization and modeling of fine-grained irregular communication in UPC

The UPC programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory sub-systems. One convenient feature of UPC is its ability to automatically execute between-thread data movement, such that the entire content of a shared data array appears to be freely accessible by all the threads. The programmer friendliness, however, can come at the cost of substantial performance penalties. This is especially true when indirectly indexing the elements of a shared array, for which the induced between-thread data communication can be irregular and have a fine-grained pattern. In this paper we study performance enhancement strategies specifically targeting such fine-grained irregular communication in UPC. Starting from explicit thread privatization, continuing with block-wise communication, and arriving at message condensing and consolidation, we obtained considerable performance improvement of UPC programs that originally require fine-grained irregular communication. Besides the performance enhancement strategies, the main contribution of the present paper is to propose performance models for the different scenarios, in form of quantifiable formulas that hinge on the actual volumes of various data movements plus a small number of easily obtainable hardware characteristic parameters. These performance models help to verify the enhancements obtained, while also providing insightful predictions of similar parallel implementations, not limited to UPC, that also involve between-thread or between-process irregular communication. As a further validation, we also apply our performance modeling methodology and hardware characteristic parameters to an existing UPC code for solving a 2D heat equation on a uniform mesh.

preprint2013arXiv

An almost symmetric Strang splitting scheme for the construction of high order composition methods

In this paper we consider splitting methods for nonlinear ordinary differential equations in which one of the (partial) flows that results from the splitting procedure can not be computed exactly. Instead, we insert a well-chosen state $y_{\star}$ into the corresponding nonlinearity $b(y)y$, which results in a linear term $b(y_{\star})y$ whose exact flow can be determined efficiently. Therefore, in the spirit of splitting methods, it is still possible for the numerical simulation to satisfy certain properties of the exact flow. However, Strang splitting is no longer symmetric (even though it is still a second order method) and thus high order composition methods are not easily attainable. We will show that an iterated Strang splitting scheme can be constructed which yields a method that is symmetric up to a given order. This method can then be used to attain high order composition schemes. We will illustrate our theoretical results, up to order six, by conducting numerical experiments for a charged particle in an inhomogeneous electric field, a post-Newtonian computation in celestial mechanics, and a nonlinear population model and show that the methods constructed yield superior efficiency as compared to Strang splitting. For the first example we also perform a comparison with the standard fourth order Runge--Kutta methods and find significant gains in efficiency as well better conservation properties.

preprint2013arXiv

Convergence analysis of Strang splitting for Vlasov-type equations

A rigorous convergence analysis of the Strang splitting algorithm for Vlasov-type equations in the setting of abstract evolution equations is provided. It is shown that under suitable assumptions the convergence is of second order in the time step τ. As an example, it is verified that the Vlasov-Poisson equation in 1+1 dimensions fits into the framework of this analysis. Also, numerical experiments for the latter case are presented.

preprint2013arXiv

Exponential Integrators on Graphic Processing Units

In this paper we revisit stencil methods on GPUs in the context of exponential integrators. We further discuss boundary conditions, in the same context, and show that simple boundary conditions (for example, homogeneous Dirichlet or homogeneous Neumann boundary conditions) do not affect the performance if implemented directly into the CUDA kernel. In addition, we show that stencil methods with position-dependent coefficients can be implemented efficiently as well. As an application, we discuss the implementation of exponential integrators for different classes of problems in a single and multi GPU setup (up to 4 GPUs). We further show that for stencil based methods such parallelization can be done very efficiently, while for some unstructured matrices the parallelization to multiple GPUs is severely limited by the throughput of the PCIe bus.

preprint2012arXiv

Convergence analysis of a discontinuous Galerkin/Strang splitting approximation for the Vlasov--Poisson equations

A rigorous convergence analysis of the Strang splitting algorithm with a discontinuous Galerkin approximation in space for the Vlasov--Poisson equations is provided. It is shown that under suitable assumptions the error is of order $\mathcal{O}(τ^2+h^q +h^q / τ)$, where $τ$ is the size of a time step, $h$ is the cell size, and $q$ the order of the discontinuous Galerkin approximation. In order to investigate the recurrence phenomena for approximations of higher order as well as to compare the algorithm with numerical results already available in the literature a number of numerical simulations are performed.

Lukas Einkemmer

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

LeXInt: Package for Exponential Integrators employing Leja interpolation

Efficient 6D Vlasov simulation using the dynamical low-rank framework Ensign

Efficient adaptive step size control for exponential integrators

Exponential Integrators for Resistive Magnetohydrodynamics: Matrix-free Leja Interpolation and Efficient Adaptive Time Stepping

A $μ$-mode integrator for solving evolution equations in Kronecker form

Semi-Lagrangian Vlasov simulation on GPUs

A low-rank projector-splitting integrator for the Vlasov--Maxwell equations with divergence correction

Performance optimization and modeling of fine-grained irregular communication in UPC

An almost symmetric Strang splitting scheme for the construction of high order composition methods

Convergence analysis of Strang splitting for Vlasov-type equations

Exponential Integrators on Graphic Processing Units

Convergence analysis of a discontinuous Galerkin/Strang splitting approximation for the Vlasov--Poisson equations