Researcher profile

Robert D. Falgout

Robert D. Falgout contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

A New Semi-Structured Algebraic Multigrid Method

Multigrid methods are well suited to large massively parallel computer architectures because they are mathematically optimal and display excellent parallelization properties. Since current architecture trends are favoring regular compute patterns to achieve high performance, the ability to express structure has become much more important. The hypre software library provides high-performance multigrid preconditioners and solvers through conceptual interfaces, including a semi-structured interface that describes matrices primarily in terms of stencils and logically structured grids. This paper presents a new semi-structured algebraic multigrid (SSAMG) method built on this interface. The numerical convergence and performance of a CPU implementation of this method are evaluated for a set of semi-structured problems. SSAMG achieves significantly better setup times than hypre's unstructured AMG solvers and comparable convergence. In addition, the new method is capable of solving more complex problems than hypre's structured solvers.

preprint2022arXiv

Multigrid Reduction in Time for Chaotic Dynamical Systems

As CPU clock speeds have stagnated and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes can be a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the Multigrid Reduction in Time (MGRIT) method, remedy this by parallelizing across time-steps, and have shown promising results for parabolic problems. However, chaotic problems have proved more difficult, since chaotic initial value problems (IVPs) are inherently ill-conditioned. MGRIT relies on a hierarchy of successively coarser time-grids to iteratively correct the solution on the finest time-grid, but due to the nature of chaotic systems, small inaccuracies on the coarser levels can be greatly magnified and lead to poor coarse-grid corrections. Here we introduce a modified MGRIT algorithm based on an existing quadratically converging nonlinear extension to the multigrid Full Approximation Scheme (FAS), as well as a novel time-coarsening scheme. Together, these approaches better capture long-term chaotic behavior on coarse-grids and greatly improve convergence of MGRIT for chaotic IVPs. Further, we introduce a novel low memory variant of the algorithm for solving chaotic PDEs with MGRIT which not only solves the IVP, but also provides estimates for the unstable Lyapunov vectors of the system. We provide supporting numerical results for the Lorenz system and demonstrate parallel speedup for the chaotic Kuramoto- Sivashinsky partial differential equation over a significantly longer time-domain than in previous works.

preprint2022arXiv

Toward Parallel in Time for Chaotic Dynamical Systems

As CPU clock speeds have stagnated, and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes are a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the Multigrid Reduction in Time (MGRIT) method, remedy this by parallelizing across time-steps, and have shown promising results for parabolic problems. However, chaotic problems have proved more difficult, since chaotic initial value problems are inherently ill-conditioned. MGRIT relies on a hierarchy of successively coarser time-grids to iteratively correct the solution on the finest time-grid, but due to the nature of chaotic systems, subtle inaccuracies on the coarser levels can lead to poor coarse-grid corrections. Here we propose a modification to nonlinear FAS multigrid, as well as a novel time-coarsening scheme, which together better capture long term behavior on coarse grids and greatly improve convergence of MGRIT for chaotic initial value problems. We provide supporting numerical results for the Lorenz system model problem.

preprint2021arXiv

Optimizing multigrid reduction-in-time (MGRIT) and Parareal coarse-grid operators for linear advection

Parallel-in-time methods, such as multigrid reduction-in-time (MGRIT) and Parareal, provide an attractive option for increasing concurrency when simulating time-dependent PDEs in modern high-performance computing environments. While these techniques have been very successful for parabolic equations, it has often been observed that their performance suffers dramatically when applied to advection-dominated problems or purely hyperbolic PDEs using standard rediscretization approaches on coarse grids. In this paper, we apply MGRIT or Parareal to the constant-coefficient linear advection equation, appealing to existing convergence theory to provide insight into the typically non-scalable or even divergent behavior of these solvers for this problem. To overcome these failings, we replace rediscretization on coarse grids with improved coarse-grid operators that are computed by applying optimization techniques to approximately minimize error estimates from the convergence theory. One of our main findings is that, in order to obtain fast convergence as for parabolic problems, coarse-grid operators should take into account the behavior of the hyperbolic problem by tracking the characteristic curves. Our approach is tested for schemes of various orders using explicit or implicit Runge-Kutta methods combined with upwind-finite-difference spatial discretizations. In all cases, we obtain scalable convergence in just a handful of iterations, with parallel tests also showing significant speed-ups over sequential time-stepping. Our insight of tracking characteristics on coarse grids provides a key idea for solving the long-standing problem of efficient parallel-in-time integration for hyperbolic PDEs.

preprint2021arXiv

Time-periodic steady-state solution of fluid-structure interaction and cardiac flow problems through multigrid-reduction-in-time

In this paper, a time-periodic MGRIT algorithm is proposed as a means to reduce the time-to-solution of numerical algorithms by exploiting the time periodicity inherent to many applications in science and engineering. The time-periodic MGRIT algorithm is applied to a variety of linear and nonlinear single- and multiphysics problems that are periodic-in-time. It is demonstrated that the proposed parallel-in-time algorithm can obtain the same time-periodic steady-state solution as sequential time-stepping. It is shown that the required number of MGRIT iterations can be estimated a priori and that the new MGRIT variant can significantly and consistently reduce the time-to-solution compared to sequential time-stepping, irrespective of the number of dimensions, linear or nonlinear PDE models, single-physics or coupled problems and the employed computing resources. The numerical experiments demonstrate that the time-periodic MGRIT algorithm enables a greater level of parallelism yielding faster turnaround, and thus, facilitating more complex and more realistic problems to be solved.

preprint2020arXiv

Parallel Performance of Algebraic Multigrid Domain Decomposition (AMG-DD)

Algebraic multigrid (AMG) is a widely used scalable solver and preconditioner for large-scale linear systems resulting from the discretization of a wide class of elliptic PDEs. While AMG has optimal computational complexity, the cost of communication has become a significant bottleneck that limits its scalability as processor counts continue to grow on modern machines. This paper examines the design, implementation, and parallel performance of a novel algorithm, Algebraic Multigrid Domain Decomposition (AMG-DD), designed specifically to limit communication. The goal of AMG-DD is to provide a low-communication alternative to standard AMG V-cycles by trading some additional computational overhead for a significant reduction in communication cost. Numerical results show that AMG-DD achieves superior accuracy per communication cost compared to AMG, and speedup over AMG is demonstrated on a large GPU cluster.

preprint2019arXiv

Provably Optimal Parallel Transport Sweeps on Semi-Structured Grids

We have found provably optimal algorithms for full-domain discrete-ordinate transport sweeps on a class of grids in 2D and 3D Cartesian geometry that are regular at a coarse level but arbitrary within the coarse blocks. We describe these algorithms and show that they always execute the full eight-octant (or four-quadrant if 2D) sweep in the minimum possible number of stages for a given Px x Py x Pz partitioning. Computational results confirm that our optimal scheduling algorithms execute sweeps in the minimum possible stage count. Observed parallel efficiencies agree well with our performance model. Our PDT transport code has achieved approximately 68% parallel efficiency with > 1.5M parallel threads, relative to 8 threads, on a simple weak-scaling problem with only three energy groups, 10 direction per octant, and 4096 cells/core. We demonstrate similar efficiencies on a much more realistic set of nuclear-reactor test problems, with unstructured meshes that resolve fine geometric details. These results demonstrate that discrete-ordinates transport sweeps can be executed with high efficiency using more than 106 parallel processes.