Source author record

Robert M. Kirby

Robert M. Kirby appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis Machine Learning physics.comp-ph Computational Engineering, Finance, and Science Applications Computation cond-mat.mtrl-sci Distributed, Parallel, and Cluster Computing General Literature math.OC Mathematical Software physics.flu-dyn Quantitative Methods

Catalog footprint

What is connected

20works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

A Metalearning Approach for Physics-Informed Neural Networks (PINNs): Application to Parameterized PDEs

Physics-informed neural networks (PINNs) as a means of discretizing partial differential equations (PDEs) are garnering much attention in the Computational Science and Engineering (CS&E) world. At least two challenges exist for PINNs at present: an understanding of accuracy and convergence characteristics with respect to tunable parameters and identification of optimization strategies that make PINNs as efficient as other computational science tools. The cost of PINNs training remains a major challenge of Physics-informed Machine Learning (PiML) - and, in fact, machine learning (ML) in general. This paper is meant to move towards addressing the latter through the study of PINNs on new tasks, for which parameterized PDEs provides a good testbed application as tasks can be easily defined in this context. Following the ML world, we introduce metalearning of PINNs with application to parameterized PDEs. By introducing metalearning and transfer learning concepts, we can greatly accelerate the PINNs optimization process. We present a survey of model-agnostic metalearning, and then discuss our model-aware metalearning applied to PINNs as well as implementation considerations and algorithmic complexity. We then test our approach on various canonical forward parameterized PDEs that have been presented in the emerging PINNs literature.

preprint2023arXiv

Multifidelity Modeling for Physics-Informed Neural Networks (PINNs)

Multifidelity simulation methodologies are often used in an attempt to judiciously combine low-fidelity and high-fidelity simulation results in an accuracy-increasing, cost-saving way. Candidates for this approach are simulation methodologies for which there are fidelity differences connected with significant computational cost differences. Physics-informed Neural Networks (PINNs) are candidates for these types of approaches due to the significant difference in training times required when different fidelities (expressed in terms of architecture width and depth as well as optimization criteria) are employed. In this paper, we propose a particular multifidelity approach applied to PINNs that exploits low-rank structure. We demonstrate that width, depth, and optimization criteria can be used as parameters related to model fidelity, and show numerical justification of cost differences in training due to fidelity parameter choices. We test our multifidelity scheme on various canonical forward PDE models that have been presented in the emerging PINNs literature.

preprint2023arXiv

Weight Matrix Dimensionality Reduction in Deep Learning via Kronecker Multi-layer Architectures

Deep learning using neural networks is an effective technique for generating models of complex data. However, training such models can be expensive when networks have large model capacity resulting from a large number of layers and nodes. For training in such a computationally prohibitive regime, dimensionality reduction techniques ease the computational burden, and allow implementations of more robust networks. We propose a novel type of such dimensionality reduction via a new deep learning architecture based on fast matrix multiplication of a Kronecker product decomposition; in particular our network construction can be viewed as a Kronecker product-induced sparsification of an "extended" fully connected network. Analysis and practical examples show that this architecture allows a neural network to be trained and implemented with a significant reduction in computational time and resources, while achieving a similar error level compared to a traditional feedforward neural network.

preprint2022arXiv

A bandit-learning approach to multifidelity approximation

Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor \textit{a priori} knowledge of statistical information (e.g., correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.

preprint2022arXiv

Adaptive Self-supervision Algorithms for Physics-informed Neural Networks

Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function, but recent work has shown that this can lead to optimization difficulties. Here, we study the impact of the location of the collocation points on the trainability of these models. We find that the vanilla PINN performance can be significantly boosted by adapting the location of the collocation points as training proceeds. Specifically, we propose a novel adaptive collocation scheme which progressively allocates more collocation points (without increasing their number) to areas where the model is making higher errors (based on the gradient of the loss function in the domain). This, coupled with a judicious restarting of the training during any optimization stalls (by simply resampling the collocation points in order to adjust the loss landscape) leads to better estimates for the prediction error. We present results for several problems, including a 2D Poisson and diffusion-advection system with different forcing functions. We find that training vanilla PINNs for these problems can result in up to 70% prediction error in the solution, especially in the regime of low collocation points. In contrast, our adaptive schemes can achieve up to an order of magnitude smaller error, with similar computational complexity as the baseline. Furthermore, we find that the adaptive methods consistently perform on-par or slightly better than vanilla PINN method, even for large collocation point regimes. The code for all the experiments has been open sourced.

preprint2022arXiv

Machine Learning in Heterogeneous Porous Materials

The "Workshop on Machine learning in heterogeneous porous materials" brought together international scientific communities of applied mathematics, porous media, and material sciences with experts in the areas of heterogeneous materials, machine learning (ML) and applied mathematics to identify how ML can advance materials research. Within the scope of ML and materials research, the goal of the workshop was to discuss the state-of-the-art in each community, promote crosstalk and accelerate multi-disciplinary collaborative research, and identify challenges and opportunities. As the end result, four topic areas were identified: ML in predicting materials properties, and discovery and design of novel materials, ML in porous and fractured media and time-dependent phenomena, Multi-scale modeling in heterogeneous porous materials via ML, and Discovery of materials constitutive laws and new governing equations. This workshop was part of the AmeriMech Symposium series sponsored by the National Academies of Sciences, Engineering and Medicine and the U.S. National Committee on Theoretical and Applied Mechanics.

preprint2022arXiv

Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

Transformers have achieved remarkable success in sequence modeling and beyond but suffer from quadratic computational and memory complexities with respect to the length of the input sequence. Leveraging techniques include sparse and linear attention and hashing tricks; efficient transformers have been proposed to reduce the quadratic complexity of transformers but significantly degrade the accuracy. In response, we first interpret the linear attention and residual connections in computing the attention map as gradient descent steps. We then introduce momentum into these components and propose the \emph{momentum transformer}, which utilizes momentum to improve the accuracy of linear transformers while maintaining linear memory and computational complexities. Furthermore, we develop an adaptive strategy to compute the momentum value for our model based on the optimal momentum for quadratic optimization. This adaptive momentum eliminates the need to search for the optimal momentum value and further enhances the performance of the momentum transformer. A range of experiments on both autoregressive and non-autoregressive tasks, including image generation and machine translation, demonstrate that the momentum transformer outperforms popular linear transformers in training efficiency and accuracy.

preprint2022arXiv

Numerical Testing of a New Positivity-Preserving Interpolation Algorithm

An important component of a number of computational modeling algorithms is an interpolation method that preserves the positivity of the function being interpolated. This report describes the numerical testing of a new positivity-preserving algorithm that is designed to be used when interpolating from a solution defined on one grid to different spatial grid. The motivating application for this work was a numerical weather prediction (NWP) code that uses a spectral element mesh discretization for its dynamics core and a cartesian tensor product mesh for the evaluation of its physics routines. This coupling of spectral element mesh, which uses nonuniformly spaced quadrature/collocation points, and uniformly-spaced cartesian mesh combined with the desire to maintain positivity when moving between these meshes necessitates our work. This new approach is evaluated against several typical algorithms in use on a range of test problems in one or more space dimensions. The results obtained show that the new method is competitive in terms of observed accuracy while at the same time preserving the underlying positivity of the functions being interpolated.

preprint2022arXiv

Variational Inference for Nonlinear Inverse Problems via Neural Net Kernels: Comparison to Bayesian Neural Networks, Application to Topology Optimization

Inverse problems and, in particular, inferring unknown or latent parameters from data are ubiquitous in engineering simulations. A predominant viewpoint in identifying unknown parameters is Bayesian inference where both prior information about the parameters and the information from the observations via likelihood evaluations are incorporated into the inference process. In this paper, we adopt a similar viewpoint with a slightly different numerical procedure from standard inference approaches to provide insight about the localized behavior of unknown underlying parameters. We present a variational inference approach which mainly incorporates the observation data in a point-wise manner, i.e. we invert a limited number of observation data leveraging the gradient information of the forward map with respect to parameters, and find true individual samples of the latent parameters when the forward map is noise-free and one-to-one. For statistical calculations (as the ultimate goal in simulations), a large number of samples are generated from a trained neural network which serves as a transport map from the prior to posterior latent parameters. Our neural network machinery, developed as part of the inference framework and referred to as Neural Net Kernels (NNK), is based on hierarchical (deep) kernels which provide greater flexibility for training compared to standard neural networks. We showcase the effectiveness of our inference procedure in identifying bimodal and irregular distributions compared to a number of approaches including Markov Chain Monte Carlo sampling approaches and a Bayesian neural network approach.

preprint2021arXiv

Fast Barycentric-Based Evaluation Over Spectral/hp Elements

As the use of spectral/$hp$ element methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendousattention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/$hp$ element methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/$hp$ element library Nektar++.

preprint2021arXiv

GP-HMAT: Scalable, ${O}(n\log(n))$ Gaussian Process Regression with Hierarchical Low-Rank Matrices

A Gaussian process (GP) is a powerful and widely used regression technique. The main building block of a GP regression is the covariance kernel, which characterizes the relationship between pairs in the random field. The optimization to find the optimal kernel, however, requires several large-scale and often unstructured matrix inversions. We tackle this challenge by introducing a hierarchical matrix approach, named HMAT, which effectively decomposes the matrix structure, in a recursive manner, into significantly smaller matrices where a direct approach could be used for inversion. Our matrix partitioning uses a particular aggregation strategy for data points, which promotes the low-rank structure of off-diagonal blocks in the hierarchical kernel matrix. We employ a randomized linear algebra method for matrix reduction on the low-rank off-diagonal blocks without factorizing a large matrix. We provide analytical error and cost estimates for the inversion of the matrix, investigate them empirically with numerical computations, and demonstrate the application of our approach on three numerical examples involving GP regression for engineering problems and a large-scale real dataset. We provide the computer implementation of GP-HMAT, HMAT adapted for GP likelihood and derivative computations, and the implementation of the last numerical example on a real dataset. We demonstrate superior scalability of the HMAT approach compared to built-in $\backslash$ operator in MATLAB for large-scale linear solves $\bf{A}\bf{x} = \bf{y}$ via a repeatable and verifiable empirical study. An extension to hierarchical semiseparable (HSS) matrices is discussed as future research.

preprint2021arXiv

Kernel optimization for Low-Rank Multi-Fidelity Algorithms

One of the major challenges for low-rank multi-fidelity (MF) approaches is the assumption that low-fidelity (LF) and high-fidelity (HF) models admit "similar" low-rank kernel representations. Low-rank MF methods have traditionally attempted to exploit low-rank representations of linear kernels, which are kernel functions of the form $K(u,v) = v^T u$ for vectors $u$ and $v$. However, such linear kernels may not be able to capture low-rank behavior, and they may admit LF and HF kernels that are not similar. Such a situation renders a naive approach to low-rank MF procedures ineffective. In this paper, we propose a novel approach for the selection of a near-optimal kernel function for use in low-rank MF methods. The proposed framework is a two-step strategy wherein: (1) hyperparameters of a library of kernel functions are optimized, and (2) a particular combination of the optimized kernels is selected, through either a convex mixture (Additive Kernels) or through a data-driven optimization (Adaptive Kernels). The two resulting methods for this generalized framework both utilize only the available inexpensive low-fidelity data and thus no evaluation of high-fidelity simulation model is needed until a kernel is chosen. These proposed approaches are tested on five non-trivial problems including multi-fidelity surrogate modeling for one- and two-species molecular systems, gravitational many-body problem, associating polymer networks, plasmonic nano-particle arrays, and an incompressible flow in channels with stenosis. The results for these numerical experiments demonstrate the numerical stability efficiency of both proposed kernel function selection procedures, as well as high accuracy of their resultant predictive models for estimation of quantities of interest. Comparisons against standard linear kernel procedures also demonstrate increased accuracy of the optimized kernel approaches.

preprint2021arXiv

Structure-preserving Nonlinear Filtering for Continuous and Discontinuous Galerkin Spectral/hp Element Methods

Finite element simulations have been used to solve various partial differential equations (PDEs) that model physical, chemical, and biological phenomena. The resulting discretized solutions to PDEs often do not satisfy requisite physical properties, such as positivity or monotonicity. Such invalid solutions pose both modeling challenges, since the physical interpretation of simulation results is not possible, and computational challenges, since such properties may be required to advance the scheme. We, therefore, consider the problem of computing solutions that preserve these structural solution properties, which we enforce as additional constraints on the solution. We consider in particular the class of convex constraints, which includes positivity and monotonicity. By embedding such constraints as a postprocessing convex optimization procedure, we can compute solutions that satisfy general types of convex constraints. For certain types of constraints (including positivity and monotonicity), the optimization is a filter, i.e., a norm-decreasing operation. We provide a variety of tests on one-dimensional time-dependent PDEs that demonstrate the method's efficacy, and we empirically show that rates of convergence are unaffected by the inclusion of the constraints.

preprint2020arXiv

Structure-preserving function approximation via convex optimization

Approximations of functions with finite data often do not respect certain "structural" properties of the functions. For example, if a given function is non-negative, a polynomial approximation of the function is not necessarily also non-negative. We propose a formalism and algorithms for preserving certain types of such structure in function approximation. In particular, we consider structure corresponding to a convex constraint on the approximant (for which positivity is one example). The approximation problem then converts into a convex feasibility problem, but the feasible set is relatively complicated so that standard convex feasibility algorithms cannot be directly applied. We propose and discuss different algorithms for solving this problem. One of the features of our machinery is flexibility: relatively complicated constraints, such as simultaneously enforcing positivity, monotonicity, and convexity, are fairly straightforward to implement. We demonstrate the success of our algorithm on several problems in univariate function approximation.

preprint2019arXiv

Nektar++: enhancing the capability and application of high-fidelity spectral/$hp$ element methods

Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/$hp$ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a wide range of application areas. In this paper, we present the algorithmic, implementation and application developments associated with our Nektar++ version 5.0 release. We describe some of the key software and performance developments, including our strategies on parallel I/O, on in situ processing, the use of collective operations for exploiting current and emerging hardware, and interfaces to enable multi-solver coupling. Furthermore, we provide details on a newly developed Python interface that enables a more rapid introduction for new users unfamiliar with spectral/$hp$ element methods, C++ and/or Nektar++. This release also incorporates a number of numerical method developments - in particular: the method of moving frames, which provides an additional approach for the simulation of equations on embedded curvilinear manifolds and domains; a means of handling spatially variable polynomial order; and a novel technique for quasi-3D simulations to permit spatially-varying perturbations to the geometry in the homogeneous direction. Finally, we demonstrate the new application-level features provided in this release, namely: a facility for generating high-order curvilinear meshes called NekMesh; a novel new AcousticSolver for aeroacoustic problems; our development of a 'thick' strip model for the modelling of fluid-structure interaction problems in the context of vortex-induced vibrations. We conclude by commenting some directions for future code development and expansion.

preprint2016arXiv

Multi-dimensional filtering: Reducing the dimension through rotation

Over the past few decades there has been a strong effort towards the development of Smoothness-Increasing Accuracy-Conserving (SIAC) filters for Discontinuous Galerkin (DG) methods, designed to increase the smoothness and improve the convergence rate of the DG solution through this post-processor. These advantages can be exploited during flow visualization, for example by applying the SIAC filter to the DG data before streamline computations [Steffan {\it et al.}, IEEE-TVCG 14(3): 680-692]. However, introducing these filters in engineering applications can be challenging since a tensor product filter grows in support size as the field dimension increases, becoming computationally expensive. As an alternative, [Walfisch {\it et al.}, JOMP 38(2);164-184] proposed a univariate filter implemented along the streamline curves. Until now, this technique remained a numerical experiment. In this paper we introduce the SIAC line filter and explore how the orientation, structure and filter size affect the order of accuracy and global errors. We present theoretical error estimates showing how line filtering preserves the properties of traditional tensor product filtering, including smoothness and improvement in the convergence rate. Furthermore, numerical experiments are included, exhibiting how these filters achieve the same accuracy at significantly lower computational costs, becoming an attractive tool for the scientific visualization community.

preprint2015arXiv

A Radial Basis Function (RBF)-Finite Difference Method for the Simulation of Reaction-Diffusion Equations on Stationary Platelets within the Augmented Forcing Method

We present a computational method for solving the coupled problem of chemical transport in a fluid (blood) with binding/unbinding of the chemical to/from cellular (platelet) surfaces in contact with the fluid, and with transport of the chemical on the cellular surfaces. The overall framework is the Augmented Forcing Point Method (AFM) (\emph{L. Yao and A.L. Fogelson, Simulations of chemical transport and reaction in a suspension of cells I: An augmented forcing point method for the stationary case, IJNMF (2012) 69, 1736-52.}) for solving fluid-phase transport in a region outside of a collection of cells suspended in the fluid. We introduce a novel Radial Basis Function-Finite Difference (RBF-FD) method to solve reaction-diffusion equations on the surface of each of a collection of 2D stationary platelets suspended in blood. Parametric RBFs are used to represent the geometry of the platelets and give accurate geometric information needed for the RBF-FD method. Symmetric Hermite-RBF interpolants are used for enforcing the boundary conditions on the fluid-phase chemical concentration, and their use removes a significant limitation of the original AFM. The efficacy of the new methods are shown through a series of numerical experiments; in particular, second order convergence for the coupled problem is demonstrated.

preprint2015arXiv

Augmenting the Immersed Boundary Method with Radial Basis Functions (RBFs) for the Modeling of Platelets in Hemodynamic Flows

We present a new computational method by extending the Immersed Boundary (IB) method with a spectrally-accurate geometric model based on Radial Basis Function (RBF) interpolation of the Lagrangian structures. Our specific motivation is the modeling of platelets in hemodynamic flows, though we anticipate that our method will be useful in other applications as well. The efficacy of our new RBF-IB method is shown through a series of numerical experiments. Specifically, we compare our method with the traditional IB method in terms of convergence and accuracy, computational cost, maximum stable time-step size and volume loss. We conclude that the RBF-IB method has advantages over the traditional Immersed Boundary method, and is well-suited for modeling of platelets in hemodynamic flows.

preprint2014arXiv

A Radial Basis Function (RBF)-Finite Difference (FD) Method for Diffusion and Reaction-Diffusion Equations on Surfaces

In this paper, we present a method based on Radial Basis Function (RBF)-generated Finite Differences (FD) for numerically solving diffusion and reaction-diffusion equations (PDEs) on closed surfaces embedded in $\mathbb{R}^d$. Our method uses a method-of-lines formulation, in which surface derivatives that appear in the PDEs are approximated locally using RBF interpolation. The method requires only scattered nodes representing the surface and normal vectors at those scattered nodes. All computations use only extrinsic coordinates, thereby avoiding coordinate distortions and singularities. We also present an optimization procedure that allows for the stabilization of the discrete differential operators generated by our RBF-FD method by selecting shape parameters for each stencil that correspond to a global target condition number. We show the convergence of our method on two surfaces for different stencil sizes, and present applications to nonlinear PDEs simulated both on implicit/parametric surfaces and more general surfaces represented by point clouds.

preprint2013arXiv

Rethinking Abstractions for Big Data: Why, Where, How, and What

Big data refers to large and complex data sets that, under existing approaches, exceed the capacity and capability of current compute platforms, systems software, analytical tools and human understanding. Numerous lessons on the scalability of big data can already be found in asymptotic analysis of algorithms and from the high-performance computing (HPC) and applications communities. However, scale is only one aspect of current big data trends; fundamentally, current and emerging problems in big data are a result of unprecedented complexity--in the structure of the data and how to analyze it, in dealing with unreliability and redundancy, in addressing the human factors of comprehending complex data sets, in formulating meaningful analyses, and in managing the dense, power-hungry data centers that house big data. The computer science solution to complexity is finding the right abstractions, those that hide as much triviality as possible while revealing the essence of the problem that is being addressed. The "big data challenge" has disrupted computer science by stressing to the very limits the familiar abstractions which define the relevant subfields in data analysis, data management and the underlying parallel systems. As a result, not enough of these challenges are revealed by isolating abstractions in a traditional software stack or standard algorithmic and analytical techniques, and attempts to address complexity either oversimplify or require low-level management of details. The authors believe that the abstractions for big data need to be rethought, and this reorganization needs to evolve and be sustained through continued cross-disciplinary collaboration.

Robert M. Kirby

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

A Metalearning Approach for Physics-Informed Neural Networks (PINNs): Application to Parameterized PDEs

Multifidelity Modeling for Physics-Informed Neural Networks (PINNs)

Weight Matrix Dimensionality Reduction in Deep Learning via Kronecker Multi-layer Architectures

A bandit-learning approach to multifidelity approximation

Adaptive Self-supervision Algorithms for Physics-informed Neural Networks

Machine Learning in Heterogeneous Porous Materials

Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

Numerical Testing of a New Positivity-Preserving Interpolation Algorithm

Variational Inference for Nonlinear Inverse Problems via Neural Net Kernels: Comparison to Bayesian Neural Networks, Application to Topology Optimization

Fast Barycentric-Based Evaluation Over Spectral/hp Elements

GP-HMAT: Scalable, ${O}(n\log(n))$ Gaussian Process Regression with Hierarchical Low-Rank Matrices

Kernel optimization for Low-Rank Multi-Fidelity Algorithms

Structure-preserving Nonlinear Filtering for Continuous and Discontinuous Galerkin Spectral/hp Element Methods

Structure-preserving function approximation via convex optimization

Nektar++: enhancing the capability and application of high-fidelity spectral/$hp$ element methods

Multi-dimensional filtering: Reducing the dimension through rotation

A Radial Basis Function (RBF)-Finite Difference Method for the Simulation of Reaction-Diffusion Equations on Stationary Platelets within the Augmented Forcing Method

Augmenting the Immersed Boundary Method with Radial Basis Functions (RBFs) for the Modeling of Platelets in Hemodynamic Flows

A Radial Basis Function (RBF)-Finite Difference (FD) Method for Diffusion and Reaction-Diffusion Equations on Surfaces

Rethinking Abstractions for Big Data: Why, Where, How, and What