Source author record

Nathaniel A. Trask

Nathaniel A. Trask appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.NA Numerical Analysis

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Partition of unity networks: deep hp-approximation

Approximation theorists have established best-in-class optimal approximation rates of deep neural networks by utilizing their ability to simultaneously emulate partitions of unity and monomials. Motivated by this, we propose partition of unity networks (POUnets) which incorporate these elements directly into the architecture. Classification architectures of the type used to learn probability measures are used to build a meshfree partition of space, while polynomial spaces with learnable coefficients are associated to each partition. The resulting hp-element-like approximation allows use of a fast least-squares optimizer, and the resulting architecture size need not scale exponentially with spatial dimension, breaking the curse of dimensionality. An abstract approximation result establishes desirable properties to guide network design. Numerical results for two choices of architecture demonstrate that POUnets yield hp-convergence for smooth functions and consistently outperform MLPs for piecewise polynomial functions with large numbers of discontinuities.

preprint2020arXiv

A block coordinate descent optimizer for classification problems exploiting convexity

Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid Newton/Gradient Descent (NGD) method is consistent with the interpretation of hidden layers as providing an adaptive basis and the linear layer as providing an optimal fit of the basis to data. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to train the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training. The size of the Hessian in the second-order step scales only with the number weights in the linear layer and not the depth and width of the hidden layers; furthermore, the approach is applicable to arbitrary hidden layer architecture. Previous work applying this adaptive basis perspective to regression problems demonstrated significant improvements in accuracy at reduced training cost, and this work can be viewed as an extension of this approach to classification problems. We first prove that the resulting Hessian matrix is symmetric semi-definite, and that the Newton step realizes a global minimizer. By studying classification of manufactured two-dimensional point cloud data, we demonstrate both an improvement in validation error and a striking qualitative difference in the basis functions encoded in the hidden layer when trained using NGD. Application to image classification benchmarks for both dense and convolutional architectures reveals improved training accuracy, suggesting possible gains of second-order methods over gradient descent.

preprint2020arXiv

Asymptotically compatible reproducing kernel collocation and meshfree integration for the peridynamic Navier equation

In this work, we study the reproducing kernel (RK) collocation method for the peridynamic Navier equation. We first apply a linear RK approximation on both displacements and dilatation, then back-substitute dilatation, and solve the peridynamic Navier equation in a pure displacement form. The RK collocation scheme converges to the nonlocal limit and also to the local limit as nonlocal interactions vanish. The stability is shown by comparing the collocation scheme with the standard Galerkin scheme using Fourier analysis. We then apply the RK collocation to the quasi-discrete peridynamic Navier equation and show its convergence to the correct local limit when the ratio between the nonlocal length scale and the discretization parameter is fixed. The analysis is carried out on a special family of rectilinear Cartesian grids for the RK collocation method with a designated kernel with finite support. We assume the Lamé parameters satisfy $λ\geq μ$ to avoid adding extra constraints on the nonlocal kernel. Finally, numerical experiments are conducted to validate the theoretical results.

Nathaniel A. Trask

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Partition of unity networks: deep hp-approximation

A block coordinate descent optimizer for classification problems exploiting convexity

Asymptotically compatible reproducing kernel collocation and meshfree integration for the peridynamic Navier equation