Source author record

Haizhao Yang

Haizhao Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis Machine Learning Computer Vision Artificial Intelligence cond-mat.mtrl-sci Distributed, Parallel, and Cluster Computing math.OC math.ST physics.comp-ph quant-ph Statistics Theory

Catalog footprint

What is connected

25works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate. This issue can be lessened by the analytic spectrum estimation of the Laplacian or normalized Laplacian matrices in spectral clustering, making the proposed algorithm very efficient for spectral clustering. Second, to make the proposed algorithm capable of analyzing big data, a distributed and parallel version has been developed with attractive scalability. The speedup by parallel computing is approximately equivalent to $\sqrt{p}$, where $p$ denotes the number of processes. {Numerical results will be provided to demonstrate its efficiency in spectral clustering and scalability advantage over existing eigensolvers used for spectral clustering in parallel computing environments.}

preprint2022arXiv

Deep Network Approximation in Terms of Intrinsic Parameters

One of the arguments to explain the success of deep learning is the powerful approximation capacity of deep neural networks. Such capacity is generally accompanied by the explosive growth of the number of parameters, which, in turn, leads to high computational costs. It is of great interest to ask whether we can achieve successful deep learning with a small number of learnable parameters adapting to the target function. From an approximation perspective, this paper shows that the number of parameters that need to be learned can be significantly smaller than people typically expect. First, we theoretically design ReLU networks with a few learnable parameters to achieve an attractive approximation. We prove by construction that, for any Lipschitz continuous function $f$ on $[0,1]^d$ with a Lipschitz constant $λ>0$, a ReLU network with $n+2$ intrinsic parameters (those depending on $f$) can approximate $f$ with an exponentially small error $5λ\sqrt{d}\,2^{-n}$. Such a result is generalized to generic continuous functions. Furthermore, we show that the idea of learning a small number of parameters to achieve a good approximation can be numerically observed. We conduct several experiments to verify that training a small part of parameters can also achieve good results for classification problems if other parameters are pre-specified or pre-trained from a related problem.

preprint2022arXiv

Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Learning operators between infinitely dimensional spaces is an important learning task arising in wide applications in machine learning, imaging science, mathematical modeling and simulations, etc. This paper studies the nonparametric estimation of Lipschitz operators using deep neural networks. Non-asymptotic upper bounds are derived for the generalization error of the empirical risk minimizer over a properly chosen network class. Under the assumption that the target operator exhibits a low dimensional structure, our error bounds decay as the training sample size increases, with an attractive fast rate depending on the intrinsic dimension in our estimation. Our assumptions cover most scenarios in real applications and our results give rise to fast rates by exploiting low dimensional structures of data in operator estimation. We also investigate the influence of network structures (e.g., network width, depth, and sparsity) on the generalization error of the neural network estimator and propose a general suggestion on the choice of network structures to maximize the learning efficiency quantitatively.

preprint2022arXiv

Friedrichs Learning: Weak Solutions of Partial Differential Equations via Deep Learning

This paper proposes Friedrichs learning as a novel deep learning methodology that can learn the weak solutions of PDEs via a minmax formulation, which transforms the PDE problem into a minimax optimization problem to identify weak solutions. The name "Friedrichs learning" is for highlighting the close relationship between our learning strategy and Friedrichs theory on symmetric systems of PDEs. The weak solution and the test function in the weak formulation are parameterized as deep neural networks in a mesh-free manner, which are alternately updated to approach the optimal solution networks approximating the weak solution and the optimal test function, respectively. Extensive numerical results indicate that our mesh-free method can provide reasonably good solutions to a wide range of PDEs defined on regular and irregular domains in various dimensions, where classical numerical methods such as finite difference methods and finite element methods may be tedious or difficult to be applied.

preprint2022arXiv

IAE-Net: Integral Autoencoders for Discretization-Invariant Learning

Discretization invariant learning aims at learning in the infinite-dimensional function spaces with the capacity to process heterogeneous discrete representations of functions as inputs and/or outputs of a learning model. This paper proposes a novel deep learning framework based on integral autoencoders (IAE-Net) for discretization invariant learning. The basic building block of IAE-Net consists of an encoder and a decoder as integral transforms with data-driven kernels, and a fully connected neural network between the encoder and decoder. This basic building block is applied in parallel in a wide multi-channel structure, which are repeatedly composed to form a deep and densely connected neural network with skip connections as IAE-Net. IAE-Net is trained with randomized data augmentation that generates training data with heterogeneous structures to facilitate the performance of discretization invariant learning. The proposed IAE-Net is tested with various applications in predictive data science, solving forward and inverse problems in scientific computing, and signal/image processing. Compared with alternatives in the literature, IAE-Net achieves state-of-the-art performance in existing applications and creates a wide range of new applications.

preprint2022arXiv

Simultaneous Neural Network Approximation for Smooth Functions

We establish in this work approximation results of deep neural networks for smooth functions measured in Sobolev norms, motivated by recent development of numerical solvers for partial differential equations using deep neural networks. {Our approximation results are nonasymptotic in the sense that the error bounds are explicitly characterized in terms of both the width and depth of the networks simultaneously with all involved constants explicitly determined.} Namely, for $f\in C^s([0,1]^d)$, we show that deep ReLU networks of width $\mathcal{O}(N\log{N})$ and of depth $\mathcal{O}(L\log{L})$ can achieve a nonasymptotic approximation rate of $\mathcal{O}(N^{-2(s-1)/d}L^{-2(s-1)/d})$ with respect to the $\mathcal{W}^{1,p}([0,1]^d)$ norm for $p\in[1,\infty)$. If either the ReLU function or its square is applied as activation functions to construct deep neural networks of width $\mathcal{O}(N\log{N})$ and of depth $\mathcal{O}(L\log{L})$ to approximate $f\in C^s([0,1]^d)$, the approximation rate is $\mathcal{O}(N^{-2(s-n)/d}L^{-2(s-n)/d})$ with respect to the $\mathcal{W}^{n,p}([0,1]^d)$ norm for $p\in[1,\infty)$.

preprint2022arXiv

The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation

Identifying hidden dynamics from observed data is a significant and challenging task in a wide range of applications. Recently, the combination of linear multistep methods (LMMs) and deep learning has been successfully employed to discover dynamics, whereas a complete convergence analysis of this approach is still under development. In this work, we consider the deep network-based LMMs for the discovery of dynamics. We put forward error estimates for these methods using the approximation property of deep networks. It indicates, for certain families of LMMs, that the $\ell^2$ grid error is bounded by the sum of $O(h^p)$ and the network approximation error, where $h$ is the time step size and $p$ is the local truncation error order. Numerical results of several physically relevant examples are provided to demonstrate our theory.

preprint2022arXiv

The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network

Recently many plug-and-play self-attention modules (SAMs) are proposed to enhance the model generalization by exploiting the internal information of deep convolutional neural networks (CNNs). In general, previous works ignore where to plug in the SAMs since they connect the SAMs individually with each block of the entire CNN backbone for granted, leading to incremental computational cost and the number of parameters with the growth of network depth. However, we empirically find and verify some counterintuitive phenomena that: (a) Connecting the SAMs to all the blocks may not always bring the largest performance boost, and connecting to partial blocks would be even better; (b) Adding the SAMs to a CNN may not always bring a performance boost, and instead it may even harm the performance of the original CNN backbone. Therefore, we articulate and demonstrate the Lottery Ticket Hypothesis for Self-attention Networks: a full self-attention network contains a subnetwork with sparse self-attention connections that can (1) accelerate inference, (2) reduce extra parameter increment, and (3) maintain accuracy. In addition to the empirical evidence, this hypothesis is also supported by our theoretical evidence. Furthermore, we propose a simple yet effective reinforcement-learning-based method to search the ticket, i.e., the connection scheme that satisfies the three above-mentioned conditions. Extensive experiments on widely-used benchmark datasets and popular self-attention networks show the effectiveness of our method. Besides, our experiments illustrate that our searched ticket has the capacity of transferring to some vision tasks, e.g., crowd counting and segmentation.

preprint2021arXiv

Deep Network Approximation Characterized by Number of Neurons

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width $\mathcal{O}\big(\max\{d\lfloor N^{1/d}\rfloor,\, N+1\}\big)$ and depth $\mathcal{O}(L)$ can approximate an arbitrary Hölder continuous function of order $α\in (0,1]$ on $[0,1]^d$ with a nearly tight approximation rate $\mathcal{O}\big(\sqrt{d} N^{-2α/d}L^{-2α/d}\big)$ measured in $L^p$-norm for any $N,L\in \mathbb{N}^+$ and $p\in[1,\infty]$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f(\cdot)$, the constructive approximation rate is $\mathcal{O}\big(\sqrt{d}\,ω_f( N^{-2/d}L^{-2/d})\big)$. We also extend our analysis to $f$ on irregular domains or those localized in an $\varepsilon$-neighborhood of a $d_{\mathcal{M}}$-dimensional smooth manifold $\mathcal{M}\subseteq [0,1]^d$ with $d_{\mathcal{M}}\ll d$. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate $\mathcal{O}\big(ω_f(\tfrac{\varepsilon}{1-δ}\sqrt{\tfrac{d}{d_δ}}+\varepsilon)+\sqrt{d}\,ω_f(\tfrac{\sqrt{d}}{(1-δ)\sqrt{d_δ}}N^{-2/d_δ}L^{-2/d_δ})\big)$ for ReLU FNNs to approximate $f$ in the $\varepsilon$-neighborhood, where $d_δ=\mathcal{O}\big(d_{\mathcal{M}}\tfrac{\ln (d/δ)}{δ^2}\big)$ for any $δ\in(0,1)$ as a relative error for a projection to approximate an isometry when projecting $\mathcal{M}$ to a $d_δ$-dimensional domain.

preprint2021arXiv

Linear-Scaling Selected Inversion based on Hierarchical Interpolative Factorization for Self Green's Function for Modified Poisson-Boltzmann Equation in Two Dimensions

This paper studies an efficient numerical method for solving modified Poisson-Boltzmann (MPB) equations with the self Green's function as a state equation to describe electrostatic correlations in ionic systems. Previously, the most expensive point of the MPB solver is the evaluation of Green's function. The evaluation of Green's function requires solving high-dimensional partial differential equations, which is the computational bottleneck for solving MPB equations. Numerically, the MPB solver only requires the evaluation of Green's function as the diagonal part of the inverse of the discrete elliptic differential operator of the Debye-Hückel equation. Therefore, we develop a fast algorithm by a coupling of the selected inversion and hierarchical interpolative factorization. By the interpolative factorization, our new selected inverse algorithm achieves linear scaling to compute the diagonal of the inverse of this discrete operator. The accuracy and efficiency of the proposed algorithm will be demonstrated by extensive numerical results for solving MPB equations.

preprint2021arXiv

Reproducing Activation Function for Deep Learning

We propose reproducing activation functions (RAFs) to improve deep learning accuracy for various applications ranging from computer vision to scientific computing. The idea is to employ several basic functions and their learnable linear combination to construct neuron-wise data-driven activation functions for each neuron. Armed with RAFs, neural networks (NNs) can reproduce traditional approximation tools and, therefore, approximate target functions with a smaller number of parameters than traditional NNs. In NN training, RAFs can generate neural tangent kernels (NTKs) with a better condition number than traditional activation functions lessening the spectral bias of deep learning. As demonstrated by extensive numerical tests, the proposed RAFs can facilitate the convergence of deep learning optimization for a solution with higher accuracy than existing deep learning solvers for audio/image/video reconstruction, PDEs, and eigenvalue problems. With RAFs, the errors of audio/video reconstruction, PDEs, and eigenvalue problems are decreased by over 14%, 73%, 99%, respectively, compared with baseline, while the performance of image reconstruction increases by 58%.

preprint2020arXiv

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and verify its capability to be used together with Batch Normalization (Ioffe and Szegedy 2015). The experimental results on CIFAR-10, CIFAR-100, SVHN, EMNIST, and ImageNet show that Drop-Activation generally improves the performance of popular neural network architectures for the image classification task. Furthermore, as a regularizer Drop-Activation can be used in harmony with standard training and regularization techniques such as Batch Normalization and Auto Augment (Cubuk et al. 2019). The code is available at \url{https://github.com/LeungSamWai/Drop-Activation}.

preprint2020arXiv

Error bounds for deep ReLU networks using the Kolmogorov--Arnold superposition theorem

We prove a theorem concerning the approximation of multivariate functions by deep ReLU networks, for which the curse of the dimensionality is lessened. Our theorem is based on a constructive proof of the Kolmogorov--Arnold superposition theorem, and on a subset of multivariate continuous functions whose outer superposition functions can be efficiently approximated by deep ReLU networks.

preprint2020arXiv

Int-Deep: A Deep Learning Initialized Iterative Method for Nonlinear Problems

This paper focuses on proposing a deep learning initialized iterative method (Int-Deep) for low-dimensional nonlinear partial differential equations (PDEs). The corresponding framework consists of two phases. In the first phase, an expectation minimization problem formulated from a given nonlinear PDE is approximately resolved with mesh-free deep neural networks to parametrize the solution space. In the second phase, a solution ansatz of the finite element method to solve the given PDE is obtained from the approximate solution in the first phase, and the ansatz can serve as a good initial guess such that Newton's method for solving the nonlinear PDE is able to converge to the ground truth solution with high-accuracy quickly. Systematic theoretical analysis is provided to justify the Int-Deep framework for several classes of problems. Numerical results show that the Int-Deep outperforms existing purely deep learning-based methods or traditional iterative methods (e.g., Newton's method and the Picard iteration method).

preprint2020arXiv

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

This paper proposes an efficient method for computing selected generalized eigenpairs of a sparse Hermitian definite matrix pencil $(A,B)$. Based on Zolotarev's best rational function approximations of the signum function and conformal mapping techniques, we construct the best rational function approximation of a rectangular function supported on an arbitrary interval via function compositions with partial fraction representations. This new best rational function approximation can be applied to construct spectrum filters of $(A,B)$ with a smaller number of poles than a direct construction without function compositions. Combining fast direct solvers and the shift-invariant generalized minimal residual method, a hybrid fast algorithm is proposed to apply spectral filters efficiently. Compared to the state-of-the-art algorithm FEAST, the proposed rational function approximation is more efficient when sparse matrix factorizations are required to solve multi-shift linear systems in the eigensolver, since the smaller number of matrix factorizations is needed in our method. The efficiency and stability of the proposed method are demonstrated by numerical examples from computational chemistry.

preprint2020arXiv

Multidimensional Phase Recovery and Interpolative Decomposition Butterfly Factorization

This paper focuses on the fast evaluation of the matvec $g=Kf$ for $K\in \mathbb{C}^{N\times N}$, which is the discretization of a multidimensional oscillatory integral transform $g(x) = \int K(x,ξ) f(ξ)dξ$ with a kernel function $K(x,ξ)=e^{2\piıΦ(x,ξ)}$, where $Φ(x,ξ)$ is a piecewise smooth phase function with $x$ and $ξ$ in $\mathbb{R}^d$ for $d=2$ or $3$. A new framework is introduced to compute $Kf$ with $O(N\log N)$ time and memory complexity in the case that only indirect access to the phase function $Φ$ is available. This framework consists of two main steps: 1) an $O(N\log N)$ algorithm for recovering the multidimensional phase function $Φ$ from indirect access is proposed; 2) a multidimensional interpolative decomposition butterfly factorization (MIDBF) is designed to evaluate the matvec $Kf$ with an $O(N\log N)$ complexity once $Φ$ is available. Numerical results are provided to demonstrate the effectiveness of the proposed framework.

preprint2019arXiv

A Hierarchical Butterfly LU Preconditioner for Two-Dimensional Electromagnetic Scattering Problems Involving Open Surfaces

This paper introduces a hierarchical interpolative decomposition butterfly-LU factorization (H-IDBF-LU) preconditioner for solving two-dimensional electric-field integral equations (EFIEs) in electromagnetic scattering problems of perfect electrically conducting objects with open surfaces. H-IDBF-LU leverages the interpolative decomposition butterfly factorization (IDBF) to compress dense blocks of the discretized EFIE operator to expedite its application; this compressed operator also serves as an approximate LU factorization of the EFIE operator leading to an efficient preconditioner in iterative solvers. Both the memory requirement and computational cost of the H-IDBF-LU solver scale as $O(N\log^2 N)$ in one iteration; the total number of iterations required for a reasonably good accuracy scales as $O(1)$ to $O(\log^2N)$ in all of our numerical tests. The efficacy and accuracy of the proposed preconditioned iterative solver are demonstrated via its application to a broad range of scatterers involving up to $100$ million unknowns.

preprint2016arXiv

Preconditioning orbital minimization method for planewave discretization

We present an efficient preconditioner for the orbital minimization method when the Hamiltonian is discretized using planewaves (i.e., pseudospectral method). This novel preconditioner is based on an approximate Fermi operator projection by pole expansion, combined with the sparsifying preconditioner to efficiently evaluate the pole expansion for a wide range of Hamiltonian operators. Numerical results validate the performance of the new preconditioner for the orbital minimization method, in particular, the iteration number is reduced to $O(1)$ and often only a few iterations are enough for convergence.

preprint2016arXiv

Statistical Analysis of Synchrosqueezed Transforms

Synchrosqueezed transforms are non-linear processes for a sharpened time-frequency representation of wave-like components. They are efficient tools for identifying and analyzing wave-like components from their superposition. This paper is concerned with the statistical properties of compactly supported synchrosqueezed transforms for wave-like components embedded in a generalized Gaussian random process in multidimensional spaces. Guided by the theoretical analysis of these properties, new numerical implementations are proposed to reduce the noise fluctuations of these transforms on noisy data. A MATLAB package SynLab together with several heavily noisy examples is provided to support these theoretical claims.

preprint2015arXiv

A Multiscale Butterfly Algorithm for Multidimensional Fourier Integral Operators

This paper presents an efficient multiscale butterfly algorithm for computing Fourier integral operators (FIOs) of the form $(\mathcal{L} f)(x) = \int_{R^d}a(x,ξ) e^{2πıΦ(x,ξ)}\hat{f}(ξ) dξ$, where $Φ(x,ξ)$ is a phase function, $a(x,ξ)$ is an amplitude function, and $f(x)$ is a given input. The frequency domain is hierarchically decomposed into a union of Cartesian coronas. The integral kernel $a(x,ξ) e^{2πıΦ(x,ξ)}$ in each corona satisfies a special low-rank property that enables the application of a butterfly algorithm on the Cartesian phase-space grid. This leads to an algorithm with quasi-linear operation complexity and linear memory complexity. Different from previous butterfly methods for the FIOs, this new approach is simple and reduces the computational cost by avoiding extra coordinate transformations. Numerical examples in two and three dimensions are provided to demonstrate the practical advantages of the new algorithm.

preprint2015arXiv

Butterfly Factorization

The paper introduces the butterfly factorization as a data-sparse approximation for the matrices that satisfy a complementary low-rank property. The factorization can be constructed efficiently if either fast algorithms for applying the matrix and its adjoint are available or the entries of the matrix can be sampled individually. For an $N \times N$ matrix, the resulting factorization is a product of $O(\log N)$ sparse matrices, each with $O(N)$ non-zero entries. Hence, it can be applied rapidly in $O(N\log N)$ operations. Numerical results are provided to demonstrate the effectiveness of the butterfly factorization and its construction algorithms.

preprint2015arXiv

Combining $2D$ synchrosqueezed wave packet transform with optimization for crystal image analysis

We develop a variational optimization method for crystal analysis in atomic resolution images, which uses information from a 2D synchrosqueezed transform (SST) as input. The synchrosqueezed transform is applied to extract initial information from atomic crystal images: crystal defects, rotations and the gradient of elastic deformation. The deformation gradient estimate is then improved outside the identified defect region via a variational approach, to obtain more robust results agreeing better with the physical constraints. The variational model is optimized by a nonlinear projected conjugate gradient method. Both examples of images from computer simulations and imaging experiments are analyzed, with results demonstrating the effectiveness of the proposed method.

preprint2015arXiv

Crystal image analysis using $2D$ synchrosqueezed transforms

We propose efficient algorithms based on a band-limited version of 2D synchrosqueezed transforms to extract mesoscopic and microscopic information from atomic crystal images. The methods analyze atomic crystal images as an assemblage of non-overlapping segments of 2D general intrinsic mode type functions, which are superpositions of non-linear wave-like components. In particular, crystal defects are interpreted as the irregularity of local energy; crystal rotations are described as the angle deviation of local wave vectors from their references; the gradient of a crystal elastic deformation can be obtained by a linear system generated by local wave vectors. Several numerical examples of synthetic and real crystal images are provided to illustrate the efficiency, robustness, and reliability of our methods.

preprint2014arXiv

Synchrosqueezed Wave Packet Transforms and Diffeomorphism Based Spectral Analysis for 1D General Mode Decompositions

This paper develops new theory and algorithms for 1D general mode decompositions. First, we introduce the 1D synchrosqueezed wave packet transform and prove that it is able to estimate the instantaneous information of well-separated modes from their superposition accurately. The synchrosqueezed wave packet transform has a better resolution than the synchrosqueezed wavelet transform in the time-frequency domain for separating high frequency modes. Second, we present a new approach based on diffeomorphisms for the spectral analysis of general shape functions. These two methods lead to a framework for general mode decompositions under a weak well-separation condition and a well different condition. Numerical examples of synthetic and real data are provided to demonstrate the fruitful applications of these methods.

preprint2013arXiv

Synchrosqueezed Curvelet Transform for 2D Mode Decomposition

This paper introduces the synchrosqueezed curvelet transform as an optimal tool for 2D mode decomposition of wavefronts or banded wave-like components. The synchrosqueezed curvelet transform consists of a generalized curvelet transform with application dependent geometric scaling parameters, and a synchrosqueezing technique for a sharpened phase space representation. In the case of a superposition of banded wave-like components with well-separated wave-vectors, it is proved that the synchrosqueezed curvelet transform is capable of recognizing each component and precisely estimating local wave-vectors. A discrete analogue of the continuous transform and several clustering models for decomposition are proposed in detail. Some numerical examples with synthetic and real data are provided to demonstrate the above properties of the proposed transform.

Haizhao Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

Deep Network Approximation in Terms of Intrinsic Parameters

Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Friedrichs Learning: Weak Solutions of Partial Differential Equations via Deep Learning

IAE-Net: Integral Autoencoders for Discretization-Invariant Learning

Simultaneous Neural Network Approximation for Smooth Functions

The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation

The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network

Deep Network Approximation Characterized by Number of Neurons

Linear-Scaling Selected Inversion based on Hierarchical Interpolative Factorization for Self Green's Function for Modified Poisson-Boltzmann Equation in Two Dimensions

Reproducing Activation Function for Deep Learning

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Error bounds for deep ReLU networks using the Kolmogorov--Arnold superposition theorem

Int-Deep: A Deep Learning Initialized Iterative Method for Nonlinear Problems

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

Multidimensional Phase Recovery and Interpolative Decomposition Butterfly Factorization

A Hierarchical Butterfly LU Preconditioner for Two-Dimensional Electromagnetic Scattering Problems Involving Open Surfaces

Preconditioning orbital minimization method for planewave discretization

Statistical Analysis of Synchrosqueezed Transforms

A Multiscale Butterfly Algorithm for Multidimensional Fourier Integral Operators

Butterfly Factorization

Combining $2D$ synchrosqueezed wave packet transform with optimization for crystal image analysis

Crystal image analysis using $2D$ synchrosqueezed transforms

Synchrosqueezed Wave Packet Transforms and Diffeomorphism Based Spectral Analysis for 1D General Mode Decompositions

Synchrosqueezed Curvelet Transform for 2D Mode Decomposition