Source author record

Bao Wang

Bao Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Machine Learning Biomolecules Numerical Analysis eess.SP physics.chem-ph Artificial Intelligence astro-ph.CO astro-ph.GA astro-ph.HE Distributed, Parallel, and Cluster Computing gr-qc math-ph math.MP math.OC Networking and Internet Architecture Neural and Evolutionary Computing physics.comp-ph

Catalog footprint

What is connected

20works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Investigating the Anisotropy of Dispersion Measure Contribution from the Galactic Halo by Using Fast Radio Bursts

We propose a data-driven approach to reconstruct the all-sky distribution of the dispersion measure contribution from the Galactic halo ($\mathrm{DM_{halo}}$) through a spherical harmonic expansion, enabling an investigation of its possible anisotropies. Based on the NE2001 model and using 92 localized and 574 unlocalized non-repeating fast radio bursts (FRBs) at Galactic latitudes $|b|>15^\circ$, we find a significant dipole anisotropy in $\mathrm{DM_{halo}}$, pointing toward $(l=130^\circ,\, b=+5^\circ)$ with a $1σ$ uncertainty of approximately $28^\circ$. The $\mathrm{DM_{halo}}$ value in this direction is $63\pm9~\mathrm{pc~cm^{-3}}$, exceeding the all-sky mean by about $2.6σ$. This result is not significantly affected by the choice of Galactic ISM models. Furthermore, even when using a refined sample of 62 localized FRBs (excluding CHIME detections, repeaters, and unlocalized events), the dipole anisotropic structure persists, with a direction of $(l=141^\circ,\, b=+51^\circ)$ and a larger 1$σ$ uncertainty of $\sim 44^\circ$. Model comparisons using the Akaike Information Criterion and Bayesian evidence yield consistent preferences, and together they suggest that current FRB data slightly favor the existence of a dipole structure in $\mathrm{DM_{halo}}$. If this feature is not a statistical fluctuation or systematic error, its physical origin requires further investigation. Future FRB samples with larger sizes and more complete sky coverage will be essential to confirm or refute this possible anisotropic structure.

preprint2024arXiv

Observations favor the redshift-evolutionary $L_X$-$L_{UV}$ relation of quasars from copula

We compare, with data from the quasars, the Hubble parameter measurements, and the Pantheon+ type Ia supernova, three different relations between X-ray luminosity ($L_X$) and ultraviolet luminosity ($L_{UV}$) of quasars. These three relations consist of the standard and two redshift-evolutionary $L_X$-$L_{UV}$ relations which are constructed respectively by considering a redshift dependent correction to the luminosities of quasars and using the statistical tool called copula. By employing the PAge approximation for a cosmological-model-independent description of the cosmic background evolution and dividing the quasar data into the low-redshift and high-redshift parts, we find that the constraints on the PAge parameters from the low-redshift and high-redshift data, which are obtained with the redshift-evolutionary relations, are consistent with each other, while they are not when the standard relation is considered. If the data are used to constrain the coefficients of the relations and the PAge parameters simultaneously, then the observations support the redshift-evolutionary relations at more than $3σ$. The Akaike and Bayes information criteria indicate that there is strong evidence against the standard relation and mild evidence against the redshift-evolutionary relation constructed by considering a redshift dependent correction to the luminosities of quasars. This suggests that the redshift-evolutionary $L_X$-$L_{UV}$ relation of quasars constructed from copula is favored by the observations.

preprint2022arXiv

glassoformer: a query-sparse transformer for post-fault power grid voltage prediction

We propose GLassoformer, a novel and efficient transformer architecture leveraging group Lasso regularization to reduce the number of queries of the standard self-attention mechanism. Due to the sparsified queries, GLassoformer is more computationally efficient than the standard transformers. On the power grid post-fault voltage prediction task, GLassoformer shows remarkably better prediction than many existing benchmark algorithms in terms of accuracy and stability.

preprint2022arXiv

Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

Transformers have achieved remarkable success in sequence modeling and beyond but suffer from quadratic computational and memory complexities with respect to the length of the input sequence. Leveraging techniques include sparse and linear attention and hashing tricks; efficient transformers have been proposed to reduce the quadratic complexity of transformers but significantly degrade the accuracy. In response, we first interpret the linear attention and residual connections in computing the attention map as gradient descent steps. We then introduce momentum into these components and propose the \emph{momentum transformer}, which utilizes momentum to improve the accuracy of linear transformers while maintaining linear memory and computational complexities. Furthermore, we develop an adaptive strategy to compute the momentum value for our model based on the optimal momentum for quadratic optimization. This adaptive momentum eliminates the need to search for the optimal momentum value and further enhances the performance of the momentum transformer. A range of experiments on both autoregressive and non-autoregressive tasks, including image generation and machine translation, demonstrate that the momentum transformer outperforms popular linear transformers in training efficiency and accuracy.

preprint2022arXiv

Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers. These solvers are computationally expensive, requiring the use of tiny step sizes for numerical stability and accuracy guarantees. This paper considers learning neural ODEs using implicit ODE solvers of different orders leveraging proximal operators. The proximal implicit solver consists of inner-outer iterations: the inner iterations approximate each implicit update step using a fast optimization algorithm, and the outer iterations solve the ODE system over time. The proximal implicit ODE solver guarantees superiority over explicit solvers in numerical stability and computational efficiency. We validate the advantages of proximal implicit solvers over existing popular neural ODE solvers on various challenging benchmark tasks, including learning continuous-depth graph neural networks and continuous normalizing flows.

preprint2021arXiv

A formula for symmetry recursion operators from non-variational symmetries of partial differential equations

An explicit formula to find symmetry recursion operators for partial differential equations (PDEs) is obtained from new results connecting variational integrating factors and non-variational symmetries. The formula is special case of a general formula that produces a pre-symplectic operator from a non-gradient adjoint-symmetry. These formulas are illustrated by several examples of linear PDEs and integrable nonlinear PDEs. Additionally, a classification of quasilinear second-order PDEs admitting a multiplicative symmetry recursion operator through the first formula is presented.

preprint2021arXiv

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

We propose near-optimal overlay networks based on $d$-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical convergence and generalization bounds for DFL. As such, our proposed overlay networks accelerate convergence, improve generalization, and enhance robustness to clients failures in DFL with theoretical guarantees. Also, we present an efficient algorithm to convert a given graph to a practical overlay network and maintaining the network topology after potential client failures. We numerically verify the advantages of DFL with our proposed networks on various benchmark tasks, ranging from image classification to language modeling using hundreds of clients.

preprint2021arXiv

Stability and Generalization of the Decentralized Stochastic Gradient Descent

The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models. As the main workhorse for deep learning, stochastic gradient descent has received a considerable amount of studies. Nevertheless, the community paid little attention to its decentralized variants. In this paper, we provide a novel formulation of the decentralized stochastic gradient descent. Leveraging this formulation together with (non)convex optimization theory, we establish the first stability and generalization guarantees for the decentralized stochastic gradient descent. Our theoretical results are built on top of a few common and mild assumptions and reveal that the decentralization deteriorates the stability of SGD for the first time. We verify our theoretical findings by using a variety of decentralized settings and benchmark machine learning models.

preprint2020arXiv

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation remarkably improves both the generalization and robustness of DNN. In the CIFAR10 benchmark, we raise the robust accuracy of the adversarially trained ResNet20 from $\sim 46\%$ to $\sim 69\%$ under the state-of-the-art Iterative Fast Gradient Sign Method (IFGSM) based adversarial attack. When we combine this data-dependent activation with total variation minimization on adversarial images and training data augmentation, we achieve an improvement in robust accuracy by 38.9$\%$ for ResNet56 under the strongest IFGSM attack. Furthermore, We provide an intuitive explanation of our defense by analyzing the geometry of the feature space.

preprint2020arXiv

Reflections in the Sky: Joint Trajectory and Passive Beamforming Design for Secure UAV Networks with Reconfigurable Intelligent Surface

This paper investigates the problem of secure energy efficiency maximization for a reconfigurable intelligent surface (RIS) assisted uplink wireless communication system, where an unmanned aerial vehicle (UAV) equipped with an RIS works as a mobile relay between the base station (BS) and a group of users. We focus on maximizing the secure energy efficiency of the system via jointly optimizing the UAV's trajectory, the RIS's phase shift, users' association and transmit power. To tackle this problem, we divide the original problem into three sub-problems, and propose an efficient iterative algorithm. In particular, the successive convex approximation method (SCA) is applied to solve the nonconvex UAV trajectory, the RIS's phase shift, and transmit power optimization sub-problems. We further provide two schemes to simplify the solution of phase and trajectory sub-problem. Simulation results demonstrate that the proposed algorithm converges fast, and the proposed design can enhance the secure energy efficiency by up to 38\% gains, as compared to the traditional schemes without any RIS.

preprint2020arXiv

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up the convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance in training ResNet200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with significantly fewer training epochs compared to the SGD baseline.

preprint2020arXiv

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Deep neural nets (DNNs) compression is crucial for adaptation to mobile devices. Though many successful algorithms exist to compress naturally trained DNNs, developing efficient and stable compression algorithms for robustly trained DNNs remains widely open. In this paper, we focus on a co-design of efficient DNN compression algorithms and sparse neural architectures for robust and accurate deep learning. Such a co-design enables us to advance the goal of accommodating both sparsity and robustness. With this objective in mind, we leverage the relaxed augmented Lagrangian based algorithms to prune the weights of adversarially trained DNNs, at both structured and unstructured levels. Using a Feynman-Kac formalism principled robust and sparse DNNs, we can at least double the channel sparsity of the adversarially trained ResNet20 for CIFAR10 classification, meanwhile, improve the natural accuracy by $8.69$\% and the robust accuracy under the benchmark $20$ iterations of IFGSM attack by $5.42$\%. The code is available at \url{https://github.com/BaoWangMath/rvsm-rgsm-admm}.

preprint2016arXiv

Accurate, robust and reliable calculations of Poisson-Boltzmann binding energies

Poisson-Boltzmann (PB) model is one of the most popular implicit solvent models in biophysical modeling and computation. The ability of providing accurate and reliable PB estimation of electrostatic solvation free energy, $ΔG_{\text{el}}$, and binding free energy, $ΔΔG_{\text{el}}$, is of tremendous significance to computational biophysics and biochemistry. Recently, it has been warned in the literature (Journal of Chemical Theory and Computation 2013, 9, 3677-3685) that the widely used grid spacing of $0.5$ Å$ $ produces unacceptable errors in $ΔΔG_{\text{el}}$ estimation with the solvent exclude surface (SES). In this work, we investigate the grid dependence of our PB solver (MIBPB) with SESs for estimating both electrostatic solvation free energies and electrostatic binding free energies. It is found that the relative absolute error of $ΔG_{\text{el}}$ obtained at the grid spacing of $1.0$ Å$ $ compared to $ΔG_{\text{el}}$ at $0.2$ Å$ $ averaged over 153 molecules is less than 0.2\%. Our results indicate that the use of grid spacing $0.6$ Å$ $ ensures accuracy and reliability in $ΔΔG_{\text{el}}$ calculation. In fact, the grid spacing of $1.1$ Å$ $ appears to deliver adequate accuracy for high throughput screening.

preprint2016arXiv

Accurate, robust and reliable calculations of Poisson-Boltzmann solvation energies

Developing accurate solvers for the Poisson Boltzmann (PB) model is the first step to make the PB model suitable for implicit solvent simulation. Reducing the grid size influence on the performance of the solver benefits to increasing the speed of solver and providing accurate electrostatics analysis for solvated molecules. In this work, we explore the accurate coarse grid PB solver based on the Green's function treatment of the singular charges, matched interface and boundary (MIB) method for treating the geometric singularities, and posterior electrostatic potential field extension for calculating the reaction field energy. We made our previous PB software, MIBPB, robust and provides almost grid size independent reaction field energy calculation. Large amount of the numerical tests verify the grid size independence merit of the MIBPB software. The advantage of MIBPB software directly make the acceleration of the PB solver from the numerical algorithm instead of utilization of advanced computer architectures. Furthermore, the presented MIBPB software is provided as a free online sever.

preprint2016arXiv

Automatic parametrization of implicit solvent models for the blind prediction of solvation free energies

In this work, a systematic protocol is proposed to automatically parametrize implicit solvent models with polar and nonpolar components. The proposed protocol utilizes the classical Poisson model or the Kohn-Sham density functional theory (KSDFT) based polarizable Poisson model for modeling polar solvation free energies. For the nonpolar component, either the standard model of surface area, molecular volume, and van der Waals interactions, or a model with atomic surface areas and molecular volume is employed. Based on the assumption that similar molecules have similar parametrizations, we develop scoring and ranking algorithms to classify solute molecules. Four sets of radius parameters are combined with four sets of charge force fields to arrive at a total of 16 different parametrizations for the Poisson model. A large database with 668 experimental data is utilized to validate the proposed protocol. The lowest leave-one-out root mean square (RMS) error for the database is 1.33k cal/mol. Additionally, five subsets of the database, i.e., SAMPL0-SAMPL4, are employed to further demonstrate that the proposed protocol offers some of the best solvation predictions. The optimal RMS errors are 0.93, 2.82, 1.90, 0.78, and 1.03 kcal/mol, respectively for SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 test sets. These results are some of the best, to our best knowledge.

preprint2015arXiv

Finite Volume Formulation of the MIB Method for Elliptic Interface Problems

The matched interface and boundary (MIB) method has a proven ability for delivering the second order accuracy in handling elliptic interface problems with arbitrarily complex interface geometries. However, its collocation formulation requires relatively high solution regularity. Finite volume method (FVM) has its merit in dealing with conservation law problems and its integral formulation works well with relatively low solution regularity. We propose an MIB-FVM to take the advantages of both MIB and FVM for solving elliptic interface problems. We construct the proposed method on Cartesian meshes with vertex-centered control volumes. A large number of numerical experiments are designed to validate the present method in both two dimensional (2D) and three dimensional (3D) domains. It is found that the proposed MIB-FVM achieves the second order convergence for elliptic interface problems with complex interface geometries in both $L_{\infty}$ and $L_2$ norms.

preprint2015arXiv

Parameter optimization in differential geometry based solvation models

Differential geometry (DG) based solvation models are a new class of variational implicit solvent approaches that are able to avoid unphysical solvent-solute boundary definitions and associated geometric singularities, and dynamically couple polar and nonpolar interactions in a self-consistent framework. Our earlier study indicates that DG based nonpolar solvation model outperforms other methods in nonpolar solvation energy predictions. However, the DG based full solvation model has not shown its superiority in solvation analysis, due to its difficulty in parametrization, which must ensure the stability of the solution of strongly coupled nonlinear Laplace-Beltrami and Poisson-Boltzmann equations. In this work, we introduce new parameter learning algorithms based on perturbation and convex optimization theories to stabilize the numerical solution and thus achieve an optimal parametrization of the DG based solvation models. An interesting feature of the present DG based solvation model is that it provides accurate solvation free energy predictions for both polar and nonploar molecules in a unified formulation. Extensive numerical experiment demonstrates that the present DG based solvation model delivers some of the most accurate predictions of the solvation free energies for a large number of molecules.

preprint2014arXiv

Matched Interface and Boundary Method for Elasticity Interface Problems

Elasticity theory is an important component of continuum mechanics and has had widely spread applications in science and engineering. Material interfaces are ubiquity in nature and man-made devices, and often give rise to discontinuous coefficients in the governing elasticity equations. In this work, the matched interface and boundary (MIB) method is developed to address elasticity interface problems. Linear elasticity theory for both isotropic homogeneous and inhomogeneous media is employed. In our approach, Lam$\acute{e}$'s parameters can have jumps across the interface and are allowed to be position dependent in modeling isotropic inhomogeneous material. Both strong discontinuity, i.e., discontinuous solution, and weak discontinuity, namely, discontinuous derivatives of the solution, are considered in the present study. In the proposed method, fictitious values are utilized so that the standard central finite different schemes can be employed regardless of the interface. Interface jump conditions are enforced on the interface, which in turn, accurately determines fictitious values. We design new MIB schemes to account for complex interface geometries. In particular, the cross derivatives in the elasticity equations are difficult to handle for complex interface geometries. We propose secondary fictitious values and construct geometry based interpolation schemes to overcome this difficulty. Numerous analytical examples are used to validate the accuracy, convergence and robustness of the present MIB method for elasticity interface problems with both small and large curvatures, strong and weak discontinuities, and constant and variable coefficients. Numerical tests indicate second order accuracy in both $L_\infty$ and $L_2$ norms.

preprint2014arXiv

Objective-oriented Persistent Homology

Persistent homology provides a new approach for the topological simplification of big data via measuring the life time of intrinsic topological features in a filtration process and has found its success in scientific and engineering applications. However, such a success is essentially limited to qualitative data characterization, identification and analysis (CIA). In this work, we outline a general protocol to construct objective-oriented persistent homology methods. The minimization of the objective functional leads to a Laplace-Beltrami operator which generates a multiscale representation of the initial data and offers an objective oriented filtration process. The resulting differential geometry based objective-oriented persistent homology is able to preserve desirable geometric features in the evolutionary filtration and enhances the corresponding topological persistence. The consistence between Laplace-Beltrami flow based filtration and Euclidean distance based filtration is confirmed on the Vietoris-Rips complex for a large amount of numerical tests. The convergence and reliability of the present Laplace-Beltrami flow based cubical complex filtration approach are analyzed over various spatial and temporal mesh sizes. The efficiency and robustness of the present method are verified by more than 500 fullerene molecules. It is shown that the proposed persistent homology based quantitative model offers good predictions of total curvature energies for ten types of fullerene isomers. The present work offers the first example to design objective-oriented persistent homology to enhance or preserve desirable features in the original data during the filtration process and then automatically detect or extract the corresponding topological traits from the data.

preprint2014arXiv

Second order Method for Solving 3D Elasticity Equations with Complex and Sharp Interfaces

Elastic materials are ubiquitous in nature and indispensable components in man-made devices and equipments. When a device or equipment involves composite or multiple elastic materials, elasticity interface problems come into play. The solution of three dimensional (3D) elasticity interface problems is significantly more difficult than that of elliptic counterparts due to the coupled vector components and cross derivatives in the governing elasticity equation. This work introduces the matched interface and boundary (MIB) method for solving 3D elasticity interface problems. The proposed MIB method utilizes fictitious values on irregular grid points near the material interface to replace function values in the discretization so that the elasticity equation can be discretized using the standard finite difference schemes as if there were no material interface. The interface jump conditions are rigorously enforced on the intersecting points between the interface and the mesh lines. Such an enforcement determines the fictitious values. A number of new technique are developed to construct efficient MIB schemes for dealing with cross derivative in coupled governing equations. The proposed method is extensively validated over both weak and strong discontinuity of the solution, both piecewise constant and position-dependent material parameters, both smooth and nonsmooth interface geometries, and both small and large contrasts in the Poisson's ratio and shear modulus across the interface. Numerical experiments indicate that the present MIB method is of second order convergence in both $L_\infty$ and $L_2$ error norms.

Bao Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Investigating the Anisotropy of Dispersion Measure Contribution from the Galactic Halo by Using Fast Radio Bursts

Observations favor the redshift-evolutionary $L_X$-$L_{UV}$ relation of quasars from copula

glassoformer: a query-sparse transformer for post-fault power grid voltage prediction

Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

A formula for symmetry recursion operators from non-variational symmetries of partial differential equations

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

Stability and Generalization of the Decentralized Stochastic Gradient Descent

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

Reflections in the Sky: Joint Trajectory and Passive Beamforming Design for Secure UAV Networks with Reconfigurable Intelligent Surface

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Accurate, robust and reliable calculations of Poisson-Boltzmann binding energies

Accurate, robust and reliable calculations of Poisson-Boltzmann solvation energies

Automatic parametrization of implicit solvent models for the blind prediction of solvation free energies

Finite Volume Formulation of the MIB Method for Elliptic Interface Problems

Parameter optimization in differential geometry based solvation models

Matched Interface and Boundary Method for Elasticity Interface Problems

Objective-oriented Persistent Homology

Second order Method for Solving 3D Elasticity Equations with Complex and Sharp Interfaces