Source author record

Andrea L. Bertozzi

Andrea L. Bertozzi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.NA Numerical Analysis math.OC physics.soc-ph Social and Information Networks Artificial Intelligence Computer Vision physics.flu-dyn cond-mat.mtrl-sci Distributed, Parallel, and Cluster Computing eess.IV math.AP math.DS Networking and Internet Architecture Neural and Evolutionary Computing physics.data-an Populations and Evolution

Catalog footprint

What is connected

19works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Dynamics of small particle inertial migration in curved square ducts

Microchannels are well-known in microfluidic applications for the control and separation of microdroplets and cells. Often the objects in the flow experience inertial effects, resulting in dynamics that is a departure from the underlying channel flow dynamics. This paper considers small neutrally buoyant spherical particles suspended in flow through a curved duct having a square cross-section. The particle experiences a combination of inertial lift force induced by the disturbance from the primary flow along the duct, and drag from the secondary vortices in the cross-section, which drive migration of the particle within the cross-section. We construct a simplified model that preserves the core topology of the force field yet depends on a single parameter $κ$, quantifying the relative strength of the two forces. We show that $κ$ is a bifurcation parameter for the dynamical system that describes motion of the particle in the cross section of the duct. At large values of $κ$ there exists an attracting limit cycle, in each of the upper and lower halves of the duct. At small $κ$ we find that particles migrate to one of four stable foci. Between these extremes, there is an intermediate-range of $κ$ for which all particles migrate to a single stable focus. Noting that the positions of the limit cycles and foci vary with the value of $κ$, this behavior indicates that, for a suitable particle mixture, duct bend radius might be chosen to segregate particles by size. We evaluate the time and axial distance required to focus particles near the unique stable node, which determines the duct length required for particle segregation.

preprint2022arXiv

Graph-based Active Learning for Semi-supervised Classification of SAR Data

We present a novel method for classification of Synthetic Aperture Radar (SAR) data by combining ideas from graph-based learning and neural network methods within an active learning framework. Graph-based methods in machine learning are based on a similarity graph constructed from the data. When the data consists of raw images composed of scenes, extraneous information can make the classification task more difficult. In recent years, neural network methods have been shown to provide a promising framework for extracting patterns from SAR images. These methods, however, require ample training data to avoid overfitting. At the same time, such training data are often unavailable for applications of interest, such as automatic target recognition (ATR) and SAR data. We use a Convolutional Neural Network Variational Autoencoder (CNNVAE) to embed SAR data into a feature space, and then construct a similarity graph from the embedded data and apply graph-based semi-supervised learning techniques. The CNNVAE feature embedding and graph construction requires no labeled data, which reduces overfitting and improves the generalization performance of graph learning at low label rates. Furthermore, the method easily incorporates a human-in-the-loop for active learning in the data-labeling process. We present promising results and compare them to other standard machine learning methods on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset for ATR with small amounts of labeled data.

preprint2022arXiv

Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers. These solvers are computationally expensive, requiring the use of tiny step sizes for numerical stability and accuracy guarantees. This paper considers learning neural ODEs using implicit ODE solvers of different orders leveraging proximal operators. The proximal implicit solver consists of inner-outer iterations: the inner iterations approximate each implicit update step using a fast optimization algorithm, and the outer iterations solve the ODE system over time. The proximal implicit ODE solver guarantees superiority over explicit solvers in numerical stability and computational efficiency. We validate the advantages of proximal implicit solvers over existing popular neural ODE solvers on various challenging benchmark tasks, including learning continuous-depth graph neural networks and continuous normalizing flows.

preprint2021arXiv

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

We propose near-optimal overlay networks based on $d$-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical convergence and generalization bounds for DFL. As such, our proposed overlay networks accelerate convergence, improve generalization, and enhance robustness to clients failures in DFL with theoretical guarantees. Also, we present an efficient algorithm to convert a given graph to a practical overlay network and maintaining the network topology after potential client failures. We numerically verify the advantages of DFL with our proposed networks on various benchmark tasks, ranging from image classification to language modeling using hundreds of clients.

preprint2020arXiv

A Hamilton-Jacobi Formulation for Time-Optimal Paths of Rectangular Nonholonomic Vehicles

We address the problem of optimal path planning for a simple nonholonomic vehicle in the presence of obstacles. Most current approaches are either split hierarchically into global path planning and local collision avoidance, or neglect some of the ambient geometry by assuming the car is a point mass. We present a Hamilton-Jacobi formulation of the problem that resolves time-optimal paths and considers the geometry of the vehicle.

preprint2020arXiv

A Model for Optimal Human Navigation with Stochastic Effects

We present a method for optimal path planning of human walking paths in mountainous terrain, using a control theoretic formulation and a Hamilton-Jacobi-Bellman equation. Previous models for human navigation were entirely deterministic, assuming perfect knowledge of the ambient elevation data and human walking velocity as a function of local slope of the terrain. Our model includes a stochastic component which can account for uncertainty in the problem, and thus includes a Hamilton-Jacobi-Bellman equation with viscosity. We discuss the model in the presence and absence of stochastic effects, and suggest numerical methods for simulating the model. We discuss two different notions of an optimal path when there is uncertainty in the problem. Finally, we compare the optimal paths suggested by the model at different levels of uncertainty, and observe that as the size of the uncertainty tends to zero (and thus the viscosity in the equation tends to zero), the optimal path tends toward the deterministic optimal path.

preprint2020arXiv

A theory for undercompressive shocks in tears of wine

We revisit the tears of wine problem for thin films in water-ethanol mixtures and present a new model for the climbing dynamics. The new formulation includes a Marangoni stress balanced by both the normal and tangential components of gravity as well as surface tension which lead to distinctly different behavior. The prior literature did not address the wine tears but rather the behavior of the film at earlier stages and the behavior of the meniscus. In the lubrication limit we obtain an equation that is already well-known for rising films in the presence of thermal gradients. Such models can exhibit non-classical shocks that are undercompressive. We present basic theory that allows one to identify the signature of an undercompressive (UC) wave. We observe both compressive and undercompressive waves in new experiments and we argue that, in the case of a pre-coated glass, the famous "wine tears" emerge from a reverse undercompressive shock originating at the meniscus.

preprint2020arXiv

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation remarkably improves both the generalization and robustness of DNN. In the CIFAR10 benchmark, we raise the robust accuracy of the adversarially trained ResNet20 from $\sim 46\%$ to $\sim 69\%$ under the state-of-the-art Iterative Fast Gradient Sign Method (IFGSM) based adversarial attack. When we combine this data-dependent activation with total variation minimization on adversarial images and training data augmentation, we achieve an improvement in robust accuracy by 38.9$\%$ for ResNet56 under the strongest IFGSM attack. Furthermore, We provide an intuitive explanation of our defense by analyzing the geometry of the feature space.

preprint2020arXiv

Efficient Graph-Based Active Learning with Probit Likelihood via Gaussian Approximations

We present a novel adaptation of active learning to graph-based semi-supervised learning (SSL) under non-Gaussian Bayesian models. We present an approximation of non-Gaussian distributions to adapt previously Gaussian-based acquisition functions to these more general cases. We develop an efficient rank-one update for applying "look-ahead" based methods as well as model retraining. We also introduce a novel "model change" acquisition function based on these approximations that further expands the available collection of active learning acquisition functions for such methods.

preprint2020arXiv

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up the convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance in training ResNet200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with significantly fewer training epochs compared to the SGD baseline.

preprint2020arXiv

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Deep neural nets (DNNs) compression is crucial for adaptation to mobile devices. Though many successful algorithms exist to compress naturally trained DNNs, developing efficient and stable compression algorithms for robustly trained DNNs remains widely open. In this paper, we focus on a co-design of efficient DNN compression algorithms and sparse neural architectures for robust and accurate deep learning. Such a co-design enables us to advance the goal of accommodating both sparsity and robustness. With this objective in mind, we leverage the relaxed augmented Lagrangian based algorithms to prune the weights of adversarially trained DNNs, at both structured and unstructured levels. Using a Feynman-Kac formalism principled robust and sparse DNNs, we can at least double the channel sparsity of the adversarially trained ResNet20 for CIFAR10 classification, meanwhile, improve the natural accuracy by $8.69$\% and the robust accuracy under the benchmark $20$ iterations of IFGSM attack by $5.42$\%. The code is available at \url{https://github.com/BaoWangMath/rvsm-rgsm-admm}.

preprint2020arXiv

The challenges of modeling and forecasting the spread of COVID-19

We present three data driven model-types for COVID-19 with a minimal number of parameters to provide insights into the spread of the disease that may be used for developing policy responses. The first is exponential growth, widely studied in analysis of early-time data. The second is a self-exciting branching process model which includes a delay in transmission and recovery. It allows for meaningful fit to early time stochastic data. The third is the well-known Susceptible-Infected-Resistant (SIR) model and its cousin, SEIR, with an "Exposed" component. All three models are related quantitatively, and the SIR model is used to illustrate the potential effects of short-term distancing measures in the United States.

preprint2016arXiv

Growth and Containment of a Hierarchical Criminal Network

We model the hierarchical evolution of an organized criminal network via antagonistic recruitment and pursuit processes. Within the recruitment phase, a criminal kingpin enlists new members into the network, who in turn seek out other affiliates. New recruits are linked to established criminals according to a probability distribution that depends on the current network structure. At the same time, law enforcement agents attempt to dismantle the growing organization using pursuit strategies that initiate on the lower level nodes and that unfold as self-avoiding random walks. The global details of the organization are unknown to law enforcement, who must explore the hierarchy node by node. We halt the pursuit when certain local criteria of the network are uncovered, encoding if and when an arrest is made; the criminal network is assumed to be eradicated if the kingpin is arrested. We first analyze recruitment and study the large scale properties of the growing network; later we add pursuit and use numerical simulations to study the eradication probability in the case of three pursuit strategies, the time to first eradication and related costs. Within the context of this model, we find that eradication becomes increasingly costly as the network increases in size and that the optimal way of arresting the kingpin is to intervene at the early stages of network formation. We discuss our results in the context of dark network disruption and their implications on possible law enforcement strategies.

preprint2014arXiv

A Blob Method for the Aggregation Equation

Motivated by classical vortex blob methods for the Euler equations, we develop a numerical blob method for the aggregation equation. This provides a counterpoint to existing literature on particle methods. By regularizing the velocity field with a mollifier or "blob function", the blob method has a faster rate of convergence and allows a wider range of admissible kernels. In fact, we prove arbitrarily high polynomial rates of convergence to classical solutions, depending on the choice of mollifier. The blob method conserves mass and the corresponding particle system is both energy decreasing for a regularized free energy functional and preserves the Wasserstein gradient flow structure. We consider numerical examples that validate our predicted rate of convergence and illustrate qualitative properties of the method.

preprint2014arXiv

Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

We present two graph-based algorithms for multiclass segmentation of high-dimensional data. The algorithms use a diffuse interface model based on the Ginzburg-Landau functional, related to total variation compressed sensing and image processing. A multiclass extension is introduced using the Gibbs simplex, with the functional's double-well potential modified to handle the multiclass case. The first algorithm minimizes the functional using a convex splitting numerical scheme. The second algorithm is a uses a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding. We demonstrate the performance of both algorithms experimentally on synthetic data, grayscale and color images, and several benchmark data sets such as MNIST, COIL and WebKB. We also make use of fast numerical solvers for finding the eigenvectors and eigenvalues of the graph Laplacian, and take advantage of the sparsity of the matrix. Experiments indicate that the results are competitive with or better than the current state-of-the-art multiclass segmentation algorithms.

preprint2013arXiv

A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme

The study of network structure is pervasive in sociology, biology, computer science, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups of nodes called "communities". One popular approach to find communities is to maximize a quality function known as {\em modularity} to achieve some sort of optimal clustering of nodes. In this paper, we interpret the modularity function from a novel perspective: we reformulate modularity optimization as a minimization problem of an energy functional that consists of a total variation term and an $\ell_2$ balance term. By employing numerical techniques from image processing and $\ell_1$ compressive sensing -- such as convex splitting and the Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the minimization problem. We present our computational results using both synthetic benchmark networks and real data.

preprint2012arXiv

Characterization of radially symmetric finite time blowup in multidimensional aggregation equations,

This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for $2-d\leq α<2$ for which the smooth densities are known to develop singularities in finite time. For this range This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for $2-d\leq α<2$ for which the smooth densities are known to develop singularities in finite time. For this range we prove the existence for all time of radially symmetric measure solutions that are monotone decreasing as a function of the radius, thus allowing for continuation of the solution past the blowup time. The monotone constraint on the data is consistent with the typical blowup profiles observed in recent numerical studies of these singularities. We prove monotonicity is preserved for all time, even after blowup, in contrast to the case $α>2$ where radially symmetric solutions are known to lose monotonicity. In the case of the Newtonian potential ($α=2-d$), under the assumption of radial symmetry the equation can be transformed into the inviscid Burgers equation on a half line. This enables us to prove preservation of monotonicity using the classical theory of conservation laws. In the case $2 -d < α< 2$ and at the critical exponent $p$ we exhibit initial data in $L^p$ for which the solution immediately develops a Dirac mass singularity. This extends recent work on the local ill-posedness of solutions at the critical exponent.

preprint2012arXiv

Development of Knife-Edge Ridges on Ion-Bombarded Surfaces

We demonstrate in both laboratory and numerical experiments that ion bombardment of a modestly sloped surface can create knife-edge like ridges with extremely high slopes. Small pre-fabricated pits expand under ion bombardment, and the collision of two such pits creates knife-edge ridges. Both laboratory and numerical experiments show that the pit propagation speed and the precise shape of the knife edge ridges are universal, independent of initial conditions, as has been predicted theoretically. These observations suggest a novel method of fabrication in which a surface is pre-patterned so that it dynamically evolves to a desired target pattern made of knife-edge ridges.

preprint2012arXiv

Multislice Modularity Optimization in Community Detection and Image Segmentation

Because networks can be used to represent many complex systems, they have attracted considerable attention in physics, computer science, sociology, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups (i.e., "communities") of nodes. In this paper, we algorithmically detect communities in social networks and image data by optimizing multislice modularity. A key advantage of modularity optimization is that it does not require prior knowledge of the number or sizes of communities, and it is capable of finding network partitions that are composed of communities of different sizes. By optimizing multislice modularity and subsequently calculating diagnostics on the resulting network partitions, it is thereby possible to obtain information about network structure across multiple system scales. We illustrate this method on data from both social networks and images, and we find that optimization of multislice modularity performs well on these two tasks without the need for extensive problem-specific adaptation. However, improving the computational speed of this method remains a challenging open problem.

Andrea L. Bertozzi

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Dynamics of small particle inertial migration in curved square ducts

Graph-based Active Learning for Semi-supervised Classification of SAR Data

Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

A Hamilton-Jacobi Formulation for Time-Optimal Paths of Rectangular Nonholonomic Vehicles

A Model for Optimal Human Navigation with Stochastic Effects

A theory for undercompressive shocks in tears of wine

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

Efficient Graph-Based Active Learning with Probit Likelihood via Gaussian Approximations

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

The challenges of modeling and forecasting the spread of COVID-19

Growth and Containment of a Hierarchical Criminal Network

A Blob Method for the Aggregation Equation

Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme

Characterization of radially symmetric finite time blowup in multidimensional aggregation equations,

Development of Knife-Edge Ridges on Ion-Bombarded Surfaces

Multislice Modularity Optimization in Community Detection and Image Segmentation