Source author record

Guojun Zhang

Guojun Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning hep-th Computer Science and Game Theory math.OC Computer Vision physics.atom-ph

Catalog footprint

What is connected

9works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Domain Adversarial Training: A Game Perspective

The dominant line of work in domain adaptation has focused on learning invariant representations using domain-adversarial training. In this paper, we interpret this approach from a game theoretical perspective. Defining optimal solutions in domain-adversarial training as a local Nash equilibrium, we show that gradient descent in domain-adversarial training can violate the asymptotic convergence guarantees of the optimizer, oftentimes hindering the transfer performance. Our analysis leads us to replace gradient descent with high-order ODE solvers (i.e., Runge-Kutta), for which we derive asymptotic convergence guarantees. This family of optimizers is significantly more stable and allows more aggressive learning rates, leading to high performance gains when used as a drop-in replacement over standard optimizers. Our experiments show that in conjunction with state-of-the-art domain-adversarial methods, we achieve up to 3.5% improvement with less than of half training iterations. Our optimizers are easy to implement, free of additional parameters, and can be plugged into any domain-adversarial framework.

preprint2022arXiv

Mitigating Data Heterogeneity in Federated Learning with Data Augmentation

Federated Learning (FL) is a prominent framework that enables training a centralized model while securing user privacy by fusing local, decentralized models. In this setting, one major obstacle is data heterogeneity, i.e., each client having non-identically and independently distributed (non-IID) data. This is analogous to the context of Domain Generalization (DG), where each client can be treated as a different domain. However, while many approaches in DG tackle data heterogeneity from the algorithmic perspective, recent evidence suggests that data augmentation can induce equal or greater performance. Motivated by this connection, we present federated versions of popular DG algorithms, and show that by applying appropriate data augmentation, we can mitigate data heterogeneity in the federated setting, and obtain higher accuracy on unseen clients. Equipped with data augmentation, we can achieve state-of-the-art performance using even the most basic Federated Averaging algorithm, with much sparser communication.

preprint2022arXiv

Optimality and Stability in Non-Convex Smooth Games

Convergence to a saddle point for convex-concave functions has been studied for decades, while recent years has seen a surge of interest in non-convex (zero-sum) smooth games, motivated by their recent wide applications. It remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. An interesting concept is known as the local minimax point, which strongly correlates with the widely-known gradient descent ascent algorithm. This paper aims to provide a comprehensive analysis of local minimax points, such as their relation with other solution concepts and their optimality conditions. We find that local saddle points can be regarded as a special type of local minimax points, called uniformly local minimax points, under mild continuity assumptions. In (non-convex) quadratic games, we show that local minimax points are (in some sense) equivalent to global minimax points. Finally, we study the stability of gradient algorithms near local minimax points. Although gradient algorithms can converge to local/global minimax points in the non-degenerate case, they would often fail in general cases. This implies the necessity of either novel algorithms or concepts beyond saddle points and minimax points in non-convex smooth games.

preprint2022arXiv

Robust One Round Federated Learning with Predictive Space Bayesian Inference

Making predictions robust is an important challenge. A separate challenge in federated learning (FL) is to reduce the number of communication rounds, particularly since doing so reduces performance in heterogeneous data settings. To tackle both issues, we take a Bayesian perspective on the problem of learning a global model. We show how the global predictive posterior can be approximated using client predictive posteriors. This is unlike other works which aggregate the local model space posteriors into the global model space posterior, and are susceptible to high approximation errors due to the posterior's high dimensional multimodal nature. In contrast, our method performs the aggregation on the predictive posteriors, which are typically easier to approximate owing to the low-dimensionality of the output space. We present an algorithm based on this idea, which performs MCMC sampling at each client to obtain an estimate of the local posterior, and then aggregates these in one round to obtain a global ensemble model. Through empirical evaluation on several classification and regression tasks, we show that despite using one round of communication, the method is competitive with other FL techniques, and outperforms them on heterogeneous settings. The code is publicly available at https://github.com/hasanmohsin/FedPredSpace_1Round.

preprint2020arXiv

Convergence of Gradient Methods on Bilinear Zero-Sum Games

Min-max formulations have attracted great attention in the ML community due to the rise of deep generative models and adversarial methods, while understanding the dynamics of gradient algorithms for solving such formulations has remained a grand challenge. As a first step, we restrict to bilinear zero-sum games and give a systematic analysis of popular gradient updates, for both simultaneous and alternating versions. We provide exact conditions for their convergence and find the optimal parameter setup and convergence rates. In particular, our results offer formal evidence that alternating updates converge "better" than simultaneous ones.

preprint2020arXiv

Deep Learning for Feynman's Path Integral in Strong-Field Time-Dependent Dynamics

Feynman's path integral approach is to sum over all possible spatio-temporal paths to reproduce the quantum wave function and the corresponding time evolution, which has enormous potential to reveal quantum processes in classical view. However, the complete characterization of quantum wave function with infinite paths is a formidable challenge, which greatly limits the application potential, especially in the strong-field physics and attosecond science. Instead of brute-force tracking every path one by one, here we propose deep-learning-performed strong-field Feynman's formulation with pre-classification scheme which can predict directly the final results only with data of initial conditions, so as to attack unsurmountable tasks by existing strong-field methods and explore new physics. Our results build up a bridge between deep learning and strong-field physics through the Feynman's path integral, which would boost applications of deep learning to study the ultrafast time-dependent dynamics in strong-field physics and attosecond science, and shed a new light on the quantum-classical correspondence.

preprint2016arXiv

Minimal Basis in Four Dimensions and Scalar Blocks

We find a construction that expresses any tree-level $n$-particle ${\rm N^{k-2}MHV}$ color-ordered partial amplitude in gauge theory as a linear combination of a basis of dimension $\eulerian{n-3}{k-2}$. Here $\eulerian{p}{q}$ denotes the $(p,q)$ Eulerian number. The coefficients of the expansion are independent of the helicities of the particles. This basis is a four-dimensional refinement of the $(n-3)!$-element BCJ basis which is valid in any number of dimensions. The construction uses a new kind of objects which we call {\it scalar blocks}. Here we initiate the study of these objects. Scalar blocks provide an "${\rm N^{k-2}MHV}$ sector" decomposition of a bi-adjoint scalar amplitude in four dimensions. As byproducts of the construction, we also find an intrinsically four-dimensional version of KLT relations for gravity amplitudes.

preprint2015arXiv

New Exact Quantization Condition for Toric Calabi-Yau Geometries

We propose a new exact quantization condition for a class of quantum mechanical systems derived from local toric Calabi-Yau three-folds. Our proposal includes all contributions to the energy spectrum which are non-perturbative in the Planck constant, and is much simpler than the available quantization condition in the literature. We check that our proposal is consistent with previous works and implies non-trivial relations among the topological Gopakumar-Vafa invariants of the toric Calabi-Yau geometries. Together with the recent developments, our proposal opens a new avenue in the long investigations at the interface of geometry, topology and quantum mechanics.

preprint2015arXiv

One-loop Structure of Higher Rank Wilson Loops in AdS/CFT

The half-supersymmetric Wilson loop in $\mathcal N=4$ SYM is arguably the central non-local operator in the AdS/CFT correspondence. On the field theory side, the vacuum expectation values of Wilson loops in arbitrary representations of $SU(N)$ are captured to all orders in perturbation theory by a Gaussian matrix model. Of prominent interest are the $k$-symmetric and $k$-antisymmetric representations, whose gravitational description is given in terms of D3- and D5-branes, respectively, with fluxes in their world volumes. At leading order in $N$ and $λ$ the agreement in both cases is exact. In this note we explore the structure of the next-to-leading order correction in the matrix model and compare with existing string theory calculations. We find agreement in the functional dependence on $k$ but a mismatch in the numerical coefficients.

Guojun Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Domain Adversarial Training: A Game Perspective

Mitigating Data Heterogeneity in Federated Learning with Data Augmentation

Optimality and Stability in Non-Convex Smooth Games

Robust One Round Federated Learning with Predictive Space Bayesian Inference

Convergence of Gradient Methods on Bilinear Zero-Sum Games

Deep Learning for Feynman's Path Integral in Strong-Field Time-Dependent Dynamics

Minimal Basis in Four Dimensions and Scalar Blocks

New Exact Quantization Condition for Toric Calabi-Yau Geometries

One-loop Structure of Higher Rank Wilson Loops in AdS/CFT