Researcher profile

Xin Guo

Xin Guo contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

Signature Approach for Contextual Bandits with Nonlinear and Path-dependent Rewards

We study contextual bandits with nonlinear and path-dependent rewards through a novel signature-transform-based approach. Leveraging the universal nonlinearity property of signatures, we approximate continuous path-dependent reward functionals by linear functionals in the signature space. This representation enables the use of efficient linear contextual bandit methods while preserving expressive sequential structure. Building on this framework, we propose \texttt{DisSigUCB}, a signature-based disjoint upper confidence bound (UCB) algorithm. Under boundedness and non-degeneracy assumptions, we prove a high-probability data-dependent sublinear regret bound of order \(\tilde{\mathcal O}(\sqrt{(d+m)KT})\) where \(d\) is the context dimension and \(m\) is the signature feature dimension. Synthetic experiments and numerical applications on temperature sensor monitoring, sleep-stage classification, and hospital nurse staffing demonstrate that \texttt{DisSigUCB} consistently outperforms classical linear and kernelized contextual bandit baselines in nonlinear and path-dependent settings.

preprint2023arXiv

A General Framework for Learning Mean-Field Games

This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision-making in stochastic games with a large population. It first establishes the existence of a unique Nash Equilibrium to this GMFG, and demonstrates that naively combining reinforcement learning with the fixed-point approach in classical MFGs yields unstable algorithms. It then proposes value-based and policy-based reinforcement learning algorithms (GMF-V and GMF-P, respectively) with smoothed policies, with analysis of their convergence properties and computational complexities. Experiments on an equilibrium product pricing problem demonstrate that GMF-V-Q and GMF-P-TRPO, two specific instantiations of GMF-V and GMF-P, respectively, with Q-learning and TRPO, are both efficient and robust in the GMFG setting. Moreover, their performance is superior in convergence speed, accuracy, and stability when compared with existing algorithms for multi-agent reinforcement learning in the $N$-player setting.

preprint2022arXiv

Dynamic Programming Principles for Mean-Field Controls with Learning

Dynamic programming principle (DPP) is fundamental for control and optimization, including Markov decision problems (MDPs), reinforcement learning (RL), and more recently mean-field controls (MFCs). However, in the learning framework of MFCs, DPP has not been rigorously established, despite its critical importance for algorithm designs. In this paper, we first present a simple example in MFCs with learning where DPP fails with a mis-specified Q function; and then propose the correct form of Q function in an appropriate space for MFCs with learning. This particular form of Q function is different from the classical one and is called the IQ function. In the special case when the transition probability and the reward are independent of the mean-field information, it integrates the classical Q function for single-agent RL over the state-action distribution. In other words, MFCs with learning can be viewed as lifting the classical RLs by replacing the state-action space with its probability distribution space. This identification of the IQ function enables us to establish precisely the DPP in the learning framework of MFCs. Finally, we illustrate through numerical experiments the time consistency of this IQ function.

preprint2022arXiv

Energetics of quantum vacuum friction. II: Dipole fluctuations and field fluctuations

As a second paper in series with arXiv:2108.01539, we discuss here quantum vacuum friction on an intrinsically dissipative particle. The friction arises not only from the field fluctuations but also from the dipole fluctuations intrinsic to the particle. As a result, the dissipative particle can be out of the nonequilibrium steady state (NESS), where it loses or gains internal energy. Only if the temperature of the particle equals a special NESS temperature will the particle be in NESS. We first derive the NESS conditions which give the NESS temperature of the particle as a function of the radiation temperature and the velocity of the particle. Imposing the NESS conditions, we then obtain an expression for the quantum vacuum friction in NESS. The NESS quantum vacuum friction is shown to be always negative definite, therefore a true drag. The NESS temperature and quantum vacuum friction are calculated numerically for various models. Out of NESS, even though the quantum vacuum frictional force no longer has a definite sign in the rest frame of the radiation, we find the external force needed to keep the particle moving must be in the same direction as the motion of the particle. This then excludes the possibility of making a perpetual motion machine, which could convert the vacuum energy into useful mechanical work. In addition, we find that the deviation of the temperature of the particle from its NESS temperature causes the particle to lose or gain internal energy in such a way that the particle would return to NESS after deviating from it. This enables experimental measurements of the NESS temperature of the particle to serve as a feasible signature for these quantum vacuum frictional effects.

preprint2022arXiv

Escaping Saddle Points Efficiently with Occupation-Time-Adapted Perturbations

Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively, and thus they are guaranteed to avoid getting stuck at non-degenerate saddle points. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, AMSGrad, and RMSProp.

preprint2022arXiv

Logarithmic regret for episodic continuous-time linear-quadratic reinforcement learning over a finite-time horizon

We study finite-time horizon continuous-time linear-quadratic reinforcement learning problems in an episodic setting, where both the state and control coefficients are unknown to the controller. We first propose a least-squares algorithm based on continuous-time observations and controls, and establish a logarithmic regret bound of order $O((\ln M)(\ln\ln M))$, with $M$ being the number of learning episodes. The analysis consists of two parts: perturbation analysis, which exploits the regularity and robustness of the associated Riccati differential equation; and parameter estimation error, which relies on sub-exponential properties of continuous-time least-squares estimators. We further propose a practically implementable least-squares algorithm based on discrete-time observations and piecewise constant controls, which achieves similar logarithmic regret with an additional term depending explicitly on the time stepsizes used in the algorithm.

preprint2022arXiv

Meta-learning with GANs for anomaly detection, with deployment in high-speed rail inspection system

Anomaly detection has been an active research area with a wide range of potential applications. Key challenges for anomaly detection in the AI era with big data include lack of prior knowledge of potential anomaly types, highly complex and noisy background in input data, scarce abnormal samples, and imbalanced training dataset. In this work, we propose a meta-learning framework for anomaly detection to deal with these issues. Within this framework, we incorporate the idea of generative adversarial networks (GANs) with appropriate choices of loss functions including structural similarity index measure (SSIM). Experiments with limited labeled data for high-speed rail inspection demonstrate that our meta-learning framework is sharp and robust in identifying anomalies. Our framework has been deployed in five high-speed railways of China since 2021: it has reduced more than 99.7% workload and saved 96.7% inspection time.

preprint2022arXiv

Out-of-distribution Generalization via Partial Feature Decorrelation

Most deep-learning-based image classification methods assume that all samples are generated under an independent and identically distributed (IID) setting. However, out-of-distribution (OOD) generalization is more common in practice, which means an agnostic context distribution shift between training and testing environments. To address this problem, we present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimizes a feature decomposition network and the target image classification model. The feature decomposition network decomposes feature embeddings into the independent and the correlated parts such that the correlations between features will be highlighted. Then, the correlated features help learn a stable feature representation by decorrelating the highlighted correlations while optimizing the image classification model. We verify the correlation modeling ability of the feature decomposition network on a synthetic dataset. The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.

preprint2022arXiv

Reinforcement learning for linear-convex models with jumps via stability analysis of feedback controls

We study finite-time horizon continuous-time linear-convex reinforcement learning problems in an episodic setting. In this problem, the unknown linear jump-diffusion process is controlled subject to nonsmooth convex costs. We show that the associated linear-convex control problems admit Lipchitz continuous optimal feedback controls and further prove the Lipschitz stability of the feedback controls, i.e., the performance gap between applying feedback controls for an incorrect model and for the true model depends Lipschitz-continuously on the magnitude of perturbations in the model coefficients; the proof relies on a stability analysis of the associated forward-backward stochastic differential equation. We then propose a novel least-squares algorithm which achieves a regret of the order $O(\sqrt{N\ln N})$ on linear-convex learning problems with jumps, where $N$ is the number of learning episodes; the analysis leverages the Lipschitz stability of feedback controls and concentration properties of sub-Weibull random variables. Numerical experiment confirms the convergence and the robustness of the proposed algorithm.

preprint2022arXiv

Transfer Learning for Retinal Vascular Disease Detection: A Pilot Study with Diabetic Retinopathy and Retinopathy of Prematurity

Retinal vascular diseases affect the well-being of human body and sometimes provide vital signs of otherwise undetected bodily damage. Recently, deep learning techniques have been successfully applied for detection of diabetic retinopathy (DR). The main obstacle of applying deep learning techniques to detect most other retinal vascular diseases is the limited amount of data available. In this paper, we propose a transfer learning technique that aims to utilize the feature similarities for detecting retinal vascular diseases. We choose the well-studied DR detection as a source task and identify the early detection of retinopathy of prematurity (ROP) as the target task. Our experimental results demonstrate that our DR-pretrained approach dominates in all metrics the conventional ImageNet-pretrained transfer learning approach, currently adopted in medical image analysis. Moreover, our approach is more robust with respect to the stochasticity in the training process and with respect to reduced training samples. This study suggests the potential of our proposed transfer learning approach for a broad range of retinal vascular diseases or pathologies, where data is limited.

preprint2021arXiv

Edge-Labeling based Directed Gated Graph Network for Few-shot Learning

Existing graph-network-based few-shot learning methods obtain similarity between nodes through a convolution neural network (CNN). However, the CNN is designed for image data with spatial information rather than vector form node feature. In this paper, we proposed an edge-labeling-based directed gated graph network (DGGN) for few-shot learning, which utilizes gated recurrent units to implicitly update the similarity between nodes. DGGN is composed of a gated node aggregation module and an improved gated recurrent unit (GRU) based edge update module. Specifically, the node update module adopts a gate mechanism using activation of edge feature, making a learnable node aggregation process. Besides, improved GRU cells are employed in the edge update procedure to compute the similarity between nodes. Further, this mechanism is beneficial to gradient backpropagation through the GRU sequence across layers. Experiment results conducted on two benchmark datasets show that our DGGN achieves a comparable performance to the-state-of-art methods.

preprint2020arXiv

A useful technique for piecewise deterministic Markov decision processes

This paper presents with justifications a technique that is useful for the study of piecewise deterministic Markov decision processes (PDMDPs) with general policies and unbounded transition intensities. This technique produces an auxiliary PDMDP from the original one. As to be discussed and claified, the auxiliary PDMDP possesses certain desired properties, which may not be possessed by the original PDMDP. Moreover, the performance measure of any policy in the original PDMDP can be replicated by the auxiliary PDMDP for a large class of performance criteria. As an application, we apply this technique to risk-sensitive PDMDPs with total cost criteria.

preprint2020arXiv

Accelerating CNN Training by Pruning Activation Gradients

Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the activation gradients in back-propagation are very close to zero and only have a tiny impact on weight-updating. Hence, we consider pruning these very small gradients randomly to accelerate CNN training according to the statistical distribution of activation gradients. Meanwhile, we theoretically analyze the impact of pruning algorithm on the convergence. The proposed approach is evaluated on AlexNet and ResNet-\{18, 34, 50, 101, 152\} with CIFAR-\{10, 100\} and ImageNet datasets. Experimental results show that our training approach could substantially achieve up to $5.92 \times$ speedups at back-propagation stage with negligible accuracy loss.

preprint2020arXiv

Approximation of Mean Field Games to N-Player Stochastic Games, with Singular Controls

This paper establishes that $N$-player stochastic games with singular controls, either of bounded velocity or of finite variation, can both be approximated by mean field games (MFGs) with singular controls of bounded velocity. More specifically, it shows i) the optimal control to an MFG with singular controls of a bounded velocity $θ$ is shown to be an $ε_N$-NE to an $N$-player game with singular controls of the bounded velocity, with $ε_N = O(\frac{1}{\sqrt{N}})$, and (ii) the optimal control to this MFG is an $(ε_N + ε_θ)$-NE to an $N$-player game with singular controls of finite variation, where $ε_θ$ is an error term that depends on $θ$. This work generalizes the classical result on approximation $N$-player games by MFGs, by allowing for discontinuous controls.

preprint2020arXiv

Audio-video Emotion Recognition in the Wild using Deep Hybrid Networks

This paper presents an audiovisual-based emotion recognition hybrid network. While most of the previous work focuses either on using deep models or hand-engineered features extracted from images, we explore multiple deep models built on both images and audio signals. Specifically, in addition to convolutional neural networks (CNN) and recurrent neutral networks (RNN) trained on facial images, the hybrid network also contains one SVM classifier trained on holistic acoustic feature vectors, one long short-term memory network (LSTM) trained on short-term feature sequences extracted from segmented audio clips, and one Inception(v2)-LSTM network trained on image-like maps, which are built based on short-term acoustic feature sequences. Experimental results show that the proposed hybrid network outperforms the baseline method by a large margin.

preprint2020arXiv

Early Detection of Retinopathy of Prematurity (ROP) in Retinal Fundus Images Via Convolutional Neural Networks

Retinopathy of prematurity (ROP) is an abnormal blood vessel development in the retina of a prematurely-born infant or an infant with low birth weight. ROP is one of the leading causes for infant blindness globally. Early detection of ROP is critical to slow down and avert the progression to vision impairment caused by ROP. Yet there is limited awareness of ROP even among medical professionals. Consequently, dataset for ROP is limited if ever available, and is in general extremely imbalanced in terms of the ratio between negative images and positive ones. In this study, we formulate the problem of detecting ROP in retinal fundus images in an optimization framework, and apply state-of-art convolutional neural network techniques to solve this problem. Experimental results based on our models achieve 100 percent sensitivity, 96 percent specificity, 98 percent accuracy, and 96 percent precision. In addition, our study shows that as the network gets deeper, more significant features can be extracted for better understanding of ROP.

preprint2020arXiv

Electrodynamic friction of a charged particle passing a conducting plate

The classical electromagnetic friction of a charged particle moving with prescribed constant velocity parallel to a planar imperfectly conducting surface is reinvestigated. As a concrete example, the Drude model is used to describe the conductor. The transverse electric and transverse magnetic contributions have very different character both in the low velocity (nonrelativistic) and high velocity (ultrarelativistic) regimes. Both numerical and analytical results are given. Most remarkably, the transverse magnetic contribution to the friction has a maximum for $|\mathbf{v}|<c$, and persists in the limit of vanishing resistivity for sufficiently high velocities. We also show how Vavilov-Čerenkov radiation can be treated in the same formalism.

preprint2020arXiv

Graph Neural Networks for Image Understanding Based on Multiple Cues: Group Emotion Recognition and Event Recognition as Use Cases

A graph neural network (GNN) for image understanding based on multiple cues is proposed in this paper. Compared to traditional feature and decision fusion approaches that neglect the fact that features can interact and exchange information, the proposed GNN is able to pass information among features extracted from different models. Two image understanding tasks, namely group-level emotion recognition (GER) and event recognition, which are highly semantic and require the interaction of several deep models to synthesize multiple cues, were selected to validate the performance of the proposed method. It is shown through experiments that the proposed method achieves state-of-the-art performance on the selected image understanding tasks. In addition, a new group-level emotion recognition database is introduced and shared in this paper.

preprint2020arXiv

Kernel-based L_2-Boosting with Structure Constraints

Developing efficient kernel methods for regression is very popular in the past decade. In this paper, utilizing boosting on kernel-based weaker learners, we propose a novel kernel-based learning algorithm called kernel-based re-scaled boosting with truncation, dubbed as KReBooT. The proposed KReBooT benefits in controlling the structure of estimators and producing sparse estimate, and is near overfitting resistant. We conduct both theoretical analysis and numerical simulations to illustrate the power of KReBooT. Theoretically, we prove that KReBooT can achieve the almost optimal numerical convergence rate for nonlinear approximation. Furthermore, using the recently developed integral operator approach and a variant of Talagrand&#39;s concentration inequality, we provide fast learning rates for KReBooT, which is a new record of boosting-type algorithms. Numerically, we carry out a series of simulations to show the promising performance of KReBooT in terms of its good generalization, near over-fitting resistance and structure constraints.

preprint2020arXiv

MFGs for partially reversible investment

This paper analyzes a class of infinite-time-horizon stochastic games with singular controls motivated from the partially reversible problem. It provides an explicit solution for the mean-field game (MFG) and presents sensitivity analysis to compare the solution for the MFG with that for the single-agent control problem. It shows that in the MFG, model parameters not only affect the optimal strategies as in the single-agent case, but also influence the equilibrium price. It then establishes that the solution to the MFG is an $ε$-Nash Equilibrium to the corresponding $N$-player game, with $ε=O\left(\frac{1}{\sqrt N}\right)$.

preprint2020arXiv

PAC-Bayesian Generalization Bounds for MultiLayer Perceptrons

We study PAC-Bayesian generalization bounds for Multilayer Perceptrons (MLPs) with the cross entropy loss. Above all, we introduce probabilistic explanations for MLPs in two aspects: (i) MLPs formulate a family of Gibbs distributions, and (ii) minimizing the cross-entropy loss for MLPs is equivalent to Bayesian variational inference, which establish a solid probabilistic foundation for studying PAC-Bayesian bounds on MLPs. Furthermore, based on the Evidence Lower Bound (ELBO), we prove that MLPs with the cross entropy loss inherently guarantee PAC- Bayesian generalization bounds, and minimizing PAC-Bayesian generalization bounds for MLPs is equivalent to maximizing the ELBO. Finally, we validate the proposed PAC-Bayesian generalization bound on benchmark datasets.

preprint2020arXiv

Prediction and analysis of Coronavirus Disease 2019

In December 2019, a novel coronavirus was found in a seafood wholesale market in Wuhan, China. WHO officially named this coronavirus as COVID-19. Since the first patient was hospitalized on December 12, 2019, China has reported a total of 78,824 confirmed CONID-19 cases and 2,788 deaths as of February 28, 2020. Wuhan&#39;s cumulative confirmed cases and deaths accounted for 61.1% and 76.5% of the whole China mainland , making it the priority center for epidemic prevention and control. Meanwhile, 51 countries and regions outside China have reported 4,879 confirmed cases and 79 deaths as of February 28, 2020. COVID-19 epidemic does great harm to people&#39;s daily life and country&#39;s economic development. This paper adopts three kinds of mathematical models, i.e., Logistic model, Bertalanffy model and Gompertz model. The epidemic trends of SARS were first fitted and analyzed in order to prove the validity of the existing mathematical models. The results were then used to fit and analyze the situation of COVID-19. The prediction results of three different mathematical models are different for different parameters and in different regions. In general, the fitting effect of Logistic model may be the best among the three models studied in this paper, while the fitting effect of Gompertz model may be better than Bertalanffy model. According to the current trend, based on the three models, the total number of people expected to be infected is 49852-57447 in Wuhan,12972-13405 in non-Hubei areas and 80261-85140 in China respectively. The total death toll is 2502-5108 in Wuhan, 107-125 in Non-Hubei areas and 3150-6286 in China respetively. COVID-19 will be over p robably in late-April, 2020 in Wuhan and before late-March, 2020 in other areas respectively.

preprint2020arXiv

Self-force on moving electric and magnetic dipoles: dipole radiation, Vavilov-Čerenkov radiation, friction with a conducting surface, and the Einstein-Hopf effect

The classical electromagnetic self-force on an arbitrary time-dependent electric or magnetic dipole moving with constant velocity in vacuum, and in a medium, is considered. Of course, in vacuum there is no net force on such a particle. Rather, because of loss of mass by the particle due to radiation, the self-force precisely cancels this inertial effect, and thus the spectral distribution of the energy radiated by dipole radiation is deduced without any consideration of radiation fields or of radiation reaction, in both the nonrelativistic and relativistic regimes. If the particle is moving in a homogeneous medium faster than the speed of light in the medium, Vavilov-Čerenkov radiation results. This is derived for the different polarization states, in agreement with the earlier results of Frank. The friction experienced by a point (time-independent) dipole moving parallel to an imperfectly conducting surface is examined. Finally, the quantum/thermal Einstein-Hopf effect is rederived. We obtain a closed form for the spectral distribution of the force, and demonstrate that, even if the atom and the blackbody background have independent temperatures, the force is indeed a drag in the case that the imaginary part of the polarizability is proportional to a power of the frequency.