Source author record

Peng Zhao

Peng Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

48works

24topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AeroSketch: Near-Optimal Time Matrix Sketch Framework for Persistent, Sliding Window, and Distributed Streams

Many real-world matrix datasets arrive as high-throughput vector streams, making it impractical to store or process them in their entirety. To enable real-time analytics under limited computational, memory, and communication resources, matrix sketching techniques have been developed over recent decades to provide compact approximations of such streaming data. Some algorithms have achieved optimal space and communication complexity. However, these approaches often require frequent time-consuming matrix factorization operations. In particular, under tight approximation error bounds, each matrix factorization computation incurs cubic time complexity, thereby limiting their update efficiency. In this paper, we introduce AeroSketch, a novel matrix sketching framework that leverages recent advances in randomized numerical linear algebra (RandNLA). AeroSketch achieves optimal communication and space costs while delivering near-optimal update time complexity (within logarithmic factors) across persistent, sliding window, and distributed streaming scenarios. Extensive experiments on both synthetic and real-world datasets demonstrate that AeroSketch consistently outperforms state-of-the-art methods in update throughput. In particular, under tight approximation error constraints, AeroSketch reduces the cubic time complexity to the quadratic level. Meanwhile, it maintains comparable approximation quality while retaining optimal communication and space costs.

preprint2026arXiv

Breaking Coordinate Overfitting: Geometry-Aware WiFi Sensing for Cross-Layout 3D Pose Estimation

WiFi-based 3D human pose estimation offers a low-cost and privacy-preserving alternative to vision-based systems for smart interaction. However, existing approaches rely on visual 3D poses as supervision and directly regress CSI to a camera-based coordinate system. We find that this practice leads to coordinate overfitting: models memorize deployment-specific WiFi transceiver layouts rather than only learning activity-relevant representations, resulting in severe generalization failures. To address this challenge, we present PerceptAlign, the first geometry-conditioned framework for WiFi-based cross-layout pose estimation. PerceptAlign introduces a lightweight coordinate unification procedure that aligns WiFi and vision measurements in a shared 3D space using only two checkerboards and a few photos. Within this unified space, it encodes calibrated transceiver positions into high-dimensional embeddings and fuses them with CSI features, making the model explicitly aware of device geometry as a conditional variable. This design forces the network to disentangle human motion from deployment layouts, enabling robust and, for the first time, layout-invariant WiFi pose estimation. To support systematic evaluation, we construct the largest cross-domain 3D WiFi pose estimation dataset to date, comprising 21 subjects, 5 scenes, 18 actions, and 7 device layouts. Experiments show that PerceptAlign reduces in-domain error by 12.3% and cross-domain error by more than 60% compared to state-of-the-art baselines. These results establish geometry-conditioned learning as a viable path toward scalable and practical WiFi sensing.

preprint2026arXiv

Dynamic Chunking for Diffusion Language Models

Block discrete diffusion language models factorize a sequence autoregressively over fixed-size positional blocks, decoupling within-block parallel denoising from across-block conditioning. We argue that this rigid partition wastes structure already present in the sequence: blocks defined by position rather than by content separate semantically coherent tokens and group unrelated ones together. We introduce the \textbf{D}ynamic \textbf{C}hunking \textbf{D}iffusion \textbf{M}odel (DCDM), which replaces positional blocks with content-defined semantic chunks. At its core is Chunking Attention, a differentiable layer that routes tokens into $K$ clusters parameterized by learnable subspaces and shaped end-to-end by the diffusion objective. The resulting cluster assignments induce a chunk-causal attention mask under which a discrete diffusion denoiser factorizes the sequence likelihood autoregressively over semantic chunks, strictly generalizing block discrete diffusion. On downstream benchmarks at parameter scales up to 1.5B, DCDM consistently improves over both unstructured and positional-block diffusion baselines, with the advantage stable across scales and visible early in training.

preprint2026arXiv

Revisiting Weighted Strategy for Non-stationary Parametric Bandits and MDPs

Non-stationary parametric bandits have attracted much attention recently. There are three principled ways to deal with non-stationarity, including sliding-window, weighted, and restart strategies. As many non-stationary environments exhibit gradual drifting patterns, the weighted strategy is commonly adopted in real-world applications. However, previous theoretical studies show that its analysis is more involved and the algorithms are either computationally less efficient or statistically suboptimal. This paper revisits the weighted strategy for non-stationary parametric bandits. In linear bandits (LB), we discover that this undesirable feature is due to an inadequate regret analysis, which results in an overly complex algorithm design. We propose a \emph{refined analysis framework}, which simplifies the derivation and, importantly, produces a simpler weight-based algorithm that is as efficient as window/restart-based algorithms while retaining the same regret as previous studies. Furthermore, our new framework can be used to improve regret bounds of other parametric bandits, including Generalized Linear Bandits (GLB) and Self-Concordant Bandits (SCB). For example, we develop a simple weighted GLB algorithm with an $\tilde{O}(k_μ^{5/4} c_μ^{-3/4} d^{3/4} P_T^{1/4}T^{3/4})$ regret, improving the $\tilde{O}(k_μ^{2} c_μ^{-1}d^{9/10} P_T^{1/5}T^{4/5})$ bound in prior work, where $k_μ$ and $c_μ$ characterize the reward model's nonlinearity, $P_T$ measures the non-stationarity, $d$ and $T$ denote the dimension and time horizon. Moreover, we extend our framework to non-stationary Markov Decision Processes (MDPs) with function approximation, focusing on Linear Mixture MDP and Multinomial Logit (MNL) Mixture MDP. For both classes, we propose algorithms based on the weighted strategy and establish dynamic regret guarantees using our analysis framework.

preprint2023arXiv

Adapting to Online Label Shift with Provable Guarantees

The standard supervised learning paradigm works effectively when training data shares the same distribution as the upcoming testing samples. However, this stationary assumption is often violated in real-world applications, especially when testing data appear in an online fashion. In this paper, we formulate and investigate the problem of \emph{online label shift} (OLaS): the learner trains an initial model from the labeled offline data and then deploys it to an unlabeled online environment where the underlying label distribution changes over time but the label-conditional density does not. The non-stationarity nature and the lack of supervision make the problem challenging to be tackled. To address the difficulty, we construct a new unbiased risk estimator that utilizes the unlabeled data, which exhibits many benign properties albeit with potential non-convexity. Building upon that, we propose novel online ensemble algorithms to deal with the non-stationarity of the environments. Our approach enjoys optimal \emph{dynamic regret}, indicating that the performance is competitive with a clairvoyant who knows the online environments in hindsight and then chooses the best decision for each round. The obtained dynamic regret bound scales with the intensity and pattern of label distribution shift, hence exhibiting the adaptivity in the OLaS problem. Extensive experiments are conducted to validate the effectiveness and support our theoretical findings.

preprint2023arXiv

FedLED: Label-Free Equipment Fault Diagnosis with Vertical Federated Transfer Learning

Intelligent equipment fault diagnosis based on Federated Transfer Learning (FTL) attracts considerable attention from both academia and industry. It allows real-world industrial agents with limited samples to construct a fault diagnosis model without jeopardizing their raw data privacy. Existing approaches, however, can neither address the intense sample heterogeneity caused by different working conditions of practical agents, nor the extreme fault label scarcity, even zero, of newly deployed equipment. To address these issues, we present FedLED, the first unsupervised vertical FTL equipment fault diagnosis method, where knowledge of the unlabeled target domain is further exploited for effective unsupervised model transfer. Results of extensive experiments using data of real equipment monitoring demonstrate that FedLED obviously outperforms SOTA approaches in terms of both diagnosis accuracy (up to 4.13 times) and generality. We expect our work to inspire further study on label-free equipment fault diagnosis systematically enhanced by target domain knowledge.

preprint2022arXiv

A Comparative Study of Deep Learning Classification Methods on a Small Environmental Microorganism Image Dataset (EMDS-6): from Convolutional Neural Networks to Visual Transformers

In recent years, deep learning has made brilliant achievements in Environmental Microorganism (EM) image classification. However, image classification of small EM datasets has still not obtained good research results. Therefore, researchers need to spend a lot of time searching for models with good classification performance and suitable for the current equipment working environment. To provide reliable references for researchers, we conduct a series of comparison experiments on 21 deep learning models. The experiment includes direct classification, imbalanced training, and hyperparameter tuning experiments. During the experiments, we find complementarities among the 21 models, which is the basis for feature fusion related experiments. We also find that the data augmentation method of geometric deformation is difficult to improve the performance of VTs (ViT, DeiT, BotNet and T2T-ViT) series models. In terms of model performance, Xception has the best classification performance, the ViT model consumes the least time for training, and the ShuffleNet-V2 model has the least number of parameters.

preprint2022arXiv

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

We consider the problem of adversarial bandit convex optimization, that is, online learning over a sequence of arbitrary convex loss functions with only one function evaluation for each of them. While all previous works assume known and homogeneous curvature on these loss functions, we study a heterogeneous setting where each function has its own curvature that is only revealed after the learner makes a decision. We develop an efficient algorithm that is able to adapt to the curvature on the fly. Specifically, our algorithm not only recovers or \emph{even improves} existing results for several homogeneous settings, but also leads to surprising results for some heterogeneous settings -- for example, while Hazan and Levy (2014) showed that $\widetilde{O}(d^{3/2}\sqrt{T})$ regret is achievable for a sequence of $T$ smooth and strongly convex $d$-dimensional functions, our algorithm reveals that the same is achievable even if $T^{3/4}$ of them are not strongly convex, and sometimes even if a constant fraction of them are not strongly convex. Our approach is inspired by the framework of Bartlett et al. (2007) who studied a similar heterogeneous setting but with stronger gradient feedback. Extending their framework to the bandit feedback setting requires novel ideas such as lifting the feasible domain and using a logarithmically homogeneous self-concordant barrier regularizer.

preprint2022arXiv

Combating fluctuations in relaxation times of fixed-frequency transmon qubits with microwave-dressed states

With the long coherence time, the fixed-frequency transmon qubit is a promising qubit modality for quantum computing. Currently, diverse qubit architectures that utilize fixed-frequency transmon qubits have been demonstrated with high-fidelity gate performance. Nevertheless, the relaxation times of transmon qubits can have large temporal fluctuations, causing instabilities in gate performance. The fluctuations are often believed to be caused by nearly on-resonance couplings with sparse two-level-system (TLS) defects. To mitigate their impact on qubit coherence and gate performance, one direct approach is to tune the qubits away from these TLSs. In this work, to combat the potential TLS-induced performance fluctuations in a tunable-bus architecture unitizing fixed-frequency transmon qubits, we explore the possibility of using an off-resonance microwave drive to effectively tuning the qubit frequency through the ac-Stark shift while implementing universal gate operations on the microwave-dressed qubit. We show that the qubit frequency can be tuned up to 20 MHz through the ac-stark shift while keeping minimal impacts on the qubit control. Besides passive approaches that aim to remove these TLSs through more careful treatments of device fabrications, this work may offer an active approach towards mitigating the TLS-induced performance fluctuations in fixed-frequency transmon qubit devices.

preprint2022arXiv

Contrastive Multi-view Hyperbolic Hierarchical Clustering

Hierarchical clustering recursively partitions data at an increasingly finer granularity. In real-world applications, multi-view data have become increasingly important. This raises a less investigated problem, i.e., multi-view hierarchical clustering, to better understand the hierarchical structure of multi-view data. To this end, we propose a novel neural network-based model, namely Contrastive Multi-view Hyperbolic Hierarchical Clustering (CMHHC). It consists of three components, i.e., multi-view alignment learning, aligned feature similarity learning, and continuous hyperbolic hierarchical clustering. First, we align sample-level representations across multiple views in a contrastive way to capture the view-invariance information. Next, we utilize both the manifold and Euclidean similarities to improve the metric property. Then, we embed the representations into a hyperbolic space and optimize the hyperbolic embeddings via a continuous relaxation of hierarchical clustering loss. Finally, a binary clustering tree is decoded from optimized hyperbolic embeddings. Experimental results on five real-world datasets demonstrate the effectiveness of the proposed method and its components.

preprint2022arXiv

Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits

We consider the problem of combining and learning over a set of adversarial bandit algorithms with the goal of adaptively tracking the best one on the fly. The CORRAL algorithm of Agarwal et al. (2017) and its variants (Foster et al., 2020a) achieve this goal with a regret overhead of order $\widetilde{O}(\sqrt{MT})$ where $M$ is the number of base algorithms and $T$ is the time horizon. The polynomial dependence on $M$, however, prevents one from applying these algorithms to many applications where $M$ is poly$(T)$ or even larger. Motivated by this issue, we propose a new recipe to corral a larger band of bandit algorithms whose regret overhead has only \emph{logarithmic} dependence on $M$ as long as some conditions are satisfied. As the main example, we apply our recipe to the problem of adversarial linear bandits over a $d$-dimensional $\ell_p$ unit-ball for $p \in (1,2]$. By corralling a large set of $T$ base algorithms, each starting at a different time step, our final algorithm achieves the first optimal switching regret $\widetilde{O}(\sqrt{d S T})$ when competing against a sequence of comparators with $S$ switches (for some known $S$). We further extend our results to linear bandits over a smooth and strongly convex domain as well as unconstrained linear bandits.

preprint2022arXiv

Dynamic Regret of Online Markov Decision Processes

We investigate online Markov Decision Processes (MDPs) with adversarially changing loss functions and known transitions. We choose dynamic regret as the performance measure, defined as the performance difference between the learner and any sequence of feasible changing policies. The measure is strictly stronger than the standard static regret that benchmarks the learner's performance with a fixed compared policy. We consider three foundational models of online MDPs, including episodic loop-free Stochastic Shortest Path (SSP), episodic SSP, and infinite-horizon MDPs. For these three models, we propose novel online ensemble algorithms and establish their dynamic regret guarantees respectively, in which the results for episodic (loop-free) SSP are provably minimax optimal in terms of time horizon and certain non-stationarity measure. Furthermore, when the online environments encountered by the learner are predictable, we design improved algorithms and achieve better dynamic regret bounds for the episodic (loop-free) SSP; and moreover, we demonstrate impossibility results for the infinite-horizon MDPs.

preprint2022arXiv

EMDS-6: Environmental Microorganism Image Dataset Sixth Version for Image Denoising, Segmentation, Feature Extraction, Classification and Detection Methods Evaluation

Environmental microorganisms (EMs) are ubiquitous around us and have an important impact on the survival and development of human society. However, the high standards and strict requirements for the preparation of environmental microorganism (EM) data have led to the insufficient of existing related databases, not to mention the databases with GT images. This problem seriously affects the progress of related experiments. Therefore, This study develops the Environmental Microorganism Dataset Sixth Version (EMDS-6), which contains 21 types of EMs. Each type of EM contains 40 original and 40 GT images, in total 1680 EM images. In this study, in order to test the effectiveness of EMDS-6. We choose the classic algorithms of image processing methods such as image denoising, image segmentation and target detection. The experimental result shows that EMDS-6 can be used to evaluate the performance of image denoising, image segmentation, image feature extraction, image classification, and object detection methods.

preprint2022arXiv

High-Dimensional Linear Regression via Implicit Regularization

Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined through a discretized gradient dynamic system under overparameterization. We show that under suitable restricted isometry conditions, overparameterization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with sufficiently small initial values, then under some proper early stopping rule, the iterates converge to a nearly sparse rate-optimal solution that improves over explicitly regularized approaches. In particular, the resulting estimator does not suffer from extra bias due to explicit penalties, and can achieve the parametric root-n rate when the signal-to-noise ratio is sufficiently high. We also perform simulations to compare our methods with high dimensional linear regression with explicit regularization. Our results illustrate the advantages of using implicit regularization via gradient descent after overparameterization in sparse vector estimation.

preprint2022arXiv

Improving the Robustness and Generalization of Deep Neural Network with Confidence Threshold Reduction

Deep neural networks are easily attacked by imperceptible perturbation. Presently, adversarial training (AT) is the most effective method to enhance the robustness of the model against adversarial examples. However, because adversarial training solved a min-max value problem, in comparison with natural training, the robustness and generalization are contradictory, i.e., the robustness improvement of the model will decrease the generalization of the model. To address this issue, in this paper, a new concept, namely confidence threshold (CT), is introduced and the reducing of the confidence threshold, known as confidence threshold reduction (CTR), is proven to improve both the generalization and robustness of the model. Specifically, to reduce the CT for natural training (i.e., for natural training with CTR), we propose a mask-guided divergence loss function (MDL) consisting of a cross-entropy loss term and an orthogonal term. The empirical and theoretical analysis demonstrates that the MDL loss improves the robustness and generalization of the model simultaneously for natural training. However, the model robustness improvement of natural training with CTR is not comparable to that of adversarial training. Therefore, for adversarial training, we propose a standard deviation loss function (STD), which minimizes the difference in the probabilities of the wrong categories, to reduce the CT by being integrated into the loss function of adversarial training. The empirical and theoretical analysis demonstrates that the STD based loss function can further improve the robustness of the adversarially trained model on basis of guaranteeing the changeless or slight improvement of the natural accuracy.

preprint2022arXiv

No-Regret Learning in Time-Varying Zero-Sum Games

Learning from repeated play in a fixed two-player zero-sum game is a classic problem in game theory and online learning. We consider a variant of this problem where the game payoff matrix changes over time, possibly in an adversarial manner. We first present three performance measures to guide the algorithmic design for this problem: 1) the well-studied individual regret, 2) an extension of duality gap, and 3) a new measure called dynamic Nash Equilibrium regret, which quantifies the cumulative difference between the player's payoff and the minimax game value. Next, we develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under all these three performance measures. These guarantees are adaptive to different non-stationarity measures of the payoff matrices and, importantly, recover the best known results when the payoff matrix is fixed. Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property, along with several novel ingredients specifically designed for the time-varying game setting. Empirical results further validate the effectiveness of our algorithm.

preprint2022arXiv

Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits

In this paper we investigate the problem of stochastic multi-armed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike previous results that assume bounded/sub-Gaussian reward distributions, we focus on the setting where each arm's reward distribution only has $(1+v)$-th moment with some $v\in (0, 1]$. In the first part, we study the problem in the central $ε$-DP model. We first provide a near-optimal result by developing a private and robust Upper Confidence Bound (UCB) algorithm. Then, we improve the result via a private and robust version of the Successive Elimination (SE) algorithm. Finally, we establish the lower bound to show that the instance-dependent regret of our improved algorithm is optimal. In the second part, we study the problem in the $ε$-LDP model. We propose an algorithm that can be seen as locally private and robust version of SE algorithm, which provably achieves (near) optimal rates for both instance-dependent and instance-independent regret. Our results reveal differences between the problem of private MAB with bounded/sub-Gaussian rewards and heavy-tailed rewards. To achieve these (near) optimal rates, we develop several new hard instances and private robust estimators as byproducts, which might be used to other related problems. Finally, experiments also support our theoretical findings and show the effectiveness of our algorithms.

preprint2022arXiv

Quantum circuit architecture search on a superconducting processor

Variational quantum algorithms (VQAs) have shown strong evidences to gain provable computational advantages for diverse fields such as finance, machine learning, and chemistry. However, the heuristic ansatz exploited in modern VQAs is incapable of balancing the tradeoff between expressivity and trainability, which may lead to the degraded performance when executed on the noisy intermediate-scale quantum (NISQ) machines. To address this issue, here we demonstrate the first proof-of-principle experiment of applying an efficient automatic ansatz design technique, i.e., quantum architecture search (QAS), to enhance VQAs on an 8-qubit superconducting quantum processor. In particular, we apply QAS to tailor the hardware-efficient ansatz towards classification tasks. Compared with the heuristic ansatze, the ansatz designed by QAS improves test accuracy from 31% to 98%. We further explain this superior performance by visualizing the loss landscape and analyzing effective parameters of all ansatze. Our work provides concrete guidance for developing variable ansatze to tackle various large-scale quantum learning problems with advantages.

preprint2022arXiv

Spurious microwave crosstalk in floating superconducting circuits

Crosstalk is a major concern in the implementation of large-scale quantum computation since it can degrade the performance of qubit addressing and cause gate errors. Finding the origin of crosstalk and separating contributions from different channels are essential prerequisites for figuring out crosstalk mitigation schemes. Here, by performing circuit analysis of two coupled floating transmon qubits, we demonstrate that, even if the stray coupling, e.g., between a qubit and the drive line of its nearby qubit, is absent, microwave crosstalk between qubits can still exist due to the presence of a spurious crosstalk channel. This channel arises from free modes, which are supported by the floating structure of transmon qubits, i.e., the two superconducting islands of each qubit with no galvanic connection to the ground. For various geometric layouts of floating transmon qubits, we give the contributions of microwave crosstalk from the spurious channel and show that this channel can become a performance-limiting factor in qubit addressing. This research could provide guidance for suppressing microwave crosstalk between floating superconducting qubits through the design of qubit circuits.

preprint2022arXiv

Tunable coupling of widely separated superconducting qubits: A possible application towards a modular quantum device

Besides striving to assemble more and more qubits in a single monolithic quantum device, taking a modular design strategy may mitigate numerous engineering challenges for achieving large-scalable quantum processors with superconducting qubits. Nevertheless, a major challenge in the modular quantum device is how to realize high-fidelity entanglement operations on qubits housed in different modules while preserving the desired isolation between modules. In this work, we propose a conceptual design of a modular quantum device, where nearby modules are spatially separated by centimeters. In principle, each module can contain tens of superconducting qubits, and can be separately fabricated, characterized, packaged, and replaced. By introducing a bridge module between nearby qubit modules and taking the coupling scheme utilizing a tunable bus, tunable coupling of qubits that are housed in nearby qubit modules, could be realized. Given physically reasonable assumptions, we expect that sub-100-ns two-qubit gates for qubits housed in nearby modules which are spatially separated by more than two centimeters could be obtained. In this way, the inter-module gate operations are promising to be implemented with gate performance comparable with that of intra-module gate operations. Moreover, with help of through-silicon vias technologies, this long-range coupling scheme may also allow one to implement inter-module couplers in a multi-chip stacked processor. Thus, the tunable longer-range coupling scheme and the proposed modular architecture may provide a promising foundation for solving challenges toward large-scale quantum information processing with superconducting qubits.

preprint2022arXiv

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

In recent years, deep learning has made great progress in many fields such as image recognition, natural language processing, speech recognition and video super-resolution. In this survey, we comprehensively investigate 33 state-of-the-art video super-resolution (VSR) methods based on deep learning. It is well known that the leverage of information within video frames is important for video super-resolution. Thus we propose a taxonomy and classify the methods into six sub-categories according to the ways of utilizing inter-frame information. Moreover, the architectures and implementation details of all the methods are depicted in detail. Finally, we summarize and compare the performance of the representative VSR method on some benchmark datasets. We also discuss some challenges, which need to be further addressed by researchers in the community of VSR. To the best of our knowledge, this work is the first systematic review on VSR tasks, and it is expected to make a contribution to the development of recent studies in this area and potentially deepen our understanding to the VSR techniques based on deep learning.

preprint2021arXiv

Storage Fit Learning with Feature Evolvable Streams

Feature evolvable learning has been widely studied in recent years where old features will vanish and new features will emerge when learning with streams. Conventional methods usually assume that a label will be revealed after prediction at each time step. However, in practice, this assumption may not hold whereas no label will be given at most time steps. A good solution is to leverage the technique of manifold regularization to utilize the previous similar data to assist the refinement of the online model. Nevertheless, this approach needs to store all previous data which is impossible in learning with streams that arrive sequentially in large volume. Thus we need a buffer to store part of them. Considering that different devices may have different storage budgets, the learning approaches should be flexible subject to the storage budget limit. In this paper, we propose a new setting: Storage-Fit Feature-Evolvable streaming Learning (SF$^2$EL) which incorporates the issue of rarely-provided labels into feature evolution. Our framework is able to fit its behavior to different storage budgets when learning with feature evolvable streams with unlabeled data. Besides, both theoretical and empirical results validate that our approach can preserve the merit of the original feature evolvable learning i.e., can always track the best baseline and thus perform well at any time step.

preprint2020arXiv

A Single Frame and Multi-Frame Joint Network for 360-degree Panorama Video Super-Resolution

Spherical videos, also known as \ang{360} (panorama) videos, can be viewed with various virtual reality devices such as computers and head-mounted displays. They attract large amount of interest since awesome immersion can be experienced when watching spherical videos. However, capturing, storing and transmitting high-resolution spherical videos are extremely expensive. In this paper, we propose a novel single frame and multi-frame joint network (SMFN) for recovering high-resolution spherical videos from low-resolution inputs. To take advantage of pixel-level inter-frame consistency, deformable convolutions are used to eliminate the motion difference between feature maps of the target frame and its neighboring frames. A mixed attention mechanism is devised to enhance the feature representation capability. The dual learning strategy is exerted to constrain the space of solution so that a better solution can be found. A novel loss function based on the weighted mean square error is proposed to emphasize on the super-resolution of the equatorial regions. This is the first attempt to settle the super-resolution of spherical videos, and we collect a novel dataset from the Internet, MiG Panorama Video, which includes 204 videos. Experimental results on 4 representative video clips demonstrate the efficacy of the proposed method. The dataset and code are available at https://github.com/lovepiano/SMFN_For_360VSR.

preprint2020arXiv

Bandit Convex Optimization in Non-stationary Environments

Bandit Convex Optimization (BCO) is a fundamental framework for modeling sequential decision-making with partial information, where the only feedback available to the player is the one-point or two-point function values. In this paper, we investigate BCO in non-stationary environments and choose the \emph{dynamic regret} as the performance measure, which is defined as the difference between the cumulative loss incurred by the algorithm and that of any feasible comparator sequence. Let $T$ be the time horizon and $P_T$ be the path-length of the comparator sequence that reflects the non-stationarity of environments. We propose a novel algorithm that achieves $O(T^{3/4}(1+P_T)^{1/2})$ and $O(T^{1/2}(1+P_T)^{1/2})$ dynamic regret respectively for the one-point and two-point feedback models. The latter result is optimal, matching the $Ω(T^{1/2}(1+P_T)^{1/2})$ lower bound established in this paper. Notably, our algorithm is more adaptive to non-stationary environments since it does not require prior knowledge of the path-length $P_T$ ahead of time, which is generally unknown.

preprint2020arXiv

CDC: Classification Driven Compression for Bandwidth Efficient Edge-Cloud Collaborative Deep Learning

The emerging edge-cloud collaborative Deep Learning (DL) paradigm aims at improving the performance of practical DL implementations in terms of cloud bandwidth consumption, response latency, and data privacy preservation. Focusing on bandwidth efficient edge-cloud collaborative training of DNN-based classifiers, we present CDC, a Classification Driven Compression framework that reduces bandwidth consumption while preserving classification accuracy of edge-cloud collaborative DL. Specifically, to reduce bandwidth consumption, for resource-limited edge servers, we develop a lightweight autoencoder with a classification guidance for compression with classification driven feature preservation, which allows edges to only upload the latent code of raw data for accurate global training on the Cloud. Additionally, we design an adjustable quantization scheme adaptively pursuing the tradeoff between bandwidth consumption and classification accuracy under different network conditions, where only fine-tuning is required for rapid compression ratio adjustment. Results of extensive experiments demonstrate that, compared with DNN training with raw data, CDC consumes 14.9 times less bandwidth with an accuracy loss no more than 1.06%, and compared with DNN training with data compressed by AE without guidance, CDC introduces at least 100% lower accuracy loss.

preprint2020arXiv

Improved existence for the characteristic initial value problem with the conformal Einstein field equations

We adapt Luk's analysis of the characteristic initial value problem in General Relativity to the asymptotic characteristic problem for the conformal Einstein field equations to demonstrate the local existence of solutions in a neighbourhood of the set on which the data are given. In particular, we obtain existence of solutions along a narrow rectangle along null infinity which, in turn, corresponds to an infinite domain in the asymptotic region of the physical spacetime. This result generalises work by Kánnár on the local existence of solutions to the characteristic initial value problem by means of Rendall's reduction strategy. In analysing the conformal Einstein equations we make use of the Newman-Penrose formalism and a gauge due to J. Stewart.

preprint2020arXiv

Joint Cyber Risk Assessment of Network Systems with Heterogeneous Components

Cyber risks are the most common risks encountered by a modern network system. However, it is significantly difficult to assess the joint cyber risk owing to the network topology, risk propagation, and heterogeneities of components. In this paper, we propose a novel backward elimination approach for computing the joint cyber risk encountered by different types of components in a network system; moreover, explicit formulas are also presented. Certain specific network topologies including complete, star, and complete bi-partite topologies are studied. The effects of propagation depth and compromise probabilities on the joint cyber risk are analyzed using stochastic comparisons. The variances and correlations of cyber risks are examined by a simulation experiment. It was discovered that both variances and correlations change rapidly when the propagation depth increases from its initial value. Further, numerical examples are also presented.

preprint2020arXiv

Simultaneously exciting two atoms with photon-mediated Raman interaction

We propose an approach to simultaneously excite two atoms by using cavity-assisted Raman process in combination with cavity photon-mediated interaction. The system consists of a two-level atom and a $Λ$-type or V-type three-level atom, which are coupled together with a cavity mode. Having derived the effective Hamiltonian, we find that under certain circumstances a single photon can simultaneously excite two atoms. In addition, multiple photons and even a classical field can also simultaneously excite two atoms. As an example, we show a scheme to realize our proposal in a circuit QED setup, which is artificial atoms coupled with a cavity. The dynamics and the quantum statistical properties of the process are investigated with experimentally feasible parameters.

preprint2020arXiv

Switchable next-nearest-neighbor coupling for controlled two-qubit operations

In a superconducting quantum processor with nearest neighbor coupling, the dispersive interaction between adjacent qubits can result in an effective next-nearest-neighbor coupling whose strength depends on the state of the intermediary qubit. Here, we theoretically explore the possibility of engineering this next-nearest-neighbor coupling to implement controlled two-qubit operations where the intermediary qubit controls the operation on the next-nearest neighbor pair of qubits. In particular, in a system comprising two types of superconducting qubits with anharmonicities of opposite-sign arranged in an -A-B-A- pattern, where the unwanted static ZZ coupling between adjacent qubits could be heavily suppressed, a switchable coupling between the next-nearest-neighbor qubits can be achieved via the intermediary qubit, the qubit state of which functions as an on/off switch for this coupling. Therefore, depending on the adopted activating scheme, various controlled two-qubit operations such as controlled-iSWAP gate can be realized, potentially enabling circuit depth reductions as to a standard decomposition approach for implementing generic quantum algorithms.

preprint2018arXiv

Handling Concept Drift via Model Reuse

In many real-world applications, data are often collected in the form of stream, and thus the distribution usually changes in nature, which is referred as concept drift in literature. We propose a novel and effective approach to handle concept drift via model reuse, leveraging previous knowledge by reusing models. Each model is associated with a weight representing its reusability towards current data, and the weight is adaptively adjusted according to the model performance. We provide generalization and regret analysis. Experimental results also validate the superiority of our approach on both synthetic and real-world datasets.

preprint2018arXiv

Universal linear optical operations on discrete phase-coherent spatial modes

Linear optical operations are fundamental and significant for both quantum mechanics and classical technologies. We demonstrate a non-cascaded approach to perform arbitrary unitary and non-unitary linear operations for N-dimensional phase-coherent spatial modes with meticulously designed phase gratings. As implemented on spatial light modulators (SLMs), the unitary transformation matrix has been realized with dimensionalities ranging from 7 to 24 and the corresponding fidelities are from 95.1% to 82.1%. For the non-unitary operators, a matrix is presented for the tomography of a 4-level quantum system with a fidelity of 94.9%. Thus, the linear operator has been successfully implemented with much higher dimensionality than that in previous reports. It should be mentioned that our method is not limited to SLMs and can be easily applied on other devices. Thus we believe that our proposal provides another option to perform linear operation with a simple, fixed, error-tolerant and scalable scheme.

preprint2017arXiv

Distribution-Free One-Pass Learning

In many large-scale machine learning applications, data are accumulated with time, and thus, an appropriate model should be able to update in an online paradigm. Moreover, as the whole data volume is unknown when constructing the model, it is desired to scan each data item only once with a storage independent with the data volume. It is also noteworthy that the distribution underlying may change during the data accumulation procedure. To handle such tasks, in this paper we propose DFOP, a distribution-free one-pass learning approach. This approach works well when distribution change occurs during data accumulation, without requiring prior knowledge about the change. Every data item can be discarded once it has been scanned. Besides, theoretical guarantee shows that the estimate error, under a mild assumption, decreases until convergence with high probability. The performance of DFOP for both regression and classification are validated in experiments.

preprint2017arXiv

Identifying the tilt angle and correcting the orbital angular momentum spectrum dispersion of misaligned light beam

The axis tilt of light beam in optical system would introduce the dispersion of orbital angular momentum (OAM) spectrum. To deal with it, a two-step method is proposed and demonstrated. First, the tilt angle of optical axis is identified with a deduced relation between the tilt angle and the variation of OAM topological charges with different reference axes, which is obtained with the help of a charge coupled device (CCD) camera. In our experiments, the precision of measured tilt angle is about 10-4rad with OAM orders of -3~3. Using the measured angle value, the additional phase delay due to axis tilt can be calculated so that the dispersion of OAM spectrum can be corrected with a simple formula while the optical axis is not aligned. The experimental results indicate that the original OAM spectrum has been successfully extracted for not only the pure OAM state but also the superposed OAM states.

preprint2015arXiv

The Quasi-normal Modes of Charged Scalar Fields in Kerr-Newman black hole and Its Geometric Interpretation

It is well-known that there is a geometric correspondence between high-frequency quasi-normal modes (QNMs) and null geodesics (spherical photon orbits). In this paper, we generalize such correspondence to charged scalar field in Kerr-Newman space-time. In our case, the particle and black hole are all charged, so one should consider non-geodesic orbits. Using the WKB approximation, we find that the real part of quasi-normal frequency corresponds to the orbits frequency, the imaginary part of the frequency corresponds to the Lyapunov exponent of these orbits and the eigenvalue of angular equation corresponds to carter constant. From the properties of the imaginary part of quasi-normal frequency of charged massless scalar field, we can still find that the QNMs of charged massless scalar field possess the zero damping modes in extreme Kerr-Newman spacetime under certain condition which has been fixed in this paper.

preprint2014arXiv

Algebro-Geometric Solutions for the Kadomtsev-Petviashvili Hierarchy

Based on the idea of symmetric constraint, we apply the Gesztesy-Holden's method to derive explicit representations of the Baker-Ahkiezer function $ψ_1$ of the KP hierarchy, from which we provide theta function representations of algebro-geometric solutions for the whole Kadomtsev-Petviashvili (KP) hierarchy. This provides a approach to obtain some special subclasses of algebro-geometric solutions for the KP hierarchy and other high dimensional hierarchy of equations.

preprint2014arXiv

Cluster algebras from dualities of 2d N=(2,2) quiver gauge theories

We interpret certain Seiberg-like dualities of two-dimensional N=(2,2) quiver gauge theories with unitary groups as cluster mutations in cluster algebras, originally formulated by Fomin and Zelevinsky. In particular, we show how the complexified Fayet-Iliopoulos parameters of the gauge group factors transform under those dualities and observe that they are in fact related to the dual cluster variables of cluster algebras. This implies that there is an underlying cluster algebra structure in the quantum Kahler moduli space of manifolds constructed from the corresponding Kahler quotients. We study the S^2 partition function of the gauge theories, showing that it is invariant under dualities/mutations, up to an overall normalization factor whose physical origin and consequences we spell out in detail. We also present similar dualities in N=(2,2)* quiver gauge theories, which are related to dualities of quantum integrable spin chains.

preprint2014arXiv

New Construction of Algebro-Geometric Solutions to the Modified Kadomtsev-Petviashvili Hierarchy

We extend Gesztesy-Holden's method to 2+1 dimensional case to obtain a unified construction to the algebro-geometric solutions of the whole modified Kadomtsev-Petviashvili (mKP) hierarchy. Our tools include the relations between solutions of the Gerdjikov-Ivanov (GI) and mKP hierarchy, the Baker-Akhiezer functions in 2+1 dimensions, a special function $ψ_1(P)ψ_2(P^*)$ on $X\times \mathbb{R}^3$ and Dubrovin-type equations for auxiliary divisors.

preprint2013arXiv

Algebro-geometric solutions and their reductions for the Fokas-Lenells hierarchy

This paper is dedicated to provide theta function representations of algebro-geometric solutions for the Fokas-Lenells (FL) hierarchy through studying an algebro-geometric initial value problem. Further, we reduce these solutions into $N$-dark solutions through the degeneration of associated Riemann surfaces.

preprint2013arXiv

Central charges and RG flow of strongly-coupled N=2 theory

We calculate the central charges a, c and k_G of a large class of four-dimensional N=2 superconformal field theories arising from compactifying the six-dimensional N=(2,0) theory on a Riemann surface with regular and irregular punctures. We also study the renormalization group flows between the general Argyres-Douglas theories, which all agree with the a-theorem.

preprint2013arXiv

Facile implementation of integrated tempering sampling method to enhance the sampling over a broad range of temperatures

Integrated tempering sampling (ITS) method is an approach to enhance the sampling over a broad range of energies and temperatures in computer simulations. In this paper, a new version of integrated tempering sampling method is proposed. In the new approach presented here, we obtain parameters such as the set of temperatures and the corresponding weighting factors from canonical average of potential energies. These parameters can be easily obtained without estimating partition functions. We apply this new approach to study the Lennard-Jones fluid, the ALA-PRO peptide and the single polymer chain systems to validate and benchmark the method.

preprint2013arXiv

Reality problems for the Algebro-Geometric Solutions of Fokas-Lenell hierarchy

In a previous study, we obtained the algebro-geometric solutions and $n$-dark solitons of Forkas-Lenells (FL) hierarchy using algebro-geometric method. In this paper, we construct physically relevant classes of solutions for FL hierarchy by studying the reality conditions for $q=\pm \bar{r}$ based on the idea of Vinikov's homological basis.

preprint2012arXiv

A 5d/3d duality from relativistic integrable system

We propose and prove a new exact duality between the F-terms of supersymmetric gauge theories in five and three dimensions with adjoint matter fields. The theories are compactified on a circle and are subject to the Omega deformation. In the limit proposed by Nekrasov and Shatashvili, the supersymmetric vacua become isolated and are identified with the eigenstates of a quantum integrable system. The effective twisted superpotentials are the Yang-Yang functional of the relativistic elliptic Calogero-Moser model. We show that they match on-shell by deriving the Bethe ansatz equation from the saddle point of the five-dimensional partition function. We also show that the Chern-Simons terms match and extend our proposal to the elliptic quiver generalizations.

preprint2012arXiv

The Algebro-Geometric Initial Value Problem for the Relativistic Lotka-Volterra Hierarchy and Quasi-Periodic Solutions

We provide a detailed treatment of relativistic Lotka-Volterra hierarchy and a kind of initial value problem with special emphasis on its the theta function representation of all algebro-geometric solutions. The basic tools involve hyperelliptic curve $\mathcal{K}_n$ associated with the Burchnall-Chaundy polynomial, Dubrovin-type equations for auxiliary divisors and associated trace formulas. With the help of a foundamental meromorphic function $\tildeϕ$ on $\mathcal{K}_p$ and trace formulas, the complex-valued algebro-geometric solutions of of RLV hierarchy are derived.

preprint2012arXiv

The algebro-geometric solutions for Degasperis-Procesi hierarchy

Though completely integrable Camassa-Holm (CH) equation and Degasperis-Procesi (DP) equation are cast in the same peakon family, they possess the second- and third-order Lax operators, respectively. From the viewpoint of algebro-geometrical study, this difference lies in hyper-elliptic and non-hyper-elliptic curves. The non-hyper-elliptic curves lead to great difficulty in the construction of algebro-geometric solutions of the DP equation. In this paper, we derive the DP hierarchy with the help of Lenard recursion operators. Based on the characteristic polynomial of a Lax matrix for the DP hierarchy, we introduce a third order algebraic curve $\mathcal{K}_{r-2}$ with genus $r-2$, from which the associated Baker-Akhiezer functions, meromorphic function and Dubrovin-type equations are established. Furthermore, the theory of algebraic curve is applied to derive explicit representations of the theta function for the Baker-Akhiezer functions and the meromorphic function. In particular, the algebro-geometric solutions are obtained for all equations in the whole DP hierarchy.

preprint2012arXiv

The algebro-geometric solutions for Hunter-Saxton hierarchy

This paper is dedicated to provide theta function representation of algebro-geometric solutions and related crucial quantities for the Hunter-Saxton (HS) hierarchy through studying a algebro-geometric initial value problem. Our main tools include the polynomial recursive formalism to derive the HS hierarchy, the hyperelliptic curve with finite number of genus, the Baker-Akhiezer functions, the meromorphic function, the Dubrovin-type equations for auxiliary divisors, and the associated trace formulas. With the help of these tools, the explicit representations of the Baker-Ahhiezer functions, the meromorphic function, and the algebro-geometric solutions are obtained for the entire HS hierarchy.

preprint2012arXiv

The Algebro-Geometric Solutions for the Ruijsenaars-Toda Hierarchy

We provide a detailed treatment of Ruijsenaars-Toda (RT) hierarchy with special emphasis on its the theta function representation of all algebro-geometric solutions. The basic tools involve hyperelliptic curve $\mathcal{K}_p$ associated with the Burchnall-Chaundy polynomial, Dubrovin-type equations for auxiliary divisors and associated trace formulas. With the help of a foundamental meromorphic function $ϕ$, Baker-Akhiezer vector $Ψ$ on $\mathcal{K}_p$, the complex-valued algebro-geometric solutions of RT hierarchy are derived.

preprint2011arXiv

Scattering of Giant Holes

We study scalar excitations of high spin operators in N=4 super Yang-Mills theory, which are dual to solitons propagating on a long folded string in AdS_3 x S^1. In the spin chain description of the gauge theory, these are associated to holes in the magnon distribution in the sl(2,R) sector. We compute the all-loop hole S-matrix from the asymptotic Bethe ansatz, and expand in leading orders at weak and strong coupling. The worldsheet S-matrix of solitonic excitations on the GKP string is calculated using semiclassical quantization. We find an exact agreement between the gauge theory and string theory results.

preprint2010arXiv

Active microcantilevers based on piezoresistive ferromagnetic thin films

We report the piezoresisitivity in magnetic thin films of FeGa and their use for fabricating self transducing microcantilevers. The actuation occurs as a consequence of both the ferromagnetic and magnetostrictive property of FeGa thin films, while the deflection readout is achieved by exploiting the piezoresisitivity of these films. This self-sensing, self-actuating micromechanical system involves a very simple bilayer structure, which eliminates the need for the more complex piezoelectric stack that is commonly used in active cantilevers. Thus, it potentially opens opportunities for remotely actuated, cantilever-based sensors.

Peng Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

48 published item(s)

AeroSketch: Near-Optimal Time Matrix Sketch Framework for Persistent, Sliding Window, and Distributed Streams

Breaking Coordinate Overfitting: Geometry-Aware WiFi Sensing for Cross-Layout 3D Pose Estimation

Dynamic Chunking for Diffusion Language Models

Revisiting Weighted Strategy for Non-stationary Parametric Bandits and MDPs

Adapting to Online Label Shift with Provable Guarantees

FedLED: Label-Free Equipment Fault Diagnosis with Vertical Federated Transfer Learning

A Comparative Study of Deep Learning Classification Methods on a Small Environmental Microorganism Image Dataset (EMDS-6): from Convolutional Neural Networks to Visual Transformers

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

Combating fluctuations in relaxation times of fixed-frequency transmon qubits with microwave-dressed states

Contrastive Multi-view Hyperbolic Hierarchical Clustering

Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits

Dynamic Regret of Online Markov Decision Processes

EMDS-6: Environmental Microorganism Image Dataset Sixth Version for Image Denoising, Segmentation, Feature Extraction, Classification and Detection Methods Evaluation

High-Dimensional Linear Regression via Implicit Regularization

Improving the Robustness and Generalization of Deep Neural Network with Confidence Threshold Reduction

No-Regret Learning in Time-Varying Zero-Sum Games

Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits

Quantum circuit architecture search on a superconducting processor

Spurious microwave crosstalk in floating superconducting circuits

Tunable coupling of widely separated superconducting qubits: A possible application towards a modular quantum device

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Storage Fit Learning with Feature Evolvable Streams

A Single Frame and Multi-Frame Joint Network for 360-degree Panorama Video Super-Resolution

Bandit Convex Optimization in Non-stationary Environments

CDC: Classification Driven Compression for Bandwidth Efficient Edge-Cloud Collaborative Deep Learning

Improved existence for the characteristic initial value problem with the conformal Einstein field equations

Joint Cyber Risk Assessment of Network Systems with Heterogeneous Components

Simultaneously exciting two atoms with photon-mediated Raman interaction

Switchable next-nearest-neighbor coupling for controlled two-qubit operations

Handling Concept Drift via Model Reuse

Universal linear optical operations on discrete phase-coherent spatial modes

Distribution-Free One-Pass Learning

Identifying the tilt angle and correcting the orbital angular momentum spectrum dispersion of misaligned light beam

The Quasi-normal Modes of Charged Scalar Fields in Kerr-Newman black hole and Its Geometric Interpretation

Algebro-Geometric Solutions for the Kadomtsev-Petviashvili Hierarchy

Cluster algebras from dualities of 2d N=(2,2) quiver gauge theories

New Construction of Algebro-Geometric Solutions to the Modified Kadomtsev-Petviashvili Hierarchy

Algebro-geometric solutions and their reductions for the Fokas-Lenells hierarchy

Central charges and RG flow of strongly-coupled N=2 theory

Facile implementation of integrated tempering sampling method to enhance the sampling over a broad range of temperatures

Reality problems for the Algebro-Geometric Solutions of Fokas-Lenell hierarchy

A 5d/3d duality from relativistic integrable system

The Algebro-Geometric Initial Value Problem for the Relativistic Lotka-Volterra Hierarchy and Quasi-Periodic Solutions

The algebro-geometric solutions for Degasperis-Procesi hierarchy

The algebro-geometric solutions for Hunter-Saxton hierarchy

The Algebro-Geometric Solutions for the Ruijsenaars-Toda Hierarchy

Scattering of Giant Holes

Active microcantilevers based on piezoresistive ferromagnetic thin films