Source author record

Shaofeng Zou

Shaofeng Zou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Information Theory math.IT math.ST Statistics Theory Computer Vision eess.IV eess.SP

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Robust Multi-Hypothesis Testing with Moment Constrained Uncertainty Sets

The problem of robust binary hypothesis testing is studied. Under both hypotheses, the data-generating distributions are assumed to belong to uncertainty sets constructed through moments; in particular, the sets contain distributions whose moments are centered around the empirical moments obtained from training samples. The goal is to design a test that performs well under all distributions in the uncertainty sets, i.e., minimize the worst-case error probability over the uncertainty sets. In the finite-alphabet case, the optimal test is obtained. In the infinite-alphabet case, a tractable approximation to the worst-case error is derived that converges to the optimal value using finite samples from the alphabet. A test is further constructed to generalize to the entire alphabet. An exponentially consistent test for testing batch samples is also proposed. Numerical results are provided to demonstrate the performance of the proposed robust tests.

preprint2022arXiv

Policy Gradient Method For Robust Reinforcement Learning

This paper develops the first policy gradient method with global optimality guarantee and complexity analysis for robust reinforcement learning under model mismatch. Robust reinforcement learning is to learn a policy robust to model mismatch between simulator and real environment. We first develop the robust policy (sub-)gradient, which is applicable for any differentiable parametric policy class. We show that the proposed robust policy gradient method converges to the global optimum asymptotically under direct policy parameterization. We further develop a smoothed robust policy gradient method and show that to achieve an $ε$-global optimum, the complexity is $\mathcal O(ε^{-3})$. We then extend our methodology to the general model-free setting and design the robust actor-critic method with differentiable parametric policy class and value function. We further characterize its asymptotic convergence and sample complexity under the tabular setting. Finally, we provide simulation results to demonstrate the robustness of our methods.

preprint2022arXiv

Quickest Change Detection in Anonymous Heterogeneous Sensor Networks

The problem of quickest change detection (QCD) in anonymous heterogeneous sensor networks is studied. There are $n$ heterogeneous sensors and a fusion center. The sensors are clustered into $K$ groups, and different groups follow different data-generating distributions. At some unknown time, an event occurs in the network and changes the data-generating distribution of the sensors. The goal is to detect the change as quickly as possible, subject to false alarm constraints. The anonymous setting is studied, where at each time step, the fusion center receives $n$ unordered samples, and the fusion center does not know which sensor each sample comes from, and thus does not know its exact distribution. A simple optimality proof is first derived for the mixture likelihood ratio test, which was constructed and proved to be optimal for the non-sequential anonymous setting in (Chen and Wang, 2019). For the QCD problem, a mixture CuSum algorithm is further constructed, and is further shown to be optimal under Lorden's criterion. For large networks, a computationally efficient test is proposed and a novel theoretical characterization of its false alarm rate is developed. Numerical results are provided to validate the theoretical results.

preprint2022arXiv

Robust Constrained Reinforcement Learning

Constrained reinforcement learning is to maximize the expected reward subject to constraints on utilities/costs. However, the training environment may not be the same as the test one, due to, e.g., modeling error, adversarial attack, non-stationarity, resulting in severe performance degradation and more importantly constraint violation. We propose a framework of robust constrained reinforcement learning under model uncertainty, where the MDP is not fixed but lies in some uncertainty set, the goal is to guarantee that constraints on utilities/costs are satisfied for all MDPs in the uncertainty set, and to maximize the worst-case reward performance over the uncertainty set. We design a robust primal-dual approach, and further theoretically develop guarantee on its convergence, complexity and robust feasibility. We then investigate a concrete example of $δ$-contamination uncertainty set, design an online and model-free algorithm and theoretically characterize its sample complexity.

preprint2022arXiv

Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy. However, existing decentralized AC algorithms either do not preserve the privacy of agents or are not sample and communication-efficient. In this work, we develop two decentralized AC and natural AC (NAC) algorithms that are private, and sample and communication-efficient. In both algorithms, agents share noisy information to preserve privacy and adopt mini-batch updates to improve sample and communication efficiency. Particularly for decentralized NAC, we develop a decentralized Markovian SGD algorithm with an adaptive mini-batch size to efficiently compute the natural policy gradient. Under Markovian sampling and linear function approximation, we prove the proposed decentralized AC and NAC algorithms achieve the state-of-the-art sample complexities $\mathcal{O}\big(ε^{-2}\ln(ε^{-1})\big)$ and $\mathcal{O}\big(ε^{-3}\ln(ε^{-1})\big)$, respectively, and the same small communication complexity $\mathcal{O}\big(ε^{-1}\ln(ε^{-1})\big)$. Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithm.

preprint2020arXiv

A CNN-Based Blind Denoising Method for Endoscopic Images

The quality of images captured by wireless capsule endoscopy (WCE) is key for doctors to diagnose diseases of gastrointestinal (GI) tract. However, there exist many low-quality endoscopic images due to the limited illumination and complex environment in GI tract. After an enhancement process, the severe noise become an unacceptable problem. The noise varies with different cameras, GI tract environments and image enhancement. And the noise model is hard to be obtained. This paper proposes a convolutional blind denoising network for endoscopic images. We apply Deep Image Prior (DIP) method to reconstruct a clean image iteratively using a noisy image without a specific noise model and ground truth. Then we design a blind image quality assessment network based on MobileNet to estimate the quality of the reconstructed images. The estimated quality is used to stop the iterative operation in DIP method. The number of iterations is reduced about 36% by using transfer learning in our DIP process. Experimental results on endoscopic images and real-world noisy images demonstrate the superiority of our proposed method over the state-of-the-art methods in terms of visual quality and quantitative metrics.

preprint2020arXiv

Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise

Greedy-GQ is an off-policy two timescale algorithm for optimal control in reinforcement learning. This paper develops the first finite-sample analysis for the Greedy-GQ algorithm with linear function approximation under Markovian noise. Our finite-sample analysis provides theoretical justification for choosing stepsizes for this two timescale algorithm for faster convergence in practice, and suggests a trade-off between the convergence rate and the quality of the obtained policy. Our paper extends the finite-sample analyses of two timescale reinforcement learning algorithms from policy evaluation to optimal control, which is of more practical interest. Specifically, in contrast to existing finite-sample analyses for two timescale methods, e.g., GTD, GTD2 and TDC, where their objective functions are convex, the objective function of the Greedy-GQ algorithm is non-convex. Moreover, the Greedy-GQ algorithm is also not a linear two-timescale stochastic approximation algorithm. Our techniques in this paper provide a general framework for finite-sample analysis of non-convex value-based reinforcement learning algorithms for optimal control.

preprint2020arXiv

Tightening Mutual Information Based Bounds on Generalization Error

An information-theoretic upper bound on the generalization error of supervised learning algorithms is derived. The bound is constructed in terms of the mutual information between each individual training sample and the output of the learning algorithm. The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error. Examples of learning algorithms are provided to demonstrate the the tightness of the bound, and to show that it has a broad range of applicability. Application to noisy and iterative algorithms, e.g., stochastic gradient Langevin dynamics (SGLD), is also studied, where the constructed bound provides a tighter characterization of the generalization error than existing results. Finally, it is demonstrated that, unlike existing bounds, which are difficult to compute and evaluate empirically, the proposed bound can be estimated easily in practice.

preprint2017arXiv

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

The problem of universal outlying sequence detection is studied, where the goal is to detect outlying sequences among $M$ sequences of samples. A sequence is considered as outlying if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, we are interested in identifying all the outlying sequences without knowing the underlying generating distributions. In this paper, a class of tests based on distribution clustering is proposed. These tests are shown to be exponentially consistent with linear time complexity in $M$. Numerical results demonstrate that our clustering-based tests achieve similar performance to existing tests, while being considerably more computationally efficient.

preprint2016arXiv

A Kernel-Based Nonparametric Test for Anomaly Detection over Line Networks

The nonparametric problem of detecting existence of an anomalous interval over a one dimensional line network is studied. Nodes corresponding to an anomalous interval (if exists) receive samples generated by a distribution q, which is different from the distribution p that generates samples for other nodes. If anomalous interval does not exist, then all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary, and are unknown. In order to detect whether an anomalous interval exists, a test is built based on mean embeddings of distributions into a reproducing kernel Hilbert space (RKHS) and the metric of maximummean discrepancy (MMD). It is shown that as the network size n goes to infinity, if the minimum length of candidate anomalous intervals is larger than a threshold which has the order O(log n), the proposed test is asymptotically successful, i.e., the probability of detection error approaches zero asymptotically. An efficient algorithm to perform the test with substantial computational complexity reduction is proposed, and is shown to be asymptotically successful if the condition on the minimum length of candidate anomalous interval is satisfied. Numerical results are provided, which are consistent with the theoretical results.

preprint2016arXiv

Nonparametric Detection of Anomalous Data Streams

A nonparametric anomalous hypothesis testing problem is investigated, in which there are totally n sequences with s anomalous sequences to be detected. Each typical sequence contains m independent and identically distributed (i.i.d.) samples drawn from a distribution p, whereas each anomalous sequence contains m i.i.d. samples drawn from a distribution q that is distinct from p. The distributions p and q are assumed to be unknown in advance. Distribution-free tests are constructed using maximum mean discrepancy as the metric, which is based on mean embeddings of distributions into a reproducing kernel Hilbert space. The probability of error is bounded as a function of the sample size m, the number s of anomalous sequences and the number n of sequences. It is then shown that with s known, the constructed test is exponentially consistent if m is greater than a constant factor of log n, for any p and q, whereas with s unknown, m should has an order strictly greater than log n. Furthermore, it is shown that no test can be consistent for arbitrary p and q if m is less than a constant factor of log n, thus the order-level optimality of the proposed test is established. Numerical results are provided to demonstrate that our tests outperform (or perform as well as) the tests based on other competitive approaches under various cases.

preprint2015arXiv

Universal Outlying sequence detection For Continuous Observations

The following detection problem is studied, in which there are $M$ sequences of samples out of which one outlier sequence needs to be detected. Each typical sequence contains $n$ independent and identically distributed (i.i.d.) continuous observations from a known distribution $π$, and the outlier sequence contains $n$ i.i.d. observations from an outlier distribution $μ$, which is distinct from $π$, but otherwise unknown. A universal test based on KL divergence is built to approximate the maximum likelihood test, with known $π$ and unknown $μ$. A data-dependent partitions based KL divergence estimator is employed. Such a KL divergence estimator is further shown to converge to its true value exponentially fast when the density ratio satisfies $0<K_1\leq \frac{dμ}{dπ}\leq K_2$, where $K_1$ and $K_2$ are positive constants, and this further implies that the test is exponentially consistent. The performance of the test is compared with that of a recently introduced test for this problem based on the machine learning approach of maximum mean discrepancy (MMD). We identify regimes in which the KL divergence based test is better than the MMD based test.

preprint2014arXiv

An Information Theoretic Approach to Secret Sharing

A novel information theoretic approach is proposed to solve the secret sharing problem, in which a dealer distributes one or multiple secrets among a set of participants that for each secret only qualified sets of users can recover it by pooling their shares together while non-qualified sets of users obtain no information about the secret even if they pool their shares together. While existing secret sharing systems (implicitly) assume that communications between the dealer and participants are noiseless, this paper takes a more practical assumption that the dealer delivers shares to the participants via a noisy broadcast channel. An information theoretic approach is proposed, which exploits the channel as additional resources to achieve secret sharing requirements. In this way, secret sharing problems can be reformulated as equivalent secure communication problems via wiretap channels, and can be solved by employing powerful information theoretic security techniques. This approach is first developed for the classic secret sharing problem, in which only one secret is to be shared. This classic problem is shown to be equivalent to a communication problem over a compound wiretap channel. The lower and upper bounds on the secrecy capacity of the compound channel provide the corresponding bounds on the secret sharing rate. The power of the approach is further demonstrated by a more general layered multi-secret sharing problem, which is shown to be equivalent to the degraded broadcast multiple-input multiple-output (MIMO) channel with layered decoding and secrecy constraints. The secrecy capacity region for the degraded MIMO broadcast channel is characterized, which provides the secret sharing capacity region. Furthermore, these secure encoding schemes that achieve the secrecy capacity region provide an information theoretic scheme for sharing the secrets.

Shaofeng Zou

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Robust Multi-Hypothesis Testing with Moment Constrained Uncertainty Sets

Policy Gradient Method For Robust Reinforcement Learning

Quickest Change Detection in Anonymous Heterogeneous Sensor Networks

Robust Constrained Reinforcement Learning

Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

A CNN-Based Blind Denoising Method for Endoscopic Images

Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise

Tightening Mutual Information Based Bounds on Generalization Error

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

A Kernel-Based Nonparametric Test for Anomaly Detection over Line Networks

Nonparametric Detection of Anomalous Data Streams

Universal Outlying sequence detection For Continuous Observations

An Information Theoretic Approach to Secret Sharing