Source author record

Chicheng Zhang

Chicheng Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Data Structures and Algorithms cs.CY Genomics math.ST physics.optics Statistics Theory

Catalog footprint

What is connected

15works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

First Thin-Film Lithium Tantalate Polarization Controller Enabling Reset-Free Mrad/s Tracking for Optical Interconnects

The rapid escalation of computing power driven by large-scale artificial intelligence is placing unprecedented demands on the bandwidth, latency, and energy efficiency of data-center interconnects (DCIs). Self-homodyne coherent (SHC) transmission is a promising architecture because it preserves the spectral efficiency of coherent detection while greatly simplifying digital signal processing, but its practical deployment is critically limited by random and often ultrafast state-of-polarization (SOP) fluctuations that induce carrier fading and destabilize coherent reception. Here we report the first integrated polarization controller based on thin-film lithium tantalate (TFLT), enabling reset-free polarization tracking at Mrad/s speeds. The four-stage electro-optic device exhibits polarization-dependent loss (PDL) below 0.3 dB, a half-wave voltage below 2.5 V, high modulation bandwidth, and negligible DC drift. To accommodate the finite tuning range of integrated phase shifters, we develop a finite-boundary gradient-descent (FBGD) control algorithm that ensures reset-free SOP evolution with no phase jump. The implemented adaptive polarization controller (APC) is validated through both standalone polarization-tracking measurements and a dual-polarization 16-QAM SHC 400-Gbps transmission system. Transient polarization disturbances can be tracked at speeds up to 2 Mrad/s, while stable reset-free operation under continuous polarization disturbances is maintained up to 1 Mrad/s. This reset-free performance represents more than doubling the state of the art, while the pre-FEC bit-error rates remain below the HD-FEC threshold under realistic DCI conditions and lightning-scale polarization disturbances. These results establish TFLT as a new platform for ultrafast, low-power, reset-free, and drift-free polarization control in coherent optical interconnects and beyond.

preprint2026arXiv

Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning

Imitation learning (IL) is a paradigm for learning sequential decision making policies from experts, leveraging offline demonstrations, interactive annotations, or both. Recent advances show that when annotation cost is tallied per trajectory, Behavior Cloning (BC) which relies solely on offline demonstrations cannot be improved in general, leaving limited conditions for interactive methods such as DAgger to help. We revisit this conclusion and prove that when the annotation cost is measured per state, algorithms using interactive annotations can provably outperform BC. Specifically: (1) we show that Stagger, a one sample per round variant of DAgger, provably beats BC under low recovery cost settings; (2) we initiate the study of hybrid IL where the agent learns from offline demonstrations and interactive annotations. We propose Warm Stagger whose learning guarantee is not much worse than using either data source alone. Furthermore, motivated by compounding error and cold start problem in imitation learning practice, we give an MDP example in which Warm Stagger has significant better annotation cost; (3) experiments on MuJoCo continuous control tasks confirm that, with modest cost ratio between interactive and offline annotations, interactive and hybrid approaches consistently outperform BC. To the best of our knowledge, our work is the first to highlight the benefit of state wise interactive annotation and hybrid feedback in imitation learning.

preprint2022arXiv

Active Fairness Auditing

The fast spreading adoption of machine learning (ML) by companies across industries poses significant regulatory challenges. One such challenge is scalability: how can regulatory bodies efficiently audit these ML models, ensuring that they are fair? In this paper, we initiate the study of query-based auditing algorithms that can estimate the demographic parity of ML models in a query-efficient manner. We propose an optimal deterministic algorithm, as well as a practical randomized, oracle-efficient algorithm with comparable guarantees. Furthermore, we make inroads into understanding the optimal query complexity of randomized active fairness estimation algorithms. Our first exploration of active fairness estimation aims to put AI governance on firmer theoretical foundations.

preprint2022arXiv

Margin-distancing for safe model explanation

The growing use of machine learning models in consequential settings has highlighted an important and seemingly irreconcilable tension between transparency and vulnerability to gaming. While this has sparked sizable debate in legal literature, there has been comparatively less technical study of this contention. In this work, we propose a clean-cut formulation of this tension and a way to make the tradeoff between transparency and gaming. We identify the source of gaming as being points close to the \emph{decision boundary} of the model. And we initiate an investigation on how to provide example-based explanations that are expansive and yet consistent with a version space that is sufficiently uncertain with respect to the boundary points' labels. Finally, we furnish our theoretical results with empirical investigations of this tradeoff on real-world datasets.

preprint2022arXiv

Provably Efficient Multi-Task Reinforcement Learning with Model Transfer

We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, with a goal of improving their collective performance through inter-player information sharing. We design and analyze an algorithm based on the idea of model transfer, and provide gap-dependent and gap-independent upper and lower bounds that characterize the intrinsic complexity of the problem.

preprint2022arXiv

Thompson Sampling for Robust Transfer in Multi-Task Bandits

We study the problem of online multi-task learning where the tasks are performed within similar but not necessarily identical multi-armed bandit environments. In particular, we study how a learner can improve its overall performance across multiple related tasks through robust transfer of knowledge. While an upper confidence bound (UCB)-based algorithm has recently been shown to achieve nearly-optimal performance guarantees in a setting where all tasks are solved concurrently, it remains unclear whether Thompson sampling (TS) algorithms, which have superior empirical performance in general, share similar theoretical properties. In this work, we present a TS-type algorithm for a more general online multi-task learning protocol, which extends the concurrent setting. We provide its frequentist analysis and prove that it is also nearly-optimal using a novel concentration inequality for multi-task data aggregation at random stopping times. Finally, we evaluate the algorithm on synthetic data and show that the TS-type algorithm enjoys superior empirical performance in comparison with the UCB-based algorithm and a baseline algorithm that performs TS for each individual task without transfer.

preprint2021arXiv

Active Online Learning with Hidden Shifting Domains

Online machine learning systems need to adapt to domain shifts. Meanwhile, acquiring label at every timestep is expensive. We propose a surprisingly simple algorithm that adaptively balances its regret and its number of label queries in settings where the data streams are from a mixture of hidden domains. For online linear regression with oblivious adversaries, we provide a tight tradeoff that depends on the durations and dimensionalities of the hidden domains. Our algorithm can adaptively deal with interleaving spans of inputs from different domains. We also generalize our results to non-linear regression for hypothesis classes with bounded eluder dimension and adaptive adversaries. Experiments on synthetic and realistic datasets demonstrate that our algorithm achieves lower regret than uniform queries and greedy queries with equal labeling budget.

preprint2021arXiv

Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance

This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $\mathbb{R}^d$ under noise. Though recent works have established attribute-efficient learning algorithms under various types of label noise (e.g. bounded noise), it remains an open question when and how $s$-sparse halfspaces can be efficiently learned under the challenging malicious noise model, where an adversary may corrupt both the unlabeled examples and the labels. We answer this question in the affirmative by designing a computationally efficient active learning algorithm with near-optimal label complexity of $\tilde{O}\big({s \log^4 \frac d ε} \big)$ and noise tolerance $η= Ω(ε)$, where $ε\in (0, 1)$ is the target error rate, under the assumption that the distribution over (uncorrupted) unlabeled examples is isotropic log-concave. Our algorithm can be straightforwardly tailored to the passive learning setting, and we show that the sample complexity is $\tilde{O}\big({\frac 1 εs^2 \log^5 d} \big)$ which also enjoys the attribute efficiency. Our main techniques include attribute-efficient paradigms for instance reweighting and for empirical risk minimization, and a new analysis of uniform concentration for unbounded data -- all of them crucially take the structure of the underlying halfspace into account.

preprint2020arXiv

Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting

We study contextual bandit learning with an abstract policy class and continuous action space. We obtain two qualitatively different regret bounds: one competes with a smoothed version of the policy class under no continuity assumptions, while the other requires standard Lipschitz assumptions. Both bounds exhibit data-dependent "zooming" behavior and, with no tuning, yield improved guarantees for benign problems. We also study adapting to unknown smoothness parameters, establishing a price-of-adaptivity and deriving optimal adaptive algorithms that require no additional information.

preprint2020arXiv

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. We show that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems.

preprint2016arXiv

Search Improves Label for Active Learning

We investigate active learning with access to two distinct oracles: Label (which is standard) and Search (which is not). The Search oracle models the situation where a human searches a database to seed or counterexample an existing solution. Search is stronger than Label while being natural to implement in many situations. We show that an algorithm using both oracles can provide exponentially large problem-dependent improvements over Label alone.

preprint2016arXiv

The Extended Littlestone's Dimension for Learning with Mistakes and Abstentions

This paper studies classification with an abstention option in the online setting. In this setting, examples arrive sequentially, the learner is given a hypothesis class $\mathcal H$, and the goal of the learner is to either predict a label on each example or abstain, while ensuring that it does not make more than a pre-specified number of mistakes when it does predict a label. Previous work on this problem has left open two main challenges. First, not much is known about the optimality of algorithms, and in particular, about what an optimal algorithmic strategy is for any individual hypothesis class. Second, while the realizable case has been studied, the more realistic non-realizable scenario is not well-understood. In this paper, we address both challenges. First, we provide a novel measure, called the Extended Littlestone's Dimension, which captures the number of abstentions needed to ensure a certain number of mistakes. Second, we explore the non-realizable case, and provide upper and lower bounds on the number of abstentions required by an algorithm to guarantee a specified number of mistakes.

preprint2015arXiv

Active Learning from Weak and Strong Labelers

An active learner is given a hypothesis class, a large set of unlabeled examples and the ability to interactively query labels to an oracle of a subset of these examples; the goal of the learner is to learn a hypothesis in the class that fits the data well by making as few label queries as possible. This work addresses active learning with labels obtained from strong and weak labelers, where in addition to the standard active learning setting, we have an extra weak labeler which may occasionally provide incorrect labels. An example is learning to classify medical images where either expensive labels may be obtained from a physician (oracle or strong labeler), or cheaper but occasionally incorrect labels may be obtained from a medical resident (weak labeler). Our goal is to learn a classifier with low error on data labeled by the oracle, while using the weak labeler to reduce the number of label queries made to this labeler. We provide an active learning algorithm for this setting, establish its statistical consistency, and analyze its label complexity to characterize when it can provide label savings over using the strong labeler alone.

preprint2015arXiv

Spectral Learning of Large Structured HMMs for Comparative Epigenomics

We develop a latent variable model and an efficient spectral algorithm motivated by the recent emergence of very large data sets of chromatin marks from multiple human cell types. A natural model for chromatin data in one cell type is a Hidden Markov Model (HMM); we model the relationship between multiple cell types by connecting their hidden states by a fixed tree of known structure. The main challenge with learning parameters of such models is that iterative methods such as EM are very slow, while naive spectral methods result in time and space complexity exponential in the number of cell types. We exploit properties of the tree structure of the hidden states to provide spectral algorithms that are more computationally efficient for current biological datasets. We provide sample complexity bounds for our algorithm and evaluate it experimentally on biological data from nine human cell types. Finally, we show that beyond our specific model, some of our algorithmic ideas can be applied to other graphical models.

preprint2014arXiv

Beyond Disagreement-based Agnostic Active Learning

We study agnostic active learning, where the goal is to learn a classifier in a pre-specified hypothesis class interactively with as few label queries as possible, while making no assumptions on the true function generating the labels. The main algorithms for this problem are {\em{disagreement-based active learning}}, which has a high label requirement, and {\em{margin-based active learning}}, which only applies to fairly restricted settings. A major challenge is to find an algorithm which achieves better label complexity, is consistent in an agnostic setting, and applies to general classification problems. In this paper, we provide such an algorithm. Our solution is based on two novel contributions -- a reduction from consistent active learning to confidence-rated prediction with guaranteed error, and a novel confidence-rated predictor.

Chicheng Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

First Thin-Film Lithium Tantalate Polarization Controller Enabling Reset-Free Mrad/s Tracking for Optical Interconnects

Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning

Active Fairness Auditing

Margin-distancing for safe model explanation

Provably Efficient Multi-Task Reinforcement Learning with Model Transfer

Thompson Sampling for Robust Transfer in Multi-Task Bandits

Active Online Learning with Hidden Shifting Domains

Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance

Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

Search Improves Label for Active Learning

The Extended Littlestone's Dimension for Learning with Mistakes and Abstentions

Active Learning from Weak and Strong Labelers

Spectral Learning of Large Structured HMMs for Comparative Epigenomics

Beyond Disagreement-based Agnostic Active Learning