Source author record

Yifei Ma

Yifei Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning physics.optics Data Structures and Algorithms Information Retrieval

Catalog footprint

What is connected

6works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

General in situ feedback control of cascaded liquid crystal spatial light modulators for structured field generation

Cascaded liquid crystal spatial light modulators provide a versatile strategy for the generation of structured light and matter fields, with applications including optical communications, photonic computing, and topological field engineering. However, experimental imperfections, such as temperature-dependent liquid crystal response, variations between individual pixels, and alignment errors, present significant engineering challenges in generating high-quality fields. Moreover, changes in experimental conditions over time mean that calibrating each component once is insufficient for maintaining long-term, high-quality field generation. To address this, we present a general engineering approach based on a bespoke, physically informed, and manifold-constrained gradient-descent scheme that enables in situ feedback control, compensating for such errors in real time without the need to alter the experimental setup. We further demonstrate the correction efficacy of our proposed strategy through experiments in both spatially varying light and matter field generation, including scenarios in which complex vectorial aberrations are artificially introduced into the setup. Together, these demonstrations underscore the practicality of our method and its suitability for deployment in real-world experimental environments, paving the way for robust operation of cascaded architectures for structured field generation.

preprint2026arXiv

Resolving topological obstructions to vectorial structured field control

The use of structured matter, such as optical retarders, for vectorial control is a well-established and widely employed technique in modern optics, and has driven continued advances in the manipulation of complex, spatially varying vectorial fields. However, achieving arbitrary field conversion typically requires the use of cascaded elements, as intrinsic physical and fabrication constraints fundamentally limit individual devices to a restricted subset of transformations. This results in an overall continuous transformation potentially failing to be continuous at the level of the parameters of the cascade, leading to detrimental engineering consequences such as the introduction of complex, discontinuous aberrations that disrupt important topological properties of the underlying matter field. In this work, we establish a novel mathematical framework for analyzing the topological difficulties that emerge in the decomposition of an overall transformation into individual layers, and for determining the minimal depth required to overcome them. The strategy introduced provides a general pathway for optimizing designs for vectorial field control and matter field generation, with particular significance for the manipulation of topological phases in optical polarization fields, such as Stokes skyrmions, where continuity is of vital importance.

preprint2022arXiv

Context Uncertainty in Contextual Bandits with Applications to Recommender Systems

Recurrent neural networks have proven effective in modeling sequential user feedbacks for recommender systems. However, they usually focus solely on item relevance and fail to effectively explore diverse items for users, therefore harming the system performance in the long run. To address this problem, we propose a new type of recurrent neural networks, dubbed recurrent exploration networks (REN), to jointly perform representation learning and effective exploration in the latent space. REN tries to balance relevance and exploration while taking into account the uncertainty in the representations. Our theoretical analysis shows that REN can preserve the rate-optimal sublinear regret even when there exists uncertainty in the learned representations. Our empirical study demonstrates that REN can achieve satisfactory long-term rewards on both synthetic and real-world recommendation datasets, outperforming state-of-the-art models.

preprint2020arXiv

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Motivated by the many real-world applications of reinforcement learning (RL) that require safe-policy iterations, we consider the problem of off-policy evaluation (OPE) -- the problem of evaluating a new policy using the historical data obtained by different behavior policies -- under the model of nonstationary episodic Markov Decision Processes (MDP) with a long horizon and a large action space. Existing importance sampling (IS) methods often suffer from large variance that depends exponentially on the RL horizon $H$. To solve this problem, we consider a marginalized importance sampling (MIS) estimator that recursively estimates the state marginal distribution for the target policy at every step. MIS achieves a mean-squared error of $$ \frac{1}{n} \sum\nolimits_{t=1}^H\mathbb{E}_μ\left[\frac{d_t^π(s_t)^2}{d_t^μ(s_t)^2} \mathrm{Var}_μ\left[\frac{π_t(a_t|s_t)}{μ_t(a_t|s_t)}\big( V_{t+1}^π(s_{t+1}) + r_t\big) \middle| s_t\right]\right] + \tilde{O}(n^{-1.5}) $$ where $μ$ and $π$ are the logging and target policies, $d_t^μ(s_t)$ and $d_t^π(s_t)$ are the marginal distribution of the state at $t$th step, $H$ is the horizon, $n$ is the sample size and $V_{t+1}^π$ is the value function of the MDP under $π$. The result matches the Cramer-Rao lower bound in \citet{jiang2016doubly} up to a multiplicative factor of $H$. To the best of our knowledge, this is the first OPE estimation error bound with a polynomial dependence on $H$. Besides theory, we show empirical superiority of our method in time-varying, partially observable, and long-horizon RL environments.

preprint2016arXiv

Active Search for Sparse Signals with Region Sensing

Autonomous systems can be used to search for sparse signals in a large space; e.g., aerial robots can be deployed to localize threats, detect gas leaks, or respond to distress calls. Intuitively, search algorithms may increase efficiency by collecting aggregate measurements summarizing large contiguous regions. However, most existing search methods either ignore the possibility of such region observations (e.g., Bayesian optimization and multi-armed bandits) or make strong assumptions about the sensing mechanism that allow each measurement to arbitrarily encode all signals in the entire environment (e.g., compressive sensing). We propose an algorithm that actively collects data to search for sparse signals using only noisy measurements of the average values on rectangular regions (including single points), based on the greedy maximization of information gain. We analyze our algorithm in 1d and show that it requires $\tilde{O}(\frac{n}{μ^2}+k^2)$ measurements to recover all of $k$ signal locations with small Bayes error, where $μ$ and $n$ are the signal strength and the size of the search space, respectively. We also show that active designs can be fundamentally more efficient than passive designs with region sensing, contrasting with the results of Arias-Castro, Candes, and Davenport (2013). We demonstrate the empirical performance of our algorithm on a search problem using satellite image data and in high dimensions.

preprint2012arXiv

Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields

Many real-world datasets can be represented in the form of a graph whose edge weights designate similarities between instances. A discrete Gaussian random field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior covariance is the inverse of a graph Laplacian. Minimizing the trace of the predictive covariance Sigma (V-optimality) on GRFs has proven successful in batch active learning classification problems with budget constraints. However, its worst-case bound has been missing. We show that the V-optimality on GRFs as a function of the batch query set is submodular and hence its greedy selection algorithm guarantees an (1-1/e) approximation ratio. Moreover, GRF models have the absence-of-suppressor (AofS) condition. For active survey problems, we propose a similar survey criterion which minimizes 1'(Sigma)1. In practice, V-optimality criterion performs better than GPs with mutual information gain criteria and allows nonuniform costs for different nodes.