Source author record

Yifang Chen

Yifang Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence physics.optics cond-mat.mes-hall quant-ph

Catalog footprint

What is connected

7works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Deep Bayesian Bandits Approach for Anticancer Therapy: Exploration via Functional Prior

Learning personalized cancer treatment with machine learning holds great promise to improve cancer patients' chance of survival. Despite recent advances in machine learning and precision oncology, this approach remains challenging as collecting data in preclinical/clinical studies for modeling multiple treatment efficacies is often an expensive, time-consuming process. Moreover, the randomization in treatment allocation proves to be suboptimal since some participants/samples are not receiving the most appropriate treatments during the trial. To address this challenge, we formulate drug screening study as a "contextual bandit" problem, in which an algorithm selects anticancer therapeutics based on contextual information about cancer cell lines while adapting its treatment strategy to maximize treatment response in an "online" fashion. We propose using a novel deep Bayesian bandits framework that uses functional prior to approximate posterior for drug response prediction based on multi-modal information consisting of genomic features and drug structure. We empirically evaluate our method on three large-scale in vitro pharmacogenomic datasets and show that our approach outperforms several benchmarks in identifying optimal treatment for a given cell line.

preprint2022arXiv

Active Multi-Task Representation Learning

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications. However, up until now, choosing which source tasks to include in the multi-task learning has been more art than science. In this paper, we give the first formal study on resource task sampling by leveraging the techniques from active learning. We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance. Theoretically, we show that for the linear representation class, to achieve the same error rate, our algorithm can save up to a \textit{number of source tasks} factor in the source task sample complexity, compared with the naive uniform sampling from all source tasks. We also provide experiments on real-world computer vision datasets to illustrate the effectiveness of our proposed method on both linear and convolutional neural network representation classes. We believe our paper serves as an important initial step to bring techniques from active learning to representation learning.

preprint2022arXiv

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

Reward-free reinforcement learning (RL) considers the setting where the agent does not have access to a reward function during exploration, but must propose a near-optimal policy for an arbitrary reward function revealed only after exploring. In the the tabular setting, it is well known that this is a more difficult problem than reward-aware (PAC) RL -- where the agent has access to the reward function during exploration -- with optimal sample complexities in the two settings differing by a factor of $|\mathcal{S}|$, the size of the state space. We show that this separation does not exist in the setting of linear MDPs. We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/ε^2)$. We then show a lower bound with matching dimension-dependence of $Ω(d^2 H^2/ε^2)$, which holds for the reward-aware RL setting. To our knowledge, our approach is the first computationally efficient algorithm to achieve optimal $d$ dependence in linear MDPs, even in the single-reward PAC setting. Our algorithm relies on a novel procedure which efficiently traverses a linear MDP, collecting samples in any given ``feature direction'', and enjoys a sample complexity scaling optimally in the (linear MDP equivalent of the) maximal state visitation probability. We show that this exploration procedure can also be applied to solve the problem of obtaining ``well-conditioned'' covariates in linear MDPs.

preprint2021arXiv

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system. We propose new algorithms which, compared to the existing results in (Lykouris et al., 2020), achieve strictly better regret bounds in terms of total corruptions for the tabular setting. To be specific, firstly, our regret bounds depend on more precise numerical values of total rewards corruptions and transition corruptions, instead of only on the total number of corrupted episodes. Secondly, our regret bounds are the first of their kind in the reinforcement learning setting to have the number of corruptions show up additively with respect to $\min\{\sqrt{T}, \text{PolicyGapComplexity}\}$ rather than multiplicatively. Our results follow from a general algorithmic framework that combines corruption-robust policy elimination meta-algorithms, and plug-in reward-free exploration sub-algorithms. Replacing the meta-algorithm or sub-algorithm may extend the framework to address other corrupted settings with potentially more structure.

preprint2020arXiv

More Practical and Adaptive Algorithms for Online Quantum State Learning

Online quantum state learning is a recently proposed problem by Aaronson et al. (2018), where the learner sequentially predicts $n$-qubit quantum states based on given measurements on states and noisy outcomes. In the previous work, the algorithms are worst-case optimal in general but fail in achieving tighter bounds in certain simpler or more practical cases. In this paper, we develop algorithms to advance the online learning of quantum states. First, we show that Regularized Follow-the-Leader (RFTL) method with Tallis-2 entropy can achieve an $O(\sqrt{MT})$ total loss with perfect hindsight on the first $T$ measurements with maximum rank $M$. This regret bound depends only on the maximum rank $M$ of measurements rather than the number of qubits, which takes advantage of low-rank measurements. Second, we propose a parameter-free algorithm based on a classical adjusting learning rate schedule that can achieve a regret depending on the loss of best states in hindsight, which takes advantage of low noisy outcomes. Besides these more adaptive bounds, we also show that our RFTL with Tallis-2 entropy algorithm can be implemented efficiently on near-term quantum computing devices, which is not achievable in previous works.

preprint2010arXiv

Controlling the Colour of Metals: Intaglio and Bas-Relief Metamaterials

The fabrication of indented ('intaglio') or raised ('bas-relief') sub-wavelength metamaterial patterns on a metal surface provides a mechanism for changing and controlling the colour of the metal without employing any form of chemical surface modification, thin-film coating or diffraction effects. We show that a broad range of colours can be achieved by varying the structural parameters of metamaterial designs to tune absorption resonances. This novel approach to the 'structural colouring' of pure metals offers great versatility and scalability for both aesthetic (e.g. jewellery design) and functional (e.g. sensors, optical modulators) applications. We focus here on visible colour but the concept can equally be applied to the engineering of metallic spectral response in other electromagnetic domains.

preprint2006arXiv

Focusing of Light by a Nano-Hole Array

We demonstrate a conceptually new mechanism for sub-wavelength focusing at optical frequencies based upon the use of nano-hole quasi-periodic arrays in metal screens. Using coherent illumination at 660 nm and scanning near-field optical microscopy, ~290 nm "hot spots", were observed at a distance of ~12.5 mkm from the array. Even smaller hot-spots of about 200 nm in waist were observed closer to the plane of the array.

Yifang Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A Deep Bayesian Bandits Approach for Anticancer Therapy: Exploration via Functional Prior

Active Multi-Task Representation Learning

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

More Practical and Adaptive Algorithms for Online Quantum State Learning

Controlling the Colour of Metals: Intaglio and Bas-Relief Metamaterials

Focusing of Light by a Nano-Hole Array