Source author record

Jinxin Liu

Jinxin Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Methodology physics.med-ph

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Offline Imitation Learning with Variational Counterfactual Reasoning

In offline imitation learning (IL), an agent aims to learn an optimal expert behavior policy without additional online environment interactions. However, in many real-world scenarios, such as robotics manipulation, the offline dataset is collected from suboptimal behaviors without rewards. Due to the scarce expert data, the agents usually suffer from simply memorizing poor trajectories and are vulnerable to variations in the environments, lacking the capability of generalizing to new environments. To automatically generate high-quality expert data and improve the generalization ability of the agent, we propose a framework named \underline{O}ffline \underline{I}mitation \underline{L}earning with \underline{C}ounterfactual data \underline{A}ugmentation (OILCA) by doing counterfactual inference. In particular, we leverage identifiable variational autoencoder to generate \textit{counterfactual} samples for expert data augmentation. We theoretically analyze the influence of the generated expert data and the improvement of generalization. Moreover, we conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both \textsc{DeepMind Control Suite} benchmark for in-distribution performance and \textsc{CausalWorld} benchmark for out-of-distribution generalization. Our code is available at \url{https://github.com/ZexuSun/OILCA-NeurIPS23}.

preprint2022arXiv

DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning

Offline reinforcement learning algorithms promise to be applicable in settings where a fixed dataset is available and no new experience can be acquired. However, such formulation is inevitably offline-data-hungry and, in practice, collecting a large offline dataset for one specific task over one specific environment is also costly and laborious. In this paper, we thus 1) formulate the offline dynamics adaptation by using (source) offline data collected from another dynamics to relax the requirement for the extensive (target) offline data, 2) characterize the dynamics shift problem in which prior offline methods do not scale well, and 3) derive a simple dynamics-aware reward augmentation (DARA) framework from both model-free and model-based offline settings. Specifically, DARA emphasizes learning from those source transition pairs that are adaptive for the target environment and mitigates the offline dynamics shift by characterizing state-action-next-state pairs instead of the typical state-action distribution sketched by prior offline RL methods. The experimental evaluation demonstrates that DARA, by augmenting rewards in the source offline dataset, can acquire an adaptive policy for the target environment and yet significantly reduce the requirement of target offline data. With only modest amounts of target offline data, our performance consistently outperforms the prior offline RL methods in both simulated and real-world tasks.

preprint2020arXiv

A no-gold-standard technique to objectively evaluate quantitative imaging methods using patient data: Theory

Objective evaluation of quantitative imaging (QI) methods using measurements directly obtained from patient images is highly desirable but hindered by the non-availability of gold standards. To address this issue, statistical techniques have been proposed to objectively evaluate QI methods without a gold standard. These techniques assume that the measured and true values are linearly related by a slope, bias, and normally distributed noise term, where it is assumed that the noise term between the different methods is independent. However, the noise could be correlated since it arises in the process of measuring the same true value. To address this issue, we propose a new no-gold-standard evaluation (NGSE) technique that models this noise as a multivariate normally distributed term, characterized by a covariance matrix. In this manuscript, we derive a maximum-likelihood-based technique that, without any knowledge of the true QI values, estimates the slope, bias, and covariance matrix terms. These are then used to rank the methods on the basis of precision of the measured QI values. Overall, the derivation demonstrates the mathematical premise behind the proposed NGSE technique.