Researcher profile

Thommen George Karimpanal

Thommen George Karimpanal contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Fast Model-based Policy Search for Universal Policy Networks

Adapting an agent's behaviour to new environments has been one of the primary focus areas of physics based reinforcement learning. Although recent approaches such as universal policy networks partially address this issue by enabling the storage of multiple policies trained in simulation on a wide range of dynamic/latent factors, efficiently identifying the most appropriate policy for a given environment remains a challenge. In this work, we propose a Gaussian Process-based prior learned in simulation, that captures the likely performance of a policy when transferred to a previously unseen environment. We integrate this prior with a Bayesian Optimisation-based policy search process to improve the efficiency of identifying the most appropriate policy from the universal policy network. We empirically evaluate our approach in a range of continuous and discrete control environments, and show that it outperforms other competing baselines.

preprint2022arXiv

Intuitive Physics Guided Exploration for Sample Efficient Sim2real Transfer

Physics-based reinforcement learning tasks can benefit from simplified physics simulators as they potentially allow near-optimal policies to be learned in simulation. However, such simulators require the latent factors (e.g. mass, friction coefficient etc.) of the associated objects and other environment-specific factors (e.g. wind speed, air density etc.) to be accurately specified, without which, it could take considerable additional learning effort to adapt the learned simulation policy to the real environment. As such a complete specification can be impractical, in this paper, we instead, focus on learning task-specific estimates of latent factors which allow the approximation of real world trajectories in an ideal simulation environment. Specifically, we propose two new concepts: a) action grouping - the idea that certain types of actions are closely associated with the estimation of certain latent factors, and; b) partial grounding - the idea that simulation of task-specific dynamics may not need precise estimation of all the latent factors. We first introduce intuitive action groupings based on human physics knowledge and experience, which is then used to design novel strategies for interacting with the real environment. Next, we describe how prior knowledge of a task in a given environment can be used to extract the relative importance of different latent factors, and how this can be used to inform partial grounding, which enables efficient learning of the task in any arbitrary environment. We demonstrate our approach in a range of physics based tasks, and show that it achieves superior performance relative to other baselines, using only a limited number of real-world interactions.

preprint2022arXiv

Uncertainty Aware System Identification with Universal Policies

Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.

preprint2020arXiv

Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning

Prior access to domain knowledge could significantly improve the performance of a reinforcement learning agent. In particular, it could help agents avoid potentially catastrophic exploratory actions, which would otherwise have to be experienced during learning. In this work, we identify consistently undesirable actions in a set of previously learned tasks, and use pseudo-rewards associated with them to learn a prior policy. In addition to enabling safer exploratory behaviors in subsequent tasks in the domain, we show that these priors are transferable to similar environments, and can be learned off-policy and in parallel with the learning of other tasks in the domain. We compare our approach to established, state-of-the-art algorithms in both discrete as well as continuous environments, and demonstrate that it exhibits a safer exploratory behavior while learning to perform arbitrary tasks in the domain. We also present a theoretical analysis to support these results, and briefly discuss the implications and some alternative formulations of this approach, which could also be useful in certain scenarios.

preprint2020arXiv

Neuro-evolutionary Frameworks for Generalized Learning Agents

The recent successes of deep learning and deep reinforcement learning have firmly established their statuses as state-of-the-art artificial learning techniques. However, longstanding drawbacks of these approaches, such as their poor sample efficiencies and limited generalization capabilities point to a need for re-thinking the way such systems are designed and deployed. In this paper, we emphasize how the use of these learning systems, in conjunction with a specific variation of evolutionary algorithms could lead to the emergence of unique characteristics such as the automated acquisition of a variety of desirable behaviors and useful sets of behavior priors. This could pave the way for learning to occur in a generalized and continual manner, with minimal interactions with the environment. We discuss the anticipated improvements from such neuro-evolutionary frameworks, along with the associated challenges, as well as its potential for application to a number of research areas.