Source author record

Bryan Lim

Bryan Lim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Neural and Evolutionary Computing Robotics Information Retrieval q-fin.PM q-fin.TR

Catalog footprint

What is connected

3works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Enhancing Cross-Sectional Currency Strategies by Context-Aware Learning to Rank with Self-Attention

The performance of a cross-sectional currency strategy depends crucially on accurately ranking instruments prior to portfolio construction. While this ranking step is traditionally performed using heuristics, or by sorting the outputs produced by pointwise regression or classification techniques, strategies using Learning to Rank algorithms have recently presented themselves as competitive and viable alternatives. Although the rankers at the core of these strategies are learned globally and improve ranking accuracy on average, they ignore the differences between the distributions of asset features over the times when the portfolio is rebalanced. This flaw renders them susceptible to producing sub-optimal rankings, possibly at important periods when accuracy is actually needed the most. For example, this might happen during critical risk-off episodes, which consequently exposes the portfolio to substantial, unwanted drawdowns. We tackle this shortcoming with an analogous idea from information retrieval: that a query's top retrieved documents or the local ranking context provide vital information about the query's own characteristics, which can then be used to refine the initial ranked list. In this work, we use a context-aware Learning-to-rank model that is based on the Transformer architecture to encode top/bottom ranked assets, learn the context and exploit this information to re-rank the initial results. Backtesting on a slate of 31 currencies, our proposed methodology increases the Sharpe ratio by around 30% and significantly enhances various performance metrics. Additionally, this approach also improves the Sharpe ratio when separately conditioning on normal and risk-off market states.

preprint2022arXiv

Learning to Walk Autonomously via Reset-Free Quality-Diversity

Quality-Diversity (QD) algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. However, the generation of behavioural repertoires has mainly been limited to simulation environments instead of real-world learning. This is because existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions. This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments. We build on Dynamics-Aware Quality-Diversity (DA-QD) and introduce a behaviour selection policy that leverages the diversity of the imagined repertoire and environmental information to intelligently select of behaviours that can act as automatic resets. We demonstrate this through a task of learning to walk within defined training zones with obstacles. Our experiments show that we can learn full repertoires of legged locomotion controllers autonomously without manual resets with high sample efficiency in spite of harsh safety constraints. Finally, using an ablation of different target objectives, we show that it is important for RF-QD to have diverse types solutions available for the behaviour selection policy over solutions optimised with a specific objective. Videos and code available at https://sites.google.com/view/rf-qd.

preprint2021arXiv

Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires

Quality-Diversity (QD) algorithms are powerful exploration algorithms that allow robots to discover large repertoires of diverse and high-performing skills. However, QD algorithms are sample inefficient and require millions of evaluations. In this paper, we propose Dynamics-Aware Quality-Diversity (DA-QD), a framework to improve the sample efficiency of QD algorithms through the use of dynamics models. We also show how DA-QD can then be used for continual acquisition of new skill repertoires. To do so, we incrementally train a deep dynamics model from experience obtained when performing skill discovery using QD. We can then perform QD exploration in imagination with an imagined skill repertoire. We evaluate our approach on three robotic experiments. First, our experiments show DA-QD is 20 times more sample efficient than existing QD approaches for skill discovery. Second, we demonstrate learning an entirely new skill repertoire in imagination to perform zero-shot learning. Finally, we show how DA-QD is useful and effective for solving a long horizon navigation task and for damage adaptation in the real world. Videos and source code are available at: https://sites.google.com/view/da-qd.