Researcher profile

Steven Latré

Steven Latré contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

An Analysis of Discretization Methods for Communication Learning with Multi-Agent Reinforcement Learning

Communication is crucial in multi-agent reinforcement learning when agents are not able to observe the full state of the environment. The most common approach to allow learned communication between agents is the use of a differentiable communication channel that allows gradients to flow between agents as a form of feedback. However, this is challenging when we want to use discrete messages to reduce the message size since gradients cannot flow through a discrete communication channel. Previous work proposed methods to deal with this problem. However, these methods are tested in different communication learning architectures and environments, making it hard to compare them. In this paper, we compare several state-of-the-art discretization methods as well as two methods that have not been used for communication learning before. We do this comparison in the context of communication learning using gradients from other agents and perform tests on several environments. Our results show that none of the methods is best in all environments. The best choice in discretization method greatly depends on the environment. However, the discretize regularize unit (DRU), straight through DRU and the straight through gumbel softmax show the most consistent results across all the tested environments. Therefore, these methods prove to be the best choice for general use while the straight through estimator and the gumbel softmax may provide better results in specific environments but fail completely in others.

preprint2022arXiv

Learning to Communicate Using Counterfactual Reasoning

Learning to communicate in order to share state information is an active problem in the area of multi-agent reinforcement learning (MARL). The credit assignment problem, the non-stationarity of the communication environment and the creation of influenceable agents are major challenges within this research field which need to be overcome in order to learn a valid communication protocol. This paper introduces the novel multi-agent counterfactual communication learning (MACC) method which adapts counterfactual reasoning in order to overcome the credit assignment problem for communicating agents. Secondly, the non-stationarity of the communication environment while learning the communication Q-function is overcome by creating the communication Q-function using the action policy of the other agents and the Q-function of the action environment. Additionally, a social loss function is introduced in order to create influenceable agents which is required to learn a valid communication protocol. Our experiments show that MACC is able to outperform the state-of-the-art baselines in four different scenarios in the Particle environment.

preprint2020arXiv

HTMRL: Biologically Plausible Reinforcement Learning with Hierarchical Temporal Memory

Building Reinforcement Learning (RL) algorithms which are able to adapt to continuously evolving tasks is an open research challenge. One technology that is known to inherently handle such non-stationary input patterns well is Hierarchical Temporal Memory (HTM), a general and biologically plausible computational model for the human neocortex. As the RL paradigm is inspired by human learning, HTM is a natural framework for an RL algorithm supporting non-stationary environments. In this paper, we present HTMRL, the first strictly HTM-based RL algorithm. We empirically and statistically show that HTMRL scales to many states and actions, and demonstrate that HTM's ability for adapting to changing patterns extends to RL. Specifically, HTMRL performs well on a 10-armed bandit after 750 steps, but only needs a third of that to adapt to the bandit suddenly shuffling its arms. HTMRL is the first iteration of a novel RL approach, with the potential of extending to a capable algorithm for Meta-RL.

preprint2020arXiv

Neurosciences and 6G: Lessons from and Needs of Communicative Brains

This paper presents the first comprehensive tutorial on a promising research field located at the frontier of two well-established domains: Neurosciences and wireless communications, motivated by the ongoing efforts to define how the sixth generation of mobile networks (6G) will be. In particular, this tutorial first provides a novel integrative approach that bridges the gap between these two, seemingly disparate fields. Then, we present the state-of-the-art and key challenges of these two topics. In particular, we propose a novel systematization that divides the contributions into two groups, one focused on what neurosciences will offer to 6G in terms of new applications and systems architecture (Neurosciences for Wireless), and the other focused on how wireless communication theory and 6G systems can provide new ways to study the brain (Wireless for Neurosciences). For the first group, we concretely explain how current scientific understanding of the brain would enable new application for 6G within the context of a new type of service that we dub braintype communications and that has more stringent requirements than human- and machine-type communication. In this regard, we expose the key requirements of brain-type communication services and we discuss how future wireless networks can be equipped to deal with such services. Meanwhile, for the second group, we thoroughly explore modern communication system paradigms, including Internet of Bio-nano Things and chaosbased communications, in addition to highlighting how complex systems tools can help bridging 6G and neuroscience applications. Brain-controlled vehicles are then presented as our case study. All in all, this tutorial is expected to provide a largely missing articulation between these two emerging fields while delineating concrete ways to move forward in such an interdisciplinary endeavor.

preprint2020arXiv

Pre-trained Word Embeddings for Goal-conditional Transfer Learning in Reinforcement Learning

Reinforcement learning (RL) algorithms typically start tabula rasa, without any prior knowledge of the environment, and without any prior skills. This however often leads to low sample efficiency, requiring a large amount of interaction with the environment. This is especially true in a lifelong learning setting, in which the agent needs to continually extend its capabilities. In this paper, we examine how a pre-trained task-independent language model can make a goal-conditional RL agent more sample efficient. We do this by facilitating transfer learning between different related tasks. We experimentally demonstrate our approach on a set of object navigation tasks.