Source author record

Luis Pineda

Luis Pineda appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computer Vision eess.IV eess.SY math.OC Multiagent Systems Neural and Evolutionary Computing Robotics Systems and Control

Catalog footprint

What is connected

7works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Theseus: A Library for Differentiable Nonlinear Optimization

We present Theseus, an efficient application-agnostic open source library for differentiable nonlinear least squares (DNLS) optimization built on PyTorch, providing a common framework for end-to-end structured learning in robotics and vision. Existing DNLS implementations are application specific and do not always incorporate many ingredients important for efficiency. Theseus is application-agnostic, as we illustrate with several example applications that are built using the same underlying differentiable components, such as second-order optimizers, standard costs functions, and Lie groups. For efficiency, Theseus incorporates support for sparse solvers, automatic vectorization, batching, GPU acceleration, and gradient computation with implicit differentiation and direct loss minimization. We do extensive performance evaluation in a set of applications, demonstrating significant efficiency gains and better scalability when these features are incorporated. Project page: https://sites.google.com/view/theseus-ai

preprint2022arXiv

K-level Reasoning for Zero-Shot Coordination in Hanabi

The standard problem setting in cooperative multi-agent settings is self-play (SP), where the goal is to train a team of agents that works well together. However, optimal SP policies commonly contain arbitrary conventions ("handshakes") and are not compatible with other, independently trained agents or humans. This latter desiderata was recently formalized by Hu et al. 2020 as the zero-shot coordination (ZSC) setting and partially addressed with their Other-Play (OP) algorithm, which showed improved ZSC and human-AI performance in the card game Hanabi. OP assumes access to the symmetries of the environment and prevents agents from breaking these in a mutually incompatible way during training. However, as the authors point out, discovering symmetries for a given environment is a computationally hard problem. Instead, we show that through a simple adaption of k-level reasoning (KLR) Costa Gomes et al. 2006, synchronously training all levels, we can obtain competitive ZSC and ad-hoc teamplay performance in Hanabi, including when paired with a human-like proxy bot. We also introduce a new method, synchronous-k-level reasoning with a best response (SyKLRBR), which further improves performance on our synchronous KLR by co-training a best response.

preprint2022arXiv

On learning adaptive acquisition policies for undersampled multi-coil MRI reconstruction

Most current approaches to undersampled multi-coil MRI reconstruction focus on learning the reconstruction model for a fixed, equidistant acquisition trajectory. In this paper, we study the problem of joint learning of the reconstruction model together with acquisition policies. To this end, we extend the End-to-End Variational Network with learnable acquisition policies that can adapt to different data points. We validate our model on a coil-compressed version of the large scale undersampled multi-coil fastMRI dataset using two undersampling factors: $4\times$ and $8\times$. Our experiments show on-par performance with the learnable non-adaptive and handcrafted equidistant strategies at $4\times$, and an observed improvement of more than $2\%$ in SSIM at $8\times$ acceleration, suggesting that potentially-adaptive $k$-space acquisition trajectories can improve reconstructed image quality for larger acceleration factors. However, and perhaps surprisingly, our best performing policies learn to be explicitly non-adaptive.

preprint2021arXiv

Learning Causal State Representations of Partially Observable Environments

Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predict subsequent observations given the history. We demonstrate that these learned state representations are useful for learning policies efficiently in reinforcement learning problems with rich observation spaces. We connect causal states with causal feature sets from the causal inference literature, and also provide theoretical guarantees on the optimality of the continuous version of this causal state representation under Lipschitz assumptions by proving equivalence to bisimulation, a relation between behaviorally equivalent systems. This allows for lower bounds on the optimal value function of the learned representation, which is tight given certain assumptions. Finally, we empirically evaluate causal state representations using multiple partially observable tasks and compare with prior methods.

preprint2021arXiv

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a result, they often possess tens of hyperparameters and architectural choices. For this reason, MBRL typically requires significant human expertise before it can be applied to new problems and domains. To alleviate this problem, we propose to use automatic hyperparameter optimization (HPO). We demonstrate that this problem can be tackled effectively with automated HPO, which we demonstrate to yield significantly improved performance compared to human experts. In addition, we show that tuning of several MBRL hyperparameters dynamically, i.e. during the training itself, further improves the performance compared to using static hyperparameters which are kept fixed for the whole training. Finally, our experiments provide valuable insights into the effects of several hyperparameters, such as plan horizon or learning rate and their influence on the stability of training and resulting rewards.

preprint2020arXiv

Elucidating image-to-set prediction: An analysis of models, losses and datasets

In this paper, we identify an important reproducibility challenge in the image-to-set prediction literature that impedes proper comparisons among published methods, namely, researchers use different evaluation protocols to assess their contributions. To alleviate this issue, we introduce an image-to-set prediction benchmark suite built on top of five public datasets of increasing task complexity that are suitable for multi-label classification (VOC, COCO, NUS-WIDE, ADE20k and Recipe1M). Using the benchmark, we provide an in-depth analysis where we study the key components of current models, namely the choice of the image representation backbone as well as the set predictor design. Our results show that (1) exploiting better image representation backbones leads to higher performance boosts than enhancing set predictors, and (2) modeling both the label co-occurrences and ordering has a slight positive impact in terms of performance, whereas explicit cardinality prediction only helps when training on complex datasets, such as Recipe1M. To facilitate future image-to-set prediction research, we make the code, best models and dataset splits publicly available at: https://github.com/facebookresearch/image-to-set.

preprint2020arXiv

Planning in Stochastic Environments with Goal Uncertainty

We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal specification by allowing a belief over possible goal configurations. The unique observations at potential goals helps the agent identify the true goal during plan execution. The partial observability is restricted to goals, facilitating the reduction to an SSP with a modified state space. We formally define a GUSSP and discuss its theoretical properties. We then propose an admissible heuristic that reduces the planning time using FLARES -- a start-of-the-art probabilistic planner. We also propose a determinization approach for solving this class of problems. Finally, we present empirical results on a search and rescue mobile robot and three other problem domains in simulation.

Luis Pineda

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Theseus: A Library for Differentiable Nonlinear Optimization

K-level Reasoning for Zero-Shot Coordination in Hanabi

On learning adaptive acquisition policies for undersampled multi-coil MRI reconstruction

Learning Causal State Representations of Partially Observable Environments

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Elucidating image-to-set prediction: An analysis of models, losses and datasets

Planning in Stochastic Environments with Goal Uncertainty