Source author record

Herbert G. Tanner

Herbert G. Tanner appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

5works
6topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2021arXiv

A Hybrid PAC Reinforcement Learning Algorithm

This paper offers a new hybrid probably approximately correct (PAC) reinforcement learning (RL) algorithm for Markov decision processes (MDPs) that intelligently maintains favorable features of its parents. The designed algorithm, referred to as the Dyna-Delayed Q-learning (DDQ) algorithm, combines model-free and model-based learning approaches while outperforming both in most cases. The paper includes a PAC analysis of the DDQ algorithm and a derivation of its sample complexity. Numerical results are provided to support the claim regarding the new algorithm's sample efficiency compared to its parents as well as the best known model-free and model-based algorithms in application.

preprint2020arXiv

PAC Reinforcement Learning Algorithm for General-Sum Markov Games

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. The paper offers an extension to the well-known Nash Q-learning algorithm, using the idea of delayed Q-learning, in order to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate performance and robustness.

preprint2012arXiv

Networked Decision Making for Poisson Processes: Application to nuclear detection

This paper addresses a detection problem where several spatially distributed sensors independently observe a time-inhomogeneous stochastic process. The task is to decide between two hypotheses regarding the statistics of the observed process at the end of a fixed time interval. In the proposed method, each of the sensors transmits once to a fusion center a locally processed summary of its information in the form of a likelihood ratio. The fusion center then combines these messages to arrive at an optimal decision in the Neyman-Pearson framework. The approach is motivated by applications arising in the detection of mobile radioactive sources, and offers a pathway toward the development of novel fixed- interval detection algorithms that combine decentralized processing with optimal centralized decision making.

preprint2012arXiv

Stochastic receding horizon control of nonlinear stochastic systems with probabilistic state constraints

The paper describes a receding horizon control design framework for continuous-time stochastic nonlinear systems subject to probabilistic state constraints. The intention is to derive solutions that are implementable in real-time on currently available mobile processors. The approach consists of decomposing the problem into designing receding horizon reference paths based on the drift component of the system dynamics, and then implementing a stochastic optimal controller to allow the system to stay close and follow the reference path. In some cases, the stochastic optimal controller can be obtained in closed form; in more general cases, pre-computed numerical solutions can be implemented in real-time without the need for on-line computation. The convergence of the closed loop system is established assuming no constraints on control inputs, and simulation results are provided to corroborate the theoretical predictions.

preprint2012arXiv

Symbolic Planning and Control Using Game Theory and Grammatical Inference

This paper presents an approach that brings together game theory with grammatical inference and discrete abstractions in order to synthesize control strategies for hybrid dynamical systems performing tasks in partially unknown but rule-governed adversarial environments. The combined formulation guarantees that a system specification is met if (a) the true model of the environment is in the class of models inferable from a positive presentation, (b) a characteristic sample is observed, and (c) the task specification is satisfiable given the capabilities of the system (agent) and the environment.