Source author record

Daniel A. Braun

Daniel A. Braun appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Information Theory math.IT Systems and Control Computer Science and Game Theory math.OC math.ST Statistics Theory econ.TH math.CO math.FA Robotics

Catalog footprint

What is connected

19works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Mixture-of-Variational-Experts for Continual Learning

One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We discuss this principle from a Bayesian perspective and show its connections to previous approaches to CL. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method in continual supervised learning and in continual reinforcement learning.

preprint2022arXiv

The classification of preordered spaces in terms of monotones: complexity and optimization

The study of complexity and optimization in decision theory involves both partial and complete characterizations of preferences over decision spaces in terms of real-valued monotones. With this motivation, and following the recent introduction of new classes of monotones, like injective monotones or strict monotone multi-utilities, we present the classification of preordered spaces in terms of both the existence and cardinality of real-valued monotones and the cardinality of the quotient space. In particular, we take advantage of a characterization of real-valued monotones in terms of separating families of increasing sets in order to obtain a more complete classification consisting of classes that are strictly different from each other. As a result, we gain new insight into both complexity and optimization, and clarify their interplay in preordered spaces.

preprint2021arXiv

Representing preorders with injective monotones

We introduce a new class of real-valued monotones in preordered spaces, injective monotones. We show that the class of preorders for which they exist lies in between the class of preorders with strict monotones and preorders with countable multi-utilities, improving upon the known classification of preordered spaces through real-valued monotones. We extend several well-known results for strict monotones (Richter-Peleg functions) to injective monotones, we provide a construction of injective monotones from countable multi-utilities, and relate injective monotones to classic results concerning Debreu denseness and order separability. Along the way, we connect our results to Shannon entropy and the uncertainty preorder, obtaining new insights into how they are related. In particular, we show how injective montones can be used to generalize some appealing properties of Jaynes' maximum entropy principle, which is considered a basis for statistical inference and serves as a justification for many regularization techniques that appear throughout machine learning and decision theory.

preprint2020arXiv

Hierarchical Expert Networks for Meta-Learning

The goal of meta-learning is to train a model on a variety of learning tasks, such that it can adapt to new problems within only a few iterations. Here we propose a principled information-theoretic model that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems. To drive this specialization we impose the same kind of information processing constraints both on the partitioning and the expert decision-makers. We argue that this specialization leads to efficient adaptation to new tasks. To demonstrate the generality of our approach we evaluate three meta-learning domains: image classification, regression, and reinforcement learning.

preprint2019arXiv

An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems

Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with information constraints are joined together. We devise an on-line learning rule of this principle that learns a partitioning of the problem space such that it can be solved by specialized linear policies. We demonstrate the approach for decision-making problems whose complexity exceeds the capabilities of individual decision-makers, but can be solved by combining the decision-makers optimally. The strength of the model is that it is abstract and principled, yet has direct applications in classification, regression, reinforcement learning and adaptive control.

preprint2016arXiv

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.

preprint2015arXiv

Adaptive information-theoretic bounded rational decision-making with parametric priors

Deviations from rational decision-making due to limited computational resources have been studied in the field of bounded rationality, originally proposed by Herbert Simon. There have been a number of different approaches to model bounded rationality ranging from optimality principles to heuristics. Here we take an information-theoretic approach to bounded rationality, where information-processing costs are measured by the relative entropy between a posterior decision strategy and a given fixed prior strategy. In the case of multiple environments, it can be shown that there is an optimal prior rendering the bounded rationality problem equivalent to the rate distortion problem for lossy compression in information theory. Accordingly, the optimal prior and posterior strategies can be computed by the well-known Blahut-Arimoto algorithm which requires the computation of partition sums over all possible outcomes and cannot be applied straightforwardly to continuous problems. Here we derive a sampling-based alternative update rule for the adaptation of prior behaviors of decision-makers and we show convergence to the optimal prior predicted by rate distortion theory. Importantly, the update rule avoids typical infeasible operations such as the computation of partition sums. We show in simulations a proof of concept for discrete action and environment domains. This approach is not only interesting as a generic computational method, but might also provide a more realistic model of human decision-making processes occurring on a fast and a slow time scale.

preprint2015arXiv

Information-Theoretic Bounded Rationality

Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the free energy functional as the objective function for characterizing bounded-rational decisions. This functional possesses three crucial properties: it controls the size of the solution space; it has Monte Carlo planners that are exact, yet bypass the need for exhaustive search; and it captures model uncertainty arising from lack of evidence or from interacting with other agents having unknown intentions. We discuss the single-step decision-making case, and show how to extend it to sequential decisions using equivalence transformations. This extension yields a very general class of decision problems that encompass classical decision rules (e.g. EXPECTIMAX and MINIMAX) as limit cases, as well as trust- and risk-sensitive planning.

preprint2013arXiv

Abstraction in decision-makers with limited information processing capabilities

A distinctive property of human and animal intelligence is the ability to form abstractions by neglecting irrelevant information which allows to separate structure from noise. From an information theoretic point of view abstractions are desirable because they allow for very efficient information processing. In artificial systems abstractions are often implemented through computationally costly formations of groups or clusters. In this work we establish the relation between the free-energy framework for decision making and rate-distortion theory and demonstrate how the application of rate-distortion for decision-making leads to the emergence of abstractions. We argue that abstractions are induced due to a limit in information processing capacity.

preprint2013arXiv

Bounded Rational Decision-Making in Changing Environments

A perfectly rational decision-maker chooses the best action with the highest utility gain from a set of possible actions. The optimality principles that describe such decision processes do not take into account the computational costs of finding the optimal action. Bounded rational decision-making addresses this problem by specifically trading off information-processing costs and expected utility. Interestingly, a similar trade-off between energy and entropy arises when describing changes in thermodynamic systems. This similarity has been recently used to describe bounded rational agents. Crucially, this framework assumes that the environment does not change while the decision-maker is computing the optimal policy. When this requirement is not fulfilled, the decision-maker will suffer inefficiencies in utility, that arise because the current policy is optimal for an environment in the past. Here we borrow concepts from non-equilibrium thermodynamics to quantify these inefficiencies and illustrate with simulations its relationship with computational resources.

preprint2013arXiv

Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference

Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution can then be constructed by a Bayesian superposition of the optimal policies weighted by their posterior probability that is updated by Bayesian inference and causal calculus. Here we discuss three important features of this approach. First, we discuss in how far such Thompson sampling can be regarded as a natural consequence of the Bayesian modeling of policy uncertainty. Second, we show how Thompson sampling can be used to study interactions between multiple adaptive agents, thus, opening up an avenue of game-theoretic analysis. Third, we show how Thompson sampling can be applied to infer causal relationships when interacting with an environment in a sequential fashion. In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.

preprint2012arXiv

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly model the distribution over extrema. To this end, we devise a non-parametric conjugate prior based on a kernel regressor. The resulting posterior distribution directly captures the uncertainty over the maximum of the unknown function. We illustrate the effectiveness of our model by optimizing a noisy, high-dimensional, non-convex objective function.

preprint2012arXiv

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.

preprint2012arXiv

Thermodynamics as a theory of decision-making with information processing costs

Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we propose an information-theoretic formalization of bounded rational decision-making where decision-makers trade off expected utility and information processing costs. Such bounded rational decision-makers can be thought of as thermodynamic machines that undergo physical state changes when they compute. Their behavior is governed by a free energy functional that trades off changes in internal energy-as a proxy for utility-and entropic changes representing computational costs induced by changing states. As a result, the bounded rational decision-making problem can be rephrased in terms of well-known concepts from statistical physics. In the limit when computational costs are ignored, the maximum expected utility principle is recovered. We discuss the relation to satisficing decision-making procedures as well as links to existing theoretical frameworks and human decision-making experiments that describe deviations from expected utility theory. Since most of the mathematical machinery can be borrowed from statistical physics, the main contribution is to axiomatically derive and interpret the thermodynamic free energy as a model of bounded rational decision-making.

preprint2011arXiv

Information, Utility & Bounded Rationality

Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we employ an axiomatic framework for bounded rational decision-making based on a thermodynamic interpretation of resource costs as information costs. This leads to a variational "free utility" principle akin to thermodynamical free energy that trades off utility and information costs. We show that bounded optimal control solutions can be derived from this variational principle, which leads in general to stochastic policies. Furthermore, we show that risk-sensitive and robust (minimax) control schemes fall out naturally from this framework if the environment is considered as a bounded rational and perfectly rational opponent, respectively. When resource costs are ignored, the maximum expected utility principle is recovered.

preprint2010arXiv

A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes

Adaptive control problems are notoriously difficult to solve even in the presence of plant-specific controllers. One way to by-pass the intractable computation of the optimal policy is to restate the adaptive control as the minimization of the relative entropy of a controller that ignores the true plant dynamics from an informed controller. The solution is given by the Bayesian control rule-a set of equations characterizing a stochastic adaptive controller for the class of possible plant dynamics. Here, the Bayesian control rule is applied to derive BCR-MDP, a controller to solve undiscounted Markov decision processes with finite state and action spaces and unknown dynamics. In particular, we derive a non-parametric conjugate prior distribution over the policy space that encapsulates the agent's whole relevant history and we present a Gibbs sampler to draw random policies from this distribution. Preliminary results show that BCR-MDP successfully avoids sub-optimal limit cycles due to its built-in mechanism to balance exploration versus exploitation.

preprint2010arXiv

A Minimum Relative Entropy Principle for Learning and Acting

This paper proposes a method to construct an adaptive agent that is universal with respect to a given class of experts, where each expert is an agent that has been designed specifically for a particular environment. This adaptive control problem is formalized as the problem of minimizing the relative entropy of the adaptive agent from the expert that is most suitable for the unknown environment. If the agent is a passive observer, then the optimal solution is the well-known Bayesian predictor. However, if the agent is active, then its past actions need to be treated as causal interventions on the I/O stream rather than normal probability conditions. Here it is shown that the solution to this new variational problem is given by a stochastic controller called the Bayesian control rule, which implements adaptive behavior as a mixture of experts. Furthermore, it is shown that under mild assumptions, the Bayesian control rule converges to the control law of the most suitable expert.

preprint2010arXiv

An axiomatic formalization of bounded rationality based on a utility-information equivalence

Classic decision-theory is based on the maximum expected utility (MEU) principle, but crucially ignores the resource costs incurred when determining optimal decisions. Here we propose an axiomatic framework for bounded decision-making that considers resource costs. Agents are formalized as probability measures over input-output streams. We postulate that any such probability measure can be assigned a corresponding conjugate utility function based on three axioms: utilities should be real-valued, additive and monotonic mappings of probabilities. We show that these axioms enforce a unique conversion law between utility and probability (and thereby, information). Moreover, we show that this relation can be characterized as a variational principle: given a utility function, its conjugate probability measure maximizes a free utility functional. Transformations of probability measures can then be formalized as a change in free utility due to the addition of new constraints expressed by a target utility function. Accordingly, one obtains a criterion to choose a probability measure that trades off the maximization of a target utility function and the cost of the deviation from a reference distribution. We show that optimal control, adaptive estimation and adaptive control problems can be solved this way in a resource-efficient way. When resource costs are ignored, the MEU principle is recovered. Our formalization might thus provide a principled approach to bounded rationality that establishes a close link to information theory.

preprint2010arXiv

Convergence of Bayesian Control Rule

Recently, new approaches to adaptive control have sought to reformulate the problem as a minimization of a relative entropy criterion to obtain tractable solutions. In particular, it has been shown that minimizing the expected deviation from the causal input-output dependencies of the true plant leads to a new promising stochastic control rule called the Bayesian control rule. This work proves the convergence of the Bayesian control rule under two sufficient assumptions: boundedness, which is an ergodicity condition; and consistency, which is an instantiation of the sure-thing principle.

Daniel A. Braun

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Mixture-of-Variational-Experts for Continual Learning

The classification of preordered spaces in terms of monotones: complexity and optimization

Representing preorders with injective monotones

Hierarchical Expert Networks for Meta-Learning

An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Adaptive information-theoretic bounded rational decision-making with parametric priors

Information-Theoretic Bounded Rationality

Abstraction in decision-makers with limited information processing capabilities

Bounded Rational Decision-Making in Changing Environments

Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

Thermodynamics as a theory of decision-making with information processing costs

Information, Utility & Bounded Rationality

A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes

A Minimum Relative Entropy Principle for Learning and Acting

An axiomatic formalization of bounded rationality based on a utility-information equivalence

Convergence of Bayesian Control Rule