Source author record

David H. Wolpert

David H. Wolpert appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

18works

24topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Game Mining: How to Make Money from those about to Play a Game

It is known that a player in a noncooperative game can benefit by publicly restricting his possible moves before play begins. We show that, more generally, a player may benefit by publicly committing to pay an external party an amount that is contingent on the game's outcome. We explore what happens when external parties -- who we call ``game miners'' -- discover this fact and seek to profit from it by entering an outcome-contingent contract with the players. We analyze various structured bargaining games between miners and players for determining such an outcome-contingent contract. These bargaining games include playing the players against one another, as well as allowing the players to pay the miner(s) for exclusivity and first-mover advantage. We establish restrictions on the strategic settings in which a game miner can profit and bounds on the game miner's profit. We also find that game miners can lead to both efficient and inefficient equilibria.

preprint2022arXiv

Combining lower bounds on entropy production in complex systems with multiple interacting components

The past two decades have seen a revolution in statistical physics, generalizing it to apply to systems of arbitrary size, evolving while arbitrarily far from equilibrium. Many of these new results are based on analyzing the dynamics of the entropy of a system that is evolving according to a Markov process. These results comprise a sub-field called ``stochastic thermodynamics''. Some of the most powerful results in stochastic thermodynamics were traditionally concerned with single, monolithic systems, evolving by themselves, ignoring any internal structure of those systems. In this chapter I review how in complex systems, composed of many interacting constituent systems, it is possible to substantially strengthen many of these traditional results of stochastic thermodynamics. This is done by ``mixing and matching'' those traditional results, to each apply to only a subset of the interacting systems, thereby producing a more powerful result at the level of the aggregate, complex system.

preprint2022arXiv

The Implications of the No-Free-Lunch Theorems for Meta-induction

The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction algorithm $B$ as vice-versa. Importantly though, in addition to the NFL theorems, there are many {free lunch} theorems. In particular, the NFL theorems can only be used to compare the {marginal} expected performance of an induction algorithm $A$ with the marginal expected performance of an induction algorithm $B$. There is a rich set of free lunches which instead concern the statistical correlations among the generalization errors of induction algorithms. As I describe, the meta-induction algorithms that Schurz advocate as a "solution to Hume's problem" are just an example of such a free lunch based on correlations among the generalization errors of induction algorithms. I end by pointing out that the prior that Schurz advocates, which is uniform over bit frequencies rather than bit patterns, is contradicted by thousands of experiments in statistical physics and by the great success of the maximum entropy procedure in inductive inference.

preprint2020arXiv

What is important about the No Free Lunch theorems?

The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.

preprint2016arXiv

Reducing the error of Monte Carlo Algorithms by Learning Control Variates

Monte Carlo (MC) sampling algorithms are an extremely widely-used technique to estimate expectations of functions f(x), especially in high dimensions. Control variates are a very powerful technique to reduce the error of such estimates, but in their conventional form rely on having an accurate approximation of f, a priori. Stacked Monte Carlo (StackMC) is a recently introduced technique designed to overcome this limitation by fitting a control variate to the data samples themselves. Done naively, forming a control variate to the data would result in overfitting, typically worsening the MC algorithm's performance. StackMC uses in-sample / out-sample techniques to remove this overfitting. Crucially, it is a post-processing technique, requiring no additional samples, and can be applied to data generated by any MC estimator. Our preliminary experiments demonstrated that StackMC improved the estimates of expectations when it was used to post-process samples produces by a "simple sampling" MC estimator. Here we substantially extend this earlier work. We provide an in-depth analysis of the StackMC algorithm, which we use to construct an improved version of the original algorithm, with lower estimation error. We then perform experiments of StackMC on several additional kinds of MC estimators, demonstrating improved performance when the samples are generated via importance sampling, Latin-hypercube sampling and quasi-Monte Carlo sampling. We also show how to extend StackMC to combine multiple fitting functions, and how to apply it to discrete input spaces x.

preprint2016arXiv

The free energy requirements of biological organisms; implications for evolution

Recent advances in nonequilibrium statistical physics have provided unprecedented insight into the thermodynamics of dynamic processes. The author recently used these advances to extend Landauer's semi-formal reasoning concerning the thermodynamics of bit erasure, to derive the minimal free energy required to implement an arbitrary computation. Here, I extend this analysis, deriving the minimal free energy required by an organism to run a given (stochastic) map $π$ from its sensor inputs to its actuator outputs. I use this result to calculate the input-output map $π$ of an organism that optimally trades off the free energy needed to run $π$ with the phenotypic fitness that results from implementing $π$. I end with a general discussion of the limits imposed on the rate of the terrestrial biosphere's information processing by the flux of sunlight on the Earth.

preprint2015arXiv

Extending Landauer's Bound from Bit Erasure to Arbitrary Computation

Recent analyses have calculated the minimal thermodynamic work required to perform a computation pi when two conditions hold: the output of pi is independent of its input (e.g., as in bit erasure); we use a physical computer C to implement pi that is specially tailored to the environment of C, i.e., to the precise distribution over C's inputs, P_0. First I extend these analyses to calculate the work required even if the output of pi depends on its input, and even if C is not used with the distribution P_0 it was tailored for. Next I show that if C will be re-used, then the minimal work to run it depends only on the logical computation pi, independent of the physical details of C. This establishes a formal identity between the thermodynamics of (re-usable) computers and theoretical computer science. I use this identity to prove that the minimal work required to compute a bit string sigma on a "general purpose computer" rather than a special purpose one, i.e., on a universal Turing machine U, is k_BT ln(2) times the sum of three terms: The Kolmogorov complexity of sigma, log of the Bernoulli measure of the set of strings that compute sigma, and log of the halting probability of U. I also prove that using C with a distribution over environments results in an unavoidable increase in the work required to run the computer, even if it is tailored to the distribution over environments. I end by using these results to relate the free energy flux incident on an organism / robot / biosphere to the maximal amount of computation that the organism / robot / biosphere can do per unit time.

preprint2015arXiv

Optimal high-level descriptions of dynamical systems

To analyze high-dimensional systems, many fields in science and engineering rely on high-level descriptions, sometimes called "macrostates," "coarse-grainings," or "effective theories". Examples of such descriptions include the thermodynamic properties of a large collection of point particles undergoing reversible dynamics, the variables in a macroeconomic model describing the individuals that participate in an economy, and the summary state of a cell composed of a large set of biochemical networks. Often these high-level descriptions are constructed without considering the ultimate reason for needing them in the first place. Here, we formalize and quantify one such purpose: the need to predict observables of interest concerning the high-dimensional system with as high accuracy as possible, while minimizing the computational cost of doing so. The resulting State Space Compression (SSC) framework provides a guide for how to solve for the {optimal} high-level description of a given dynamical system, rather than constructing it based on human intuition alone. In this preliminary report, we introduce SSC, and illustrate it with several information-theoretic quantifications of "accuracy", all with different implications for the optimal compression. We also discuss some other possible applications of SSC beyond the goal of accurate prediction. These include SSC as a measure of the complexity of a dynamical system, and as a way to quantify information flow between the scales of a system.

preprint2015arXiv

Value of information in noncooperative games

In some games, additional information hurts a player, e.g., in games with first-mover advantage, the second-mover is hurt by seeing the first-mover's move. What properties of a game determine whether it has such negative "value of information" for a particular player? Can a game have negative value of information for all players? To answer such questions, we generalize the definition of marginal utility of a good to define the marginal utility of a parameter vector specifying a game. So rather than analyze the global structure of the relationship between a game's parameter vector and player behavior, as in previous work, we focus on the local structure of that relationship. This allows us to prove that generically, every game can have negative marginal value of information, unless one imposes a priori constraints on allowed changes to the game's parameter vector. We demonstrate these and related results numerically, and discuss their implications.

preprint2014arXiv

Predicting the behavior of interacting humans by fusing data from multiple sources

Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under dierent conditions.

preprint2013arXiv

Estimating Functions of Distributions Defined over Spaces of Unknown Size

We consider Bayesian estimation of information-theoretic quantities from data, using a Dirichlet prior. Acknowledging the uncertainty of the event space size $m$ and the Dirichlet prior's concentration parameter $c$, we treat both as random variables set by a hyperprior. We show that the associated hyperprior, $P(c, m)$, obeys a simple "Irrelevance of Unseen Variables" (IUV) desideratum iff $P(c, m) = P(c) P(m)$. Thus, requiring IUV greatly reduces the number of degrees of freedom of the hyperprior. Some information-theoretic quantities can be expressed multiple ways, in terms of different event spaces, e.g., mutual information. With all hyperpriors (implicitly) used in earlier work, different choices of this event space lead to different posterior expected values of these information-theoretic quantities. We show that there is no such dependence on the choice of event space for a hyperprior that obeys IUV. We also derive a result that allows us to exploit IUV to greatly simplify calculations, like the posterior expected mutual information or posterior expected multi-information. We also use computer experiments to favorably compare an IUV-based estimator of entropy to three alternative methods in common use. We end by discussing how seemingly innocuous changes to the formalization of an estimation problem can substantially affect the resultant estimates of posterior expectations.

preprint2012arXiv

Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate The Future

This paper introduces a novel framework for modeling interacting humans in a multi-stage game. This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world systems. We achieve these benefits by combining concepts from game theory and reinforcement learning. To be precise, we extend the bounded rational "level-K reasoning" model to apply to games over multiple stages. Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each of which can be solved by standard reinforcement learning algorithms. We call this hybrid approach "level-K reinforcement learning". We investigate these ideas in a cyber battle scenario over a smart power grid and discuss the relationship between the behavior predicted by our model and what one might expect of real human defenders and attackers.

preprint2012arXiv

Predicting the behavior of interacting humans by fusing data from multiple sources

Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but high-fidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under different conditions.

preprint2011arXiv

Game theoretic modeling of pilot behavior during mid-air encounters

We show how to combine Bayes nets and game theory to predict the behavior of hybrid systems involving both humans and automated components. We call this novel framework "Semi Network-Form Games," and illustrate it by predicting aircraft pilot behavior in potential near mid-air collisions. At present, at the beginning of such potential collisions, a collision avoidance system in the aircraft cockpit advises the pilots what to do to avoid the collision. However studies of mid-air encounters have found wide variability in pilot responses to avoidance system advisories. In particular, pilots rarely perfectly execute the recommended maneuvers, despite the fact that the collision avoidance system's effectiveness relies on their doing so. Rather pilots decide their actions based on all information available to them (advisory, instrument readings, visual observations). We show how to build this aspect into a semi network-form game model of the encounter and then present computational simulations of the resultant model.

preprint2010arXiv

Hysteresis effects of changing parameters of noncooperative games

We adapt the method used by Jaynes to derive the equilibria of statistical physics to instead derive equilibria of bounded rational game theory. We analyze the dependence of these equilibria on the parameters of the underlying game, focusing on hysteresis effects. In particular, we show that by gradually imposing individual-specific tax rates on the players of the game, and then gradually removing those taxes, the players move from a poor equilibrium to one that is better for all of them.

preprint2010arXiv

What does Newcomb's paradox teach us?

In Newcomb's paradox you choose to receive either the contents of a particular closed box, or the contents of both that closed box and another one. Before you choose, a prediction algorithm deduces your choice, and fills the two boxes based on that deduction. Newcomb's paradox is that game theory appears to provide two conflicting recommendations for what choice you should make in this scenario. We analyze Newcomb's paradox using a recent extension of game theory in which the players set conditional probability distributions in a Bayes net. We show that the two game theory recommendations in Newcomb's scenario have different presumptions for what Bayes net relates your choice and the algorithm's prediction. We resolve the paradox by proving that these two Bayes nets are incompatible. We also show that the accuracy of the algorithm's prediction, the focus of much previous work, is irrelevant. In addition we show that Newcomb's scenario only provides a contradiction between game theory's expected utility and dominance principles if one is sloppy in specifying the underlying Bayes net. We also show that Newcomb's paradox is time-reversal invariant; both the paradox and its resolution are unchanged if the algorithm makes its `prediction' after you make your choice rather than before.

preprint2007arXiv

Parametric Learning and Monte Carlo Optimization

This paper uncovers and explores the close relationship between Monte Carlo Optimization of a parametrized integral (MCO), Parametric machine-Learning (PL), and `blackbox' or `oracle'-based optimization (BO). We make four contributions. First, we prove that MCO is mathematically identical to a broad class of PL problems. This identity potentially provides a new application domain for all broadly applicable PL techniques: MCO. Second, we introduce immediate sampling, a new version of the Probability Collectives (PC) algorithm for blackbox optimization. Immediate sampling transforms the original BO problem into an MCO problem. Accordingly, by combining these first two contributions, we can apply all PL techniques to BO. In our third contribution we validate this way of improving BO by demonstrating that cross-validation and bagging improve immediate sampling. Finally, conventional MC and MCO procedures ignore the relationship between the sample point locations and the associated values of the integrand; only the values of the integrand at those locations are considered. We demonstrate that one can exploit the sample location information using PL techniques, for example by forming a fit of the sample locations to the associated values of the integrand. This provides an additional way to apply PL techniques to improve MCO.

preprint2000arXiv

On the computational capabilities of physical systems part I: the impossibility of infallible computation

In this first of two papers, strong limits on the accuracy of physical computation are established. First it is proven that there cannot be a physical computer C to which one can pose any and all computational tasks concerning the physical universe. Next it is proven that no physical computer C can correctly carry out any computational task in the subset of such tasks that can be posed to C. As a particular example, this means that there cannot be a physical computer that can, for any physical system external to that computer, take the specification of that external system's state as input and then correctly predict its future state before that future state actually occurs. The results also mean that there cannot exist an infallible, general-purpose observation apparatus, and that there cannot be an infallible, general-purpose control apparatus. These results do not rely on systems that are infinite, and/or non-classical, and/or obey chaotic dynamics. They also hold even if one uses an infinitely fast, infinitely dense computer, with computational powers greater than that of a Turing Machine.

David H. Wolpert

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Game Mining: How to Make Money from those about to Play a Game

Combining lower bounds on entropy production in complex systems with multiple interacting components

The Implications of the No-Free-Lunch Theorems for Meta-induction

What is important about the No Free Lunch theorems?

Reducing the error of Monte Carlo Algorithms by Learning Control Variates

The free energy requirements of biological organisms; implications for evolution

Extending Landauer's Bound from Bit Erasure to Arbitrary Computation

Optimal high-level descriptions of dynamical systems

Value of information in noncooperative games

Predicting the behavior of interacting humans by fusing data from multiple sources

Estimating Functions of Distributions Defined over Spaces of Unknown Size

Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate The Future

Predicting the behavior of interacting humans by fusing data from multiple sources

Game theoretic modeling of pilot behavior during mid-air encounters

Hysteresis effects of changing parameters of noncooperative games

What does Newcomb's paradox teach us?

Parametric Learning and Monte Carlo Optimization

On the computational capabilities of physical systems part I: the impossibility of infallible computation