Source author record

Martin Schmid

Martin Schmid appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language hep-ph Machine Learning Computer Science and Game Theory cond-mat.mes-hall cond-mat.mtrl-sci hep-lat physics.optics

Catalog footprint

What is connected

10works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Solving Common-Payoff Games with Approximate Policy Iteration

For artificially intelligent learning systems to have widespread applicability in real-world settings, it is important that they be able to operate decentrally. Unfortunately, decentralized control is difficult -- computing even an epsilon-optimal joint policy is a NEXP complete problem. Nevertheless, a recently rediscovered insight -- that a team of agents can coordinate via common knowledge -- has given rise to algorithms capable of finding optimal joint policies in small common-payoff games. The Bayesian action decoder (BAD) leverages this insight and deep reinforcement learning to scale to games as large as two-player Hanabi. However, the approximations it uses to do so prevent it from discovering optimal joint policies even in games small enough to brute force optimal solutions. This work proposes CAPI, a novel algorithm which, like BAD, combines common knowledge with deep reinforcement learning. However, unlike BAD, CAPI prioritizes the propensity to discover optimal joint policies over scalability. While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so. Code is available at https://github.com/ssokota/capi .

preprint2021arXiv

Sound Algorithms in Imperfect Information Games

Search has played a fundamental role in computer game research since the very beginning. And while online search has been commonly used in perfect information games such as Chess and Go, online search methods for imperfect information games have only been introduced relatively recently. This paper addresses the question of what is a sound online algorithm in an imperfect information setting of two-player zero-sum games. We argue that the~fixed-strategy~definitions of exploitability and $ε$-Nash equilibria are ill-suited to measure an online algorithm's worst-case performance. We thus formalize $ε$-soundness, a concept that connects the worst-case performance of an online algorithm to the performance of an $ε$-Nash equilibrium. As $ε$-soundness can be difficult to compute in general, we introduce a consistency framework -- a hierarchy that connects an online algorithm's behavior to a Nash equilibrium. These multiple levels of consistency describe in what sense an online algorithm plays "just like a fixed Nash equilibrium". These notions further illustrate the difference between perfect and imperfect information settings, as the same consistency guarantees have different worst-case online performance in perfect and imperfect information games. The definitions of soundness and the consistency hierarchy finally provide appropriate tools to analyze online algorithms in repeated imperfect information games. We thus inspect some of the previous online algorithms in a new light, bringing new insights into their worst-case performance guarantees.

preprint2020arXiv

The Advantage Regret-Matching Actor-Critic

Regret minimization has played a key role in online learning, equilibrium computation in games, and reinforcement learning (RL). In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior. We propose a model-free RL algorithm, the AdvantageRegret-Matching Actor-Critic (ARMAC): rather than saving past state-action data, ARMAC saves a buffer of past policies, replaying through them to reconstruct hindsight assessments of past behavior. These retrospective value estimates are used to predict conditional advantages which, combined with regret matching, produces a new policy. In particular, ARMAC learns from sampled trajectories in a centralized training setting, without requiring the application of importance sampling commonly used in Monte Carlo counterfactual regret (CFR) minimization; hence, it does not suffer from excessive variance in large environments. In the single-agent setting, ARMAC shows an interesting form of exploration by keeping past policies intact. In the multiagent setting, ARMAC in self-play approaches Nash equilibria on some partially-observable zero-sum benchmarks. We provide exploitability estimates in the significantly larger game of betting-abstracted no-limit Texas Hold'em.

preprint2016arXiv

Color Change Effect in an Organic-Inorganic Hybrid Material Based on a Porphyrin Diacid

Porphyrinic materials show a range of interesting and useful optical and electrical properties. The less well-known sub-class of porphyrin diacids has been used in this work to construct an ionic hybrid organic-inorganic material in combination with a halogenidometalate anion. The resulting compound, $[H_6TPyP][BiCl_6]_2$ (1) (TPyP = tetra(4-pyridyl)porphyrin) has been obtained via a facile solution based synthesis in single crystalline form. The material exhibits a broad photoluminescence emission band between 650 and 850 nm at room temperature. Single crystals of $[H_6TPyP][BiCl_6]_2$ show a photocurrent in the fA and a much higher dark current in the nA range. They also display an unexpected reversible color change upon wetting with different liquids. This phenomenon has been investigated with optical spectroscopy, SEM, XPS and NEXAFS techniques, showing that a surface-based structural coloration effect is the source of the color change. This stands in contrast to other materials where structural coloration typically has to be introduced through elaborate, multi-step processes or the use of natural templates. Additionally, it underscores the potential of self-assembly of porphyrinic hybrid compounds in the fabrication of materials with unusual optical properties.

preprint2016arXiv

Direct Characterization of Band Bending in GaP/Si(001) Heterostructures with Hard X-ray Photoelectron Spectroscopy

We apply hard X-ray photoelectron spectroscopy (HAXPES) to investigate the electronic structures in ~50-nm thick epitaxial GaP layers grown on Si(001) under different conditions. Depth profiles of the local binding energies for the core levels are obtained by measuring the photoemission spectra at different incident photon energies between 3 and 7 keV and analyzing them with simple numerical models. The obtained depth profiles are in quantitative agreement with the band bending determinations for the same samples in a previous coherent phonon spectroscopic study. Our results demonstrate the applicability of the HAXPES with varying incident photon energy to characterize the electric potential profiles at buried semiconductor heterointerfaces.

preprint2016arXiv

Printable Nanoscopic Metamaterial Absorbers and Images with Diffraction-Limited Resolution

The fabrication of functional metamaterials with extreme feature resolution finds a host of applications such as the broad area of surface/light interaction. Non-planar features of such structures can significantly enhance their performance and tunability, but their facile generation remains a challenge. Here, we show that carefully designed out-of-plane nanopillars made of metal-dielectric composites integrated in a metal-dielectric-nanocomposite configuration, can absorb broadband light very effectively. We further demonstrate that electrohydrodynamic printing in a rapid nanodripping mode, is able to generate precise out-of-plane forests of such composite nanopillars with deposition resolutions at the diffraction limit on flat and non-flat substrates. The nanocomposite nature of the printed material allows the fine-tuning of the overall visible light absorption from complete absorption to complete reflection by simply tuning the pillar height. Almost perfect absorption (~95%) over the entire visible spectrum is achieved by a nanopillar forest covering only 6% of the printed area. Adjusting the height of individual pillar groups by design, we demonstrate on-demand control of the gray scale of a micrograph with a spatial resolution of 400 nm. These results constitute a significant step forward in ultra-high resolution facile fabrication of out-of-plane nanostructures, important to a broad palette of light design applications. nanostructures, important to a broad palette of light design applications.

preprint2016arXiv

Text Understanding with the Attention Sum Reader Network

Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children's Book Test. Thanks to the size of these datasets, the associated text comprehension task is well suited for deep-learning techniques that currently seem to outperform all alternative approaches. We present a new, simple model that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models. This makes the model particularly suitable for question-answering problems where the answer is a single word from the document. Ensemble of our models sets new state of the art on all evaluated datasets.

preprint2015arXiv

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

This paper presents results of our experiments for the next utterance ranking on the Ubuntu Dialog Corpus -- the largest publicly available multi-turn dialog corpus. First, we use an in-house implementation of previously reported models to do an independent evaluation using the same data. Second, we evaluate the performances of various LSTMs, Bi-LSTMs and CNNs on the dataset. Third, we create an ensemble by averaging predictions of multiple models. The ensemble further improves the performance and it achieves a state-of-the-art result for the next utterance ranking on this dataset. Finally, we discuss our future plans using this corpus.

preprint2011arXiv

Integrating out the heaviest quark in N--flavour ChPT

We extend a known method to integrate out the strange quark in three flavour chiral perturbation theory to the context of an arbitrary number of flavours. As an application, we present the explicit formulae to one--loop accuracy for the heavy quark mass dependency of the low energy constants after decreasing the number of flavours by one while integrating out the heaviest quark in N--flavour chiral perturbation theory.

preprint2009arXiv

Relations between SU(2)- and SU(3)-LECs in chiral perturbation theory

Chiral perturbation theory in the two--flavour sector allows one to analyse Green functions in QCD in the limit where the strange quark mass is considered to be large in comparison to the external momenta and to the light quark masses m_u and m_d. In this framework, the low--energy constants of SU{2}_R \times SU{2}_L depend on the value of the heavy quark masses. For the coupling constants which occur at order p^2 and p^4 in the chiral expansion, we worked out the dependence on the strange quark mass at two--loop accuracy, and provided analogous relations for some of the couplings c_i which are relevant at order p^6. This talk comments on the methods used, and illustrates implications of the results obtained.

Martin Schmid

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Solving Common-Payoff Games with Approximate Policy Iteration

Sound Algorithms in Imperfect Information Games

The Advantage Regret-Matching Actor-Critic

Color Change Effect in an Organic-Inorganic Hybrid Material Based on a Porphyrin Diacid

Direct Characterization of Band Bending in GaP/Si(001) Heterostructures with Hard X-ray Photoelectron Spectroscopy

Printable Nanoscopic Metamaterial Absorbers and Images with Diffraction-Limited Resolution

Text Understanding with the Attention Sum Reader Network

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

Integrating out the heaviest quark in N--flavour ChPT

Relations between SU(2)- and SU(3)-LECs in chiral perturbation theory