Researcher profile

David McAllester

David McAllester contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2020arXiv

Domain-independent Dominance of Adaptive Methods

From a simplified analysis of adaptive methods, we derive AvaGrad, a new optimizer which outperforms SGD on vision tasks when its adaptability is properly tuned. We observe that the power of our method is partially explained by a decoupling of learning rate and adaptability, greatly simplifying hyperparameter search. In light of this observation, we demonstrate that, against conventional wisdom, Adam can also outperform SGD on vision tasks, as long as the coupling between its learning rate and adaptability is taken into account. In practice, AvaGrad matches the best results, as measured by generalization accuracy, delivered by any existing optimizer (SGD or adaptive) across image classification (CIFAR, ImageNet) and character-level language modelling (Penn Treebank) tasks.

preprint2020arXiv

Formal Limitations on the Measurement of Mutual Information

Measuring mutual information from finite data is difficult. Recent work has considered variational methods maximizing a lower bound. In this paper, we prove that serious statistical limitations are inherent to any method of measuring mutual information. More specifically, we show that any distribution-free high-confidence lower bound on mutual information estimated from N samples cannot be larger than O(ln N ).

preprint2020arXiv

Isomorphism Revisited

Isomorphism is central to the structure of mathematics and has been formalized in various ways within dependent type theory. All previous treatments have done this by replacing quantification over sets with quantification over groupoids of some form --- categories in which every morphism is an isomorphism. Quantification over sets is replaced by quantification over standard groupoids in the groupoid model, by quantification over infinity groupoid in Homotopy type theory, and by quantification over morphoids in the morphoid model. Here we give a treatment of isomorphism based on the intuitive notion of sets as collections without internal structure. Quantification over sets remains as quantification over sets. Isomorphism and groupoid structure then emerge from simple but subtle syntactic restrictions on set-theoretic language. This approach more fully unifies the classical ZFC foundations with a rigorous treatments of isomorphism, symmetry, canonicality, functors, and natural transformations. This is all done without reference to category theory.

preprint2020arXiv

MathZero, The Classification Problem, and Set-Theoretic Type Theory

AlphaZero learns to play go, chess and shogi at a superhuman level through self play given only the rules of the game. This raises the question of whether a similar thing could be done for mathematics -- a MathZero. MathZero would require a formal foundation and an objective. We propose the foundation of set-theoretic dependent type theory and an objective defined in terms of the classification problem -- the problem of classifying concept instances up to isomorphism. The natural numbers arise as the solution to the classification problem for finite sets. Here we generalize classical Bourbaki set-theoretic isomorphism to set-theoretic dependent type theory. To our knowledge we give the first isomorphism inference rules for set-theoretic dependent type theory with propositional set-theoretic equality. The presentation is intended to be accessible to mathematicians with no prior exposure to type theory.

preprint2020arXiv

On-The-Fly Information Retrieval Augmentation for Language Models

Here we experiment with the use of information retrieval as an augmentation for pre-trained language models. The text corpus used in information retrieval can be viewed as form of episodic memory which grows over time. By augmenting GPT 2.0 with information retrieval we achieve a zero shot 15% relative reduction in perplexity on Gigaword corpus without any re-training. We also validate our IR augmentation on an event co-reference task.