Source author record

Kyle Aitken

Kyle Aitken appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-th cond-mat.str-el Machine Learning Computation and Language

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

The geometry of integration in text classification RNNs

Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing task: text classification. Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Specifically, across architectures and datasets, RNNs accumulate evidence for each class as they process the text, using a low-dimensional attractor manifold as the underlying mechanism. Moreover, the dimensionality and geometry of the attractor manifold are determined by the structure of the training dataset; in particular, we describe how simple word-count statistics computed on the training dataset can be used to predict these properties. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification. To the degree that integration of evidence towards a decision is a common computational primitive, this work lays the foundation for using dynamical systems techniques to study the inner workings of RNNs.

preprint2020arXiv

On the asymptotics of wide networks with polynomial activations

We consider an existing conjecture addressing the asymptotic behavior of neural networks in the large width limit. The results that follow from this conjecture include tight bounds on the behavior of wide networks during stochastic gradient descent, and a derivation of their finite-width dynamics. We prove the conjecture for deep networks with polynomial activation functions, greatly extending the validity of these results. Finally, we point out a difference in the asymptotic behavior of networks with analytic (and non-linear) activation functions and those with piecewise-linear activations such as ReLU.

preprint2019arXiv

Generalization of QCD$_3$ Symmetry-Breaking and Flavored Quiver Dualities

We extend the recently proposed symmetry breaking scenario of QCD$_3$ to the so-called "master" $(2+1)$d bosonization duality, which has bosonic and fermionic matter on both ends. Using anomaly arguments, a phase diagram emerges with several novel regions. We then construct $2+1$ dimensional dualities for flavored quivers using node-by-node dualization. Such dualities are applicable to theories which live on domain walls in QCD$_4$-like theories with dynamical quarks. We also derive dualities for quivers based on orthogonal and symplectic gauge groups. Lastly, we support the conjectured dualities using holographic constructions, even though several aspects of this holographic construction remain mostly qualitative.

preprint2019arXiv

New and Old Fermionic Dualities from 3d Bosonization

We construct novel fermion-fermion dualities in $2+1$-dimensions using 3d bosonization dualities. This is achieved by relating two-node quiver theories using both the flavor-bounded and flavor-violated 3d bosonization dualities. Such quivers can be viewed as a generalization of the fermionic particle-vortex duality. A special case of these quivers exhibits a $\mathbb{Z}_2$ symmetry under interchange of the two nodes. Using orbifold techniques, we show that such dualities provide a novel way of deriving known 3d bosonization dualities with adjoint matter, thus unifying the non-Abelian bosonization dualities in an even larger duality web. We then use this construction to derive new dualities involving adjoint matter.

Kyle Aitken

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

The geometry of integration in text classification RNNs

On the asymptotics of wide networks with polynomial activations

Generalization of QCD$_3$ Symmetry-Breaking and Flavored Quiver Dualities

New and Old Fermionic Dualities from 3d Bosonization