Source author record

Andrey Gromov

Andrey Gromov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.str-el hep-th cond-mat.mes-hall math-ph math.MP cond-mat.stat-mech Machine Learning cond-mat.dis-nn cond-mat.quant-gas Artificial Intelligence cond-mat.soft nlin.PS quant-ph

Catalog footprint

What is connected

24works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning Rate Transfer in Normalized Transformers

The Normalized Transformer, or nGPT (arXiv:2410.01131) achieves impressive training speedups and does not require weight decay or learning rate warmup. However, despite having hyperparameters that explicitly scale with model size, we observe that nGPT does not exhibit learning rate transfer across model dimension and token horizon. To rectify this, we combine numerical experiments with a principled use of alignment exponents (arXiv:2407.05872) to revisit and modify the $μ$P approach to hyperparameter transfer (arXiv:2011.14522). The result is a novel nGPT parameterization we call $ν$GPT. Through extensive empirical validation, we find $ν$GPT exhibits learning rate transfer across width, depth, and token horizon.

preprint2026arXiv

On the origin of neural scaling laws: from random graphs to natural language

Scaling laws have played a major role in the modern AI revolution, providing practitioners predictive power over how the model performance will improve with increasing data, compute, and number of model parameters. This has spurred an intense interest in the origin of neural scaling laws, with a common suggestion being that they arise from power law structure already present in the data. In this paper we study scaling laws for transformers trained to predict random walks (bigrams) on graphs with tunable complexity. We demonstrate that this simplified setting already gives rise to neural scaling laws even in the absence of power law structure in the data correlations. We further consider dialing down the complexity of natural language systematically, by training on sequences sampled from increasingly simplified generative language models, from 4,2,1-layer transformer language models down to language bigrams, revealing a monotonic evolution of the scaling exponents. Our results also include scaling laws obtained from training on random walks on random graphs drawn from Erdös-Renyi and scale-free Barabási-Albert ensembles. Finally, we revisit conventional scaling laws for language modeling, demonstrating that several essential results can be reproduced using 2 layer transformers with context length of 50, provide a critical analysis of various fits used in prior literature, demonstrate an alternative method for obtaining compute optimal curves as compared with current practice in published literature, and provide preliminary evidence that maximal update parameterization may be more parameter efficient than standard parameterization.

preprint2023arXiv

Grokking modular arithmetic

We present a simple neural network that can learn modular arithmetic tasks and exhibits a sudden jump in generalization known as ``grokking''. Concretely, we present (i) fully-connected two-layer networks that exhibit grokking on various modular arithmetic tasks under vanilla gradient descent with the MSE loss function in the absence of any regularization; (ii) evidence that grokking modular arithmetic corresponds to learning specific feature maps whose structure is determined by the task; (iii) analytic expressions for the weights -- and thus for the feature maps -- that solve a large class of modular arithmetic tasks; and (iv) evidence that these feature maps are also found by vanilla gradient descent as well as AdamW, thereby establishing complete interpretability of the representations learnt by the network.

preprint2022arXiv

AutoInit: Automatic Initialization via Jacobian Tuning

Good initialization is essential for training Deep Neural Networks (DNNs). Oftentimes such initialization is found through a trial and error approach, which has to be applied anew every time an architecture is substantially modified, or inherited from smaller size networks leading to sub-optimal initialization. In this work we introduce a new and cheap algorithm, that allows one to find a good initialization automatically, for general feed-forward DNNs. The algorithm utilizes the Jacobian between adjacent network blocks to tune the network hyperparameters to criticality. We solve the dynamics of the algorithm for fully connected networks with ReLU and derive conditions for its convergence. We then extend the discussion to more general architectures with BatchNorm and residual connections. Finally, we apply our method to ResMLP and VGG architectures, where the automatic one-shot initialization found by our method shows good performance on vision tasks.

preprint2022arXiv

Fracton Matter

We review a burgeoning field of "fractons" -- a class of models where quasi-particles are strictly immobile or display restricted mobility that can be understood through generalized multipolar symmetries and associated conservation laws. Focusing on just a corner of this fast-growing subject, we will demonstrate how one class of such theories -- symmetric tensor and coupled-vector gauge theories surprisingly emerge from familiar elasticity of a two-dimensional quantum crystal. The disclination and dislocation crystal defects respectively map onto charges and dipoles of the fracton gauge theory. This fracton-elasticity duality leads to predictions of fractonic phases and quantum phase transitions to their descendants, that are duals of the commensurate crystal, supersolid, smectic, hexatic liquid crystals, as well as amorphous solids, quasi-crystals and elastic membranes. We show how these dual gauge theories provide a field theoretic description of quantum melting transitions through a generalized Higgs mechanism. We demonstrate how they can be equivalently constructed as gauged models with global multipole symmetries. We expect extensions of such gauge-elasticity dualities to generalized elasticity theories provide a route to discovery of new fractonic models and their potential experimental realizations.

preprint2022arXiv

Very high-energy collective states of partons in fractional quantum Hall liquids

The low energy physics of fractional quantum Hall (FQH) states -- a paradigm of strongly correlated topological phases of matter -- to a large extent is captured by weakly interacting quasiparticles known as composite fermions (CFs). In this paper, based on numerical simulations and effective field theory, we argue that some \emph{high energy} states in the FQH spectra necessitate a different description based on \emph{parton} quasiparticles. We show that Jain states at filling factor $ν{=}n/(2pn\pm1)$ with integers $n,p{\geq}2$, support two kinds of collective modes: in addition to the well-known Girvin-MacDonald-Platzman (GMP) mode, they host a high energy collective mode, which is interpreted as the GMP mode of partons. We elucidate observable signatures of the parton mode in the dynamics following a geometric quench. We construct a microscopic wave function for the parton mode, and demonstrate agreement between its variational energy and exact diagonalization. Using the parton construction, we derive a field theory of the Jain states and show that the previously proposed effective theories follow from our approach. Our results point to partons being "real" quasiparticles which, in a way reminiscent of quarks, only become observable at sufficiently high energies.

preprint2020arXiv

A Duality Between U(1) Haah Code and 3D Smectic A Phase

We describe a duality between multipole gauge theories and spatially ordered phases. Our main example is a duality between the multipole gauge theory description of the U(1) Haah code and smectic A phase in three spatial dimensions. We show how multipole symmetries restrict the mobility of dislocations and disclinations in smectic A phase. Finally, we exhibit a 2D version of the duality.

preprint2020arXiv

Fracton hydrodynamics

We introduce new classes of hydrodynamic theories inspired by the recently discovered fracton phases of quantum matter. Fracton phases are characterized by elementary excitations (fractons) with restricted mobility. The hydrodynamic theories we introduce describe thermalization in systems with fracton-like mobility constraints, including fluids where charge and dipole moment are both locally conserved, and fluids where charge is conserved along every line or plane of a lattice. Each of these fluids is subdiffusive, and constitutes a new universality class of hydrodynamic behavior. There are infinitely many such classes, each with distinct subdiffusive exponents, all of which are captured by our formalism. Our framework naturally explains recent results on dynamics with constrained quantum circuits, as well as recent experiments with ultracold atoms in tilted optical lattices. We identify crisp experimental signatures of these novel hydrodynamics, and explain how they may be realized in near term ultracold atom experiments.

preprint2020arXiv

On duality between Cosserat elasticity and fractons

We present a dual formulation of the Cosserat theory of elasticity. In this theory a local element of an elastic body is described in terms of local displacement and local orientation. Upon the duality transformation these degrees of freedom map onto a coupled theory of a vector-valued one-form gauge field and an ordinary $U(1)$ gauge field. We discuss the degrees of freedom in the corresponding gauge theories, the defect matter and coupling to the curved space.

preprint2020arXiv

Quench dynamics of collective modes in fractional quantum Hall bilayers

We introduce different types of quenches to probe the non-equilibrium dynamics and multiple collective modes of bilayer fractional quantum Hall states. We show that applying an electric field in one layer induces oscillations of a spin-1 degree of freedom, whose frequency matches the long-wavelength limit of the dipole mode. On the other hand, oscillations of the long-wavelength limit of the quadrupole mode, i.e., the spin-2 graviton, as well as the combination of two spin-1 states, can be activated by a sudden change of band mass anisotropy. We construct an effective field theory to describe the quench dynamics of these collective modes. In particular, we derive the dynamics for both the spin-2 and the spin-1 states and demonstrate their excellent agreement with numerics.

preprint2020arXiv

Vortices and Fractons

We discuss a simple and experimentally available realization of fracton physics. We note that superfluid vortices form a Hamiltonian system that conserves total dipole moment and trace of the quadrupole moment of vorticity; thereby establishing a relation to a traceless scalar charge theory in two spatial dimensions. Next we consider the limit where the number of vortices is large and show that emergent vortex hydrodynamics also conserves these moments. Finally, we show the motion of vortices and of fractons on curved surfaces agree, thereby opening a route to experimental study of the interplay between fracton physics and curved space. Our conclusions also apply to charged particles in strong magnetic field.

preprint2019arXiv

Anisotropic odd viscosity via time-modulated drive

At equilibrium, the structure and response of ordered phases are typically determined by the spontaneous breaking of spatial symmetries. Out of equilibrium, spatial order itself can become a dynamically emergent concept. In this article, we show that spatially anisotropic viscous coefficients and stresses can be designed in a far-from-equilibrium fluid by applying to its constituents a time-modulated drive. If the drive induces a rotation whose rate is slowed down when the constituents point along specific directions, anisotropic structures and mechanical responses arise at long timescales. We demonstrate that the viscous response of such anisotropic driven fluids can acquire a tensorial, dissipationless component called anisotropic odd (or Hall) viscosity. Classical fluids with internal torques can display additional components of the odd viscosity neglected in previous studies of quantum Hall fluids that assumed angular momentum conservation. We show that these anisotropic and angular momentum-violating odd-viscosity coefficients can change even the bulk flow of an incompressible fluid by acting as a source of vorticity. In addition, shear distortions in the shape of an inclusion result in torques.

preprint2019arXiv

Collective excitations at filling factor 5/2: The view from superspace

We present a microscopic theory of the neutral collective modes supported by the non-Abelian fractional quantum Hall states at filling factor 5/2. The theory is formulated in terms of the trial states describing the Girvin-MacDonald-Platzman (GMP) mode and its fermionic counterpart. These modes are superpartners of each other in a concrete sense, which we elucidate.

preprint2019arXiv

Effective response theory for Floquet topological systems

We present an effective field theory approach to the topological response of Floquet systems with symmetry group $G$. This is achieved by introducing a background $G$ gauge field in the Schwinger-Keldysh formalism, which is suitable for far from equilibrium systems. We carry out this program for chiral topological Floquet systems (anomalous Floquet-Anderson insulators) in two spatial dimensions, and the group cohomology models of topological Floquet unitaries. These response actions serve as many-body topological invariants for topological Floquet unitaries. The effective action approach also leads us to propose novel topological response functions.

preprint2016arXiv

Geometric Defects in Quantum Hall States

We describe a geometric (or gravitational) analogue of the Laughlin quasiholes in the fractional quantum Hall states. Analogously to the quasiholes these defects can be constructed by an insertion of an appropriate vertex operator into the conformal block representation of a trial wavefunction, however, unlike the quasiholes these defects are extrinsic and do not correspond to true excitations of the quantum fluid. We construct a wavefunction in the presence of such defects and explain how to assign an electric charge and a spin to each defect, and calculate the adiabatic, non-abelian statistics of the defects. The defects turn out to be equivalent to the genons in that their adiabatic exchange statistics can be described in terms of representations of the mapping class group of an appropriate higher genus Riemann surface. We present a general construction that, in principle, allows to calculate the statistics of $\mathbb Z_n$ genons for any "parent" topological phase. We illustrate the construction on the example of the Laughlin state and perform an explicit calculation of the braiding matrices. In addition to non-abelian statistics geometric defects possess a universal abelian overall phase, determined by the gravitational anomaly.

preprint2016arXiv

Universal dynamics of a soliton after an interaction quench

We propose a new type of experimentally feasible quantum quench protocol in which a quantum system is prepared in a coherent, localized excited state of a Hamiltonian. During the evolution of this solitonic excitation, the microscopic interaction is suddenly changed. We study the dynamics of solitons after this interaction quench for a wide class of systems using a hydrodynamic approach. We find that the post-quench dynamics is universal at short times, i.e. it does not depend on the microscopic details of the physical system. Numerical support for these results is presented using generalized non-linear Schroedinger equation, relevant for the implementation of the proposed protocol with ultracold bosons, as well as for the integrable Calogero model in harmonic potential. Finally, it is shown that the effects of integrability breaking by a parabolic potential and by a power-law non-linearity do not change the universality of the short-time dynamics.

preprint2015arXiv

Boundary effective action for quantum Hall states

We consider quantum Hall states on a space with boundary, focusing on the aspects of the edge physics which are completely determined by the symmetries of the problem. There are four distinct terms of Chern-Simons type that appear in the low-energy effective action of the state. Two of these protect gapless edge modes. They describe Hall conductance and, with some provisions, thermal Hall conductance. The remaining two, including the Wen-Zee term, which contributes to the Hall viscosity, do not protect gapless edge modes but are instead related to local boundary response fixed by symmetries. We highlight some basic features of this response. It follows that the coefficient of the Wen-Zee term can change across an interface without closing a gap or breaking a symmetry.

preprint2015arXiv

Framing Anomaly in the Effective Theory of Fractional Quantum Hall Effect

We consider the geometric part of the effective action for Fractional Quantum Hall Effect (FQHE). It is shown that accounting for the framing anomaly of the quantum Chern-Simons theory is essential to the obtain correct gravitational linear response functions. In the lowest order in gradients the linear response generating functional includes Chern-Simons, Wen-Zee and gravitational Chern- Simons terms. The latter term has a contribution from the framing anomaly which fixes the value of thermal Hall conductivity and contributes to the Hall viscosity of the FQH states on a sphere. We also discuss the effects of the framing anomaly on linear responses for non-Abelian FQH states.

preprint2015arXiv

Supersymmetric waves in Bose-Fermi mixtures

Interacting Bose-Fermi mixtures possess a fermionic (super)symmetry when bosons and fermions in the mixture have equal masses, and when the interaction strengths are appropriately tuned. This symmetry is spontaneously broken in the ground state of the mixture, leading to a novel Goldstone mode with fermionic statistics and quadratic dispersion. Here we examine the effect of explicit symmetry-breaking perturbations on the Goldstone mode. When the symmetry is not exact and the system is allowed to deviate from the symmetric point, we find that the Goldstone mode acquires an energy gap. We show that the excitations manifest themselves as a non-analyticity of the thermodynamic pressure.

preprint2014arXiv

Density-curvature response and gravitational anomaly

We study constraints imposed by the Galilean invariance on linear electromagnetic and elastic responses of two-dimensional gapped systems in background magnetic field. Exact relations between response functions following from the Ward identities are derived. In addition to viscosity-conductivity relations known in literature we find new relations between the density-curvature response and the chiral central charge.

preprint2014arXiv

Electromagnetic and gravitational responses of two-dimensional non-interacting electrons in background magnetic field

We compute electromagnetic, gravitational and mixed linear response functions of two- dimensional free fermions in external quantizing magnetic field at an integer filling factor. The results are presented in the form of the effective action and as an expansion of currents and stresses in wave-vectors and frequencies of the probing electromagnetic and metric fields. We identify the terms in linear response functions coming from geometric Chern-Simons, Wen-Zee, and gravitational Chern-Simons terms in effective action. We derive the expressions for Hall conductivity, Hall viscosity and find the current and charge density responses to the spatial curvature as well as stresses caused by inhomogeneous electromagnetic fields.

preprint2014arXiv

Entanglement Entropy in 2D Non-abelian Pure Gauge Theory

We compute the Entanglement Entropy (EE) of a bipartition in 2D pure non-abelian $U(N)$ gauge theory. We obtain a general expression for EE on an arbitrary Riemann surface. We find that due to area-preserving diffeomorphism symmetry EE does not depend on the size of the subsystem, but only on the number of disjoint intervals defining the bipartition. In the strong coupling limit on a torus we find that the scaling of the EE at small temperature is given by $S(T) - S(0) = O\left(\frac{m_{gap}}{T}e^{-\frac{m_{gap}}{T}}\right)$, which is similar to the scaling for the matter fields recently derived in literature. In the large $N$ limit we compute all of the Renyi entropies and identify the Douglas-Kazakov phase transition.

preprint2014arXiv

Thermal Hall Effect and Geometry with Torsion

We formulate a geometric framework that allows to study momentum and energy transport in non-relativistic systems. It amounts to coupling of the non-relativistic system to the Newton-Cartan geometry with torsion. The approach generalizes the classic Luttinger's formulation of thermal transport. In particular, we clarify the geometric meaning of the fields conjugated to energy and energy current. These fields describe the geometric background with non-vanishing temporal torsion. We use the developed formalism to construct the equilibrium partition function of a non-relativistic system coupled to the NC geometry in 2+1 dimensions and to derive various thermodynamic relations

preprint2011arXiv

Soliton solutions of Calogero model in harmonic potential

A classical Calogero model in an external harmonic potential is known to be integrable for any number of particles. We consider here reductions which play a role of "soliton" solutions of the model. We obtain these solutions both for the model with finite number of particles and in a hydrodynamic limit. In the latter limit the model is described by hydrodynamic equations on continuous density and velocity fields. Soliton solutions in this case are finite dimensional reductions of the hydrodynamic model and describe the propagation of lumps of density and velocity in the nontrivial background.

Andrey Gromov

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

Learning Rate Transfer in Normalized Transformers

On the origin of neural scaling laws: from random graphs to natural language

Grokking modular arithmetic

AutoInit: Automatic Initialization via Jacobian Tuning

Fracton Matter

Very high-energy collective states of partons in fractional quantum Hall liquids

A Duality Between U(1) Haah Code and 3D Smectic A Phase

Fracton hydrodynamics

On duality between Cosserat elasticity and fractons

Quench dynamics of collective modes in fractional quantum Hall bilayers

Vortices and Fractons

Anisotropic odd viscosity via time-modulated drive

Collective excitations at filling factor 5/2: The view from superspace

Effective response theory for Floquet topological systems

Geometric Defects in Quantum Hall States

Universal dynamics of a soliton after an interaction quench

Boundary effective action for quantum Hall states

Framing Anomaly in the Effective Theory of Fractional Quantum Hall Effect

Supersymmetric waves in Bose-Fermi mixtures

Density-curvature response and gravitational anomaly

Electromagnetic and gravitational responses of two-dimensional non-interacting electrons in background magnetic field

Entanglement Entropy in 2D Non-abelian Pure Gauge Theory

Thermal Hall Effect and Geometry with Torsion

Soliton solutions of Calogero model in harmonic potential