Source author record

Dmitry Krotov

Dmitry Krotov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning hep-th Artificial Intelligence astro-ph.CO Biological Physics Computation and Language Computer Vision cond-mat.dis-nn cond-mat.stat-mech gr-qc hep-ph Molecular Networks Neural and Evolutionary Computing Neurons and Cognition nlin.AO Social and Information Networks

Catalog footprint

What is connected

7works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Deep Clustering with Associative Memories

Deep clustering - joint representation learning and latent space clustering - is a well studied problem especially in computer vision and text processing under the deep learning framework. While the representation learning is generally differentiable, clustering is an inherently discrete optimization task, requiring various approximations and regularizations to fit in a standard differentiable pipeline. This leads to a somewhat disjointed representation learning and clustering. In this work, we propose a novel loss function utilizing energy-based dynamics via Associative Memories to formulate a new deep clustering method, DCAM, which ties together the representation learning and clustering aspects more intricately in a single objective. Our experiments showcase the advantage of DCAM, producing improved clustering quality for various architecture choices (convolutional, residual or fully-connected) and data modalities (images or text).

preprint2026arXiv

Hyperparameter Transfer for Dense Associative Memories

Dense Associative Memory (DenseAM) is a promising family of AI architectures that is represented by a neural network performing temporal dynamics on an energy landscape. While hyperparameter transfer methods are well-studied for feed-forward networks, these methods have not been developed for settings in which weights are shared across layers and within the layer, which is common in DenseAMs. Additionally, DenseAMs utilize rapidly peaking activation functions that are rarely used in feed-forward architectures. The confluence of these aspects makes DenseAM a challenging framework for using existing methods for hyperparameter transfer. Our work initiates the development of hyperparameter transfer methods for this class of models. We derive explicit prescriptions for how the hyperparameters tuned on small models can be transferred to models trained at scale. We demonstrate excellent agreement between these theoretical findings and empirical results.

preprint2026arXiv

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

When do language diffusion models memorize their training data, and how to quantitatively assess their true generative regime? We address these questions by showing that Uniform-based Discrete Diffusion Models (UDDMs) fundamentally behave as Associative Memories (AMs) $\textit{with emergent creative capabilities}$. The core idea of an AM is to reliably recover stored data points as $\textit{memories}$ by establishing distinct basins of attraction around them. Historically, models like Hopfield networks use an explicit energy function to guarantee these stable attractors. We broaden this perspective by leveraging the observation that energy is not strictly necessary, as basins of attraction can also be formed via conditional likelihood maximization. By evaluating token recovery of $\textit{training}$ and $\textit{test}$ examples, we identify in UDDMs a sharp memorization-to-generalization transition governed by the size of the training dataset: as it increases, basins around training examples shrink and basins around unseen test examples expand, until both later converge to the same level. Crucially, we can detect this transition using only the conditional entropy of predicted token sequences: memorization is characterized by vanishing conditional entropy, while in the generalization regime the conditional entropy of most tokens remains finite. Thus, conditional entropy offers a practical probe for the memorization-to-generalization transition in deployed models.

preprint2022arXiv

Associative Learning for Network Embedding

The network embedding task is to represent the node in the network as a low-dimensional vector while incorporating the topological and structural information. Most existing approaches solve this problem by factorizing a proximity matrix, either directly or implicitly. In this work, we introduce a network embedding method from a new perspective, which leverages Modern Hopfield Networks (MHN) for associative learning. Our network learns associations between the content of each node and that node's neighbors. These associations serve as memories in the MHN. The recurrent dynamics of the network make it possible to recover the masked node, given that node's neighbors. Our proposed method is evaluated on different downstream tasks such as node classification and linkage prediction. The results show competitive performance compared to the common matrix factorization techniques and deep learning based methods.

preprint2013arXiv

Morphogenesis at criticality?

Spatial patterns in the early fruit fly embryo emerge from a network of interactions among transcription factors, the gap genes, driven by maternal inputs. Such networks can exhibit many qualitatively different behaviors, separated by critical surfaces. At criticality, we should observe strong correlations in the fluctuations of different genes around their mean expression levels, a slowing of the dynamics along some but not all directions in the space of possible expression levels, correlations of expression fluctuations over long distances in the embryo, and departures from a Gaussian distribution of these fluctuations. Analysis of recent experiments on the gap genes shows that all these signatures are observed, and that the different signatures are related in ways predicted by theory. While there might be other explanations for these individual phenomena, the confluence of evidence suggests that this genetic network is tuned to criticality.

preprint2011arXiv

Infrared Sensitivity of Unstable Vacua

We discover that some unstable vacua have long memory. By that we mean that even in the theories containing only massive particles, there are correllators and expectation values which grow with time. We examine the cases of instabilities caused by the constant electric fields, expanding and contracting universes and, most importantly, the global de Sitter space. In the last case the interaction leads to a remarkable UV/IR mixing and to a large back reaction. This gives reasons to believe that the cosmological constant problem could be resolved by the infrared physics.

preprint2008arXiv

Quantum Field Theory as Effective BV Theory from Chern-Simons

The general procedure for obtaining explicit expressions for all cohomologies of N.Berkovits's operator is suggested. It is demonstrated that calculation of BV integral for the classical Chern-Simons-like theory (Witten's OSFT-like theory) reproduces BV version of two dimensional gauge model at the level of effective action. This model contains gauge field, scalars, fermions and some other fields. We prove that this model is an example of "singular" point from the perspective of the suggested method for cohomology evaluation. For arbitrary "regular" point the same technique results in AKSZ(Alexandrov, Kontsevich, Schwarz, Zaboronsky) version of Chern-Simons theory (BF theory) in accord with [2,3].

Dmitry Krotov

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Deep Clustering with Associative Memories

Hyperparameter Transfer for Dense Associative Memories

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

Associative Learning for Network Embedding

Morphogenesis at criticality?

Infrared Sensitivity of Unstable Vacua

Quantum Field Theory as Effective BV Theory from Chern-Simons