Source author record

Andrey Zhmoginov

Andrey Zhmoginov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning physics.atom-ph gr-qc hep-ph Information Theory math.IT Neural and Evolutionary Computing physics.plasm-ph

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Fine-tuning Image Transformers using Learnable Memory

In this paper we propose augmenting Vision Transformer models with learnable memory tokens. Our approach allows the model to adapt to new tasks, using few parameters, while optionally preserving its capabilities on previously learned tasks. At each layer we introduce a set of learnable embedding vectors that provide contextual information useful for specific datasets. We call these "memory tokens". We show that augmenting a model with just a handful of such tokens per layer significantly improves accuracy when compared to conventional head-only fine-tuning, and performs only slightly below the significantly more expensive full fine-tuning. We then propose an attention-masking approach that enables extension to new downstream tasks, with a computation reuse. In this setup in addition to being parameters efficient, models can execute both old and new tasks as a part of single inference at a small incremental cost.

preprint2022arXiv

HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples. Since the dependence of a small generated CNN model on a specific task is encoded by a high-capacity Transformer model, we effectively decouple the complexity of the large task space from the complexity of individual tasks. Our method is particularly effective for small target CNN architectures where learning a fixed universal task-independent embedding is not optimal and better performance is attained when the information about the task can modulate all model parameters. For larger models we discover that generating the last layer alone allows us to produce competitive or better results than those obtained with state-of-the-art methods while being end-to-end differentiable.

preprint2020arXiv

Image segmentation via Cellular Automata

In this paper, we propose a new approach for building cellular automata to solve real-world segmentation problems. We design and train a cellular automaton that can successfully segment high-resolution images. We consider a colony that densely inhabits the pixel grid, and all cells are governed by a randomized update that uses the current state, the color, and the state of the $3\times 3$ neighborhood. The space of possible rules is defined by a small neural network. The update rule is applied repeatedly in parallel to a large random subset of cells and after convergence is used to produce segmentation masks that are then back-propagated to learn the optimal update rules using standard gradient descent methods. We demonstrate that such models can be learned efficiently with only limited trajectory length and that they show remarkable ability to organize the information to produce a globally consistent segmentation result, using only local information exchange. From a practical perspective, our approach allows us to build very efficient models -- our smallest automaton uses less than 10,000 parameters to solve complex segmentation tasks.

preprint2020arXiv

Information-Bottleneck Approach to Salient Region Discovery

We propose a new method for learning image attention masks in a semi-supervised setting based on the Information Bottleneck principle. Provided with a set of labeled images, the mask generation model is minimizing mutual information between the input and the masked image while maximizing the mutual information between the same masked image and the image label. In contrast with other approaches, our attention model produces a Boolean rather than a continuous mask, entirely concealing the information in masked-out pixels. Using a set of synthetic datasets based on MNIST and CIFAR10 and the SVHN datasets, we demonstrate that our method can successfully attend to features known to define the image class.

preprint2016arXiv

Inverting face embeddings with convolutional neural networks

Deep neural networks have dramatically advanced the state of the art for many areas of machine learning. Recently they have been shown to have a remarkable ability to generate highly complex visual artifacts such as images and text rather than simply recognize them. In this work we use neural networks to effectively invert low-dimensional face embeddings while producing realistically looking consistent images. Our contribution is twofold, first we show that a gradient ascent style approaches can be used to reproduce consistent images, with a help of a guiding image. Second, we demonstrate that we can train a separate neural network to effectively solve the minimization problem in one pass, and generate images in real-time. We then evaluate the loss imposed by using a neural network instead of the gradient descent by comparing the final values of the minimized loss function.

preprint2014arXiv

Antimatter interferometry for gravity measurements

We describe a light-pulse atom interferometer that is suitable for any species of atom and even for electrons and protons as well as their antiparticles, in particular for testing the Einstein equivalence principle with antihydrogen. The design obviates the need for resonant lasers through far-off resonant Bragg beam splitters and makes efficient use of scarce atoms by magnetic confinement and atom recycling. We expect to reach an initial accuracy of better than 1% for the acceleration of free fall of antihydrogen, which can be improved to the part-per million level.

preprint2013arXiv

Nonlinear dynamics of antihydrogen in magnetostatic traps: implications for gravitational measurements

The influence of gravity on antihydrogen dynamics in magnetic traps is studied. The advantages and disadvantages of various techniques for measuring the ratio of the gravitational mass to the inertial mass of antihydrogen are discussed. Theoretical considerations and numerical simulations indicate that stochasticity may be especially important for some experimental techniques in vertically oriented traps.