Researcher profile

Vitaly Vanchurin

Vitaly Vanchurin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

Bio-inspired Machine Learning: programmed death and replication

We analyze algorithmic and computational aspects of biological phenomena, such as replication and programmed death, in the context of machine learning. We use two different measures of neuron efficiency to develop machine learning algorithms for adding neurons to the system (i.e. replication algorithm) and removing neurons from the system (i.e. programmed death algorithm). We argue that the programmed death algorithm can be used for compression of neural networks and the replication algorithm can be used for improving performance of the already trained neural networks. We also show that a combined algorithm of programmed death and replication can improve the learning efficiency of arbitrary machine learning systems. The computational advantages of the bio-inspired algorithms are demonstrated by training feedforward neural networks on the MNIST dataset of handwritten images.

preprint2022arXiv

Towards a theory of quantum gravity from neural networks

Neural network is a dynamical system described by two different types of degrees of freedom: fast-changing non-trainable variables (e.g. state of neurons) and slow-changing trainable variables (e.g. weights and biases). We show that the non-equilibrium dynamics of trainable variables can be described by the Madelung equations, if the number of neurons is fixed, and by the Schrodinger equation, if the learning system is capable of adjusting its own parameters such as the number of neurons, step size and mini-batch size. We argue that the Lorentz symmetries and curved space-time can emerge from the interplay between stochastic entropy production and entropy destruction due to learning. We show that the non-equilibrium dynamics of non-trainable variables can be described by the geodesic equation (in the emergent space-time) for localized states of neurons, and by the Einstein equations (with cosmological constant) for the entire network. We conclude that the quantum description of trainable variables and the gravitational description of non-trainable variables are dual in the sense that they provide alternative macroscopic descriptions of the same learning system, defined microscopically as a neural network.

preprint2021arXiv

Dual Path Integral: a non-perturbative approach to strong coupling

We develop a non-perturbative method for calculating partition functions of strongly coupled quantum mechanical systems with interactions between subsystems described by a path integral of a dual system. The dual path integral is derived starting from non-interacting subsystems at zeroth order and then by introducing couplings of increasing complexity at each order of an iterative procedure. These orders of interactions play the role of a dual time and the full quantum partition function is expressed as a transition amplitude in the dual system. More precisely, it is expressed as a path integral from a deformation-operators dependent initial state at zero time/order to the inverse-temperature dependent final state at later time/order. We provide three examples of strongly coupled systems with first-order, second-order and higher-order interactions and discuss a possible emergence of space-time, quantum field theories and general relativity in context of the dual path integral.

preprint2021arXiv

Towards a theory of machine learning

We define a neural network as a septuple consisting of (1) a state vector, (2) an input projection, (3) an output projection, (4) a weight matrix, (5) a bias vector, (6) an activation map and (7) a loss function. We argue that the loss function can be imposed either on the boundary (i.e. input and/or output neurons) or in the bulk (i.e. hidden neurons) for both supervised and unsupervised systems. We apply the principle of maximum entropy to derive a canonical ensemble of the state vectors subject to a constraint imposed on the bulk loss function by a Lagrange multiplier (or an inverse temperature parameter). We show that in an equilibrium the canonical partition function must be a product of two factors: a function of the temperature and a function of the bias vector and weight matrix. Consequently, the total Shannon entropy consists of two terms which represent respectively a thermodynamic entropy and a complexity of the neural network. We derive the first and second laws of learning: during learning the total entropy must decrease until the system reaches an equilibrium (i.e. the second law), and the increment in the loss function must be proportional to the increment in the thermodynamic entropy plus the increment in the complexity (i.e. the first law). We calculate the entropy destruction to show that the efficiency of learning is given by the Laplacian of the total free energy which is to be maximized in an optimal neural architecture, and explain why the optimization condition is better satisfied in a deep network with a large number of hidden layers. The key properties of the model are verified numerically by training a supervised feedforward neural network using the method of stochastic gradient descent. We also discuss a possibility that the entire universe on its most fundamental level is a neural network.

preprint2011arXiv

Eternal Inflation, Global Time Cutoff Measures, and a Probability Paradox

The definition of probabilities in eternally inflating universes requires a measure to regulate the infinite spacetime volume, and much of the current literature uses a global time cutoff for this purpose. Such measures have been found to lead to paradoxical behavior, and recently Bousso, Freivogel, Leichenauer, and Rosenhaus have argued that, under reasonable assumptions, the only consistent interpretation for such measures is that time must end at the cutoff. Here we argue that there is an alternative, consistent formulation of such measures, in which time extends to infinity. Our formulation begins with a mathematical model of the infinite multiverse, which can be constructed without the use of a measure. Probabilities, which obey all the standard requirements for a probability measure, can then be defined by mathematical limits. They have a peculiar feature, however, which we call time-delay bias: if the outcome of an experiment is reported with a time delay that depends on the outcome, then the observation of the reports will be biased in favor of the shorter time delay. We show how the paradoxes can be resolved in this interpretation of the measure.

preprint2011arXiv

Geocentric cosmology: a new look at the measure problem

We show that most of cutoff measures of the multiverse violate some of the basic properties of probability theory when applied repeatedly to predict the results of local experiments. Starting from minimal assumptions, such as Markov property, we derive a correspondence between cosmological measures and quantum field theories in one lesser dimension. The correspondence allows us to replace the picture of an infinite multiverse with a finite causally connected region accessible by a given observer in conjunction with a Euclidean theory defined on its past boundary.

preprint2011arXiv

Towards a kinetic theory of strings

We study the dynamics of strings by means of a distribution function f(A, B, x, t) defined on a 9+1D phase space, where A and B are the correlation vectors of right- and left-moving waves. We derive a transport equation (an analogous to Boltzmann transport equation for particles) that governs the evolution of long strings with Nambu-Goto dynamics as well as reconnections taken into account. We also derive a system of coupled transport equations (an analogous to BBGKY hierarchy for particles) which can simultaneously describe long strings \tilde{f}(A, B, x, t) as well as simple loops \mathring{f}(A, B, x, t) made out of four correlation vectors. The formalism can be used to study non-linear dynamics of fundamental strings, D-brane strings or field theory strings. For example, the complicated semi-scaling behavior of cosmic strings translates into a simple solution of the transport system at small energy densities.

preprint2010arXiv

How many universes are in the multiverse?

We argue that the total number of distinguishable locally Friedmann universes generated by eternal inflation is proportional to the exponent of the entropy of inflationary perturbations and is limited by e^{e^{3 N}}, where N is the number of e-folds of slow-roll post-eternal inflation. For simplest models of chaotic inflation, N is approximately equal to de Sitter entropy at the end of eternal inflation; it can be exponentially large. However, not all of these universes can be observed by a local observer. In the presence of a cosmological constant Λthe number of distinguishable universes is bounded by e^{|Λ|^{-3/4}}. In the context of the string theory landscape, the overall number of different universes is expected to be exponentially greater than the total number of vacua in the landscape. We discuss the possibility that the strongest constraint on the number of distinguishable universes may be related not to the properties of the multiverse but to the properties of observers.

preprint2010arXiv

Semi-scaling cosmic strings

We develop a model of string dynamics with back-reaction from both scaling and non-scaling loops taken into account. The evolution of a string network is described by the distribution functions of coherence segments and kinks. We derive two non-linear equations which govern the evolution of the two distributions and solve them analytically in the limit of late times. We also show that the correlation function is an exponential, and solve the dynamics for the corresponding spectrum of scaling loops.

preprint2010arXiv

Towards a non-anthropic solution to the cosmological constant problem

Many probability measures in the multiverse depend exponentially on some observable parameters, giving rise to potential problems such as youngness bias, Q-catastrophe etc. In this paper we explore a possibility that the exponential runaway dependence should be viewed not as a problem, but as a feature that may help us to fix all parameters in the landscape, including the value of the cosmological constant, without using anthropic considerations.