Source author record

Andrey Y. Lokhov

Andrey Y. Lokhov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Social and Information Networks cond-mat.stat-mech Machine Learning cond-mat.dis-nn physics.data-an Populations and Evolution quant-ph Applications Biomolecules Data Structures and Algorithms math.CO math.DS math.OC math.PR nlin.AO Systems and Control

Catalog footprint

What is connected

17works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Finite Sample Bounds for Learning with Score Matching

Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.

preprint2023arXiv

Learning of networked spreading models from noisy and incomplete data

Recent years have seen a lot of progress in algorithms for learning parameters of spreading dynamics from both full and partial data. Some of the remaining challenges include model selection under the scenarios of unknown network structure, noisy data, missing observations in time, as well as an efficient incorporation of prior information to minimize the number of samples required for an accurate learning. Here, we introduce a universal learning method based on scalable dynamic message-passing technique that addresses these challenges often encountered in real data. The algorithm leverages available prior knowledge on the model and on the data, and reconstructs both network structure and parameters of a spreading model. We show that a linear computational complexity of the method with the key model parameters makes the algorithm scalable to large network instances.

preprint2022arXiv

High-quality Thermal Gibbs Sampling with Quantum Annealing Hardware

Quantum Annealing (QA) was originally intended for accelerating the solution of combinatorial optimization tasks that have natural encodings as Ising models. However, recent experiments on QA hardware platforms have demonstrated that, in the operating regime corresponding to weak interactions, the QA hardware behaves like a noisy Gibbs sampler at a hardware-specific effective temperature. This work builds on those insights and identifies a class of small hardware-native Ising models that are robust to noise effects and proposes a procedure for executing these models on QA hardware to maximize Gibbs sampling performance. Experimental results indicate that the proposed protocol results in high-quality Gibbs samples from a hardware-specific effective temperature. Furthermore, we show that this effective temperature can be adjusted by modulating the annealing time and energy scale. The procedure proposed in this work provides an approach to using QA hardware for Ising model sampling presenting potential new opportunities for applications in machine learning and physics simulation.

preprint2022arXiv

Learning Continuous Exponential Families Beyond Gaussian

We address the problem of learning of continuous exponential family distributions with unbounded support. While a lot of progress has been made on learning of Gaussian graphical models, we still lack scalable algorithms for reconstructing general continuous exponential families modeling higher-order moments of the data beyond the mean and the covariance. Here, we introduce a computationally efficient method for learning continuous graphical models based on the Interaction Screening approach. Through a series of numerical experiments, we show that our estimator maintains similar requirements in terms of accuracy and sample complexity scalings compared to alternative approaches such as maximization of conditional likelihood, while considerably improving upon the algorithm's run-time.

preprint2022arXiv

Vector Field Visualization of Single-Qubit State Tomography

As the variety of commercially available quantum computers continues to increase so does the need for tools that can characterize, verify and validate these computers. This work explores using quantum state tomography for characterizing the performance of individual qubits and develops a vector field visualization for presentation of the results. The proposed protocol is demonstrated in simulation and on quantum computing hardware developed by IBM. The results identify qubit performance features that are not reflected in the standard models of this hardware, indicating opportunities to improve the accuracy of these models. The proposed qubit evaluation protocol is provided as free open-source software to streamline the task of replicating the process on other quantum computing devices.

preprint2020arXiv

Data-driven Selection of Coarse-Grained Models of Coupled Oscillators

Systematic discovery of reduced-order closure models for multi-scale processes remains an important open problem in complex dynamical systems. Even when an effective lower-dimensional representation exists, reduced models are difficult to obtain using solely analytical methods. Rigorous methodologies for finding such coarse-grained representations of multi-scale phenomena would enable accelerated computational simulations and provide fundamental insights into the complex dynamics of interest. We focus on a heterogeneous population of oscillators of Kuramoto type as a canonical model of complex dynamics, and develop a data-driven approach for inferring its coarse-grained description. Our method is based on a numerical optimization of the coefficients in a general equation of motion informed by analytical derivations in the thermodynamic limit. We show that certain assumptions are required to obtain an autonomous coarse-grained equation of motion. However, optimizing coefficient values enables coarse-grained models with conceptually disparate functional forms, yet comparable quality of representation, to provide accurate reduced-order descriptions of the underlying system.

preprint2020arXiv

Reducing urban traffic congestion due to localized routing decisions

Balancing traffic flow by influencing drivers' route choices to alleviate congestion is becoming increasingly more appealing in urban traffic planning. Here, we introduce a discrete dynamical model comprising users who make their own routing choices on the basis of local information and those who consider routing advice based on localized inducement. We identify the formation of traffic patterns, develop a scalable optimization method for identifying control values used for user guidance, and test the effectiveness of these measures on synthetic and real-world road networks.

preprint2019arXiv

Scalable Influence Estimation Without Sampling

In a diffusion process on a network, how many nodes are expected to be influenced by a set of initial spreaders? This natural problem, often referred to as influence estimation, boils down to computing the marginal probability that a given node is active at a given time when the process starts from specified initial condition. Among many other applications, this task is crucial for a well-studied problem of influence maximization: finding optimal spreaders in a social network that maximize the influence spread by a certain time horizon. Indeed, influence estimation needs to be called multiple times for comparing candidate seed sets. Unfortunately, in many models of interest an exact computation of marginals is #P-hard. In practice, influence is often estimated using Monte-Carlo sampling methods that require a large number of runs for obtaining a high-fidelity prediction, especially at large times. It is thus desirable to develop analytic techniques as an alternative to sampling methods. Here, we suggest an algorithm for estimating the influence function in popular independent cascade model based on a scalable dynamic message-passing approach. This method has a computational complexity of a single Monte-Carlo simulation and provides an upper bound on the expected spread on a general graph, yielding exact answer for treelike networks. We also provide dynamic message-passing equations for a stochastic version of the linear threshold model. The resulting saving of a potentially large sampling factor in the running time compared to simulation-based techniques hence makes it possible to address large-scale problem instances.

preprint2016arXiv

Detection of Cyber-Physical Faults and Intrusions from Physical Correlations

Cyber-physical systems are critical infrastructures that are crucial both to the reliable delivery of resources such as energy, and to the stable functioning of automatic and control architectures. These systems are composed of interdependent physical, control and communications networks described by disparate mathematical models creating scientific challenges that go well beyond the modeling and analysis of the individual networks. A key challenge in cyber-physical defense is a fast online detection and localization of faults and intrusions without prior knowledge of the failure type. We describe a set of techniques for the efficient identification of faults from correlations in physical signals, assuming only a minimal amount of available system information. The performance of our detection method is illustrated on data collected from a large building automation system.

preprint2016arXiv

Optimal Deployment of Resources for Maximizing Impact in Spreading Processes

The effective use of limited resources for controlling spreading processes on networks is of prime significance in diverse contexts, ranging from the identification of "influential spreaders" for maximizing information dissemination and targeted interventions in regulatory networks, to the development of mitigation policies for infectious diseases and financial contagion in economic systems. Solutions for these optimization tasks that are based purely on topological arguments are not fully satisfactory; in realistic settings the problem is often characterized by heterogeneous interactions and requires interventions over a finite time window via a restricted set of controllable nodes. The optimal distribution of available resources hence results from an interplay between network topology and spreading dynamics. We show how these problems can be addressed as particular instances of a universal analytical framework based on a scalable dynamic message-passing approach and demonstrate the efficacy of the method on a variety of real-world examples.

preprint2016arXiv

Reconstructing parameters of spreading models from partial observations

Spreading processes are often modelled as a stochastic dynamics occurring on top of a given network with edge weights corresponding to the transmission probabilities. Knowledge of veracious transmission probabilities is essential for prediction, optimization, and control of diffusion dynamics. Unfortunately, in most cases the transmission rates are unknown and need to be reconstructed from the spreading data. Moreover, in realistic settings it is impossible to monitor the state of each node at every time, and thus the data is highly incomplete. We introduce an efficient dynamic message-passing algorithm, which is able to reconstruct parameters of the spreading model given only partial information on the activation times of nodes in the network. The method is generalizable to a large class of dynamic models, as well to the case of temporal graphs.

preprint2015arXiv

Dynamic message-passing equations for models with unidirectional dynamics

Understanding and quantifying the dynamics of disordered out-of-equilibrium models is an important problem in many branches of science. Using the dynamic cavity method on time trajectories, we construct a general procedure for deriving the dynamic message-passing equations for a large class of models with unidirectional dynamics, which includes the zero-temperature random field Ising model, the susceptible-infected-recovered model, and rumor spreading models. We show that unidirectionality of the dynamics is the key ingredient that makes the problem solvable. These equations are applicable to single instances of the corresponding problems with arbitrary initial conditions, and are asymptotically exact for problems defined on locally tree-like graphs. When applied to real-world networks, they generically provide a good analytic approximation of the real dynamics.

preprint2015arXiv

Efficient reconstruction of transmission probabilities in a spreading process from partial observations

An important problem of reconstruction of diffusion network and transmission probabilities from the data has attracted a considerable attention in the past several years. A number of recent papers introduced efficient algorithms for the estimation of spreading parameters, based on the maximization of the likelihood of observed cascades, assuming that the full information for all the nodes in the network is available. In this work, we focus on a more realistic and restricted scenario, in which only a partial information on the cascades is available: either the set of activation times for a limited number of nodes, or the states of nodes for a subset of observation times. To tackle this problem, we first introduce a framework based on the maximization of the likelihood of the incomplete diffusion trace. However, we argue that the computation of this incomplete likelihood is a computationally hard problem, and show that a fast and robust reconstruction of transmission probabilities in sparse networks can be achieved with a new algorithm based on recently introduced dynamic message-passing equations for the spreading processes. The suggested approach can be easily generalized to a large class of discrete and continuous dynamic models, as well as to the cases of dynamically-changing networks and noisy information.

preprint2014arXiv

A cavity approach to optimization and inverse dynamical problems

In these two lectures we shall discuss how the cavity approach can be used efficiently to study optimization problems with global (topological) constraints and how the same techniques can be generalized to study inverse problems in irreversible dynamical processes. These two classes of problems are formally very similar: they both require an efficient procedure to trace over all trajectories of either auxiliary variables which enforce global constraints, or directly dynamical variables defining the inverse dynamical problems. We will mention three basic examples, namely the Minimum Steiner Tree problem, the inverse threshold linear dynamical problem, and the patient-zero problem in epidemic cascades. All these examples are root problems in optimization and inference over networks. They appear in many modern applications and in a variety of different contexts. Credit for these results should be shared with A. Braunstein, A. Ramezanpour, F. Altarelli, L. Dall'Asta, I. Biazzo and A. Lage-Castellanos.

preprint2014arXiv

Inferring the origin of an epidemic with a dynamic message-passing algorithm

We study the problem of estimating the origin of an epidemic outbreak -- given a contact network and a snapshot of epidemic spread at a certain time, determine the infection source. Finding the source is important in different contexts of computer or social networks. We assume that the epidemic spread follows the most commonly used susceptible-infected-recovered model. We introduce an inference algorithm based on dynamic message-passing equations, and we show that it leads to significant improvement of performance compared to existing approaches. Importantly, this algorithm remains efficient in the case where one knows the state of only a fraction of nodes.

preprint2014arXiv

Topological transition in disordered planar matching: combinatorial arcs expansion

In this paper, we investigate analytically the properties of the disordered Bernoulli model of planar matching. This model is characterized by a topological phase transition, yielding complete planar matching solutions only above a critical density threshold. We develop a combinatorial procedure of arcs expansion that explicitly takes into account the contribution of short arcs, and allows to obtain an accurate analytical estimation of the critical value by reducing the global constrained problem to a set of local ones. As an application to a toy representation of the RNA secondary structures, we suggest generalized models that incorporate a one-to-one correspondence between the contact matrix and the RNA-type sequence, thus giving sense to the notion of effective non-integer alphabets.

preprint2013arXiv

New phase transition in random planar diagrams and RNA-type matching

We study the planar matching problem, defined by a symmetric random matrix with independent identically distributed entries, taking values 0 and 1. We show that the existence of a perfect planar matching structure is possible only above a certain critical density, $p_{c}$, of allowed contacts (i.e. of '1'). Using a formulation of the problem in terms of Dyck paths and a matrix model of planar contact structures, we provide an analytical estimation for the value of the transition point, $p_{c}$, in the thermodynamic limit. This estimation is close to the critical value, $p_{c} \approx 0.379$, obtained in numerical simulations based on an exact dynamical programming algorithm. We characterize the corresponding critical behavior of the model and discuss the relation of the perfect-imperfect matching transition to the known molten-glass transition in the context of random RNA secondary structure's formation. In particular, we provide strong evidence supporting the conjecture that the molten-glass transition at T=0 occurs at $p_{c}$.

Andrey Y. Lokhov

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Finite Sample Bounds for Learning with Score Matching

Learning of networked spreading models from noisy and incomplete data

High-quality Thermal Gibbs Sampling with Quantum Annealing Hardware

Learning Continuous Exponential Families Beyond Gaussian

Vector Field Visualization of Single-Qubit State Tomography

Data-driven Selection of Coarse-Grained Models of Coupled Oscillators

Reducing urban traffic congestion due to localized routing decisions

Scalable Influence Estimation Without Sampling

Detection of Cyber-Physical Faults and Intrusions from Physical Correlations

Optimal Deployment of Resources for Maximizing Impact in Spreading Processes

Reconstructing parameters of spreading models from partial observations

Dynamic message-passing equations for models with unidirectional dynamics

Efficient reconstruction of transmission probabilities in a spreading process from partial observations

A cavity approach to optimization and inverse dynamical problems

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Topological transition in disordered planar matching: combinatorial arcs expansion

New phase transition in random planar diagrams and RNA-type matching