Source author record

Marc Lelarge

Marc Lelarge appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning Social and Information Networks Discrete Mathematics math.CO physics.soc-ph cond-mat.dis-nn Data Structures and Algorithms Information Theory math.IT Computer Science and Game Theory math-ph math.MP math.ST Networking and Internet Architecture Statistics Theory cond-mat.stat-mech math.OC math.SP

Catalog footprint

What is connected

32works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Convergence beyond the over-parameterized regime using Rayleigh quotients

In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

preprint2022arXiv

Impact of Community Structure on Cascades

We study cascades under the threshold model on sparse random graphs with community structure. In this model, individuals adopt the new behavior based on how many neighbors have already chosen it. Specifically, we consider the permanent adoption model wherein individuals that have adopted the new behavior (or opinion) cannot change their state. We present a differential-equation-based tight approximation to the stochastic process of adoption and prove the validity of the mean-field equations. In addition, we characterize both necessary and sufficient conditions for contagion to happen no matter how small the set of initial adopters is. Finally, we study the problem of optimum seeding given budget constraints and propose a gradient-based heuristic seeding strategy. Our algorithm, numerically, dispels commonly held beliefs in the literature that suggest the best seeding strategy is to seed over the vertices with the highest number of neighbors.

preprint2022arXiv

SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some additional features should be considered in the design of the algorithm. In this work, we study the recommendation problem in the setting where affinities between users and items are based both on their embeddings in a latent space and on their geographical distance in their underlying euclidean space (e.g., $\mathbb{R}^2$), together with item capacity constraints. This framework is motivated by some real-world applications, for instance in healthcare: the task is to recommend hospitals to patients based on their location, pathology, and hospital capacities. In these applications, there is somewhat of an asymmetry between users and items: items are viewed as static points, their embeddings, capacities and locations constraining the allocation. Upon the observation of an optimal allocation, user embeddings, items capacities, and their positions in their underlying euclidean space, our aim is to recover item embeddings in the latent space; doing so, we are then able to use this estimate e.g. in order to predict future allocations. We propose an algorithm (SiMCa) based on matrix factorization enhanced with optimal transport steps to model user-item affinities and learn item embeddings from observed data. We then illustrate and discuss the results of such an approach for hospital recommendation on synthetic data.

preprint2017arXiv

Statistical and computational phase transitions in spiked tensor estimation

We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmically "easy" in a much wider region than previously believed. It exists, however, a "hard" region where AMP fails to reach the MMSE and we conjecture that no polynomial algorithm will improve on AMP.

preprint2016arXiv

Clustering from Sparse Pairwise Measurements

We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal solution, and two spectral algorithms based on the non-backtracking and Bethe Hessian operators. For the case of two symmetric clusters, we conjecture that these algorithms are asymptotically optimal in that they detect the clusters as soon as it is information theoretically possible to do so. We substantiate this claim for one of the spectral approaches we introduce.

preprint2016arXiv

Improving PageRank for Local Community Detection

Community detection is a classical problem in the field of graph mining. While most algorithms work on the entire graph, it is often interesting in practice to recover only the community containing some given set of seed nodes. In this paper, we propose a novel approach to this problem, using some low-dimensional embedding of the graph based on random walks starting from the seed nodes. From this embedding, we propose some simple yet efficient versions of the PageRank algorithm as well as a novel algorithm, called WalkSCAN, that is able to detect multiple communities, possibly overlapping. We provide insights into the performance of these algorithms through the theoretical analysis of a toy network and show that WalkSCAN outperforms existing algorithms on real networks.

preprint2015arXiv

Clustering and Inference From Pairwise Comparisons

Given a set of pairwise comparisons, the classical ranking problem computes a single ranking that best represents the preferences of all users. In this paper, we study the problem of inferring individual preferences, arising in the context of making personalized recommendations. In particular, we assume that there are $n$ users of $r$ types; users of the same type provide similar pairwise comparisons for $m$ items according to the Bradley-Terry model. We propose an efficient algorithm that accurately estimates the individual preferences for almost all users, if there are $r \max \{m, n\}\log m \log^2 n$ pairwise comparisons per type, which is near optimal in sample complexity when $r$ only grows logarithmically with $m$ or $n$. Our algorithm has three steps: first, for each user, compute the \emph{net-win} vector which is a projection of its $\binom{m}{2}$-dimensional vector of pairwise comparisons onto an $m$-dimensional linear subspace; second, cluster the users based on the net-win vectors; third, estimate a single preference for each cluster separately. The net-win vectors are much less noisy than the high dimensional vectors of pairwise comparisons and clustering is more accurate after the projection as confirmed by numerical experiments. Moreover, we show that, when a cluster is only approximately correct, the maximum likelihood estimation for the Bradley-Terry model is still close to the true preference.

preprint2015arXiv

Combinatorial Bandits Revisited

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice. In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.

preprint2015arXiv

Counting matchings in irregular bipartite graphs and random lifts

We give a sharp lower bound on the number of matchings of a given size in a bipartite graph. When specialized to regular bipartite graphs, our results imply Friedland's Lower Matching Conjecture and Schrijver's theorem proven by Gurvits and Csikvari. Indeed, our work extends the recent work of Csikvari done for regular and bi-regular bipartite graphs. Moreover, our lower bounds are order optimal as they are attained for a sequence of $2$-lifts of the original graph as well as for random $n$-lifts of the original graph when $n$ tends to infinity. We then extend our results to permanents and subpermanents sums. For permanents, we are able to recover the lower bound of Schrijver recently proved by Gurvits using stable polynomials. Our proof is algorithmic and borrows ideas from the theory of local weak convergence of graphs, statistical physics and covers of graphs. We provide new lower bounds for subpermanents sums and obtain new results on the number of matching in random $n$-lifts with some implications for the matching measure and the spectral measure of random $n$-lifts as well as for the spectral measure of infinite trees.

preprint2015arXiv

Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs

A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. In this work, we study the largest eigenvalues of the non-backtracking matrix of the Erdos-Renyi random graph and of the Stochastic Block Model in the regime where the number of edges is proportional to the number of vertices. Our results confirm the "spectral redemption" conjecture that community detection can be made on the basis of the leading eigenvectors above the feasibility threshold.

preprint2015arXiv

On rigidity, orientability and cores of random graphs with sliders

Suppose that you add rigid bars between points in the plane, and suppose that a constant fraction $q$ of the points moves freely in the whole plane; the remaining fraction is constrained to move on fixed lines called sliders. When does a giant rigid cluster emerge? Under a genericity condition, the answer only depends on the graph formed by the points (vertices) and the bars (edges). We find for the random graph $G \in \mathcal{G}(n,c/n)$ the threshold value of $c$ for the appearance of a linear-sized rigid component as a function of $q$, generalizing results of Kasiviswanathan et al. We show that this appearance of a giant component undergoes a continuous transition for $q \leq 1/2$ and a discontinuous transition for $q > 1/2$. In our proofs, we introduce a generalized notion of orientability interpolating between 1- and 2-orientability, of cores interpolating between 2-core and 3-core, and of extended cores interpolating between 2+1-core and 3+2-core; we find the precise expressions for the respective thresholds and the sizes of the different cores above the threshold. In particular, this proves a conjecture of Kasiviswanathan et al. about the size of the 3+2-core. We also derive some structural properties of rigidity with sliders (matroid and decomposition into components) which can be of independent interest.

preprint2015arXiv

Reconstruction in the Labeled Stochastic Block Model

The labeled stochastic block model is a random graph model representing networks with community structure and interactions of multiple types. In its simplest form, it consists of two communities of approximately equal size, and the edges are drawn and labeled at random with probability depending on whether their two endpoints belong to the same community or not. It has been conjectured in \cite{Heimlicher12} that correlated reconstruction (i.e.\ identification of a partition correlated with the true partition into the underlying communities) would be feasible if and only if a model parameter exceeds a threshold. We prove one half of this conjecture, i.e., reconstruction is impossible when below the threshold. In the positive direction, we introduce a weighted graph to exploit the label information. With a suitable choice of weight function, we show that when above the threshold by a specific constant, reconstruction is achieved by (1) minimum bisection, (2) a semidefinite relaxation of minimum bisection, and (3) a spectral method combined with removal of edges incident to vertices of high degree. Furthermore, we show that hypothesis testing between the labeled stochastic block model and the labeled Erdős-Rényi random graph model exhibits a phase transition at the conjectured reconstruction threshold.

preprint2015arXiv

Spectral Detection in the Censored Block Model

We consider the problem of partially recovering hidden binary variables from the observation of (few) censored edge weights, a problem with applications in community detection, correlation clustering and synchronization. We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators. These algorithms are shown to be asymptotically optimal for the partial recovery problem, in that they detect the hidden assignment as soon as it is information theoretically possible to do so.

preprint2015arXiv

Streaming, Memory Limited Matrix Completion with Noise

In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missing entries after one pass on the data with limited memory space and limited computational complexity. We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i.e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix.

preprint2015arXiv

The diameter of weighted random graphs

In this paper we study the impact of random exponential edge weights on the distances in a random graph and, in particular, on its diameter. Our main result consists of a precise asymptotic expression for the maximal weight of the shortest weight paths between all vertices (the weighted diameter) of sparse random graphs, when the edge weights are i.i.d. exponential random variables.

preprint2015arXiv

Universality in polytope phase transitions and message passing algorithms

We consider a class of nonlinear mappings $\mathsf{F}_{A,N}$ in $\mathbb{R}^N$ indexed by symmetric random matrices $A\in\mathbb{R}^{N\times N}$ with independent entries. Within spin glass theory, special cases of these mappings correspond to iterating the TAP equations and were studied by Bolthausen [Comm. Math. Phys. 325 (2014) 333-366]. Within information theory, they are known as "approximate message passing" algorithms. We study the high-dimensional (large $N$) behavior of the iterates of $\mathsf{F}$ for polynomial functions $\mathsf{F}$, and prove that it is universal; that is, it depends only on the first two moments of the entries of $A$, under a sub-Gaussian tail condition. As an application, we prove the universality of a certain phase transition arising in polytope geometry and compressed sensing. This solves, for a broad class of random projections, a conjecture by David Donoho and Jared Tanner.

preprint2014arXiv

Adaptive Replication in Distributed Content Delivery Networks

We address the problem of content replication in large distributed content delivery networks, composed of a data center assisted by many small servers with limited capabilities and located at the edge of the network. The objective is to optimize the placement of contents on the servers to offload as much as possible the data center. We model the system constituted by the small servers as a loss network, each loss corresponding to a request to the data center. Based on large system / storage behavior, we obtain an asymptotic formula for the optimal replication of contents and propose adaptive schemes related to those encountered in cache networks but reacting here to loss events, and faster algorithms generating virtual events at higher rate while keeping the same target replication. We show through simulations that our adaptive schemes outperform significantly standard replication strategies both in terms of loss rates and adaptation speed.

preprint2014arXiv

Contagions in Random Networks with Overlapping Communities

We consider a threshold epidemic model on a clustered random graph with overlapping communities. In other words, our epidemic model is such that an individual becomes infected as soon as the proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph model, each individual can belong to several communities. The distributions for the community sizes and the number of communities an individual belongs to are arbitrary. We consider the case where the epidemic starts from a single individual, and we prove a phase transition (when the parameter q of the model varies) for the appearance of a cascade, i.e. when the epidemic can be propagated to an infinite part of the population. More precisely, we show that our epidemic is entirely described by a multi-type (and alternating) branching process, and then we apply Sevastyanov's theorem about the phase transition of multi-type Galton-Watson branching processes. In addition, we compute the entries of the matrix whose largest eigenvalue gives the phase transition.

preprint2014arXiv

Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution.

preprint2014arXiv

Loopy annealing belief propagation for vertex cover and matching: convergence, LP relaxation, correctness and Bethe approximation

For the minimum cardinality vertex cover and maximum cardinality matching problems, the max-product form of belief propagation (BP) is known to perform poorly on general graphs. In this paper, we present an iterative loopy annealing BP (LABP) algorithm which is shown to converge and to solve a Linear Programming relaxation of the vertex cover or matching problem on general graphs. LABP finds (asymptotically) a minimum half-integral vertex cover (hence provides a 2-approximation) and a maximum fractional matching on any graph. We also show that LABP finds (asymptotically) a minimum size vertex cover for any bipartite graph and as a consequence compute the matching number of the graph. Our proof relies on some subtle monotonicity arguments for the local iteration. We also show that the Bethe free entropy is concave and that LABP maximizes it. Using loop calculus, we also give an exact (also intractable for general graphs) expression of the partition function for matching in term of the LABP messages which can be used to improve mean-field approximations.

preprint2014arXiv

Streaming, Memory Limited Algorithms for Community Detection

In this paper, we consider sparse networks consisting of a finite number of non-overlapping communities, i.e. disjoint clusters, so that there is higher density within clusters than across clusters. Both the intra- and inter-cluster edge densities vanish when the size of the graph grows large, making the cluster reconstruction problem nosier and hence difficult to solve. We are interested in scenarios where the network size is very large, so that the adjacency matrix of the graph is hard to manipulate and store. The data stream model in which columns of the adjacency matrix are revealed sequentially constitutes a natural framework in this setting. For this model, we develop two novel clustering algorithms that extract the clusters asymptotically accurately. The first algorithm is {\it offline}, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size. The second algorithm is {\it online}, as it may classify a node when the corresponding column is revealed and then discard this information. This algorithm requires a memory growing sub-linearly with the network size. To construct these efficient streaming memory-limited clustering algorithms, we first address the problem of clustering with partial information, where only a small proportion of the columns of the adjacency matrix is observed and develop, for this setting, a new spectral algorithm which is of independent interest.

preprint2013arXiv

Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

For a graph $G$, let $Z(G,λ)$ be the partition function of the monomer-dimer system defined by $\sum_k m_k(G)λ^k$, where $m_k(G)$ is the number of matchings of size $k$ in $G$. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating $\log Z(G,λ)$ at an arbitrary value $λ>0$ within additive error $εn$ with high probability. The query complexity of our algorithm does not depend on the size of $G$ and is polynomial in $1/ε$, and we also provide a lower bound quadratic in $1/ε$ for this problem. This is the first analysis of a sublinear-time approximation algorithm for a $# P$-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with $Z(G,λ)$. We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in $1/ε$ and the lower bound is quadratic in $1/ε$. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity.

preprint2012arXiv

A new approach to the orientation of random hypergraphs

A h-uniform hypergraph H=(V,E) is called (l,k)-orientable if there exists an assignment of each hyperedge e to exactly l of its vertices such that no vertex is assigned more than k hyperedges. Let H_{n,m,h} be a hypergraph, drawn uniformly at random from the set of all h-uniform hypergraphs with n vertices and m edges. In this paper, we determine the threshold of the existence of a (l,k)-orientation of H_{n,m,h} for k>=1 and h>l>=1, extending recent results motivated by applications such as cuckoo hashing or load balancing with guaranteed maximum load. Our proof combines the local weak convergence of sparse graphs and a careful analysis of a Gibbs measure on spanning subgraphs with degree constraints. It allows us to deal with a much broader class than the uniform hypergraphs.

preprint2012arXiv

Community Detection in the Labelled Stochastic Block Model

We consider the problem of community detection from observed interactions between individuals, in the context where multiple types of interaction are possible. We use labelled stochastic block models to represent the observed data, where labels correspond to interaction types. Focusing on a two-community scenario, we conjecture a threshold for the problem of reconstructing the hidden communities in a way that is correlated with the true partition. To substantiate the conjecture, we prove that the given threshold correctly identifies a transition on the behaviour of belief propagation from insensitive to sensitive. We further prove that the same threshold corresponds to the transition in a related inference problem on a tree model from infeasible to feasible. Finally, numerical results using belief propagation for community detection give further support to the conjecture.

preprint2012arXiv

Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing

This paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. These two problems admit a common abstraction: in both scenarios, performance is characterized by the maximum weight of a generalization of a matching in a bipartite graph, featuring node and edge capacities. Our main result is a law of large numbers characterizing the asymptotic maximum weight matching in the limit of large bipartite random graphs, when the graphs admit a local weak limit that is a tree. This result specializes to the two application scenarios, yielding new results in both contexts. In contrast with previous results, the key novelty is the ability to handle edge capacities with arbitrary integer values. An analysis of belief propagation algorithms (BP) with multivariate belief vectors underlies the proof. In particular, we show convergence of the corresponding BP by exploiting monotonicity of the belief vectors with respect to the so-called upshifted likelihood ratio stochastic order. This auxiliary result can be of independent interest, providing a new set of structural conditions which ensure convergence of BP.

preprint2012arXiv

Coordination in Network Security Games: a Monotone Comparative Statics Approach

Malicious softwares or malwares for short have become a major security threat. While originating in criminal behavior, their impact are also influenced by the decisions of legitimate end users. Getting agents in the Internet, and in networks in general, to invest in and deploy security features and protocols is a challenge, in particular because of economic reasons arising from the presence of network externalities. In this paper, we focus on the question of incentive alignment for agents of a large network towards a better security. We start with an economic model for a single agent, that determines the optimal amount to invest in protection. The model takes into account the vulnerability of the agent to a security breach and the potential loss if a security breach occurs. We derive conditions on the quality of the protection to ensure that the optimal amount spent on security is an increasing function of the agent's vulnerability and potential loss. We also show that for a large class of risks, only a small fraction of the expected loss should be invested. Building on these results, we study a network of interconnected agents subject to epidemic risks. We derive conditions to ensure that the incentives of all agents are aligned towards a better security. When agents are strategic, we show that security investments are always socially inefficient due to the network externalities. Moreover alignment of incentives typically implies a coordination problem, leading to an equilibrium with a very high price of anarchy.

preprint2012arXiv

How Clustering Affects Epidemics in Random Networks

Motivated by the analysis of social networks, we study a model of random networks that has both a given degree distribution and a tunable clustering coefficient. We consider two types of growth processes on these graphs: diffusion and symmetric threshold model. The diffusion process is inspired from epidemic models. It is characterized by an infection probability, each neighbor transmitting the epidemic independently. In the symmetric threshold process, the interactions are still local but the propagation rule is governed by a threshold (that might vary among the different nodes). An interesting example of symmetric threshold process is the contagion process, which is inspired by a simple coordination game played on the network. Both types of processes have been used to model spread of new ideas, technologies, viruses or worms and results have been obtained for random graphs with no clustering. In this paper, we are able to analyze the impact of clustering on the growth processes. While clustering inhibits the diffusion process, its impact for the contagion process is more subtle and depends on the connectivity of the graph: in a low connectivity regime, clustering also inhibits the contagion, while in a high connectivity regime, clustering favors the appearance of global cascades but reduces their size. For both diffusion and symmetric threshold models, we characterize conditions under which global cascades are possible and compute their size explicitly, as a function of the degree distribution and the clustering coefficient. Our results are applied to regular or power-law graphs with exponential cutoff and shed new light on the impact of clustering.

preprint2012arXiv

Leveraging Side Observations in Stochastic Bandits

This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms. In this setting, after pulling an arm i, the decision maker also observes the rewards for some other actions related to i. We will see that this model is suited to content recommendation in social networks, where users' reactions may be endorsed or not by their friends. We provide efficient algorithms based on upper confidence bounds (UCBs) to leverage this additional information and derive new bounds improving on standard regret guarantees. We also evaluate these policies in the context of movie recommendation in social networks: experiments on real datasets show substantial learning rate speedups ranging from 2.2x to 14x on dense networks.

preprint2012arXiv

Matchings on infinite graphs

Elek and Lippner (2010) showed that the convergence of a sequence of bounded-degree graphs implies the existence of a limit for the proportion of vertices covered by a maximum matching. We provide a characterization of the limiting parameter via a local recursion defined directly on the limit of the graph sequence. Interestingly, the recursion may admit multiple solutions, implying non-trivial long-range dependencies between the covered vertices. We overcome this lack of correlation decay by introducing a perturbative parameter (temperature), which we let progressively go to zero. This allows us to uniquely identify the correct solution. In the important case where the graph limit is a unimodular Galton-Watson tree, the recursion simplifies into a distributional equation that can be solved explicitly, leading to a new asymptotic formula that considerably extends the well-known one by Karp and Sipser for Erdös-Rényi random graphs.

preprint2011arXiv

Diffusion and Cascading Behavior in Random Networks

The spread of new ideas, behaviors or technologies has been extensively studied using epidemic models. Here we consider a model of diffusion where the individuals' behavior is the result of a strategic choice. We study a simple coordination game with binary choice and give a condition for a new action to become widespread in a random network. We also analyze the possible equilibria of this game and identify conditions for the coexistence of both strategies in large connected sets. Finally we look at how can firms use social networks to promote their goals with limited information. Our results differ strongly from the one derived with epidemic models and show that connectivity plays an ambiguous role: while it allows the diffusion to spread, when the network is highly connected, the diffusion is also limited by high-degree nodes which are very stable.

preprint2011arXiv

The rank of diluted random graphs

We investigate the rank of the adjacency matrix of large diluted random graphs: for a sequence of graphs $(G_n)_{n\geq0}$ converging locally to a Galton--Watson tree $T$ (GWT), we provide an explicit formula for the asymptotic multiplicity of the eigenvalue 0 in terms of the degree generating function $ϕ_*$ of $T$. In the first part, we show that the adjacency operator associated with $T$ is always self-adjoint; we analyze the associated spectral measure at the root and characterize the distribution of its atomic mass at 0. In the second part, we establish a sufficient condition on $ϕ_*$ for the expectation of this atomic mass to be precisely the normalized limit of the dimension of the kernel of the adjacency matrices of $(G_n)_{n\geq 0}$. Our proofs borrow ideas from analysis of algorithms, functional analysis, random matrix theory and statistical physics.

preprint2010arXiv

Flooding in Weighted Random Graphs

In this paper, we study the impact of edge weights on distances in diluted random graphs. We interpret these weights as delays, and take them as i.i.d exponential random variables. We analyze the weighted flooding time defined as the minimum time needed to reach all nodes from one uniformly chosen node, and the weighted diameter corresponding to the largest distance between any pair of vertices. Under some regularity conditions on the degree sequence of the random graph, we show that these quantities grow as the logarithm of $n$, when the size of the graph $n$ tends to infinity. We also derive the exact value for the prefactors. These allow us to analyze an asynchronous randomized broadcast algorithm for random regular graphs. Our results show that the asynchronous version of the algorithm performs better than its synchronized version: in the large size limit of the graph, it will reach the whole network faster even if the local dynamics are similar on average.

Marc Lelarge

What is connected

Connect this record

See the researcher in context

Building this map preview

32 published item(s)

Convergence beyond the over-parameterized regime using Rayleigh quotients

Impact of Community Structure on Cascades

SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

Statistical and computational phase transitions in spiked tensor estimation

Clustering from Sparse Pairwise Measurements

Improving PageRank for Local Community Detection

Clustering and Inference From Pairwise Comparisons

Combinatorial Bandits Revisited

Counting matchings in irregular bipartite graphs and random lifts

Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs

On rigidity, orientability and cores of random graphs with sliders

Reconstruction in the Labeled Stochastic Block Model

Spectral Detection in the Censored Block Model

Streaming, Memory Limited Matrix Completion with Noise

The diameter of weighted random graphs

Universality in polytope phase transitions and message passing algorithms

Adaptive Replication in Distributed Content Delivery Networks

Contagions in Random Networks with Overlapping Communities

Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

Loopy annealing belief propagation for vertex cover and matching: convergence, LP relaxation, correctness and Bethe approximation

Streaming, Memory Limited Algorithms for Community Detection

Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

A new approach to the orientation of random hypergraphs

Community Detection in the Labelled Stochastic Block Model

Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing

Coordination in Network Security Games: a Monotone Comparative Statics Approach

How Clustering Affects Epidemics in Random Networks

Leveraging Side Observations in Stochastic Bandits

Matchings on infinite graphs

Diffusion and Cascading Behavior in Random Networks

The rank of diluted random graphs

Flooding in Weighted Random Graphs