Source author record

Emilio Leonardi

Emilio Leonardi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Networking and Internet Architecture Performance Social and Information Networks Distributed, Parallel, and Cluster Computing Information Retrieval Information Theory Machine Learning math.IT math.PR physics.soc-ph Artificial Intelligence eess.SY Human-Computer Interaction math.OC Multiagent Systems Systems and Control

Catalog footprint

What is connected

20works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Federated Learning under Heterogeneous and Correlated Client Availability

The enormous amount of data produced by mobile and IoT devices has motivated the development of federated learning (FL), a framework allowing such devices (or clients) to collaboratively train machine learning models without sharing their local data. FL algorithms (like FedAvg) iteratively aggregate model updates computed by clients on their own datasets. Clients may exhibit different levels of participation, often correlated over time and with other clients. This paper presents the first convergence analysis for a FedAvg-like FL algorithm under heterogeneous and correlated client availability. Our analysis highlights how correlation adversely affects the algorithm's convergence rate and how the aggregation strategy can alleviate this effect at the cost of steering training toward a biased model. Guided by the theoretical analysis, we propose CA-Fed, a new FL algorithm that tries to balance the conflicting goals of maximizing convergence speed and minimizing model bias. To this purpose, CA-Fed dynamically adapts the weight given to each client and may ignore clients with low availability and large correlation. Our experimental results show that CA-Fed achieves higher time-average accuracy and a lower standard deviation than state-of-the-art AdaFed and F3AST, both on synthetic and real datasets.

preprint2022arXiv

Bootstrap percolation on the stochastic block model

We analyze the bootstrap percolation process on the stochastic block model (SBM), a natural extension of the Erdős--Rényi random graph that incorporates the community structure observed in many real systems. In the SBM, nodes are partitioned into two subsets, which represent different communities, and pairs of nodes are independently connected with a probability that depends on the communities they belong to. Under mild assumptions on the system parameters, we prove the existence of a sharp phase transition for the final number of active nodes and characterize the sub-critical and the super-critical regimes in terms of the number of initially active nodes, which are selected uniformly at random in each community.

preprint2021arXiv

Asynchronous semi-anonymous dynamics over large-scale networks

We analyze a class of stochastic processes, referred to as asynchronous and semi-anonymous dynamics (ASD), over directed labeled random networks. These processes are a natural tool to describe general best-response and noisy best-response dynamics in network games where each agent, at random times governed by independent Poisson clocks, can choose among a finite set of actions. The payoff is determined by the relative popularity of different actions among neighbors, while being independent of the specific identities of neighbors. Using a mean-field approach, we prove that, under certain conditions on the network and initial node configuration, the evolution of ASD can be approximated, in the limit of large network sizes, by the solution of a system of non-linear ordinary differential equations. Our framework is very general and applies to a large class of graph ensembles for which the typical random graph locally behaves like a tree. In particular, we will focus on labeled configuration-model random graphs, a generalization of the traditional configuration model which allows different classes of nodes to be mixed together in the network, permitting us, for example, to incorporate a community structure in the system. Our analysis also applies to configuration-model graphs having a power-law degree distribution, an essential feature of many real systems. To demonstrate the power and flexibility of our framework, we consider several examples of dynamics belonging to our class of stochastic processes. Moreover, we illustrate by simulation the applicability of our analysis to realistic scenarios by running our example dynamics over a real social network graph.

preprint2021arXiv

Content Placement in Networks of Similarity Caches

Similarity caching systems have recently attracted the attention of the scientific community, as they can be profitably used in many application contexts, like multimedia retrieval, advertising, object recognition, recommender systems and online content-match applications. In such systems, a user request for an object $o$, which is not in the cache, can be (partially) satisfied by a similar stored object $o$', at the cost of a loss of user utility. In this paper we make a first step into the novel area of similarity caching networks, where requests can be forwarded along a path of caches to get the best efficiency-accuracy tradeoff. The offline problem of content placement can be easily shown to be NP-hard, while different polynomial algorithms can be devised to approach the optimal solution in discrete cases. As the content space grows large, we propose a continuous problem formulation whose solution exhibits a simple structure in a class of tree topologies. We verify our findings using synthetic and realistic request traces.

preprint2020arXiv

A large deviation approach to super-critical bootstrap percolation on the random graph $G_{n,p}$

We consider the Erdös--Rényi random graph $G_{n,p}$ and we analyze the simple irreversible epidemic process on the graph, known in the literature as bootstrap percolation. We give a quantitative version of some results by Janson et al. (2012), providing a fine asymptotic analysis of the final size $A_n^*$ of active nodes, under a suitable super-critical regime. More specifically, we establish large deviation principles for the sequence of random variables $\{\frac{n- A_n^*}{f(n)}\}_{n\geq 1}$ with explicit rate functions and allowing the scaling function $f$ to vary in the widest possible range.

preprint2020arXiv

Ranking a set of objects: a graph based least-square approach

We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers. We assume that objects are endowed with intrinsic qualities and that the probability with which an object is preferred to another depends only on the difference between the qualities of the two competitors. We propose a class of non-adaptive ranking algorithms that rely on a least-squares optimization criterion for the estimation of qualities. Such algorithms are shown to be asymptotically optimal (i.e., they require $O(\frac{N}{ε^2}\log \frac{N}δ)$ comparisons to be $(ε, δ)$-PAC). Numerical results show that our schemes are very efficient also in many non-asymptotic scenarios exhibiting a performance similar to the maximum-likelihood algorithm. Moreover, we show how they can be extended to adaptive schemes and test them on real-world datasets.

preprint2019arXiv

Impact of Traffic Characteristics on Request Aggregation in an NDN Router

The paper revisits the performance evaluation of caching in a Named Data Networking (NDN) router where the content store (CS) is supplemented by a pending interest table (PIT). The PIT aggregates requests for a given content that arrive within the download delay and thus brings an additional reduction in upstream bandwidth usage beyond that due to CS hits. We extend prior work on caching with non-zero download delay (non-ZDD) by proposing a novel mathematical framework that is more easily applicable to general traffic models and by considering alternative cache insertion policies. Specifically we evaluate the use of an LRU filter to improve CS hit rate performance in this non-ZDD context. We also consider the impact of time locality in demand due to finite content lifetimes. The models are used to quantify the impact of the PIT on upstream bandwidth reduction, demonstrating notably that this is significant only for relatively small content catalogues or high average request rate per content. We further explore how the effectiveness of the filter with finite content lifetimes depends on catalogue size and traffic intensity.

preprint2016arXiv

A unified approach to the performance analysis of caching systems

We propose a unified methodology to analyse the performance of caches (both isolated and interconnected), by extending and generalizing a decoupling technique originally known as Che's approximation, which provides very accurate results at low computational cost. We consider several caching policies, taking into account the effects of temporal locality. In the case of interconnected caches, our approach allows us to do better than the Poisson approximation commonly adopted in prior work. Our results, validated against simulations and trace-driven experiments, provide interesting insights into the performance of caching systems.

preprint2016arXiv

Generalized threshold-based epidemics in random graphs: the power of extreme values

Bootstrap percolation is a well-known activation process in a graph, in which a node becomes active when it has at least $r$ active neighbors. Such process, originally studied on regular structures, has been recently investigated also in the context of random graphs, where it can serve as a simple model for a wide variety of cascades, such as the spreading of ideas, trends, viral contents, etc. over large social networks. In particular, it has been shown that in $G(n,p)$ the final active set can exhibit a phase transition for a sub-linear number of seeds. In this paper, we propose a unique framework to study similar sub-linear phase transitions for a much broader class of graph models and epidemic processes. Specifically, we consider i) a generalized version of bootstrap percolation in $G(n,p)$ with random activation thresholds and random node-to-node influences; ii) different random graph models, including graphs with given degree sequence and graphs with community structure (block model). The common thread of our work is to show the surprising sensitivity of the critical seed set size to extreme values of distributions, which makes some systems dramatically vulnerable to large-scale outbreaks. We validate our results running simulation on both synthetic and real graphs.

preprint2016arXiv

Modeling LRU caches with Shot Noise request processes

In this paper we analyze Least Recently Used (LRU) caches operating under the Shot Noise requests Model (SNM). The SNM was recently proposed to better capture the main characteristics of today Video on Demand (VoD) traffic. We investigate the validity of Che's approximation through an asymptotic analysis of the cache eviction time. In particular, we provide a large deviation principle, a law of large numbers and a central limit theorem for the cache eviction time, as the cache size grows large. Finally, we derive upper and lower bounds for the "hit" probability in tandem networks of caches under Che's approximation.

preprint2015arXiv

Unravelling the Impact of Temporal and Geographical Locality in Content Caching Systems

To assess the performance of caching systems, the definition of a proper process describing the content requests generated by users is required. Starting from the analysis of traces of YouTube video requests collected inside operational networks, we identify the characteristics of real traffic that need to be represented and those that instead can be safely neglected. Based on our observations, we introduce a simple, parsimonious traffic model, named Shot Noise Model (SNM), that allows us to capture temporal and geographical locality of content popularity. The SNM is sufficiently simple to be effectively employed in both analytical and scalable simulative studies of caching systems. We demonstrate this by analytically characterizing the performance of the LRU caching policy under the SNM, for both a single cache and a network of caches. With respect to the standard Independent Reference Model (IRM), some paradigmatic shifts, concerning the impact of various traffic characteristics on cache performance, clearly emerge from our results.

preprint2014arXiv

De-anonymizing scale-free social networks by percolation graph matching

We address the problem of social network de-anonymization when relationships between people are described by scale-free graphs. In particular, we propose a rigorous, asymptotic mathematical analysis of the network de-anonymization problem while capturing the impact of power-law node degree distribution, which is a fundamental and quite ubiquitous feature of many complex systems such as social networks. By applying bootstrap percolation and a novel graph slicing technique, we prove that large inhomogeneities in the node degree lead to a dramatic reduction of the initial set of nodes that must be known a priori (the seeds) in order to successfully identify all other users. We characterize the size of this set when seeds are selected using different criteria, and we show that their number can be as small as $n^ε$, for any small ${ε>0}$. Our results are validated through simulation experiments on a real social network graph.

preprint2014arXiv

Efficient analysis of caching strategies under dynamic content popularity

In this paper we develop a novel technique to analyze both isolated and interconnected caches operating under different caching strategies and realistic traffic conditions. The main strength of our approach is the ability to consider dynamic contents which are constantly added into the system catalogue, and whose popularity evolves over time according to desired profiles. We do so while preserving the simplicity and computational efficiency of models developed under stationary popularity conditions, which are needed to analyze several caching strategies. Our main achievement is to show that the impact of content popularity dynamics on cache performance can be effectively captured into an analytical model based on a fixed content catalogue (i.e., a catalogue whose size and objects' popularity do not change over time).

preprint2014arXiv

Large deviations of the interference in the Ginibre network model

Under different assumptions on the distribution of the fading random variables, we derive large deviation estimates for the tail of the interference in a wireless network model whose nodes are placed, over a bounded region of the plane, according to the $β$-Ginibre process, $0<β\leq 1$. The family of $β$-Ginibre processes is formed by determinantal point processes, with different degree of repulsiveness, which converge in law to a homogeneous Poisson process, as $β\to 0$. In this sense the Poisson network model may be considered as the limiting uncorrelated case of the $β$-Ginibre network model. Our results indicate the existence of two different regimes. When the fading random variables are bounded or Weibull superexponential, large values of the interference are typically originated by the sum of several equivalent interfering contributions due to nodes in the vicinity of the receiver. In this case, the tail of the interference has, on the log-scale, the same asymptotic behavior for any value of $0<β\le 1$, but it differs (again on a log-scale) from the asymptotic behavior of the tail of the interference in the Poisson network model. When the fading random variables are exponential or subexponential, instead, large values of the interference are typically originated by a single dominating interferer node and, on the log-scale, the asymptotic behavior of the tail of the interference is essentially insensitive to the distribution of the nodes. As a consequence, on the log-scale, the asymptotic behavior of the tail of the interference in any $β$-Ginibre network model, $0<β\le 1$, is the same as in the Poisson network model.

preprint2014arXiv

The Importance of Being Earnest in Crowdsourcing Systems

This paper presents the first systematic investigation of the potential performance gains for crowdsourcing systems, deriving from available information at the requester about individual worker earnestness (reputation). In particular, we first formalize the optimal task assignment problem when workers' reputation estimates are available, as the maximization of a monotone (submodular) function subject to Matroid constraints. Then, being the optimal problem NP-hard, we propose a simple but efficient greedy heuristic task allocation algorithm. We also propose a simple ``maximum a-posteriori`` decision rule. Finally, we test and compare different solutions, showing that system performance can greatly benefit from information about workers' reputation. Our main findings are that: i) even largely inaccurate estimates of workers' reputation can be effectively exploited in the task assignment to greatly improve system performance; ii) the performance of the maximum a-posteriori decision rule quickly degrades as worker reputation estimates become inaccurate; iii) when workers' reputation estimates are significantly inaccurate, the best performance can be obtained by combining our proposed task assignment algorithm with the LRA decision rule introduced in the literature.

preprint2014arXiv

Throughput Optimal Scheduling Policies in Networks of Interacting Queues

This report considers a fairly general model of constrained queuing networks that allows us to represent both MMBP (Markov Modulated Bernoulli Processes) arrivals and time-varying service constraints. We derive a set of sufficient conditions for throughput optimality of scheduling policies that encompass and generalize all the previously obtained results in the field. This leads to the definition of new classes of (non diagonal) throughput optimal scheduling policies. We prove the stability of queues by extending the traditional Lyapunov drift criteria methodology.

preprint2013arXiv

Analyzing the Performance of LRU Caches under Non-Stationary Traffic Patterns

This work presents, to the best of our knowledge of the literature, the first analytic model to address the performance of an LRU (Least Recently Used) implementing cache under non-stationary traffic conditions, i.e., when the popularity of content evolves with time. We validate the accuracy of the model using Monte Carlo simulations. We show that the model is capable of accurately estimating the cache hit probability, when the popularity of content is non-stationary. We find that there exists a dependency between the performance of an LRU implementing cache and i) the lifetime of content in a system, ii) the volume of requests associated with it, iii) the distribution of content request volumes and iv) the shape of the popularity profile over time.

preprint2013arXiv

Modeling the interdependency of low-priority congestion control and active queue management

Recently, a negative interplay has been shown to arise when scheduling/AQM techniques and low-priority congestion control protocols are used together: namely, AQM resets the relative level of priority among congestion control protocols. This work explores this issue by (i) studying a fluid model that describes system dynamics of heterogeneous congestion control protocols competing on a bottleneck link governed by AQM and (ii) proposing a system level solution able to reinstate priorities among protocols.

preprint2013arXiv

Temporal Locality in Today's Content Caching: Why it Matters and How to Model it

The dimensioning of caching systems represents a difficult task in the design of infrastructures for content distribution in the current Internet. This paper addresses the problem of defining a realistic arrival process for the content requests generated by users, due its critical importance for both analytical and simulative evaluations of the performance of caching systems. First, with the aid of YouTube traces collected inside operational residential networks, we identify the characteristics of real traffic that need to be considered or can be safely neglected in order to accurately predict the performance of a cache. Second, we propose a new parsimonious traffic model, named the Shot Noise Model (SNM), that enables users to natively capture the dynamics of content popularity, whilst still being sufficiently simple to be employed effectively for both analytical and scalable simulative studies of caching systems. Finally, our results show that the SNM presents a much better solution to account for the temporal locality observed in real traffic compared to existing approaches.

preprint2010arXiv

Information-theoretic Capacity of Clustered Random Networks

We analyze the capacity scaling laws of clustered ad hoc networks in which nodes are distributed according to a doubly stochastic shot-noise Cox process. We identify five different operational regimes, and for each regime we devise a communication strategy that allows to achieve a throughput to within a poly-logarithmic factor (in the number of nodes) of the maximum theoretical capacity.

Emilio Leonardi

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Federated Learning under Heterogeneous and Correlated Client Availability

Bootstrap percolation on the stochastic block model

Asynchronous semi-anonymous dynamics over large-scale networks

Content Placement in Networks of Similarity Caches

A large deviation approach to super-critical bootstrap percolation on the random graph $G_{n,p}$

Ranking a set of objects: a graph based least-square approach

Impact of Traffic Characteristics on Request Aggregation in an NDN Router

A unified approach to the performance analysis of caching systems

Generalized threshold-based epidemics in random graphs: the power of extreme values

Modeling LRU caches with Shot Noise request processes

Unravelling the Impact of Temporal and Geographical Locality in Content Caching Systems

De-anonymizing scale-free social networks by percolation graph matching

Efficient analysis of caching strategies under dynamic content popularity

Large deviations of the interference in the Ginibre network model

The Importance of Being Earnest in Crowdsourcing Systems

Throughput Optimal Scheduling Policies in Networks of Interacting Queues

Analyzing the Performance of LRU Caches under Non-Stationary Traffic Patterns

Modeling the interdependency of low-priority congestion control and active queue management

Temporal Locality in Today's Content Caching: Why it Matters and How to Model it

Information-theoretic Capacity of Clustered Random Networks