Researcher profile

Stilian Stoev

Stilian Stoev contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
12works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2020arXiv

Distributionally Robust Inference for Extreme Value-at-Risk

Under general multivariate regular variation conditions, the extreme Value-at-Risk of a portfolio can be expressed as an integral of a known kernel with respect to a generally unknown spectral measure supported on the unit simplex. The estimation of the spectral measure is challenging in practice and virtually impossible in high dimensions. This motivates the problem studied in this work, which is to find universal lower and upper bounds of the extreme Value-at-Risk under practically estimable constraints. That is, we study the infimum and supremum of the extreme Value-at-Risk functional, over the infinite dimensional space of all possible spectral measures that meet a finite set of constraints. We focus on extremal coefficient constraints, which are popular and easy to interpret in practice. Our contributions are twofold. Firstly, we show that optimization problems over an infinite dimensional space of spectral measures are in fact dual problems to linear semi-infinite programs (LSIPs) -- linear optimization problems in an Euclidean space with an uncountable set of linear constraints. This allows us to prove that the optimal solutions are in fact attained by discrete spectral measures supported on finitely many atoms. Second, in the case of balanced portfolia, we establish further structural results for the lower bounds as well as closed form solutions for both the lower- and upper-bounds of extreme Value-at-Risk in the special case of a single extremal coefficient constraint. The solutions unveil important connections to the Tawn-Molchanov max-stable models. The results are illustrated with a real data example.

preprint2020arXiv

On the rate of concentration of maxima in Gaussian arrays

Recently in Gao and Stoev (2018) it was established that the concentration of maxima phenomenon is the key to solving the exact sparse support recovery problem in high dimensions. This phenomenon, known also as relative stability, has been little studied in the context of dependence. Here, we obtain bounds on the rate of concentration of maxima in Gaussian triangular arrays. These results are used to establish sufficient conditions for the uniform relative stability of functions of Gaussian arrays, leading to new models that exhibit phase transitions in the exact support recovery problem. Finally, the optimal rate of concentration for Gaussian arrays is studied under more general assumptions than the ones implied by the classic condition of Berman (1964).

preprint2016arXiv

AMON: An Open Source Architecture for Online Monitoring, Statistical Analysis and Forensics of Multi-gigabit Streams

The Internet, as a global system of interconnected networks, carries an extensive array of information resources and services. Key requirements include good quality-of-service and protection of the infrastructure from nefarious activity (e.g. distributed denial of service--DDoS--attacks). Network monitoring is essential to network engineering, capacity planning and prevention / mitigation of threats. We develop an open source architecture, AMON (All-packet MONitor), for online monitoring and analysis of multi-gigabit network streams. It leverages the high-performance packet monitor PF RING and is readily deployable on commodity hardware. AMON examines all packets, partitions traffic into sub-streams by using rapid hashing and computes certain real-time data products. The resulting data structures provide views of the intensity and connectivity structure of network traffic at the time-scale of routing. The proposed integrated framework includes modules for the identification of heavy-hitters as well as for visualization and statistical detection at the time-of-onset of high impact events such as DDoS. This allows operators to quickly visualize and diagnose attacks, and limit offline and time consuming post-mortem analysis. We demonstrate our system in the context of real-world attack incidents, and validate it against state-of-the-art alternatives. AMON has been deployed and is currently processing 10Gbps+ live Internet traffic at Merit Network. It is extensible and allows the addition of further statistical and filtering modules for real-time forensics.

preprint2016arXiv

Inference for Monotone Trends Under Dependence

We focus on the problem estimating a monotone trend function under additive and dependent noise. New point-wise confidence interval estimators under both short- and long-range dependent errors are introduced and studied. These intervals are obtained via the method of inversion of certain discrepancy statistics arising in hypothesis testing problems. The advantage of this approach is that it avoids the estimation of nuisance parameters such as the derivative of the unknown function, which existing methods are forced to deal with. While the methodology is motivated by earlier work in the independent context, the dependence of the errors, especially longrange dependence leads to new challenges, such as the study of convex minorants of drifted fractional Brownian motion that may be of independent interest. We also unravel a new family of universal limit distributions (and tabulate selected quantiles) that can henceforth be used for inference in monotone function problems involving dependence.

preprint2016arXiv

Stochastic integral representations and classification of sum- and max-infinitely divisible processes

Introduced is the notion of minimality for spectral representations of sum- and max-infinitely divisible processes and it is shown that the minimal spectral representation on a Borel space exists and is unique. This fact is used to show that a stationary, stochastically continuous, sum- or max-i.d. random process on $\mathbb{R}^d$ can be generated by a measure-preserving flow on a $σ$-finite Borel measure space and that this flow is unique. This development makes it possible to extend the classification program of Rosiński (Ann. Probab. 23 (1995) 1163-1187) with a unified treatment of both sum- and max-infinitely divisible processes. As a particular case, a characterization of stationary, stochastically continuous, union-infinitely divisible random measurable subsets of $\mathbb{R}^d$ is obtained. Introduced and classified are several new max-i.d. random field models including fields of Penrose type and fields associated to Poisson line processes.

preprint2015arXiv

Probabilities of concurrent extremes

The statistical modelling of spatial extremes has recently made major advances. Much of its focus so far has been on the modelling of the magnitudes of extreme events but little attention has been paid on the timing of extremes. To address this gap, this paper introduces the notion of extremal concurrence. Suppose that one measures precipitation at several synoptic stations over multiple days. We say that extremes are concurrent if the maximum precipitation over time at each station is achieved simultaneously, e.g., on a single day. Under general conditions, we show that the finite sample concurrence probability converges to an asymptotic quantity, deemed extremal concurrence probability. Using Palm calculus, we establish general expressions for the extremal concurrence probability through the max-stable process emerging in the limit of the componentwise maxima of the sample. Explicit forms of the extremal concurrence probabilities are obtained for various max-stable models and several estimators are introduced. In particular, we prove that the pairwise extremal concurrence probability for max-stable vectors is precisely equal to the Kendall's $τ$. The estimators are evaluated by using simulations and applied to study the concurrence patterns of temperature extremes in the United States. The results demonstrate that concurrence probability can provide a powerful new perspective and tools for the analysis of the spatial structure and impact of extremes.

preprint2014arXiv

Hashing Pursuit for Online Identification of Heavy-Hitters in High-Speed Network Streams

Distributed Denial of Service (DDoS) attacks have become more prominent recently, both in frequency of occurrence, as well as magnitude. Such attacks render key Internet resources unavailable and disrupt its normal operation. It is therefore of paramount importance to quickly identify malicious Internet activity. The DDoS threat model includes characteristics such as: (i) heavy-hitters that transmit large volumes of traffic towards "victims", (ii) persistent-hitters that send traffic, not necessarily large, to specific destinations to be used as attack facilitators, (iii) host and port scanning for compiling lists of un-secure servers to be used as attack amplifiers, etc. This conglomeration of problems motivates the development of space/time efficient summaries of data traffic streams that can be used to identify heavy-hitters associated with the above attack vectors. This paper presents a hashing-based framework and fast algorithms that take into account the large-dimensionality of the incoming network stream and can be employed to quickly identify the culprits. The algorithms and data structures proposed provide a synopsis of the network stream that is not taxing to fast-memory, and can be efficiently implemented in hardware due to simple bit-wise operations. The methods are evaluated using real-world Internet data from a large academic network.

preprint2014arXiv

Implicit Extremes and Implicit Max-Stable Laws

Let $X_1,...,X_n$ be iid random vectors and $f\ge 0$ be a non-negative function. Let also $k(n) = {\rm Argmax}_{i=1,...,n} f(X_i)$. We are interested in the distribution of $X_{k(n)}$ and their limit theorems. In other words, what is the distribution the random vector where a function of its components is extreme? This question is motivated by a kind of inverse problem where one wants to determine the extremal behavior of $X$ when only explicitly observing $f(X)$. We shall refer to such types of results as to implicit extremes. It turns out that, as in the usual case of explicit extremes, all limit implicit extreme value laws are implicit max-stable. We characterize the regularly varying implicit max-stable laws in terms of their spectral and stochastic representations. We also establish the asymptotic behavior of implicit order statistics relative to a given homogeneous loss and conclude with several examples drawing connections to prior work involving regular variation on general cones.

preprint2013arXiv

A State-Space Approach for Optimal Traffic Monitoring via Network Flow Sampling

The robustness and integrity of IP networks require efficient tools for traffic monitoring and analysis, which scale well with traffic volume and network size. We address the problem of optimal large-scale flow monitoring of computer networks under resource constraints. We propose a stochastic optimization framework where traffic measurements are done by exploiting the spatial (across network links) and temporal relationship of traffic flows. Specifically, given the network topology, the state-space characterization of network flows and sampling constraints at each monitoring station, we seek an optimal packet sampling strategy that yields the best traffic volume estimation for all flows of the network. The optimal sampling design is the result of a concave minimization problem; then, Kalman filtering is employed to yield a sequence of traffic estimates for each network flow. We evaluate our algorithm using real-world Internet2 data.

preprint2013arXiv

CRPS M-estimation for max-stable models

Max-stable random fields provide canonical models for the dependence of multivariate extremes. Inference with such models has been challenging due to the lack of tractable likelihoods. In contrast, the finite dimensional cumulative distribution functions (CDFs) are often readily available and natural to work with. Motivated by this fact, in this work we develop an M-estimation framework for max-stable models based on the continuous ranked probability score (CRPS) of multivariate CDFs. We start by establishing conditions for the consistency and asymptotic normality of the CRPS-based estimators in a general context. We then implement them in the max-stable setting and provide readily computable expressions for their asymptotic covariance matrices. The resulting point and asymptotic confidence interval estimates are illustrated over popular simulated models. They enjoy accurate coverages and offer an alternative to composite likelihood based methods.

preprint2013arXiv

Efficient Approximation Algorithms for Optimal Large-scale Network Monitoring

The growing amount of applications that generate vast amount of data in short time scales render the problem of partial monitoring, coupled with prediction, a rather fundamental one. We study the aforementioned canonical problem under the context of large-scale monitoring of communication networks. We consider the problem of selecting the "best" subset of links so as to optimally predict the quantity of interest at the remaining ones. This is a well know NP-hard problem, and algorithms seeking the exact solution are prohibitively expensive. We present a number of approximation algorithms that: 1) their computational complexity gains a significant improvement over existing greedy algorithms; 2) exploit the geometry of principal component analysis, which also helps us establish theoretical bounds on the prediction error; 3) are amenable for randomized implementation and execution in parallel or distributed fashion, a process that often yields the exact solution. The new algorithms are demonstrated and evaluated using real-world network data.

preprint2013arXiv

Fast Approximation Algorithms for Near-optimal Large-scale Network Monitoring

We study the problem of optimal traffic prediction and monitoring in large-scale networks. Our goal is to determine which subset of K links to monitor in order to "best" predict the traffic on the remaining links in the network. We consider several optimality criteria. This can be formulated as a combinatorial optimization problem, belonging to the family of subset selection problems. Similar NP-hard problems arise in statistics, machine learning and signal processing. Some include subset selection for regression, variable selection, and sparse approximation. Exact solutions are computationally prohibitive. We present both new heuristics as well as new efficient algorithms implementing the classical greedy heuristic - commonly used to tackle such combinatorial problems. Our approach exploits connections to principal component analysis (PCA), and yields new types of performance lower bounds which do not require submodularity of the objective functions. We show that an ensemble method applied to our new randomized heuristic algorithm, often outperforms the classical greedy heuristic in practice. We evaluate our algorithms under several large-scale networks, including real life networks.