Source author record

Daniele Durante

Daniele Durante appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation Applications Machine Learning math.ST stat.OT Statistics Theory

Catalog footprint

What is connected

11works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Skew-symmetric approximations of posterior distributions

Popular deterministic approximations of posterior distributions from, e.g. the Laplace method, variational Bayes and expectation-propagation, generally rely on symmetric approximating families, often taken to be Gaussian. This choice facilitates optimization and inference, but typically affects the quality of the overall approximation. In fact, even in basic parametric models, the posterior distribution often displays asymmetries that yield bias and a reduced accuracy when considering symmetric approximations. Recent research has moved towards more flexible approximating families which incorporate skewness. However, current solutions are often model specific, lack a general supporting theory, increase the computational complexity of the optimization problem, and do not provide a broadly applicable solution to incorporate skewness in any symmetric approximation. This article addresses such a gap by introducing a general and provably optimal strategy to perturb any off-the-shelf symmetric approximation of a generic posterior distribution. This novel perturbation scheme is derived without additional optimization steps, and yields a similarly tractable approximation within the class of skew-symmetric densities that provably enhances the finite sample accuracy of the original symmetric counterpart. Furthermore, under suitable assumptions, it improves the convergence rate to the exact posterior by at least a $\sqrt{n}$ factor, in asymptotic regimes. These advancements are illustrated in numerical studies focusing on skewed perturbations of state-of-the-art Gaussian approximations.

preprint2022arXiv

A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One

Multinomial probit models are routinely-implemented representations for learning how the class probabilities of categorical response data change with p observed predictors. Although several frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable class of conjugate priors, that may facilitate posterior inference on the multinomial probit coefficients. Such an issue has motivated increasing efforts toward the development of effective Markov chain Monte Carlo methods, but state-of-the-art solutions still face severe computational bottlenecks, especially in high dimensions. In this article, we show that the entire class of unified skew-normal (SUN) distributions is conjugate to several multinomial probit models. Leveraging this result and the SUN properties, we improve upon state-of-the-art solutions for posterior inference and classification both in terms of closed-form results for several functionals of interest, and also by developing novel computational methods relying either on independent and identically distributed samples from the exact posterior or on scalable and accurate variational approximations based on blocked partially-factorized representations. As illustrated in simulations and in a gastrointestinal lesions application, the magnitude of the improvements relative to current methods is particularly evident, in practice, when the focus is on high-dimensional studies.

preprint2022arXiv

Extended Stochastic Block Models with Application to Criminal Networks

Reliably learning group structures among nodes in network data is challenging in several applications. We are particularly motivated by studying covert networks that encode relationships among criminals. These data are subject to measurement errors, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may unveil key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability of routinely-used community detection algorithms, and requires extensions of model-based solutions to realistically characterize the node partition process, incorporate information from node attributes, and provide improved strategies for estimation and uncertainty quantification. To cover these gaps, we develop a new class of extended stochastic block models (ESBM) that infer groups of nodes having common connectivity patterns via Gibbs-type priors on the partition process. This choice encompasses many realistic priors for criminal networks, covering solutions with fixed, random and infinite number of possible groups, and facilitates the inclusion of node attributes in a principled manner. Among the new alternatives in our class, we focus on the Gnedin process as a realistic prior that allows the number of groups to be finite, random and subject to a reinforcement process coherent with criminal networks. A collapsed Gibbs sampler is proposed for the whole ESBM class, and refined strategies for estimation, prediction, uncertainty quantification and model selection are outlined. The ESBM performance is illustrated in realistic simulations and in an application to an Italian mafia network, where we unveil key complex block structures, mostly hidden from state-of-the-art alternatives.

preprint2022arXiv

Scalable and Accurate Variational Bayes for High-Dimensional Binary Regression Models

Modern methods for Bayesian regression beyond the Gaussian response setting are often computationally impractical or inaccurate in high dimensions. In fact, as discussed in recent literature, bypassing such a trade-off is still an open problem even in routine binary regression models, and there is limited theory on the quality of variational approximations in high-dimensional settings. To address this gap, we study the approximation accuracy of routinely-used mean-field variational Bayes solutions in high-dimensional probit regression with Gaussian priors, obtaining novel and practically relevant results on the pathological behavior of such strategies in uncertainty quantification, point estimation and prediction. Motivated by these results, we further develop a new partially-factorized variational approximation for the posterior of the probit coefficients which leverages a representation with global and local variables but, unlike for classical mean-field assumptions, it avoids a fully factorized approximation, and instead assumes a factorization only for the local variables. We prove that the resulting approximation belongs to a tractable class of unified skew-normal densities that crucially incorporates skewness and, unlike for state-of-the-art mean-field solutions, converges to the exact posterior density as p goes to infinity. To solve the variational optimization problem, we derive a tractable CAVI algorithm that easily scales to p in the tens of thousands, and provably requires a number of iterations converging to 1 as p goes to infinity. Such findings are also illustrated in extensive empirical studies where our novel solution is shown to improve the approximation accuracy of mean-field variational Bayes for any n and p, with the magnitude of these gains being remarkable in those high-dimensional p>n settings where state-of-the-art methods are computationally impractical.

preprint2022arXiv

Scalable computation of predictive probabilities in probit models with Gaussian process priors

Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, unlike for continuous settings, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximation strategies provide common solutions to this problem, but state-of-the-art algorithms are either computationally intractable or inaccurate in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate these quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities, and on mean-field variational approximations of multivariate truncated normals. Closed-form expressions for the marginal likelihood and for the posterior distribution of the Gaussian process are also discussed. As shown in simulated and real-world empirical studies, the proposed methods scale to dimensions where state-of-the-art solutions are impractical.

preprint2020arXiv

Bayesian cumulative shrinkage for infinite factorizations

There is a wide variety of models in which the dimension of the parameter space is unknown. For example, in factor analysis the number of latent factors is typically not known and has to be inferred from the observed data. Although classical shrinkage priors are useful in these contexts, increasing shrinkage priors can provide a more effective option, which progressively penalizes expansions with growing complexity. In this article we propose a novel increasing shrinkage prior, named the cumulative shrinkage process, for the parameters controlling the dimension in over-complete formulations. Our construction has broad applicability, simple interpretation, and is based on a sequence of spike and slab distributions which assign increasing mass to the spike as model complexity grows. Using factor analysis as an illustrative example, we show that this formulation has theoretical and practical advantages over current competitors, including an improved ability to recover the model dimension. An adaptive Markov chain Monte Carlo algorithm is proposed, and the methods are evaluated in simulation studies and applied to personality traits data.

preprint2020arXiv

Tractable Bayesian Density Regression via Logit Stick-Breaking Priors

There is a growing interest in learning how the distribution of a response variable changes with a set of predictors. Bayesian nonparametric dependent mixture models provide a flexible approach to address this goal. However, several formulations require computationally demanding algorithms for posterior inference. Motivated by this issue, we study a class of predictor-dependent infinite mixture models, which relies on a simple representation of the stick-breaking prior via sequential logistic regressions. This formulation maintains the same desirable properties of popular predictor-dependent stick-breaking priors, and leverages a recent Pólya-gamma data augmentation to facilitate the implementation of several computational methods for posterior inference. These routines include Markov chain Monte Carlo via Gibbs sampling, expectation-maximization algorithms, and mean-field variational Bayes for scalable inference, thereby stimulating a wider implementation of Bayesian density regression by practitioners. The algorithms associated with these methods are presented in detail and tested in a toxicology study.

preprint2015arXiv

The Compass for Statistical Researchers

We have hiked many miles alongside several professors as we traversed our statistical path -- a regime switching trail which changed direction following a class on the foundations of our discipline. As we play the game of research in that limbo between student and academic, one thing among Prof. Bernardi's teachings has never been more clear: to draw a route in the research map you not only need to know your destination, but you must also understand where you are and how you arrived there.

preprint2015arXiv

Unifying inference on brain network variations in neurological diseases: The Alzheimer's case

There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a disease cannot exploit this rich network information. Standard practice considers each edge independently or summarizes the network with a few simple features. We enable dramatic gains in biological insight via a novel unifying methodology for inference on brain network variations associated to the occurrence of neurological diseases. The key of this approach is to define a probabilistic generative mechanism directly on the space of network configurations via dependent mixtures of low-rank factorizations, which efficiently exploit network information and allow the probability mass function for the brain network-valued random variable to vary flexibly across the group of patients characterized by a specific neurological disease and the one comprising age-matched cognitively healthy individuals.

preprint2014arXiv

Bayesian dynamic financial networks with time-varying predictors

We propose a Bayesian nonparametric model including time-varying predictors in dynamic network inference. The model is applied to infer the dependence structure among financial markets during the global financial crisis, estimating effects of verbal and material cooperation efforts. We interestingly learn contagion effects, with increasing influence of verbal relations during the financial crisis and opposite results during the United States housing bubble.

preprint2013arXiv

Locally adaptive factor processes for multivariate time series

In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such time-varying smoothness is not accounted for, one can obtain misleading inferences and predictions, with over-smoothing across erratic time intervals and under-smoothing across times exhibiting slow variation. This can lead to mis-calibration of predictive intervals, which can be substantially too narrow or wide depending on the time. We propose a locally adaptive factor process for characterizing multivariate mean-covariance changes in continuous time, allowing locally varying smoothness in both the mean and covariance matrix. This process is constructed utilizing latent dictionary functions evolving in time through nested Gaussian processes and linearly related to the observed data with a sparse mapping. Using a differential equation representation, we bypass usual computational bottlenecks in obtaining MCMC and online algorithms for approximate Bayesian inference. The performance is assessed in simulations and illustrated in a financial application.

Daniele Durante

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Skew-symmetric approximations of posterior distributions

A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One

Extended Stochastic Block Models with Application to Criminal Networks

Scalable and Accurate Variational Bayes for High-Dimensional Binary Regression Models

Scalable computation of predictive probabilities in probit models with Gaussian process priors

Bayesian cumulative shrinkage for infinite factorizations

Tractable Bayesian Density Regression via Logit Stick-Breaking Priors

The Compass for Statistical Researchers

Unifying inference on brain network variations in neurological diseases: The Alzheimer's case

Bayesian dynamic financial networks with time-varying predictors

Locally adaptive factor processes for multivariate time series