Source author record

Botond Szabo

Botond Szabo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning Methodology Computation Applications Information Theory math.IT

Catalog footprint

What is connected

9works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Skew-symmetric approximations of posterior distributions

Popular deterministic approximations of posterior distributions from, e.g. the Laplace method, variational Bayes and expectation-propagation, generally rely on symmetric approximating families, often taken to be Gaussian. This choice facilitates optimization and inference, but typically affects the quality of the overall approximation. In fact, even in basic parametric models, the posterior distribution often displays asymmetries that yield bias and a reduced accuracy when considering symmetric approximations. Recent research has moved towards more flexible approximating families which incorporate skewness. However, current solutions are often model specific, lack a general supporting theory, increase the computational complexity of the optimization problem, and do not provide a broadly applicable solution to incorporate skewness in any symmetric approximation. This article addresses such a gap by introducing a general and provably optimal strategy to perturb any off-the-shelf symmetric approximation of a generic posterior distribution. This novel perturbation scheme is derived without additional optimization steps, and yields a similarly tractable approximation within the class of skew-symmetric densities that provably enhances the finite sample accuracy of the original symmetric counterpart. Furthermore, under suitable assumptions, it improves the convergence rate to the exact posterior by at least a $\sqrt{n}$ factor, in asymptotic regimes. These advancements are illustrated in numerical studies focusing on skewed perturbations of state-of-the-art Gaussian approximations.

preprint2022arXiv

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimer's disease classification

Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance.

preprint2022arXiv

Optimal distributed composite testing in high-dimensional Gaussian models with 1-bit communication

In this paper we study the problem of signal detection in Gaussian noise in a distributed setting where the local machines in the star topology can communicate a single bit of information. We derive a lower bound on the Euclidian norm that the signal needs to have in order to be detectable. Moreover, we exhibit optimal distributed testing strategies that attain the lower bound.

preprint2020arXiv

Distributed function estimation: adaptation using minimal communication

We investigate whether in a distributed setting, adaptive estimation of a smooth function at the optimal rate is possible under minimal communication. It turns out that the answer depends on the risk considered and on the number of servers over which the procedure is distributed. We show that for the $L_\infty$-risk, adaptively obtaining optimal rates under minimal communication is not possible. For the $L_2$-risk, it is possible over a range of regularities that depends on the relation between the number of local servers and the total sample size.

preprint2020arXiv

Fast Exact Bayesian Inference for Sparse Signals in the Normal Sequence Model

We consider exact algorithms for Bayesian inference with model selection priors (including spike-and-slab priors) in the sparse normal sequence model. Because the best existing exact algorithm becomes numerically unstable for sample sizes over n=500, there has been much attention for alternative approaches like approximate algorithms (Gibbs sampling, variational Bayes, etc.), shrinkage priors (e.g. the Horseshoe prior and the Spike-and-Slab LASSO) or empirical Bayesian methods. However, by introducing algorithmic ideas from online sequential prediction, we show that exact calculations are feasible for much larger sample sizes: for general model selection priors we reach n=25000, and for certain spike-and-slab priors we can easily reach n=100000. We further prove a de Finetti-like result for finite sample sizes that characterizes exactly which model selection priors can be expressed as spike-and-slab priors. The computational speed and numerical accuracy of the proposed methods are demonstrated in experiments on simulated data, on a differential gene expression data set, and to compare the effect of multiple hyper-parameter settings in the beta-binomial prior. In our experimental evaluation we compute guaranteed bounds on the numerical accuracy of all new algorithms, which shows that the proposed methods are numerically reliable whereas an alternative based on long division is not.

preprint2020arXiv

Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data can be expensive and/or burdening for patients, so that it is important to reduce the amount of required data collection. It is therefore necessary to develop multi-view learning methods which can accurately identify those views that are most important for prediction. In recent years, several biomedical studies have used an approach known as multi-view stacking (MVS), where a model is trained on each view separately and the resulting predictions are combined through stacking. In these studies, MVS has been shown to increase classification accuracy. However, the MVS framework can also be used for selecting a subset of important views. To study the view selection potential of MVS, we develop a special case called stacked penalized logistic regression (StaPLR). Compared with existing view-selection methods, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views play an important role in preventing unimportant views from entering the model. We investigate the performance of StaPLR through simulations, and consider two real data examples. We compare the performance of StaPLR with an existing view selection method called the group lasso and observe that, in terms of view selection, StaPLR is often more conservative and has a consistently lower false positive rate.

preprint2018arXiv

A Bayesian nonparametric approach to log-concave density estimation

The estimation of a log-concave density on $\mathbb{R}$ is a canonical problem in the area of shape-constrained nonparametric inference. We present a Bayesian nonparametric approach to this problem based on an exponentiated Dirichlet process mixture prior and show that the posterior distribution converges to the log-concave truth at the (near-) minimax rate in Hellinger distance. Our proof proceeds by establishing a general contraction result based on the log-concave maximum likelihood estimator that prevents the need for further metric entropy calculations. We also present two computationally more feasible approximations and a more practical empirical Bayes approach, which are illustrated numerically via simulations.

preprint2016arXiv

Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator

We consider the asymptotic behaviour of the marginal maximum likelihood empirical Bayes posterior distribution in general setting. First we characterize the set where the maximum marginal likelihood estimator is located with high probability. Then we provide oracle type of upper and lower bounds for the contraction rates of the empirical Bayes posterior. We also show that the hierarchical Bayes posterior achieves the same contraction rate as the maximum marginal likelihood empirical Bayes posterior. We demonstrate the applicability of our general results for various models and prior distributions by deriving upper and lower bounds for the contraction rates of the corresponding empirical and hierarchical Bayes posterior distributions.

preprint2014arXiv

Honest Bayesian confidence sets for the L2-norm

We investigate the problem of constructing Bayesian credible sets that are honest and adaptive for the L2-loss over a scale of Sobolev classes with regularity ranging between [D; 2D], for some given D in the context of the signal-in-white-noise model. We consider a scale of prior distributions indexed by a regularity hyper-parameter and choose the hyper-parameter both by marginal likelihood empirical Bayes and by hierarchical Bayes method, respectively. Next we consider a ball centered around the corresponding posterior mean with prescribed posterior probability. We show by theory and examples that both the empirical Bayes and the hierarchical Bayes credible sets give misleading, overconfident uncertainty quantification for certain oddly behaving truth. Then we construct a new empirical Bayes method based on risk estimation, which provides the correct uncertainty quantification and optimal size.

Botond Szabo

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Skew-symmetric approximations of posterior distributions

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimer's disease classification

Optimal distributed composite testing in high-dimensional Gaussian models with 1-bit communication

Distributed function estimation: adaptation using minimal communication

Fast Exact Bayesian Inference for Sparse Signals in the Normal Sequence Model

Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

A Bayesian nonparametric approach to log-concave density estimation

Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator

Honest Bayesian confidence sets for the L2-norm