Source author record

Roberto Rocci

Roberto Rocci appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Omitted covariates bias and finite mixtures of regression models for longitudinal responses

Individual-specific, time-constant, random effects are often used to model dependence and/or to account for omitted covariates in regression models for longitudinal responses. Longitudinal studies have known a huge and widespread use in the last few years as they allow to distinguish between so-called age and cohort effects; these relate to differences that can be observed at the beginning of the study and stay persistent through time, and changes in the response that are due to the temporal dynamics in the observed covariates. While there is a clear and general agreement on this purpose, the random effect approach has been frequently criticized for not being robust to the presence of correlation between the observed (i.e. covariates) and the unobserved (i.e. random effects) heterogeneity. Starting from the so-called correlated effect approach, we argue that the random effect approach may be parametrized to account for potential correlation between observables and unobservables. Specifically, when the random effect distribution is estimated non-parametrically using a discrete distribution on finite number of locations, a further, more general, solution is developed. This is illustrated via a large scale simulation study and the analysis of a benchmark dataset.

preprint2016arXiv

A data driven equivariant approach to constrained Gaussian mixture modeling

Maximum likelihood estimation of Gaussian mixture models with different class-specific covariance matrices is known to be problematic. This is due to the unboundedness of the likelihood, together with the presence of spurious maximizers. Existing methods to bypass this obstacle are based on the fact that unboundedness is avoided if the eigenvalues of the covariance matrices are bounded away from zero. This can be done imposing some constraints on the covariance matrices, i.e. by incorporating a priori information on the covariance structure of the mixture components. The present work introduces a constrained equivariant approach, where the class conditional covariance matrices are shrunk towards a pre-specified matrix Psi. Data-driven choices of the matrix Psi, when a priori information is not available, and the optimal amount of shrinkage are investigated. The effectiveness of the proposal is evaluated on the basis of a simulation study and an empirical example.

preprint2016arXiv

Estimation of clusterwise linear regression models with a shrinkage-like approach

Constrained approaches to maximum likelihood estimation in the context of finite mixtures of normals have been presented in the literature. A fully data-dependent constrained method for maximum likelihood estimation of clusterwise linear regression is proposed, which extends previous work in equivariant data-driven estimation of finite mixtures of Gaussians for classification. The method imposes plausible bounds on the component variances, based on a target value estimated from the data, which we take to be the homoscedastic variance. Nevertheless, the present work does not only focus on classification recovery, but also on how well model parameters are estimated. In particular, the paper sheds light on the shrinkage-like interpretation of the procedure, where the target is the homoscedastic model: this is not only related to how close to the target the estimated scales are, but extends to the estimated clusterwise linear regressions and classification. We show, based on simulation and real-data based results, that our approach yields a final model being the most appropriate-to-the-data compromise between the heteroscedastic model and the homoscedastic model.

preprint2015arXiv

A pairwise likelihood approach to simultaneous clustering and dimensional reduction of ordinal data

The literature on clustering for continuous data is rich and wide; differently, that one developed for categorical data is still limited. In some cases, the problem is made more difficult by the presence of noise variables/dimensions that do not contain information about the clustering structure and could mask it. The aim of this paper is to propose a model for simultaneous clustering and dimensionality reduction of ordered categorical data able to detect the discriminative dimensions discarding the noise ones. Following the underlying response variable approach, the observed variables are considered as a discretization of underlying first-order latent continuous variables distributed as a Gaussian mixture. To recognize discriminative and noise dimensions, these variables are considered to be linear combinations of two independent sets of second-order latent variables where only one contains the information about the cluster structure while the other contains noise dimensions. The model specification involves multidimensional integrals that make the maximum likelihood estimation cumbersome and in some cases infeasible. To overcome this issue the parameter estimation is carried out through an EM-like algorithm maximizing a pairwise log-likelihood. Examples of application of the model on real and simulated data are performed to show the effectiveness of the proposal.