Source author record

Lourens Waldorp

Lourens Waldorp appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning stat.OT

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Using Explainable Boosting Machine to Compare Idiographic and Nomothetic Approaches for Ecological Momentary Assessment Data

Previous research on EMA data of mental disorders was mainly focused on multivariate regression-based approaches modeling each individual separately. This paper goes a step further towards exploring the use of non-linear interpretable machine learning (ML) models in classification problems. ML models can enhance the ability to accurately predict the occurrence of different behaviors by recognizing complicated patterns between variables in data. To evaluate this, the performance of various ensembles of trees are compared to linear models using imbalanced synthetic and real-world datasets. After examining the distributions of AUC scores in all cases, non-linear models appear to be superior to baseline linear models. Moreover, apart from personalized approaches, group-level prediction models are also likely to offer an enhanced performance. According to this, two different nomothetic approaches to integrate data of more than one individuals are examined, one using directly all data during training and one based on knowledge distillation. Interestingly, it is observed that in one of the two real-world datasets, knowledge distillation method achieves improved AUC scores (mean relative change of +17\% compared to personalized) showing how it can benefit EMA data classification and performance.

preprint2020arXiv

Interpreting the Ising Model: The Input Matters

The Ising model is a model for pairwise interactions between binary variables that has become popular in the psychological sciences. It has been first introduced as a theoretical model for the alignment between positive (+1) and negative (-1) atom spins. In many psychological applications, however, the Ising model is defined on the domain $\{0,1\}$ instead of the classical domain $\{-1,1\}$. While it is possible to transform the parameters of a given Ising model in one domain to obtain a statistically equivalent model in the other domain, the parameters in the two versions of the Ising model lend themselves to different interpretations and imply different dynamics, when studying the Ising model as a dynamical system. In this tutorial paper, we provide an accessible discussion of the interpretation of threshold and interaction parameters in the two domains and show how the dynamics of the Ising model depends on the choice of domain. Finally, we provide a transformation that allows to transform the parameters in an Ising model in one domain into a statistically equivalent Ising model in the other domain.

preprint2020arXiv

Moderated Network Models

Pairwise network models such as the Gaussian Graphical Model (GGM) are a powerful and intuitive way to analyze dependencies in multivariate data. A key assumption of the GGM is that each pairwise interaction is independent of the values of all other variables. However, in psychological research this is often implausible. In this paper, we extend the GGM by allowing each pairwise interaction between two variables to be moderated by (a subset of) all other variables in the model, and thereby introduce a Moderated Network Model (MNM). We show how to construct the MNM and propose an L1-regularized nodewise regression approach to estimate it. We provide performance results in a simulation study and show that MNMs outperform the split-sample based methods Network Comparison Test (NCT) and Fused Graphical Lasso (FGL) in detecting moderation effects. Finally, we provide a fully reproducible tutorial on how to estimate MNMs with the R-package mgm and discuss possible issues with model misspecification.

preprint2020arXiv

Relations between networks, regression, partial correlation, and latent variable model

The Gaussian graphical model (GGM) has become a popular tool for analyzing networks of psychological variables. In a recent paper in this journal, Forbes, Wright, Markon, and Krueger (FWMK) voiced the concern that GGMs that are estimated from partial correlations wrongfully remove the variance that is shared by its constituents. If true, this concern has grave consequences for the application of GGMs. Indeed, if partial correlations only capture the unique covariances, then the data that come from a unidimensional latent variable model ULVM should be associated with an empty network (no edges), as there are no unique covariances in a ULVM. We know that this cannot be true, which suggests that FWMK are missing something with their claim. We introduce a connection between the ULVM and the GGM and use that connection to prove that we find a fully-connected and not an empty network associated with a ULVM. We then use the relation between GGMs and linear regression to show that the partial correlation indeed does not remove the common variance.

preprint2020arXiv

Reliability of decisions based on tests: Fourier analysis of Boolean decision functions

Items in a test are often used as a basis for making decisions and such tests are therefore required to have good psychometric properties, like unidimensionality. In many cases the sum score is used in combination with a threshold to decide between pass or fail, for instance. Here we consider whether such a decision function is appropriate, without a latent variable model, and which properties of a decision function are desirable. We consider reliability (stability) of the decision function, i.e., does the decision change upon perturbations, or changes in a fraction of the outcomes of the items (measurement error). We are concerned with questions of whether the sum score is the best way to aggregate the items, and if so why. We use ideas from test theory, social choice theory, graphical models, computer science and probability theory to answer these questions. We conclude that a weighted sum score has desirable properties that (i) fit with test theory and is observable (similar to a condition like conditional association), (ii) has the property that a decision is stable (reliable), and (iii) satisfies Rousseau's criterion that the input should match the decision. We use Fourier analysis of Boolean functions to investigate whether a decision function is stable and to figure out which (set of) items has proportionally too large an influence on the decision. To apply these techniques we invoke ideas from graphical models and use a pseudo-likelihood factorisation of the probability distribution.

Lourens Waldorp

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Using Explainable Boosting Machine to Compare Idiographic and Nomothetic Approaches for Ecological Momentary Assessment Data

Interpreting the Ising Model: The Input Matters

Moderated Network Models

Relations between networks, regression, partial correlation, and latent variable model

Reliability of decisions based on tests: Fourier analysis of Boolean decision functions