Source author record

Marcel van Gerven

Marcel van Gerven appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Neurons and Cognition Artificial Intelligence eess.IV Hardware Architecture Methodology Neural and Evolutionary Computing Robotics

Catalog footprint

What is connected

12works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Decorrelation Speeds Up Vision Transformers

Masked Autoencoder (MAE) pre-training of vision transformers (ViTs) yields strong performance in low-label data regimes but comes with substantial computational costs, making it impractical in time- and resource-constrained industrial settings. We address this by integrating Decorrelated Backpropagation (DBP) into MAE pre-training, an optimization method that iteratively reduces input correlations at each layer to accelerate convergence. Applied selectively to the encoder, DBP achieves faster pre-training without loss of stability. To mimic constrained-data scenarios, we evaluate our approach on ImageNet-1K pre-training and ADE20K fine-tuning using randomly sampled subsets of each dataset. Under this setting, DBP-MAE reduces wall-clock time to baseline performance by 21.1%, lowers carbon emissions by 21.4%, and improves segmentation mIoU by 1.1 points. We observe similar gains when pre-training and fine-tuning on proprietary industrial data, confirming the method's applicability in real-world scenarios. These results demonstrate that DBP can reduce training time and energy use while improving downstream performance for large-scale ViT pre-training. Keywords: Deep learning, Vision transformers, Efficient AI, Decorrelation

preprint2026arXiv

Evolving Layer-Specific Scalar Functions for Hardware-Aware Transformer Adaptation

Vision Transformers (ViTs) achieve state-of-the-art performance on challenging vision tasks, but their deployment on edge devices is severely hindered by the computational complexity and global reduction bottleneck imposed by layer normalization. Recent methods attempt to bypass this by replacing normalization layers with hardware-friendly scalar approximations. However, these homogeneous replacements do not optimally fit to all layers' behaviour and rely on expensive model retraining. In this work, we propose a highly efficient, hardware-aware framework that utilizes genetic programming (GP) to evolve heterogeneous, layer-specific scalar functions directly from pre-trained weights. Coupled with a novel post-training re-alignment strategy, our approach eliminates the need to retrain models from scratch entirely. Our evolved expressions accurately approximate the target normalization behaviours, capturing $91.6\%$ of the variance ($R^2$) compared to only $70.2\%$ for homogeneous baselines, allowing our modified architecture to recover $84.25\%$ Top-1 ImageNet-1K accuracy in only 20 epochs. By preserving this performance while eliminating the global reduction bottleneck, our approach establishes a highly favourable trade-off between arithmetic complexity and off-chip memory traffic, removing a primary barrier to the efficient deployment of ViTs on edge accelerators.

preprint2026arXiv

Neural Co-state Policies: Structuring Hidden States in Recurrent Reinforcement Learning

A key capability of intelligent agents is operating under partial observability: reasoning and acting effectively despite missing or incomplete state observations. While recurrent (memory-based) policies learned via reinforcement learning address this by encoding history into latent state representations, their internal dynamics remain uninterpretable black boxes. This paper establishes a formal link between these hidden states and the Pontryagin minimum principle (PMP) from optimal control. We demonstrate that for standard recurrent architectures, latent representations map directly to PMP co-states, which allows the readout layer to be interpreted as performing Hamiltonian minimization. Because standard reward maximization does not naturally discover this alignment, we introduce a PMP-derived co-state loss to explicitly structure the internal dynamics. Empirically, this approach matches or improves performance on partially observable DMControl tasks, and is robust against zero-shot out-of-distribution sensor masking. By framing recurrent networks as dynamic processes governed by the minimum principle, we provide a principled approach to designing robust continuous control policies.

preprint2022arXiv

How does artificial intelligence contribute to iEEG research?

Artificial intelligence (AI) is a fast-growing field focused on modeling and machine implementation of various cognitive functions with an increasing number of applications in computer vision, text processing, robotics, neurotechnology, bio-inspired computing and others. In this chapter, we describe how AI methods can be applied in the context of intracranial electroencephalography (iEEG) research. IEEG data is unique as it provides extremely high-quality signals recorded directly from brain tissue. Applying advanced AI models to these data carries the potential to further our understanding of many fundamental questions in neuroscience. At the same time, as an invasive technique, iEEG lends itself well to long-term, mobile brain-computer interface applications, particularly for communication in severely paralyzed individuals. We provide a detailed overview of these two research directions in the application of AI techniques to iEEG. That is, (1) the development of computational models that target fundamental questions about the neurobiological nature of cognition (AI-iEEG for neuroscience) and (2) applied research on monitoring and identification of event-driven brain states for the development of clinical brain-computer interface systems (AI-iEEG for neurotechnology). We explain key machine learning concepts, specifics of processing and modeling iEEG data and details of state-of-the-art iEEG-based neurotechnology and brain-computer interfaces.

preprint2021arXiv

Automatic structured variational inference

Stochastic variational inference offers an attractive option as a default method for differentiable probabilistic programming. However, the performance of the variational approach depends on the choice of an appropriate variational family. Here, we introduce automatic structured variational inference (ASVI), a fully automated method for constructing structured variational families, inspired by the closed-form update in conjugate Bayesian models. These convex-update families incorporate the forward pass of the input probabilistic program and can therefore capture complex statistical dependencies. Convex-update families have the same space and time complexity as the input probabilistic program and are therefore tractable for a very large family of models including both continuous and discrete variables. We validate our automatic variational method on a wide range of low- and high-dimensional inference problems. We find that ASVI provides a clear improvement in performance when compared with other popular approaches such as the mean-field approach and inverse autoregressive flows. We provide an open source implementation of ASVI in TensorFlow Probability.

preprint2021arXiv

Automatic variational inference with cascading flows

The automation of probabilistic reasoning is one of the primary aims of machine learning. Recently, the confluence of variational inference and deep learning has led to powerful and flexible automatic inference methods that can be trained by stochastic gradient descent. In particular, normalizing flows are highly parameterized deep models that can fit arbitrarily complex posterior densities. However, normalizing flows struggle in highly structured probabilistic programs as they need to relearn the forward-pass of the program. Automatic structured variational inference (ASVI) remedies this problem by constructing variational programs that embed the forward-pass. Here, we combine the flexibility of normalizing flows and the prior-embedding property of ASVI in a new family of variational programs, which we named cascading flows. A cascading flows program interposes a newly designed highway flow architecture in between the conditional distributions of the prior program such as to steer it toward the observed data. These programs can be constructed automatically from an input probabilistic program and can also be amortized automatically. We evaluate the performance of the new variational programs in a series of structured inference problems. We find that cascading flows have much higher performance than both normalizing flows and ASVI in a large set of structured inference problems.

preprint2020arXiv

End-to-End Pixel-Based Deep Active Inference for Body Perception and Action

We present a pixel-based deep active inference algorithm (PixelAI) inspired by human body perception and action. Our algorithm combines the free-energy principle from neuroscience, rooted in variational inference, with deep convolutional decoders to scale the algorithm to directly deal with raw visual input and provide online adaptive inference. Our approach is validated by studying body perception and action in a simulated and a real Nao robot. Results show that our approach allows the robot to perform 1) dynamical body estimation of its arm using only monocular camera images and 2) autonomous reaching to "imagined" arm poses in the visual space. This suggests that robot and human body perception and action can be efficiently solved by viewing both as an active inference problem guided by ongoing sensory input.

preprint2020arXiv

The Indian Chefs Process

This paper introduces the Indian Chefs Process (ICP), a Bayesian nonparametric prior on the joint space of infinite directed acyclic graphs (DAGs) and orders that generalizes Indian Buffet Processes. As our construction shows, the proposed distribution relies on a latent Beta Process controlling both the orders and outgoing connection probabilities of the nodes, and yields a probability distribution on sparse infinite graphs. The main advantage of the ICP over previously proposed Bayesian nonparametric priors for DAG structures is its greater flexibility. To the best of our knowledge, the ICP is the first Bayesian nonparametric model supporting every possible DAG. We demonstrate the usefulness of the ICP on learning the structure of deep generative sigmoid networks as well as convolutional neural networks.

preprint2020arXiv

Virtual staining for mitosis detection in Breast Histopathology

We propose a virtual staining methodology based on Generative Adversarial Networks to map histopathology images of breast cancer tissue from H&E stain to PHH3 and vice versa. We use the resulting synthetic images to build Convolutional Neural Networks (CNN) for automatic detection of mitotic figures, a strong prognostic biomarker used in routine breast cancer diagnosis and grading. We propose several scenarios, in which CNN trained with synthetically generated histopathology images perform on par with or even better than the same baseline model trained with real images. We discuss the potential of this application to scale the number of training samples without the need for manual annotations.

preprint2016arXiv

Deep disentangled representations for volumetric reconstruction

We introduce a convolutional neural network for inferring a compact disentangled graphical description of objects from 2D images that can be used for volumetric reconstruction. The network comprises an encoder and a twin-tailed decoder. The encoder generates a disentangled graphics code. The first decoder generates a volume, and the second decoder reconstructs the input image using a novel training regime that allows the graphics code to learn a separate representation of the 3D object and a description of its lighting and pose conditions. We demonstrate this method by generating volumes and disentangled graphical descriptions from images and videos of faces and chairs.

preprint2016arXiv

Predicting and visualizing psychological attributions with a deep neural network

Judgments about personality based on facial appearance are strong effectors in social decision making, and are known to have impact on areas from presidential elections to jury decisions. Recent work has shown that it is possible to predict perception of memorability, trustworthiness, intelligence and other attributes in human face images. The most successful of these approaches require face images expertly annotated with key facial landmarks. We demonstrate a Convolutional Neural Network (CNN) model that is able to perform the same task without the need for landmark features, thereby greatly increasing efficiency. The model has high accuracy, surpassing human-level performance in some cases. Furthermore, we use a deconvolutional approach to visualize important features for perception of 22 attributes and demonstrate a new method for separately visualizing positive and negative features.

preprint2014arXiv

Efficient sampling of Gaussian graphical models using conditional Bayes factors

Bayesian estimation of Gaussian graphical models has proven to be challenging because the conjugate prior distribution on the Gaussian precision matrix, the G-Wishart distribution, has a doubly intractable partition function. Recent developments provide a direct way to sample from the G-Wishart distribution, which allows for more efficient algorithms for model selection than previously possible. Still, estimating Gaussian graphical models with more than a handful of variables remains a nearly infeasible task. Here, we propose two novel algorithms that use the direct sampler to more efficiently approximate the posterior distribution of the Gaussian graphical model. The first algorithm uses conditional Bayes factors to compare models in a Metropolis-Hastings framework. The second algorithm is based on a continuous time Markov process. We show that both algorithms are substantially faster than state-of-the-art alternatives. Finally, we show how the algorithms may be used to simultaneously estimate both structural and functional connectivity between subcortical brain regions using resting-state fMRI.

Marcel van Gerven

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Decorrelation Speeds Up Vision Transformers

Evolving Layer-Specific Scalar Functions for Hardware-Aware Transformer Adaptation

Neural Co-state Policies: Structuring Hidden States in Recurrent Reinforcement Learning

How does artificial intelligence contribute to iEEG research?

Automatic structured variational inference

Automatic variational inference with cascading flows

End-to-End Pixel-Based Deep Active Inference for Body Perception and Action

The Indian Chefs Process

Virtual staining for mitosis detection in Breast Histopathology

Deep disentangled representations for volumetric reconstruction

Predicting and visualizing psychological attributions with a deep neural network

Efficient sampling of Gaussian graphical models using conditional Bayes factors