Source author record

Wolfgang Fuhl

Wolfgang Fuhl appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning eess.IV Human-Computer Interaction Artificial Intelligence Graphics

Catalog footprint

What is connected

16works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TEyeD: Over 20 million real-world eye images with Pupil, Eyelid, and Iris 2D and 3D Segmentations, 2D and 3D Landmarks, 3D Eyeball, Gaze Vector, and Eye Movement Types

We present TEyeD, the world's largest unified public data set of eye images taken with head-mounted devices. TEyeD was acquired with seven different head-mounted eye trackers. Among them, two eye trackers were integrated into virtual reality (VR) or augmented reality (AR) devices. The images in TEyeD were obtained from various tasks, including car rides, simulator rides, outdoor sports activities, and daily indoor activities. The data set includes 2D and 3D landmarks, semantic segmentation, 3D eyeball annotation and the gaze vector and eye movement types for all images. Landmarks and semantic segmentation are provided for the pupil, iris and eyelids. Video lengths vary from a few minutes to several hours. With more than 20 million carefully annotated images, TEyeD provides a unique, coherent resource and a valuable foundation for advancing research in the field of computer vision, eye tracking and gaze estimation in modern VR and AR applications. Download: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FTEyeDS&mode=list Alternative Download: https://hctlsrva.edu.sot.tum.de/TEyeDS/

preprint2022arXiv

HPCGen: Hierarchical K-Means Clustering and Level Based Principal Components for Scan Path Genaration

In this paper, we present a new approach for decomposing scan paths and its utility for generating new scan paths. For this purpose, we use the K-Means clustering procedure to the raw gaze data and subsequently iteratively to find more clusters in the found clusters. The found clusters are grouped for each level in the hierarchy, and the most important principal components are computed from the data contained in them. Using this tree hierarchy and the principal components, new scan paths can be generated that match the human behavior of the original data. We show that this generated data is very useful for generating new data for scan path classification but can also be used to generate fake scan paths.

preprint2022arXiv

Technical Report: Combining knowledge from Transfer Learning during training and Wide Resnets

In this report, we combine the idea of Wide ResNets and transfer learning to optimize the architecture of deep neural networks. The first improvement of the architecture is the use of all layers as information source for the last layer. This idea comes from transfer learning, which uses networks pre-trained on other data and extracts different levels of the network as input for the new task. The second improvement is the use of deeper layers instead of deeper sequences of blocks. This idea comes from Wide ResNets. Using both optimizations, both high data augmentation and standard data augmentation can produce better results for different models. Link: https://github.com/wolfgangfuhl/PublicationStuff/tree/master/TechnicalReport1/Supp

preprint2022arXiv

Where and What: Driver Attention-based Object Detection

Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing and analyzing gaze information has emerged in recent years to benefit autonomous driving technology. Previous works in this context have primarily aimed at predicting "where" human drivers look at and lack knowledge of "what" objects drivers focus on. Our work bridges the gap between pixel-level and object-level attention prediction. Specifically, we propose to integrate an attention prediction module into a pretrained object detection framework and predict the attention in a grid-based style. Furthermore, critical objects are recognized based on predicted attended-to areas. We evaluate our proposed method on two driver attention datasets, BDD-A and DR(eye)VE. Our framework achieves competitive state-of-the-art performance in the attention prediction on both pixel-level and object-level but is far more efficient (75.3 GFLOPs less) in computation.

preprint2021arXiv

1000 Pupil Segmentations in a Second using Haar Like Features and Statistical Learning

In this paper we present a new approach for pupil segmentation. It can be computed and trained very efficiently, making it ideal for online use for high speed eye trackers as well as for energy saving pupil detection in mobile eye tracking. The approach is inspired by the BORE and CBF algorithms and generalizes the binary comparison by Haar features. Since these features are intrinsically very susceptible to noise and fluctuating light conditions, we combine them with conditional pupil shape probabilities. In addition, we also rank each feature according to its importance in determining the pupil shape. Another advantage of our method is the use of statistical learning, which is very efficient and can even be used online. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FStatsPupil&mode=list

preprint2021arXiv

A Multimodal Eye Movement Dataset and a Multimodal Eye Movement Segmentation Analysis

We present a new dataset with annotated eye movements. The dataset consists of over 800,000 gaze points recorded during a car ride in the real world and in the simulator. In total, the eye movements of 19 subjects were annotated. In this dataset there are several data sources such as the eyelid closure, the pupil center, the optical vector, and a vector into the pupil center starting from the center of the eye corners. These different data sources are analyzed and evaluated individually as well as in combination with respect to their goodness of fit for eye movement classification. These results will help developers of real-time systems and algorithms to find the best data sources for their application. Also, new algorithms can be trained and evaluated on this data set. The data and the Matlab code can be downloaded here https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FA%20Multimodal%20Eye%20Movement%20Dataset%20and%20...&mode=list

preprint2021arXiv

Explainable Online Validation of Machine Learning Models for Practical Applications

We present a reformulation of the regression and classification, which aims to validate the result of a machine learning algorithm. Our reformulation simplifies the original problem and validates the result of the machine learning algorithm using the training data. Since the validation of machine learning algorithms must always be explainable, we perform our experiments with the kNN algorithm as well as with an algorithm based on conditional probabilities, which is proposed in this work. For the evaluation of our approach, three publicly available data sets were used and three classification and two regression problems were evaluated. The presented algorithm based on conditional probabilities is also online capable and requires only a fraction of memory compared to the kNN algorithm.

preprint2021arXiv

Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction

In this paper, we use fully convolutional neural networks for the semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data. The first improvement of our approach is that no input window is necessary, due to the use of fully convolutional networks and therefore any input size can be processed directly. The second improvement is that the used and generated data is raw eye tracking data (position X, Y and time) without preprocessing. This is achieved by pre-initializing the filters in the first layer and by building the input tensor along the z axis. We evaluated our approach on three publicly available datasets and compare the results to the state of the art.

preprint2021arXiv

Multi Layer Neural Networks as Replacement for Pooling Operations

Pooling operations, which can be calculated at low cost and serve as a linear or nonlinear transfer function for data reduction, are found in almost every modern neural network. Countless modern approaches have already tackled replacing the common maximum value selection and mean value operations, not to mention providing a function that allows different functions to be selected through changing parameters. Additional neural networks are used to estimate the parameters of these pooling functions.Consequently, pooling layers may require supplementary parameters to increase the complexity of the whole model. In this work, we show that one perceptron can already be used effectively as a pooling operation without increasing the complexity of the model. This kind of pooling allows for the integration of multi-layer neural networks directly into a model as a pooling operation by restructuring the data and, as a result, learnin complex pooling operations. We compare our approach to tensor convolution with strides as a pooling operation and show that our approach is both effective and reduces complexity. The restructuring of the data in combination with multiple perceptrons allows for our approach to be used for upscaling, which can then be utilized for transposed convolutions in semantic segmentation.

preprint2021arXiv

Rotated Ring, Radial and Depth Wise Separable Radial Convolutions

Simple image rotations significantly reduce the accuracy of deep neural networks. Moreover, training with all possible rotations increases the data set, which also increases the training duration. In this work, we address trainable rotation invariant convolutions as well as the construction of nets, since fully connected layers can only be rotation invariant with a one-dimensional input. On the one hand, we show that our approach is rotationally invariant for different models and on different public data sets. We also discuss the influence of purely rotational invariant features on accuracy. The rotationally adaptive convolution models presented in this work are more computationally intensive than normal convolution models. Therefore, we also present a depth wise separable approach with radial convolution. Link to CUDA code https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/

preprint2021arXiv

Weight and Gradient Centralization in Deep Neural Networks

Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage of the newer methods compared to the batch normalization is not only increased generalization, but also that these methods only have to be applied during training and, therefore, do not influence the running time during use. Link to CUDA code https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/

preprint2020arXiv

Learning to Validate the Quality of Detected Landmarks

We present a new loss function for the validation of image landmarks detected via Convolutional Neural Networks (CNN). The network learns to estimate how accurate its landmark estimation is. This loss function is applicable to all regression-based location estimations and allows the exclusion of unreliable landmarks from further processing. In addition, we formulate a novel batch balancing approach which weights the importance of samples based on their produced loss. This is done by computing a probability distribution mapping on an interval from which samples can be selected using a uniform random selection scheme. We conducted experiments on the 300W, AFLW, and WFLW facial landmark datasets. In the first experiments, the influence of our batch balancing approach is evaluated by comparing it against uniform sampling. In addition, we evaluated the impact of the validation loss on the landmark accuracy based on uniform sampling. The last experiments evaluate the correlation of the validation signal with the landmark accuracy. All experiments were performed for all three datasets.

preprint2020arXiv

Training Decision Trees as Replacement for Convolution Layers

We present an alternative layer to convolution layers in convolutional neural networks (CNNs). Our approach reduces the complexity of convolutions by replacing it with binary decisions. Those binary decisions are used as indexes to conditional distributions where each weight represents a leaf in a decision tree. This means that only the indices to the weights need to be determined once, thus reducing the complexity of convolutions by the depth of the output tensor. Index computation is performed by simple binary decisions that require fewer cycles compared to conventionally used multiplications. In addition, we show how convolutions can be replaced by binary decisions. These binary decisions form indices in the conditional distributions and we show how they are used to replace 2D weight matrices as well as 3D weight tensors. These new layers can be trained like convolution layers in CNNs based on the backpropagation algorithm, for which we provide a formalization. Our results on multiple publicly available data sets show that our approach performs similar to conventional neuronal networks. Beyond the formalized reduction of complexity and the improved qualitative performance, we show the runtime improvement empirically compared to convolution layers.

preprint2016arXiv

PupilNet: Convolutional Neural Networks for Robust Pupil Detection

Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in real-world scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non centered and off-axis eye recording, and physiological eye characteristics. In this paper, we propose and evaluate a method based on a novel dual convolutional neural network pipeline. In its first stage the pipeline performs coarse pupil position identification using a convolutional neural network and subregions from a downscaled input image to decrease computational costs. Using subregions derived from a small window around the initial pupil position estimate, the second pipeline stage employs another convolutional neural network to refine this position, resulting in an increased pupil detection rate up to 25% in comparison with the best performing state-of-the-art algorithm. Annotated data sets can be made available upon request.

preprint2015arXiv

Bayesian Identification of Fixations, Saccades, and Smooth Pursuits

Smooth pursuit eye movements provide meaningful insights and information on subject's behavior and health and may, in particular situations, disturb the performance of typical fixation/saccade classification algorithms. Thus, an automatic and efficient algorithm to identify these eye movements is paramount for eye-tracking research involving dynamic stimuli. In this paper, we propose the Bayesian Decision Theory Identification (I-BDT) algorithm, a novel algorithm for ternary classification of eye movements that is able to reliably separate fixations, saccades, and smooth pursuits in an online fashion, even for low-resolution eye trackers. The proposed algorithm is evaluated on four datasets with distinct mixtures of eye movements, including fixations, saccades, as well as straight and circular smooth pursuits; data was collected with a sample rate of 30 Hz from six subjects, totaling 24 evaluation datasets. The algorithm exhibits high and consistent performance across all datasets and movements relative to a manual annotation by a domain expert (recall: μ= 91.42%, σ= 9.52%; precision: μ= 95.60%, σ= 5.29%; specificity μ= 95.41%, σ= 7.02%) and displays a significant improvement when compared to I-VDT, an state-of-the-art algorithm (recall: μ= 87.67%, σ= 14.73%; precision: μ= 89.57%, σ= 8.05%; specificity μ= 92.10%, σ= 11.21%). For algorithm implementation and annotated datasets, please contact the first author.

preprint2015arXiv

ElSe: Ellipse Selection for Robust Pupil Detection in Real-World Environments

Fast and robust pupil detection is an essential prerequisite for video-based eye-tracking in real-world settings. Several algorithms for image-based pupil detection have been proposed, their applicability is mostly limited to laboratory conditions. In realworld scenarios, automated pupil detection has to face various challenges, such as illumination changes, reflections (on glasses), make-up, non-centered eye recording, and physiological eye characteristics. We propose ElSe, a novel algorithm based on ellipse evaluation of a filtered edge image. We aim at a robust, resource-saving approach that can be integrated in embedded architectures e.g. driving. The proposed algorithm was evaluated against four state-of-the-art methods on over 93,000 hand-labeled images from which 55,000 are new images contributed by this work. On average, the proposed method achieved a 14.53% improvement on the detection rate relative to the best state-of-the-art performer. download:ftp://emmapupildata@messor.informatik.unituebingen. de (password:eyedata).

Wolfgang Fuhl

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

TEyeD: Over 20 million real-world eye images with Pupil, Eyelid, and Iris 2D and 3D Segmentations, 2D and 3D Landmarks, 3D Eyeball, Gaze Vector, and Eye Movement Types

HPCGen: Hierarchical K-Means Clustering and Level Based Principal Components for Scan Path Genaration

Technical Report: Combining knowledge from Transfer Learning during training and Wide Resnets

Where and What: Driver Attention-based Object Detection

1000 Pupil Segmentations in a Second using Haar Like Features and Statistical Learning

A Multimodal Eye Movement Dataset and a Multimodal Eye Movement Segmentation Analysis

Explainable Online Validation of Machine Learning Models for Practical Applications

Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction

Multi Layer Neural Networks as Replacement for Pooling Operations

Rotated Ring, Radial and Depth Wise Separable Radial Convolutions

Weight and Gradient Centralization in Deep Neural Networks

Learning to Validate the Quality of Detected Landmarks

Training Decision Trees as Replacement for Convolution Layers

PupilNet: Convolutional Neural Networks for Robust Pupil Detection

Bayesian Identification of Fixations, Saccades, and Smooth Pursuits

ElSe: Ellipse Selection for Robust Pupil Detection in Real-World Environments