Researcher profile

Matthias Zeppelzauer

Matthias Zeppelzauer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

$k$-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers

The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of collaborative research endeavours. For use with anonymisation techniques, the $k$-anonymity criterion is one of the most popular, with numerous scientific publications on different algorithms and metrics. Anonymisation techniques often require changing the data and thus necessarily affect the results of machine learning models trained on the underlying data. In this work, we conduct a systematic comparison and detailed investigation into the effects of different $k$-anonymisation algorithms on the results of machine learning models. We investigate a set of popular $k$-anonymisation algorithms with different classifiers and evaluate them on different real-world datasets. Our systematic evaluation shows that with an increasingly strong $k$-anonymity constraint, the classification performance generally degrades, but to varying degrees and strongly depending on the dataset and anonymisation method. Furthermore, Mondrian can be considered as the method with the most appealing properties for subsequent classification.

preprint2022arXiv

Automatic Sexism Detection with Multilingual Transformer Models

Sexism has become an increasingly major problem on social networks during the last years. The first shared task on sEXism Identification in Social neTworks (EXIST) at IberLEF 2021 is an international competition in the field of Natural Language Processing (NLP) with the aim to automatically identify sexism in social media content by applying machine learning methods. Thereby sexism detection is formulated as a coarse (binary) classification problem and a fine-grained classification task that distinguishes multiple types of sexist content (e.g., dominance, stereotyping, and objectification). This paper presents the contribution of the AIT_FHSTP team at the EXIST2021 benchmark for both tasks. To solve the tasks we applied two multilingual transformer models, one based on multilingual BERT and one based on XLM-R. Our approach uses two different strategies to adapt the transformers to the detection of sexist content: first, unsupervised pre-training with additional data and second, supervised fine-tuning with additional and augmented data. For both tasks our best model is XLM-R with unsupervised pre-training on the EXIST data and additional datasets and fine-tuning on the provided dataset. The best run for the binary classification (task 1) achieves a macro F1-score of 0.7752 and scores 5th rank in the benchmark; for the multiclass classification (task 2) our best submission scores 6th rank with a macro F1-score of 0.5589.

preprint2020arXiv

Machine Unlearning: Linear Filtration for Logit-based Classifiers

Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used, and in particular a "right to be forgotten". This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data which has been part of the training process of a model? From this question emerges the field of machine unlearning, which could be broadly described as the investigation of how to "delete training data from models". Our work complements this direction of research for the specific setting of class-wide deletion requests for classification models (e.g. deep neural networks). As a first step, we propose linear filtration as a intuitive, computationally efficient sanitization method. Our experiments demonstrate benefits in an adversarial setting over naive deletion schemes.

preprint2020arXiv

On the Explanation of Machine Learning Predictions in Clinical Gait Analysis

Machine learning (ML) is increasingly used to support decision-making in the healthcare sector. While ML approaches provide promising results with regard to their classification performance, most share a central limitation, namely their black-box character. Motivated by the interest to understand the functioning of ML models, methods from the field of Explainable Artificial Intelligence (XAI) have recently become important. This article investigates the usefulness of XAI methods in clinical gait classification. For this purpose, predictions of state-of-the-art classification methods are explained with an established XAI method, i.e., Layer-wise Relevance Propagation (LRP). We propose to evaluate the obtained explanations with two complementary approaches: a statistical analysis of the underlying data using Statistical Parametric Mapping and a qualitative evaluation by a clinical expert. A gait dataset comprising ground reaction force measurements from 132 patients with different lower-body gait disorders and 62 healthy controls is utilized. We investigate several gait classification tasks, employ multiple classification methods, and analyze the impact of data normalization and different signal components for classification performance and explanation quality. Our experiments show that explanations obtained by LRP exhibit promising statistical properties concerning inter-class discriminativity and are also in line with clinically relevant biomechanical gait characteristics.

preprint2017arXiv

Automatic Classification of Functional Gait Disorders

This article proposes a comprehensive investigation of the automatic classification of functional gait disorders based solely on ground reaction force (GRF) measurements. The aim of the study is twofold: (1) to investigate the suitability of stateof-the-art GRF parameterization techniques (representations) for the discrimination of functional gait disorders; and (2) to provide a first performance baseline for the automated classification of functional gait disorders for a large-scale dataset. The utilized database comprises GRF measurements from 279 patients with gait disorders (GDs) and data from 161 healthy controls (N). Patients were manually classified into four classes with different functional impairments associated with the "hip", "knee", "ankle", and "calcaneus". Different parameterizations are investigated: GRF parameters, global principal component analysis (PCA)-based representations and a combined representation applying PCA on GRF parameters. The discriminative power of each parameterization for different classes is investigated by linear discriminant analysis (LDA). Based on this analysis, two classification experiments are pursued: (1) distinction between healthy and impaired gait (N vs. GD) and (2) multi-class classification between healthy gait and all four GD classes. Experiments show promising results and reveal among others that several factors, such as imbalanced class cardinalities and varying numbers of measurement sessions per patient have a strong impact on the classification accuracy and therefore need to be taken into account. The results represent a promising first step towards the automated classification of gait disorders and a first performance baseline for future developments in this direction.