Researcher profile

Kristian Kersting

Kristian Kersting contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
36works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

36 published item(s)

preprint2026arXiv

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection

Speech emotion recognition (SER) systems are constrained by existing datasets that typically cover only 6-10 basic emotions, lack scale and diversity, and face ethical challenges when collecting sensitive emotional states. We introduce EMONET-VOICE, a comprehensive resource addressing these limitations through two components: (1) EmoNet-Voice Big, a 5,000-hour multilingual pre-training dataset spanning 40 fine-grained emotion categories across 11 voices and 4 languages, and (2) EmoNet-Voice Bench, a rigorously validated benchmark of 4,7k samples with unanimous expert consensus on emotion presence and intensity levels. Using state-of-the-art synthetic voice generation, our privacy-preserving approach enables ethical inclusion of sensitive emotions (e.g., pain, shame) while maintaining controlled experimental conditions. Each sample underwent validation by three psychology experts. We demonstrate that our Empathic Insight models trained on our synthetic data achieve strong real-world dataset generalization, as tested on EmoDB and RAVDESS. Furthermore, our comprehensive evaluation reveals that while high-arousal emotions (e.g., anger: 95% accuracy) are readily detected, the benchmark successfully exposes the difficulty of distinguishing perceptually similar emotions (e.g., sadness vs. distress: 63% discrimination), providing quantifiable metrics for advancing nuanced emotion AI. EMONET-VOICE establishes a new paradigm for large-scale, ethically-sourced, fine-grained SER research.

preprint2026arXiv

Fodor and Pylyshyn's Legacy: Still No Human-like Systematic Compositionality in Neural Networks

Strong meta-learning capabilities for systematic compositionality are emerging as an important skill for navigating the complex and changing tasks of today's world. However, in presenting models for robust adaptation to novel environments, it is important to refrain from making unsupported claims about the performance of meta-learning systems that ultimately do not stand up to scrutiny. While Fodor and Pylyshyn famously posited that neural networks inherently lack this capacity as they are unable to model compositional representations or structure-sensitive operations, and thus are not a viable model of the human mind, Lake and Baroni recently presented meta-learning as a pathway to compositionality. In this position paper, we critically revisit this claim and highlight limitations in the proposed meta-learning framework for compositionality. Our analysis shows that modern neural meta-learning systems can only perform such tasks, if at all, under a very narrow and restricted definition of a meta-learning setup. We therefore claim that `Fodor and Pylyshyn's legacy' persists, and to date, there is no human-like systematic compositionality learned in neural networks.

preprint2026arXiv

Imitation learning for clinical decision support in pediatric ECMO

Pediatric critical care is a dynamic, high-stakes process involving constant monitoring and adjustments in life-saving treatments. Modeling these interventions is crucial for effective decision support. To address the challenges of high complexity and data scarcity in pediatric Extracorporeal Membrane Oxygenation (ECMO), we frame clinical decision-making as learning to act from trajectories, i.e., imitation learning that learns action models from observational data, with a key feature that actions are not directly observed. We consider TabPFN, a recent transformer-based approach for tabular data, and traditional baselines including XGBoost and Multi-Layer Perceptrons(MLPs) on real-world pediatric ECMO data to learn the action models. We find that the TabPFN-based approach consistently outperforms these classical baselines, supporting its use as a strong clinician-behavior baseline for pediatric ECMO decision support.

preprint2026arXiv

Kintsugi: Learning Policies by Repairing Executable Knowledge Bases

Modern embodied agents achieve impressive performance, but their task knowledge is often stored in neural weights, latent state, or prompt-bound memory, making individual policy knowledge difficult to inspect, validate, recombine, and reuse. We introduce \textbf{Kintsugi}, a white-box policy-learning framework that treats embodied policy improvement as verifier-gated construction of a typed executable Knowledge Base (KB). Kintsugi represents task-level policy knowledge as composable typed entries -- predicates, operators, policy schemas, monitors, recovery rules, experience records, and goals -- and improves this artifact through localized typed edits induced from rollout evidence, rather than relying on test-time language-model reasoning. Between rollouts, a tool-constrained agentic editing loop diagnoses trajectory failures, localizes them to editable KB layers, and proposes candidate edits. A deterministic verification gate admits an edit only when the candidate type-checks, the resulting KB executes, and focused validation success or trajectory-health metrics improve without violating protected-regression checks. At inference, the accepted KB is executed by a deterministic symbolic executor with zero LLM calls. Across long-horizon text-agent benchmarks and representative object-centric manipulation settings, Kintsugi achieves strong endpoint performance while preserving inspectability, local editability, and verifier-gated deployment. These results suggest that embodied policy improvement can be organized around executable task knowledge.

preprint2026arXiv

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless, constrained by limited context windows that hinder long-horizon reasoning. Recent efforts to address this limitation often augment LLMs with an external memory bank, yet most existing pipelines are static and heuristic-driven, lacking a learned mechanism for deciding what to store, update, or retrieve. We present Memory-R1, a reinforcement learning (RL) framework that equips LLMs with the ability to actively manage and utilize external memory through two specialized agents: a Memory Manager that learns structured operations, including ADD, UPDATE, DELETE, and NOOP; and an Answer Agent that pre-selects and reasons over relevant entries. Both agents are fine-tuned with outcome-driven RL (PPO and GRPO), enabling adaptive memory management with minimal supervision. With only 152 training QA pairs, Memory-R1 outperforms strong baselines and generalizes across diverse question types, three benchmarks (LoCoMo, MSC, LongMemEval), and multiple model scales (3B-14B).

preprint2026arXiv

Neurosymbolic Imitation Learning with Human Guidance: A Privileged Information Approach

Imitation learning is widely used for learning to act in complex environments. While pure neural-based methods handle high dimensional data effectively, they suffer from the requirement of large number of samples and are prone to overfitting. Pure symbolic approaches, while generalize well, do not handle high-dimensional data effectively. We propose a neurosymbolic approach that achieves the best of both worlds, i.e, handling high-dimensional data while achieving generalization. The key advantage of our approach is that it can effectively exploit additional privileged information that is available only during training (in our case, gaze data). Our empirical evaluations demonstrate the effectiveness, efficiency and the generalization capability of our proposed approach.

preprint2026arXiv

Probabilistic Mission Design for Neuro-Symbolic Unmanned Aircraft Systems

Advanced Air Mobility (AAM) is a growing field that demands accurate and trustworthy models of legal concepts and restrictions for navigating Unmanned Aircraft Systems (UAS). In addition, any implementation of AAM needs to face the challenges posed by inherently dynamic and uncertain human-inhabited spaces robustly. Nevertheless, the employment of UAS beyond visual line of sight (BVLOS) is an endearing task that promises to significantly enhance today's logistics and emergency response capabilities. Hence, we propose Probabilistic Mission Design (ProMis), a novel neuro-symbolic approach to navigating UAS within legal frameworks. ProMis is an interpretable and adaptable system architecture that links uncertain geospatial data and noisy perception with declarative, Hybrid Probabilistic Logic Programs (HPLP) to reason over the agent's state space and its legality. To inform planning with legal restrictions and uncertainty in mind, ProMis yields Probabilistic Mission Landscapes (PML). These scalar fields quantify the belief that the HPLP is satisfied across the agent's state space. Extending prior work on ProMis' reasoning capabilities and computational characteristics, we show its integration with potent machine learning models such as Large Language Models (LLM) and Transformer-based vision models. Hence, our experiments underpin the application of ProMis with multi-modal input data and how our method applies to many AAM scenarios.

preprint2024arXiv

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in a textual description, common models reflect cultural stereotypes and biases in their generated images. We analyze this behavior both qualitatively and quantitatively, and identify a model's text encoder as the root cause of the phenomenon. Additionally, malicious users or service providers may try to intentionally bias the image generation to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.

preprint2022arXiv

Adaptable Adapters

State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of pretrained weights. Adapter layers are initialized randomly. However, existing work uses the same adapter architecture -- i.e., the same adapter layer on top of each layer of the pretrained model -- for every dataset, regardless of the properties of the dataset or the amount of available training data. In this work, we introduce adaptable adapters that contain (1) learning different activation functions for different layers and different input data, and (2) a learnable switch to select and only use the beneficial adapter layers. We show that adaptable adapters achieve on-par performances with the standard adapter architecture while using a considerably smaller number of adapter layers. In addition, we show that the selected adapter architecture by adaptable adapters transfers well across different data settings and similar tasks. We propose to use adaptable adapters for designing efficient and effective adapter architectures. The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.

preprint2022arXiv

Attributions Beyond Neural Networks: The Linear Program Case

Linear Programs (LPs) have been one of the building blocks in machine learning and have championed recent strides in differentiable optimizers for learning systems. While there exist solvers for even high-dimensional LPs, understanding said high-dimensional solutions poses an orthogonal and unresolved problem. We introduce an approach where we consider neural encodings for LPs that justify the application of attribution methods from explainable artificial intelligence (XAI) designed for neural learning systems. The several encoding functions we propose take into account aspects such as feasibility of the decision space, the cost attached to each input, or the distance to special points of interest. We investigate the mathematical consequences of several XAI methods on said neural LP encodings. We empirically show that the attribution methods Saliency and LIME reveal indistinguishable results up to perturbation levels, and we propose the property of Directedness as the main discriminative criterion between Saliency and LIME on one hand, and a perturbation-based Feature Permutation approach on the other hand. Directedness indicates whether an attribution method gives feature attributions with respect to an increase of that feature. We further notice the baseline selection problem beyond the classical computer vision setting for Integrated Gradients.

preprint2022arXiv

Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?

Large datasets underlying much of current machine learning raise serious issues concerning inappropriate content such as offensive, insulting, threatening, or might otherwise cause anxiety. This calls for increased dataset documentation, e.g., using datasheets. They, among other topics, encourage to reflect on the composition of the datasets. So far, this documentation, however, is done manually and therefore can be tedious and error-prone, especially for large image datasets. Here we ask the arguably "circular" question of whether a machine can help us reflect on inappropriate content, answering Question 16 in Datasheets. To this end, we propose to use the information stored in pre-trained transformer models to assist us in the documentation process. Specifically, prompt-tuning based on a dataset of socio-moral values steers CLIP to identify potentially inappropriate content, therefore reducing human labor. We then document the inappropriate images found using word clouds, based on captions generated using a vision-language model. The documentations of two popular, large-scale computer vision datasets -- ImageNet and OpenImages -- produced this way suggest that machines can indeed help dataset creators to answer Question 16 on inappropriate image content.

preprint2022arXiv

CLEVA-Compass: A Continual Learning EValuation Assessment Compass to Promote Research Transparency and Comparability

What is the state of the art in continual machine learning? Although a natural question for predominant static benchmarks, the notion to train systems in a lifelong manner entails a plethora of additional challenges with respect to set-up and evaluation. The latter have recently sparked a growing amount of critiques on prominent algorithm-centric perspectives and evaluation protocols being too narrow, resulting in several attempts at constructing guidelines in favor of specific desiderata or arguing against the validity of prevalent assumptions. In this work, we depart from this mindset and argue that the goal of a precise formulation of desiderata is an ill-posed one, as diverse applications may always warrant distinct scenarios. Instead, we introduce the Continual Learning EValuation Assessment Compass: the CLEVA-Compass. The compass provides the visual means to both identify how approaches are practically reported and how works can simultaneously be contextualized in the broader literature landscape. In addition to promoting compact specification in the spirit of recent replication trends, it thus provides an intuitive chart to understand the priorities of individual systems, where they resemble each other, and what elements are missing towards a fair comparison.

preprint2022arXiv

DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection

We present DAFNe, a Dense one-stage Anchor-Free deep Network for oriented object detection. As a one-stage model, it performs bounding box predictions on a dense grid over the input image, being architecturally simpler in design, as well as easier to optimize than its two-stage counterparts. Furthermore, as an anchor-free model, it reduces the prediction complexity by refraining from employing bounding box anchors. With DAFNe we introduce an orientation-aware generalization of the center-ness function for arbitrarily oriented bounding boxes to down-weight low-quality predictions and a center-to-corner bounding box prediction strategy that improves object localization performance. Our experiments show that DAFNe outperforms all previous one-stage anchor-free models on DOTA 1.0, DOTA 1.5, and UCAS-AOD and is on par with the best models on HRSC2016.

preprint2022arXiv

Do Multilingual Language Models Capture Differing Moral Norms?

Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training. This may cause the models to grasp cultural values including moral judgments from the high-resource languages and impose them on the low-resource languages. The lack of data in certain languages can also lead to developing random and thus potentially harmful beliefs. Both these issues can negatively influence zero-shot cross-lingual model transfer and potentially lead to harmful outcomes. Therefore, we aim to (1) detect and quantify these issues by comparing different models in different languages, (2) develop methods for improving undesirable properties of the models. Our initial experiments using the multilingual model XLM-R show that indeed multilingual LMs capture moral norms, even with potentially higher human-agreement than monolingual ones. However, it is not yet clear to what extent these moral norms differ between languages.

preprint2022arXiv

Finding Structure and Causality in Linear Programs

Linear Programs (LP) are celebrated widely, particularly so in machine learning where they have allowed for effectively solving probabilistic inference tasks or imposing structure on end-to-end learning systems. Their potential might seem depleted but we propose a foundational, causal perspective that reveals intriguing intra- and inter-structure relations for LP components. We conduct a systematic, empirical investigation on general-, shortest path- and energy system LPs.

preprint2022arXiv

Gradient-based Counterfactual Explanations using Tractable Probabilistic Models

Counterfactual examples are an appealing class of post-hoc explanations for machine learning models. Given input $x$ of class $y_1$, its counterfactual is a contrastive example $x^\prime$ of another class $y_0$. Current approaches primarily solve this task by a complex optimization: define an objective function based on the loss of the counterfactual outcome $y_0$ with hard or soft constraints, then optimize this function as a black-box. This "deep learning" approach, however, is rather slow, sometimes tricky, and may result in unrealistic counterfactual examples. In this work, we propose a novel approach to deal with these problems using only two gradient computations based on tractable probabilistic models. First, we compute an unconstrained counterfactual $u$ of $x$ to induce the counterfactual outcome $y_0$. Then, we adapt $u$ to higher density regions, resulting in $x^{\prime}$. Empirical evidence demonstrates the dominant advantages of our approach.

preprint2022arXiv

Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations

Learning visual concepts from raw images without strong supervision is a challenging task. In this work, we show the advantages of prototype representations for understanding and revising the latent space of neural concept learners. For this purpose, we introduce interactive Concept Swapping Networks (iCSNs), a novel framework for learning concept-grounded representations via weak supervision and implicit prototype representations. iCSNs learn to bind conceptual information to specific prototype slots by swapping the latent representations of paired images. This semantically grounded and discrete latent space facilitates human understanding and human-machine interaction. We support this claim by conducting experiments on our novel data set "Elementary Concept Reasoning" (ECR), focusing on visual concepts shared by geometric objects.

preprint2022arXiv

Interactively Providing Explanations for Transformer Language Models

Transformer language models are state of the art in a multitude of NLP tasks. Despite these successes, their opaqueness remains problematic. Recent methods aiming to provide interpretability and explainability to black-box models primarily focus on post-hoc explanations of (sometimes spurious) input-output correlations. Instead, we emphasize using prototype networks directly incorporated into the model architecture and hence explain the reasoning process behind the network's decisions. Our architecture performs on par with several language models and, moreover, enables learning from user interactions. This not only offers a better understanding of language models but uses human capabilities to incorporate knowledge outside of the rigid range of purely data-driven approaches.

preprint2022arXiv

Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do

Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also retain general knowledge implicitly present in the data. Unfortunately, LMs trained on unfiltered text corpora suffer from degenerated and biased behaviour. While this is well established, we show that recent LMs also contain human-like biases of what is right and wrong to do, some form of ethical and moral norms of the society -- they bring a "moral direction" to surface. That is, we show that these norms can be captured geometrically by a direction, which can be computed, e.g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs. Being able to rate the (non-)normativity of arbitrary phrases without explicitly training the LM for this task, we demonstrate the capabilities of the "moral direction" for guiding (even other) LMs towards producing normative text and showcase it on RealToxicityPrompts testbed, preventing the neural toxic degeneration in GPT-2.

preprint2022arXiv

LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems

Text-to-image models have recently achieved remarkable success with seemingly accurate samples in photo-realistic quality. However as state-of-the-art language models still struggle evaluating precise statements consistently, so do language model based image generation processes. In this work we showcase problems of state-of-the-art text-to-image models like DALL-E with generating accurate samples from statements related to the draw bench benchmark. Furthermore we show that CLIP is not able to rerank those generated samples consistently. To this end we propose LogicRank, a neuro-symbolic reasoning framework that can result in a more accurate ranking-system for such precision-demanding settings. LogicRank integrates smoothly into the generation process of text-to-image models and moreover can be used to further fine-tune towards a more logical precise model.

preprint2022arXiv

Machines Explaining Linear Programs

There has been a recent push in making machine learning models more interpretable so that their performance can be trusted. Although successful, these methods have mostly focused on the deep learning methods while the fundamental optimization methods in machine learning such as linear programs (LP) have been left out. Even if LPs can be considered as whitebox or clearbox models, they are not easy to understand in terms of relationships between inputs and outputs. As a linear program only provides the optimal solution to an optimization problem, further explanations are often helpful. In this work, we extend the attribution methods for explaining neural networks to linear programs. These methods explain the model by providing relevance scores for the model inputs, to show the influence of each input on the output. Alongside using classical gradient-based attribution methods we also propose a way to adapt perturbation-based attribution methods to LPs. Our evaluations of several different linear and integer problems showed that attribution methods can generate useful explanations for linear programs. However, we also demonstrate that using a neural attribution method directly might come with some drawbacks, as the properties of these methods on neural networks do not necessarily transfer to linear programs. The methods can also struggle if a linear program has more than one optimal solution, as a solver just returns one possible solution. Our results can hopefully be used as a good starting point for further research in this direction.

preprint2022arXiv

Neuro-Symbolic Verification of Deep Neural Networks

Formal verification has emerged as a powerful approach to ensure the safety and reliability of deep neural networks. However, current verification tools are limited to only a handful of properties that can be expressed as first-order constraints over the inputs and output of a network. While adversarial robustness and fairness fall under this category, many real-world properties (e.g., "an autonomous vehicle has to stop in front of a stop sign") remain outside the scope of existing verification technology. To mitigate this severe practical restriction, we introduce a novel framework for verifying neural networks, named neuro-symbolic verification. The key idea is to use neural networks as part of the otherwise logical specification, enabling the verification of a wide variety of complex, real-world properties, including the one above. Moreover, we demonstrate how neuro-symbolic verification can be implemented on top of existing verification infrastructure for neural networks, making our framework easily accessible to researchers and practitioners alike.

preprint2022arXiv

Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks

Model inversion attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target classifier's private training data by exploiting the model's learned knowledge. Previous research has developed generative MIAs that use generative adversarial networks (GANs) as image priors tailored to a specific target model. This makes the attacks time- and resource-consuming, inflexible, and susceptible to distributional shifts between datasets. To overcome these drawbacks, we present Plug & Play Attacks, which relax the dependency between the target model and image prior, and enable the use of a single GAN to attack a wide range of targets, requiring only minor adjustments to the attack. Moreover, we show that powerful MIAs are possible even with publicly available pre-trained GANs and under strong distributional shifts, for which previous approaches fail to produce meaningful results. Our extensive evaluation confirms the improved robustness and flexibility of Plug & Play Attacks and their ability to create high-quality images revealing sensitive class characteristics.

preprint2022arXiv

Right for the Right Latent Factors: Debiasing Generative Models via Disentanglement

A key assumption of most statistical machine learning methods is that they have access to independent samples from the distribution of data they encounter at test time. As such, these methods often perform poorly in the face of biased data, which breaks this assumption. In particular, machine learning models have been shown to exhibit Clever-Hans-like behaviour, meaning that spurious correlations in the training set are inadvertently learnt. A number of works have been proposed to revise deep classifiers to learn the right correlations. However, generative models have been overlooked so far. We observe that generative models are also prone to Clever-Hans-like behaviour. To counteract this issue, we propose to debias generative models by disentangling their internal representations, which is achieved via human feedback. Our experiments show that this is effective at removing bias even when human feedback covers only a small fraction of the desired distribution. In addition, we achieve strong disentanglement results in a quantitative comparison with recent methods.

preprint2022arXiv

Tearing Apart NOTEARS: Controlling the Graph Prediction via Variance Manipulation

Simulations are ubiquitous in machine learning. Especially in graph learning, simulations of Directed Acyclic Graphs (DAG) are being deployed for evaluating new algorithms. In the literature, it was recently argued that continuous-optimization approaches to structure discovery such as NOTEARS might be exploiting the sortability of the variable's variances in the available data due to their use of least square losses. Specifically, since structure discovery is a key problem in science and beyond, we want to be invariant to the scale being used for measuring our data (e.g. meter versus centimeter should not affect the causal direction inferred by the algorithm). In this work, we further strengthen this initial, negative empirical suggestion by both proving key results in the multivariate case and corroborating with further empirical evidence. In particular, we show that we can control the resulting graph with our targeted variance attacks, even in the case where we can only partially manipulate the variances of the data.

preprint2022arXiv

User-Level Label Leakage from Gradients in Federated Learning

Federated learning enables multiple users to build a joint model by sharing their model updates (gradients), while their raw data remains local on their devices. In contrast to the common belief that this provides privacy benefits, we here add to the very recent results on privacy risks when sharing gradients. Specifically, we investigate Label Leakage from Gradients (LLG), a novel attack to extract the labels of the users' training data from their shared gradients. The attack exploits the direction and magnitude of gradients to determine the presence or absence of any label. LLG is simple yet effective, capable of leaking potential sensitive information represented by labels, and scales well to arbitrary batch sizes and multiple classes. We mathematically and empirically demonstrate the validity of the attack under different settings. Moreover, empirical results show that LLG successfully extracts labels with high accuracy at the early stages of model training. We also discuss different defense mechanisms against such leakage. Our findings suggest that gradient compression is a practical technique to mitigate the attack.

preprint2020arXiv

Alfie: An Interactive Robot with a Moral Compass

This work introduces Alfie, an interactive robot that is capable of answering moral (deontological) questions of a user. The interaction of Alfie is designed in a way in which the user can offer an alternative answer when the user disagrees with the given answer so that Alfie can learn from its interactions. Alfie's answers are based on a sentence embedding model that uses state-of-the-art language models, e.g. Universal Sentence Encoder and BERT. Alfie is implemented on a Furhat Robot, which provides a customizable user interface to design a social robot.

preprint2020arXiv

Conflict and Cooperation: AI Research and Development in terms of the Economy of Conventions

Artificial Intelligence (AI) and its relation with societies is increasingly becoming an interesting object of study from the perspective of sociology and other disciplines. Theories such as the Economy of Conventions (EC) are usually applied in the context of interpersonal relations but there is still a clear lack of studies around how this and other theories can shed light on interactions between human an autonomous systems. This work is focused into studying a preliminary step that is a key enabler for the subsequent interaction between machines and humans: how the processes of researching, designing and developing AI related systems reflect different moral registers, represented by conventions within the EC. Having a better understanding of those conventions guiding the advances in AI is considered as the first and required advance to understand the conventions afterwards reflected by those autonomous systems in the interactions with societies. For this purpose, we develop an iterative tool based on active learning to label a data set from the field of AI and Machine Learning (ML) research and present preliminary results of a supervised classifier trained on these conventions. To further demonstrate the feasibility of the approach, the results are contrasted with a classifier trained on software conventions.

preprint2020arXiv

CryptoSPN: Privacy-preserving Sum-Product Network Inference

AI algorithms, and machine learning (ML) techniques in particular, are increasingly important to individuals' lives, but have caused a range of privacy concerns addressed by, e.g., the European GDPR. Using cryptographic techniques, it is possible to perform inference tasks remotely on sensitive client data in a privacy-preserving way: the server learns nothing about the input data and the model predictions, while the client learns nothing about the ML model (which is often considered intellectual property and might contain traces of sensitive data). While such privacy-preserving solutions are relatively efficient, they are mostly targeted at neural networks, can degrade the predictive accuracy, and usually reveal the network's topology. Furthermore, existing solutions are not readily accessible to ML experts, as prototype implementations are not well-integrated into ML frameworks and require extensive cryptographic knowledge. In this paper, we present CryptoSPN, a framework for privacy-preserving inference of sum-product networks (SPNs). SPNs are a tractable probabilistic graphical model that allows a range of exact inference queries in linear time. Specifically, we show how to efficiently perform SPN inference via secure multi-party computation (SMPC) without accuracy degradation while hiding sensitive client and training information with provable security guarantees. Next to foundations, CryptoSPN encompasses tools to easily transform existing SPNs into privacy-preserving executables. Our empirical results demonstrate that CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.

preprint2020arXiv

Fitted Q-Learning for Relational Domains

We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman operator; application and projection steps can be performed using a gradient-boosting technique. Our proposed framework performs reasonably well on standard domains without using domain models and using fewer training trajectories.

preprint2020arXiv

Neural Networks for Relational Data

While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.

preprint2020arXiv

Neural-Symbolic Argumentation Mining: an Argument in Favor of Deep Learning and Reasoning

Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal.

preprint2020arXiv

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron. However, deciding on the best activation is non-trivial, and the choice depends on the architecture, hyper-parameters, and even on the dataset. Typically these activations are fixed by hand before training. Here, we demonstrate how to eliminate the reliance on first picking fixed activation functions by using flexible parametric rational functions instead. The resulting Padé Activation Units (PAUs) can both approximate common activation functions and also learn new ones while providing compact representations. Our empirical evidence shows that end-to-end learning deep networks with PAUs can increase the predictive performance. Moreover, PAUs pave the way to approximations with provable robustness. https://github.com/ml-research/pau

preprint2020arXiv

Structured Object-Aware Physics Prediction for Video Modeling and Planning

When humans observe a physical system, they can easily locate objects, understand their interactions, and anticipate future behavior, even in settings with complicated and previously unseen interactions. For computers, however, learning such models from videos in an unsupervised fashion is an unsolved research problem. In this paper, we present STOVE, a novel state-space model for videos, which explicitly reasons about objects and their positions, velocities, and interactions. It is constructed by combining an image model and a dynamics model in compositional manner and improves on previous work by reusing the dynamics model for inference, accelerating and regularizing training. STOVE predicts videos with convincing physical behavior over hundreds of timesteps, outperforms previous unsupervised models, and even approaches the performance of supervised baselines. We further demonstrate the strength of our model as a simulator for sample efficient model-based control in a task with heavily interacting objects.

preprint2020arXiv

TUDataset: A collection of benchmark datasets for learning with graphs

Recently, there has been an increasing interest in (supervised) learning with graph data, especially using graph neural networks. However, the development of meaningful benchmark datasets and standardized evaluation procedures is lagging, consequently hindering advancements in this area. To address this, we introduce the TUDataset for graph classification and regression. The collection consists of over 120 datasets of varying sizes from a wide range of applications. We provide Python-based data loaders, kernel and graph neural network baseline implementations, and evaluation tools. Here, we give an overview of the datasets, standardized evaluation procedures, and provide baseline experiments. All datasets are available at www.graphlearning.io. The experiments are fully reproducible from the code available at www.github.com/chrsmrrs/tudataset.

preprint2019arXiv

Learning to play the Chess Variant Crazyhouse above World Champion Level with Deep Neural Networks and Human Data

Deep neural networks have been successfully applied in learning the board games Go, chess and shogi without prior knowledge by making use of reinforcement learning. Although starting from zero knowledge has been shown to yield impressive results, it is associated with high computationally costs especially for complex games. With this paper, we present CrazyAra which is a neural network based engine solely trained in supervised manner for the chess variant crazyhouse. Crazyhouse is a game with a higher branching factor than chess and there is only limited data of lower quality available compared to AlphaGo. Therefore, we focus on improving efficiency in multiple aspects while relying on low computational resources. These improvements include modifications in the neural network design and training configuration, the introduction of a data normalization step and a more sample efficient Monte-Carlo tree search which has a lower chance to blunder. After training on 569,537 human games for 1.5 days we achieve a move prediction accuracy of 60.4%. During development, versions of CrazyAra played professional human players. Most notably, CrazyAra achieved a four to one win over 2017 crazyhouse world champion Justin Tan (aka LM Jann Lee) who is more than 400 Elo higher rated compared to the average player in our training set. Furthermore, we test the playing strength of CrazyAra on CPU against all participants of the second Crazyhouse Computer Championships 2017, winning against twelve of the thirteen participants. Finally, for CrazyAraFish we continue training our model on generated engine games. In ten long-time control matches playing Stockfish 10, CrazyAraFish wins three games and draws one out of ten matches.