Source author record

Inga Strümke

Inga Strümke appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Cryptography and Security cs.CY Computer Vision econ.EM eess.IV Quantitative Methods Robotics Software Engineering

Catalog footprint

What is connected

12works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Backdoor Channels Hidden in Latent Space: Cryptographic Undetectability in Modern Neural Networks

Recent cryptographic results establish that neural networks can be backdoored such that no efficient algorithm can distinguish them from a clean model. These guarantees, however, have been confined to stylised architectures of limited practical relevance, leaving open whether comparable undetectability extends to modern, end-to-end trained networks. We construct such an attack mechanism for state-of-the-art architectures, closely aligned to the cryptographic notion of undetectability, by identifying backdoor channels as learned latent directions, and show that the question of undetectability reduces to a hypothesis test between two unknown distributions over model parameters, which we conjecture to be intractable in practice. The consequence of this reframing is significant: if exploitable channels within a network's latent space are statistically indistinguishable from naturally learned directions, an attacker need not introduce foreign structure but can instead exploit the geometry the network already possesses. Demonstrating the approach on ResNet and Vision Transformer architectures trained on standard image classification datasets, the attack achieves both consistently high success rates with negligible clean accuracy degradation, and resists a comprehensive suite of post-training defences, none of which neutralise the backdoor without rendering the model unusable. Our results establish that cryptographic backdoors need not be artefacts requiring exotic architectures or artificial constructions, but identifiable as latent properties inherent to the geometry of learned representations.

preprint2026arXiv

Position: AI Security Policy Should Target Systems, Not Models

We present swarm-attack, an open-source adversarial testing framework in which multiple lightweight LLM agents coordinate through shared memory, parallel exploration, and evolutionary optimization. Together, our results demonstrate that both safety bypass of frontier models and software vulnerability discovery, i.e., the capability class that motivated restricted release of Anthropic's Mythos Preview, are achievable at effectively zero cost using commodity hardware and openly available models. We report two experiments. In the first, five instances of a 1.2 billion parameter model conducted 225 jailbreak attacks each against GPT-4o and Claude Sonnet~4. Against GPT-4o, the swarm achieved an Effective Harm Rate of 45.8%, producing 49 critical-severity breaches; against Claude Sonnet-4, the Effective Harm Rate was 0% despite a 40% technical success rate. In the second experiment, the same models performed combined source code analysis and binary fuzzing against a vulnerable C application with 9 planted CWEs. With a hand-crafted exploit seed corpus, regex pattern detection, and AddressSanitizer-based crash classification, the pipeline recovers 9 of 9 vulnerabilities (100% recall) in approximately four minutes on a consumer MacBook. With those scaffold components disabled, the same model recovers 0 of 9 by crash verification and 2 of 9 by citation. The capability class that motivated restricted release of Anthropic's Mythos Preview is therefore reproducible at effectively zero cost; the important enabler is the system scaffold itself, which compensates for the limited reasoning capacity of small individual models.

preprint2026arXiv

What Does Explainable AI Mean in Practice? Evaluative Requirements from a Longitudinal Clinical Case Study

This paper reports a case study on how explainability requirements were elicited during the development of an AI system for predicting cerebral palsy (CP) risk in infants. Over 18 months, we followed a development team and hospital clinicians as they sought to design explanations that would make the AI system trustworthy. Contrary to the assumption that users need detailed explanations of the inner workings of AI systems, our findings show that clinicians trusted it when it enabled them to evaluate predictions against their own assessments. Our findings show how a simple prediction graph proved effective by supporting clinicians' existing decision-making practices. Drawing on concepts from both Requirements Engineering and Explainable AI, we use the theoretical lens of Evaluative AI to introduce the notion of Evaluative Requirements: system requirements that allow users to scrutinize AI outputs and compare them with their own assessments. Our study demonstrates that such requirements are best discovered through the well-known methods of iterative prototyping and observation, making them essential for building trustworthy AI systems in expert domains.

preprint2023arXiv

Against Algorithmic Exploitation of Human Vulnerabilities

Decisions such as which movie to watch next, which song to listen to, or which product to buy online, are increasingly influenced by recommender systems and user models that incorporate information on users' past behaviours, preferences, and digitally created content. Machine learning models that enable recommendations and that are trained on user data may unintentionally leverage information on human characteristics that are considered vulnerabilities, such as depression, young age, or gambling addiction. The use of algorithmic decisions based on latent vulnerable state representations could be considered manipulative and could have a deteriorating impact on the condition of vulnerable individuals. In this paper, we are concerned with the problem of machine learning models inadvertently modelling vulnerabilities, and want to raise awareness for this issue to be considered in legislation and AI ethics. Hence, we define and describe common vulnerabilities, and illustrate cases where they are likely to play a role in algorithmic decision-making. We propose a set of requirements for methods to detect the potential for vulnerability modelling, detect whether vulnerable groups are treated differently by a model, and detect whether a model has created an internal representation of vulnerability. We conclude that explainable artificial intelligence methods may be necessary for detecting vulnerability exploitation by machine learning-based recommendation systems.

preprint2022arXiv

Explainability for identification of vulnerable groups in machine learning models

If a prediction model identifies vulnerable individuals or groups, the use of that model may become an ethical issue. But can we know that this is what a model does? Machine learning fairness as a field is focused on the just treatment of individuals and groups under information processing with machine learning methods. While considerable attention has been given to mitigating discrimination of protected groups, vulnerable groups have not received the same attention. Unlike protected groups, which can be regarded as always vulnerable, a vulnerable group may be vulnerable in one context but not in another. This raises new challenges on how and when to protect vulnerable individuals and groups under machine learning. Methods from explainable artificial intelligence (XAI), in contrast, do consider more contextual issues and are concerned with answering the question "why was this decision made?". Neither existing fairness nor existing explainability methods allow us to ascertain if a prediction model identifies vulnerability. We discuss this problem and propose approaches for analysing prediction models in this respect.

preprint2022arXiv

Explaining a Deep Reinforcement Learning Docking Agent Using Linear Model Trees with User Adapted Visualization

Deep neural networks (DNNs) can be useful within the marine robotics field, but their utility value is restricted by their black-box nature. Explainable artificial intelligence methods attempt to understand how such black-boxes make their decisions. In this work, linear model trees (LMTs) are used to approximate the DNN controlling an autonomous surface vessel (ASV) in a simulated environment and then run in parallel with the DNN to give explanations in the form of feature attributions in real-time. How well a model can be understood depends not only on the explanation itself, but also on how well it is presented and adapted to the receiver of said explanation. Different end-users may need both different types of explanations, as well as different representations of these. The main contributions of this work are (1) significantly improving both the accuracy and the build time of a greedy approach for building LMTs by introducing ordering of features in the splitting of the tree, (2) giving an overview of the characteristics of the seafarer/operator and the developer as two different end-users of the agent and receiver of the explanations, and (3) suggesting a visualization of the docking agent, the environment, and the feature attributions given by the LMT for when the developer is the end-user of the system, and another visualization for when the seafarer or operator is the end-user, based on their different characteristics.

preprint2022arXiv

Predicting tacrolimus exposure in kidney transplanted patients using machine learning

Tacrolimus is one of the cornerstone immunosuppressive drugs in most transplantation centers worldwide following solid organ transplantation. Therapeutic drug monitoring of tacrolimus is necessary in order to avoid rejection of the transplanted organ or severe side effects. However, finding the right dose for a given patient is challenging, even for experienced clinicians. Consequently, a tool that can accurately estimate the drug exposure for individual dose adaptions would be of high clinical value. In this work, we propose a new technique using machine learning to estimate the tacrolimus exposure in kidney transplant recipients. Our models achieve predictive errors that are at the same level as an established population pharmacokinetic model, but are faster to develop and require less knowledge about the pharmacokinetic properties of the drug.

preprint2022arXiv

Socioeconomic disparities and COVID-19: the causal connections

The analysis of causation is a challenging task that can be approached in various ways. With the increasing use of machine learning based models in computational socioeconomics, explaining these models while taking causal connections into account is a necessity. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with $do$ calculus, namely causal Shapley values. Using causal Shapley values, we analyze socioeconomic disparities that have a causal link to the spread of COVID-19 in the USA. We study several phases of the disease spread to show how the causal connections change over time. We perform a causal analysis using random effects models and discuss the correspondence between the two methods to verify our results. We show the distinct advantages a non-linear machine learning models have over linear models when performing a multivariate analysis, especially since the machine learning models can map out non-linear correlations in the data. In addition, the causal Shapley values allow for including the causal structure in the variable importance computed for the machine learning model.

preprint2022arXiv

Visual explanations for polyp detection: How medical doctors assess intrinsic versus extrinsic explanations

Deep learning has in recent years achieved immense success in all areas of computer vision and has the potential of assisting medical doctors in analyzing visual content for disease and other abnormalities. However, the current state of deep learning is very much a black box, making medical professionals highly skeptical about integrating these methods into clinical practice. Several methods have been proposed in order to shine some light onto these black boxes, but there is no consensus on the opinion of the medical doctors that will consume these explanations. This paper presents a study asking medical doctors about their opinion of current state-of-the-art explainable artificial intelligence methods when applied to a gastrointestinal disease detection use case. We compare two different categories of explanation methods, intrinsic and extrinsic, and gauge their opinion of the current value of these explanations. The results indicate that intrinsic explanations are preferred and that explanation.

preprint2021arXiv

Explaining the data or explaining a model? Shapley values that uncover non-linear dependencies

Shapley values have become increasingly popular in the machine learning literature thanks to their attractive axiomatisation, flexibility, and uniqueness in satisfying certain notions of `fairness'. The flexibility arises from the myriad potential forms of the Shapley value \textit{game formulation}. Amongst the consequences of this flexibility is that there are now many types of Shapley values being discussed, with such variety being a source of potential misunderstanding. To the best of our knowledge, all existing game formulations in the machine learning and statistics literature fall into a category which we name the model-dependent category of game formulations. In this work, we consider an alternative and novel formulation which leads to the first instance of what we call model-independent Shapley values. These Shapley values use a (non-parametric) measure of non-linear dependence as the characteristic function. The strength of these Shapley values is in their ability to uncover and attribute non-linear dependencies amongst features. We introduce and demonstrate the use of the energy distance correlations, affine-invariant distance correlation, and Hilbert-Shmidt independence criterion as Shapley value characteristic functions. In particular, we demonstrate their potential value for exploratory data analysis and model diagnostics. We conclude with an interesting expository application to a classical medical survey data set.

preprint2021arXiv

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literature, thanks, to a large extent, to a solid theoretical foundation, including four "favourable and fair" axioms for attribution in transferable utility games. The Shapley value is provably the only solution concept satisfying these axioms. In this paper, we introduce the Shapley value and draw attention to its recent uses as a feature selection tool. We call into question this use of the Shapley value, using simple, abstract "toy" counterexamples to illustrate that the axioms may work against the goals of feature selection. From this, we develop a number of insights that are then investigated in concrete simulation settings, with a variety of Shapley value formulations, including SHapley Additive exPlanations (SHAP) and Shapley Additive Global importancE (SAGE).

preprint2021arXiv

The social dilemma in artificial intelligence development and why we have to solve it

While the demand for ethical artificial intelligence (AI) systems increases, the number of unethical uses of AI accelerates, even though there is no shortage of ethical guidelines. We argue that a possible underlying cause for this is that AI developers face a social dilemma in AI development ethics, preventing the widespread adaptation of ethical best practices. We define the social dilemma for AI development and describe why the current crisis in AI development ethics cannot be solved without relieving AI developers of their social dilemma. We argue that AI development must be professionalised to overcome the social dilemma, and discuss how medicine can be used as a template in this process.

Inga Strümke

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Backdoor Channels Hidden in Latent Space: Cryptographic Undetectability in Modern Neural Networks

Position: AI Security Policy Should Target Systems, Not Models

What Does Explainable AI Mean in Practice? Evaluative Requirements from a Longitudinal Clinical Case Study

Against Algorithmic Exploitation of Human Vulnerabilities

Explainability for identification of vulnerable groups in machine learning models

Explaining a Deep Reinforcement Learning Docking Agent Using Linear Model Trees with User Adapted Visualization

Predicting tacrolimus exposure in kidney transplanted patients using machine learning

Socioeconomic disparities and COVID-19: the causal connections

Visual explanations for polyp detection: How medical doctors assess intrinsic versus extrinsic explanations

Explaining the data or explaining a model? Shapley values that uncover non-linear dependencies

Shapley values for feature selection: The good, the bad, and the axioms

The social dilemma in artificial intelligence development and why we have to solve it