Source author record

Johannes Schneider

Johannes Schneider appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence Cryptography and Security Data Structures and Algorithms Computation and Language Distributed, Parallel, and Cluster Computing Human-Computer Interaction Information Retrieval

Catalog footprint

What is connected

14works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Enhanced Data-Driven Product Development via Gradient Based Optimization and Conformalized Monte Carlo Dropout Uncertainty Estimation

Data-Driven Product Development (DDPD) leverages data to learn the relationship between product design specifications and resulting properties. To discover improved designs, we train a neural network on past experiments and apply Projected Gradient Descent to identify optimal input features that maximize performance. Since many products require simultaneous optimization of multiple correlated properties, our framework employs joint neural networks to capture interdependencies among targets. Furthermore, we integrate uncertainty estimation via \emph{Conformalised Monte Carlo Dropout} (ConfMC), a novel method combining Nested Conformal Prediction with Monte Carlo dropout to provide model-agnostic, finite-sample coverage guarantees under data exchangeability. Extensive experiments on five real-world datasets show that our method matches state-of-the-art performance while offering adaptive, non-uniform prediction intervals and eliminating the need for retraining when adjusting coverage levels.

preprint2022arXiv

Concept-based Adversarial Attacks: Tricking Humans and Classifiers Alike

We propose to generate adversarial samples by modifying activations of upper layers encoding semantically meaningful concepts. The original sample is shifted towards a target sample, yielding an adversarial sample, by using the modified activations to reconstruct the original sample. A human might (and possibly should) notice differences between the original and the adversarial sample. Depending on the attacker-provided constraints, an adversarial sample can exhibit subtle differences or appear like a "forged" sample from another class. Our approach and goal are in stark contrast to common attacks involving perturbations of single pixels that are not recognizable by humans. Our approach is relevant in, e.g., multi-stage processing of inputs, where both humans and machines are involved in decision-making because invisible perturbations will not fool a human. Our evaluation focuses on deep neural networks. We also show the transferability of our adversarial examples among networks.

preprint2022arXiv

Domain Transformer: Predicting Samples of Unseen, Future Domains

The data distribution commonly evolves over time leading to problems such as concept drift that often decrease classifier performance. Current techniques are not adequate for this problem because they either require detailed knowledge of the transformation or are not suited for anticipating unseen domains but can only adapt to domains, where data samples are available. We seek to predict unseen data (and their labels) allowing us to tackle challenges s a non-constant data distribution in a proactive manner rather than detecting and reacting to already existing changes that might already have led to errors. To this end, we learn a domain transformer in an unsupervised manner that allows generating data of unseen domains. Our approach first matches independently learned latent representations of two given domains obtained from an auto-encoder using a Cycle-GAN. In turn, a transformation of the original samples can be learned that can be applied iteratively to extrapolate to unseen domains. Our evaluation of CNNs on image data confirms the usefulness of the approach. It also achieves very good results on the well-known problem of unsupervised domain adaption, where only labels but no samples have to be predicted. Code is available at https://github.com/JohnTailor/DoTra.

preprint2022arXiv

Explaining Classifiers by Constructing Familiar Concepts

Interpreting a large number of neurons in deep learning is difficult. Our proposed `CLAssifier-DECoder' architecture (ClaDec) facilitates the understanding of the output of an arbitrary layer of neurons or subsets thereof. It uses a decoder that transforms the incomprehensible representation of the given neurons to a representation that is more similar to the domain a human is familiar with. In an image recognition problem, one can recognize what information (or concepts) a layer maintains by contrasting reconstructed images of ClaDec with those of a conventional auto-encoder(AE) serving as reference. An extension of ClaDec allows trading comprehensibility and fidelity. We evaluate our approach for image classification using convolutional neural networks. We show that reconstructed visualizations using encodings from a classifier capture more relevant classification information than conventional AEs. This holds although AEs contain more information on the original input. Our user study highlights that even non-experts can identify a diverse set of concepts contained in images that are relevant (or irrelevant) for the classifier. We also compare against saliency based methods that focus on pixel relevance rather than concepts. We show that ClaDec tends to highlight more relevant input areas to classification though outcomes depend on classifier architecture. Code is at \url{https://github.com/JohnTailor/ClaDec}

preprint2022arXiv

The learning phases in NN: From Fitting the Majority to Fitting a Few

The learning dynamics of deep neural networks are subject to controversy. Using the information bottleneck (IB) theory separate fitting and compression phases have been put forward but have since been heavily debated. We approach learning dynamics by analyzing a layer's reconstruction ability of the input and prediction performance based on the evolution of parameters during training. We show that a prototyping phase decreasing reconstruction loss initially, followed by reducing classification loss of a few samples, which increases reconstruction loss, exists under mild assumptions on the data. Aside from providing a mathematical analysis of single layer classification networks, we also assess the behavior using common datasets and architectures from computer vision such as ResNet and VGG.

preprint2022arXiv

Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers

Autograding short textual answers has become much more feasible due to the rise of NLP and the increased availability of question-answer pairs brought about by a shift to online education. Autograding performance is still inferior to human grading. The statistical and black-box nature of state-of-the-art machine learning models makes them untrustworthy, raising ethical concerns and limiting their practical utility. Furthermore, the evaluation of autograding is typically confined to small, monolingual datasets for a specific question type. This study uses a large dataset consisting of about 10 million question-answer pairs from multiple languages covering diverse fields such as math and language, and strong variation in question and answer syntax. We demonstrate the effectiveness of fine-tuning transformer models for autograding for such complex datasets. Our best hyperparameter-tuned model yields an accuracy of about 86.5\%, comparable to the state-of-the-art models that are less general and more tuned to a specific type of question, subject, and language. More importantly, we address trust and ethical concerns. By involving humans in the autograding process, we show how to improve the accuracy of automatically graded answers, achieving accuracy equivalent to that of teaching assistants. We also show how teachers can effectively control the type of errors made by the system and how they can validate efficiently that the autograder's performance on individual exams is close to the expected performance.

preprint2021arXiv

Explaining Neural Networks by Decoding Layer Activations

We present a `CLAssifier-DECoder' architecture (\emph{ClaDec}) which facilitates the comprehension of the output of an arbitrary layer in a neural network (NN). It uses a decoder to transform the non-interpretable representation of the given layer to a representation that is more similar to the domain a human is familiar with. In an image recognition problem, one can recognize what information is represented by a layer by contrasting reconstructed images of \emph{ClaDec} with those of a conventional auto-encoder(AE) serving as reference. We also extend \emph{ClaDec} to allow the trade-off between human interpretability and fidelity. We evaluate our approach for image classification using Convolutional NNs. We show that reconstructed visualizations using encodings from a classifier capture more relevant information for classification than conventional AEs. Relevant code is available at \url{https://github.com/JohnTailor/ClaDec}

preprint2020arXiv

Human-to-AI Coach: Improving Human Inputs to AI Systems

Humans increasingly interact with Artificial intelligence(AI) systems. AI systems are optimized for objectives such as minimum computation or minimum error rate in recognizing and interpreting inputs from humans. In contrast, inputs created by humans are often treated as a given. We investigate how inputs of humans can be altered to reduce misinterpretation by the AI system and to improve efficiency of input generation for the human while altered inputs should remain as similar as possible to the original inputs. These objectives result in trade-offs that are analyzed for a deep learning system classifying handwritten digits. To create examples that serve as demonstrations for humans to improve, we develop a model based on a conditional convolutional autoencoder (CCAE). Our quantitative and qualitative evaluation shows that in many occasions the generated proposals lead to lower error rates, require less effort to create and differ only modestly from the original samples.

preprint2020arXiv

Humans learn too: Better Human-AI Interaction using Optimized Human Inputs

Humans rely more and more on systems with AI components. The AI community typically treats human inputs as a given and optimizes AI models only. This thinking is one-sided and it neglects the fact that humans can learn, too. In this work, human inputs are optimized for better interaction with an AI model while keeping the model fixed. The optimized inputs are accompanied by instructions on how to create them. They allow humans to save time and cut on errors, while keeping required changes to original inputs limited. We propose continuous and discrete optimization methods modifying samples in an iterative fashion. Our quantitative and qualitative evaluation including a human study on different hand-generated inputs shows that the generated proposals lead to lower error rates, require less effort to create and differ only modestly from the original samples.

preprint2020arXiv

Personalization of Deep Learning

We discuss training techniques, objectives and metrics toward personalization of deep learning models. In machine learning, personalization addresses the goal of a trained model to target a particular individual by optimizing one or more performance metrics, while conforming to certain constraints. To personalize, we investigate three methods of ``curriculum learning`` and two approaches for data grouping, i.e., augmenting the data of an individual by adding similar data identified with an auto-encoder. We show that both ``curriculuum learning'' and ``personalized'' data augmentation lead to improved performance on data of an individual. Mostly, this comes at the cost of reduced performance on a more general, broader dataset.

preprint2016arXiv

Obfuscation using Encryption

Protecting source code against reverse engineering and theft is an important problem. The goal is to carry out computations using confidential algorithms on an untrusted party while ensuring confidentiality of algorithms. This problem has been addressed for Boolean circuits known as `circuit privacy'. Circuits corresponding to real-world programs are impractical. Well-known obfuscation techniques are highly practicable, but provide only limited security, e.g., no piracy protection. In this work, we modify source code yielding programs with adjustable performance and security guarantees ranging from indistinguishability obfuscators to (non-secure) ordinary obfuscation. The idea is to artificially generate `misleading' statements. Their results are combined with the outcome of a confidential statement using encrypted \emph{selector variables}. Thus, an attacker must `guess' the encrypted selector variables to disguise the confidential source code. We evaluated our method using more than ten programmers as well as pattern mining across open source code repositories to gain insights of (micro-)coding patterns that are relevant for generating misleading statements. The evaluation reveals that our approach is effective in that it successfully preserves source code confidentiality.

preprint2016arXiv

Oblivious Sorting and Queues

We present a deterministic oblivious LIFO (Stack), FIFO, double-ended and double-ended priority queue as well as an oblivious mergesort and quicksort algorithm. Our techniques and ideas include concatenating queues end-to-end, size balancing of multiple arrays, several multi-level partitionings of an array. Our queues are the first to enable executions of pop and push operations without any change of the data structure (controlled by a parameter). This enables interesting applications in computing on encrypted data such as hiding confidential expressions. Mergesort becomes practical using our LIFO queue, ie. it improves prior work (STOC '14) by a factor of (more than) 1000 in terms of comparisons for all practically relevant queue sizes. We are the first to present double-ended (priority) and LIFO queues as well as oblivious quicksort which is asymptotically optimal. Aside from theortical analysis, we also provide an empirical evaluation of all queues.

preprint2015arXiv

The Locality of Distributed Symmetry Breaking

Symmetry breaking problems are among the most well studied in the field of distributed computing and yet the most fundamental questions about their complexity remain open. In this paper we work in the LOCAL model (where the input graph and underlying distributed network are identical) and study the randomized complexity of four fundamental symmetry breaking problems on graphs: computing MISs (maximal independent sets), maximal matchings, vertex colorings, and ruling sets. A small sample of our results includes - An MIS algorithm running in $O(\log^2Δ+ 2^{O(\sqrt{\log\log n})})$ time, where $Δ$ is the maximum degree. This is the first MIS algorithm to improve on the 1986 algorithms of Luby and Alon, Babai, and Itai, when $\log n \ll Δ\ll 2^{\sqrt{\log n}}$, and comes close to the $Ω(\log Δ)$ lower bound of Kuhn, Moscibroda, and Wattenhofer. - A maximal matching algorithm running in $O(\logΔ+ \log^4\log n)$ time. This is the first significant improvement to the 1986 algorithm of Israeli and Itai. Moreover, its dependence on $Δ$ is provably optimal. - A method for reducing symmetry breaking problems in low arboricity/degeneracy graphs to low degree graphs. (Roughly speaking, the arboricity or degeneracy of a graph bounds the density of any subgraph.) Corollaries of this reduction include an $O(\sqrt{\log n})$-time maximal matching algorithm for graphs with arboricity up to $2^{\sqrt{\log n}}$ and an $O(\log^{2/3} n)$-time MIS algorithm for graphs with arboricity up to $2^{(\log n)^{1/3}}$. Each of our algorithms is based on a simple, but powerful technique for reducing a randomized symmetry breaking task to a corresponding deterministic one on a poly$(\log n)$-size graph.

preprint2014arXiv

On Randomly Projected Hierarchical Clustering with Guarantees

Hierarchical clustering (HC) algorithms are generally limited to small data instances due to their runtime costs. Here we mitigate this shortcoming and explore fast HC algorithms based on random projections for single (SLC) and average (ALC) linkage clustering as well as for the minimum spanning tree problem (MST). We present a thorough adaptive analysis of our algorithms that improve prior work from $O(N^2)$ by up to a factor of $N/(\log N)^2$ for a dataset of $N$ points in Euclidean space. The algorithms maintain, with arbitrary high probability, the outcome of hierarchical clustering as well as the worst-case running-time guarantees. We also present parameter-free instances of our algorithms.

Johannes Schneider

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Enhanced Data-Driven Product Development via Gradient Based Optimization and Conformalized Monte Carlo Dropout Uncertainty Estimation

Concept-based Adversarial Attacks: Tricking Humans and Classifiers Alike

Domain Transformer: Predicting Samples of Unseen, Future Domains

Explaining Classifiers by Constructing Familiar Concepts

The learning phases in NN: From Fitting the Majority to Fitting a Few

Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers

Explaining Neural Networks by Decoding Layer Activations

Human-to-AI Coach: Improving Human Inputs to AI Systems

Humans learn too: Better Human-AI Interaction using Optimized Human Inputs

Personalization of Deep Learning

Obfuscation using Encryption

Oblivious Sorting and Queues

The Locality of Distributed Symmetry Breaking

On Randomly Projected Hierarchical Clustering with Guarantees