Source author record

Christian Bauckhage

Christian Bauckhage appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision physics.soc-ph Social and Information Networks Artificial Intelligence Computation and Language Information Theory math.IT quant-ph Computational Engineering, Finance, and Science Information Retrieval Computational Geometry Data Structures and Algorithms Digital Libraries Human-Computer Interaction Numerical Analysis

Catalog footprint

What is connected

25works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Four Quadrants of Difficulty: A Simple Categorisation and its Limits

Curriculum Learning (CL) aims to improve the outcome of model training by estimating the difficulty of samples and scheduling them accordingly. In NLP, difficulty is commonly approximated using task-agnostic linguistic heuristics or human intuition, implicitly assuming that these signals correlate with what neural models find difficult to learn. We propose a four-quadrant categorisation of difficulty signals -- human vs. model and task-agnostic vs. task-dependent -- and systematically analyse their interactions on a natural language understanding dataset. We find that task-agnostic features behave largely independently and that only task-dependent features align. These findings challenge common CL intuitions and highlight the need for lightweight, task-dependent difficulty estimators that better reflect model learning behaviour.

preprint2022arXiv

Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision

UAV-based image retrieval in modern agriculture enables gathering large amounts of spatially referenced crop image data. In large-scale experiments, however, UAV images suffer from containing a multitudinous amount of crops in a complex canopy architecture. Especially for the observation of temporal effects, this complicates the recognition of individual plants over several images and the extraction of relevant information tremendously. In this work, we present a hands-on workflow for the automatized temporal and spatial identification and individualization of crop images from UAVs abbreviated as "cataloging" based on comprehensible computer vision methods. We evaluate the workflow on two real-world datasets. One dataset is recorded for observation of Cercospora leaf spot - a fungal disease - in sugar beet over an entire growing cycle. The other one deals with harvest prediction of cauliflower plants. The plant catalog is utilized for the extraction of single plant images seen over multiple time points. This gathers large-scale spatio-temporal image dataset that in turn can be applied to train further machine learning models including various data layers. The presented approach improves analysis and interpretation of UAV data in agriculture significantly. By validation with some reference data, our method shows an accuracy that is similar to more complex deep learning-based recognition techniques. Our workflow is able to automatize plant cataloging and training image extraction, especially for large datasets.

preprint2022arXiv

Dynamic Review-based Recommenders

Just as user preferences change with time, item reviews also reflect those same preference changes. In a nutshell, if one is to sequentially incorporate review content knowledge into recommender systems, one is naturally led to dynamical models of text. In the present work we leverage the known power of reviews to enhance rating predictions in a way that (i) respects the causality of review generation and (ii) includes, in a bidirectional fashion, the ability of ratings to inform language review models and vice-versa, language representations that help predict ratings end-to-end. Moreover, our representations are time-interval aware and thus yield a continuous-time representation of the dynamics. We provide experiments on real-world datasets and show that our methodology is able to outperform several state-of-the-art models. Source code for all models can be found at [1].

preprint2022arXiv

Gradient Flows for L2 Support Vector Machine Training

We explore the merits of training of support vector machines for binary classification by means of solving systems of ordinary differential equations. We thus assume a continuous time perspective on a machine learning problem which may be of interest for implementations on (re)emerging hardware platforms such as analog- or quantum computers.

preprint2022arXiv

Informed Pre-Training on Prior Knowledge

When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training on more concise forms of knowledge has rather been overlooked. In this paper, we propose a novel informed machine learning approach and suggest to pre-train on prior knowledge. Formal knowledge representations, e.g. graphs or equations, are first transformed into a small and condensed data set of knowledge prototypes. We show that informed pre-training on such knowledge prototypes (i) speeds up the learning processes, (ii) improves generalization capabilities in the regime where not enough training data is available, and (iii) increases model robustness. Analyzing which parts of the model are affected most by the prototypes reveals that improvements come from deeper layers that typically represent high-level features. This confirms that informed pre-training can indeed transfer semantic knowledge. This is a novel effect, which shows that knowledge-based pre-training has additional and complementary strengths to existing approaches.

preprint2022arXiv

KPI-BERT: A Joint Named Entity Recognition and Relation Extraction Model for Financial Reports

We present KPI-BERT, a system which employs novel methods of named entity recognition (NER) and relation extraction (RE) to extract and link key performance indicators (KPIs), e.g. "revenue" or "interest expenses", of companies from real-world German financial documents. Specifically, we introduce an end-to-end trainable architecture that is based on Bidirectional Encoder Representations from Transformers (BERT) combining a recurrent neural network (RNN) with conditional label masking to sequentially tag entities before it classifies their relations. Our model also introduces a learnable RNN-based pooling mechanism and incorporates domain expert knowledge by explicitly filtering impossible relations. We achieve a substantially higher prediction performance on a new practical dataset of German financial reports, outperforming several strong baselines including a competing state-of-the-art span-based entity tagging approach.

preprint2022arXiv

Predict better with less training data using a QNN

Over the past decade, machine learning revolutionized vision-based quality assessment for which convolutional neural networks (CNNs) have now become the standard. In this paper, we consider a potential next step in this development and describe a quanvolutional neural network (QNN) algorithm that efficiently maps classical image data to quantum states and allows for reliable image analysis. We practically demonstrate how to leverage quantum devices in computer vision and how to introduce quantum convolutions into classical CNNs. Dealing with a real world use case in industrial quality control, we implement our hybrid QNN model within the PennyLane framework and empirically observe it to achieve better predictions using much fewer training data than classical CNNs. In other words, we empirically observe a genuine quantum advantage for an industrial application where the advantage is due to superior data encoding.

preprint2022arXiv

QUBOs for Sorting Lists and Building Trees

We show that the fundamental tasks of sorting lists and building search trees or heaps can be modeled as quadratic unconstrained binary optimization problems (QUBOs). The idea is to understand these tasks as permutation problems and to devise QUBOs whose solutions represent appropriate permutation matrices. We discuss how to construct such QUBOs and how to solve them using Hopfield nets or adiabatic) quantum computing. In short, we show that neurocomputing methods or quantum computers can solve problems usually associated with abstract data structures.

preprint2022arXiv

Towards Bundle Adjustment for Satellite Imaging via Quantum Machine Learning

Given is a set of images, where all images show views of the same area at different points in time and from different viewpoints. The task is the alignment of all images such that relevant information, e.g., poses, changes, and terrain, can be extracted from the fused image. In this work, we focus on quantum methods for keypoint extraction and feature matching, due to the demanding computational complexity of these sub-tasks. To this end, k-medoids clustering, kernel density clustering, nearest neighbor search, and kernel methods are investigated and it is explained how these methods can be re-formulated for quantum annealers and gate-based quantum computers. Experimental results obtained on digital quantum emulation hardware, quantum annealers, and quantum gate computers show that classical systems still deliver superior results. However, the proposed methods are ready for the current and upcoming generations of quantum computing devices which have the potential to outperform classical systems in the near future.

preprint2020arXiv

Learning Syllogism with Euler Neural-Networks

Traditional neural networks represent everything as a vector, and are able to approximate a subset of logical reasoning to a certain degree. As basic logic relations are better represented by topological relations between regions, we propose a novel neural network that represents everything as a ball and is able to learn topological configuration as an Euler diagram. So comes the name Euler Neural-Network (ENN). The central vector of a ball is a vector that can inherit representation power of traditional neural network. ENN distinguishes four spatial statuses between balls, namely, being disconnected, being partially overlapped, being part of, being inverse part of. Within each status, ideal values are defined for efficient reasoning. A novel back-propagation algorithm with six Rectified Spatial Units (ReSU) can optimize an Euler diagram representing logical premises, from which logical conclusion can be deduced. In contrast to traditional neural network, ENN can precisely represent all 24 different structures of Syllogism. Two large datasets are created: one extracted from WordNet-3.0 covers all types of Syllogism reasoning, the other extracted all family relations from DBpedia. Experiment results approve the superior power of ENN in logical representation and reasoning. Datasets and source code are available upon request.

preprint2020arXiv

Recurrent Point Processes for Dynamic Review Models

Recent progress in recommender system research has shown the importance of including temporal representations to improve interpretability and performance. Here, we incorporate temporal representations in continuous time via recurrent point process for a dynamical model of reviews. Our goal is to characterize how changes in perception, user interest and seasonal effects affect review text.

preprint2015arXiv

Exploring Human Vision Driven Features for Pedestrian Detection

Motivated by the center-surround mechanism in the human visual attention system, we propose to use average contrast maps for the challenge of pedestrian detection in street scenes due to the observation that pedestrians indeed exhibit discriminative contrast texture. Our main contributions are first to design a local, statistical multi-channel descriptorin order to incorporate both color and gradient information. Second, we introduce a multi-direction and multi-scale contrast scheme based on grid-cells in order to integrate expressive local variations. Contributing to the issue of selecting most discriminative features for assessing and classification, we perform extensive comparisons w.r.t. statistical descriptors, contrast measurements, and scale structures. This way, we obtain reasonable results under various configurations. Empirical findings from applying our optimized detector on the INRIA and Caltech pedestrian datasets show that our features yield state-of-the-art performance in pedestrian detection.

preprint2015arXiv

Maximum Entropy Models of Shortest Path and Outbreak Distributions in Networks

Properties of networks are often characterized in terms of features such as node degree distributions, average path lengths, diameters, or clustering coefficients. Here, we study shortest path length distributions. On the one hand, average as well as maximum distances can be determined therefrom; on the other hand, they are closely related to the dynamics of network spreading processes. Because of the combinatorial nature of networks, we apply maximum entropy arguments to derive a general, physically plausible model. In particular, we establish the generalized Gamma distribution as a continuous characterization of shortest path length histograms of networks or arbitrary topology. Experimental evaluations corroborate our theoretical results.

preprint2015arXiv

SGPD Volume Maximization for Community Detection

In this note we briefly study the feasibility of community detection in complex networks using peripheral vertices. Our method suggests a novel direction in axiomizing the problem of clustering in graphs and complex networks by looking at the topological role each vertex plays in the community structure, regardless of the attributes. The promising strength of pseudo-peripheral vertices as a lever for analysis of complex networks is also demonstrated on real-world data.

preprint2014arXiv

A Comparison of Methods for Player Clustering via Behavioral Telemetry

The analysis of user behavior in digital games has been aided by the introduction of user telemetry in game development, which provides unprecedented access to quantitative data on user behavior from the installed game clients of the entire population of players. Player behavior telemetry datasets can be exceptionally complex, with features recorded for a varying population of users over a temporal segment that can reach years in duration. Categorization of behaviors, whether through descriptive methods (e.g. segmention) or unsupervised/supervised learning techniques, is valuable for finding patterns in the behavioral data, and developing profiles that are actionable to game developers. There are numerous methods for unsupervised clustering of user behavior, e.g. k-means/c-means, Non-negative Matrix Factorization, or Principal Component Analysis. Although all yield behavior categorizations, interpretation of the resulting categories in terms of actual play behavior can be difficult if not impossible. In this paper, a range of unsupervised techniques are applied together with Archetypal Analysis to develop behavioral clusters from playtime data of 70,014 World of Warcraft players, covering a five year interval. The techniques are evaluated with respect to their ability to develop actionable behavioral profiles from the dataset.

preprint2014arXiv

A Note on Archetypal Analysis and the Approximation of Convex Hulls

We briefly review the basic ideas behind archetypal analysis for matrix factorization and discuss its behavior in approximating the convex hull of a data sample. We then ask how good such approximations can be and consider different cases. Understanding archetypal analysis as the problem of computing a convexity constrained low-rank approximation of the identity matrix provides estimates for archetypal analysis and the SiVM heuristic.

preprint2014arXiv

Characterizations and Kullback-Leibler Divergence of Gompertz Distributions

In this note, we characterize the Gompertz distribution in terms of extreme value distributions and point out that it implicitly models the interplay of two antagonistic growth processes. In addition, we derive a closed form expressions for the Kullback-Leibler divergence between two Gompertz Distributions. Although the latter is rather easy to obtain, it seems not to have been widely reported before.

preprint2014arXiv

Computing the Kullback-Leibler Divergence between two Generalized Gamma Distributions

We derive a closed form solution for the Kullback-Leibler divergence between two generalized gamma distributions. These notes are meant as a reference and provide a guided tour towards a result of practical interest that is rarely explicated in the literature.

preprint2014arXiv

Marginalizing over the PageRank Damping Factor

In this note, we show how to marginalize over the damping parameter of the PageRank equation so as to obtain a parameter-free version known as TotalRank. Our discussion is meant as a reference and intended to provide a guided tour towards an interesting result that has applications in information retrieval and classification.

preprint2014arXiv

Propagation Kernels

We introduce propagation kernels, a general graph-kernel framework for efficiently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two benefits. First, off-the-shelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing efficient and informative propagation schemes, propagation kernels can be considerably faster than state-of-the-art approaches without sacrificing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of real-world graphs from a variety of application domains.

preprint2014arXiv

Strong Regularities in Growth and Decline of Popularity of Social Media Services

We analyze general trends and pattern in time series that characterize the dynamics of collective attention to social media services and Web-based businesses. Our study is based on search frequency data available from Google Trends and considers 175 different services. For each service, we collect data from 45 different countries as well as global averages. This way, we obtain more than 8,000 time series which we analyze using diffusion models from the economic sciences. We find that these models accurately characterize the empirical data and our analysis reveals that collective attention to social media grows and subsides in a highly regular and predictable manner. Regularities persist across regions, cultures, and topics and thus hint at general mechanisms that govern the adoption of Web-based services. We discuss several cases in detail to highlight interesting findings. Our methods are of economic interest as they may inform investment decisions and can help assessing at what stage of the general life-cycle a Web service is at.

preprint2013arXiv

Computing the Kullback-Leibler Divergence between two Weibull Distributions

We derive a closed form solution for the Kullback-Leibler divergence between two Weibull distributions. These notes are meant as reference material and intended to provide a guided tour towards a result that is often mentioned but seldom made explicit in the literature.

preprint2013arXiv

Efficient Information Theoretic Clustering on Discrete Lattices

We consider the problem of clustering data that reside on discrete, low dimensional lattices. Canonical examples for this setting are found in image segmentation and key point extraction. Our solution is based on a recent approach to information theoretic clustering where clusters result from an iterative procedure that minimizes a divergence measure. We replace costly processing steps in the original algorithm by means of convolutions. These allow for highly efficient implementations and thus significantly reduce runtime. This paper therefore bridges a gap between machine learning and signal processing.

preprint2013arXiv

GeoDBLP: Geo-Tagging DBLP for Mining the Sociology of Computer Science

Many collective human activities have been shown to exhibit universal patterns. However, the possibility of universal patterns across timing events of researcher migration has barely been explored at global scale. Here, we show that timing events of migration within different countries exhibit remarkable similarities. Specifically, we look at the distribution governing the data of researcher migration inferred from the web. Compiling the data in itself represents a significant advance in the field of quantitative analysis of migration patterns. Official and commercial records are often access restricted, incompatible between countries, and especially not registered across researchers. Instead, we introduce GeoDBLP where we propagate geographical seed locations retrieved from the web across the DBLP database of 1,080,958 authors and 1,894,758 papers. But perhaps more important is that we are able to find statistical patterns and create models that explain the migration of researchers. For instance, we show that the science job market can be treated as a Poisson process with individual propensities to migrate following a log-normal distribution over the researcher's career stage. That is, although jobs enter the market constantly, researchers are generally not "memoryless" but have to care greatly about their next move. The propensity to make k>1 migrations, however, follows a gamma distribution suggesting that migration at later career stages is "memoryless". This aligns well but actually goes beyond scientometric models typically postulated based on small case studies. On a very large, transnational scale, we establish the first general regularities that should have major implications on strategies for education and research worldwide.

preprint2012arXiv

Latent Dirichlet Allocation Uncovers Spectral Characteristics of Drought Stressed Plants

Understanding the adaptation process of plants to drought stress is essential in improving management practices, breeding strategies as well as engineering viable crops for a sustainable agriculture in the coming decades. Hyper-spectral imaging provides a particularly promising approach to gain such understanding since it allows to discover non-destructively spectral characteristics of plants governed primarily by scattering and absorption characteristics of the leaf internal structure and biochemical constituents. Several drought stress indices have been derived using hyper-spectral imaging. However, they are typically based on few hyper-spectral images only, rely on interpretations of experts, and consider few wavelengths only. In this study, we present the first data-driven approach to discovering spectral drought stress indices, treating it as an unsupervised labeling problem at massive scale. To make use of short range dependencies of spectral wavelengths, we develop an online variational Bayes algorithm for latent Dirichlet allocation with convolved Dirichlet regularizer. This approach scales to massive datasets and, hence, provides a more objective complement to plant physiological practices. The spectral topics found conform to plant physiological knowledge and can be computed in a fraction of the time compared to existing LDA approaches.

Christian Bauckhage

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Four Quadrants of Difficulty: A Simple Categorisation and its Limits

Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision

Dynamic Review-based Recommenders

Gradient Flows for L2 Support Vector Machine Training

Informed Pre-Training on Prior Knowledge

KPI-BERT: A Joint Named Entity Recognition and Relation Extraction Model for Financial Reports

Predict better with less training data using a QNN

QUBOs for Sorting Lists and Building Trees

Towards Bundle Adjustment for Satellite Imaging via Quantum Machine Learning

Learning Syllogism with Euler Neural-Networks

Recurrent Point Processes for Dynamic Review Models

Exploring Human Vision Driven Features for Pedestrian Detection

Maximum Entropy Models of Shortest Path and Outbreak Distributions in Networks

SGPD Volume Maximization for Community Detection

A Comparison of Methods for Player Clustering via Behavioral Telemetry

A Note on Archetypal Analysis and the Approximation of Convex Hulls

Characterizations and Kullback-Leibler Divergence of Gompertz Distributions

Computing the Kullback-Leibler Divergence between two Generalized Gamma Distributions

Marginalizing over the PageRank Damping Factor

Propagation Kernels

Strong Regularities in Growth and Decline of Popularity of Social Media Services

Computing the Kullback-Leibler Divergence between two Weibull Distributions

Efficient Information Theoretic Clustering on Discrete Lattices

GeoDBLP: Geo-Tagging DBLP for Mining the Sociology of Computer Science

Latent Dirichlet Allocation Uncovers Spectral Characteristics of Drought Stressed Plants