Source author record

Christian Gruhl

Christian Gruhl appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

3works
2topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2016arXiv

Detecting Novel Processes with CANDIES -- An Holistic Novelty Detection Technique based on Probabilistic Models

In this article, we propose CANDIES (Combined Approach for Novelty Detection in Intelligent Embedded Systems), a new approach to novelty detection in technical systems. We assume that in a technical system several processes interact. If we observe these processes with sensors, we are able to model the observations (samples) with a probabilistic model, where, in an ideal case, the components of the parametric mixture density model we use, correspond to the processes in the real world. Eventually, at run-time, novel processes emerge in the technical systems such as in the case of an unpredictable failure. As a consequence, new kinds of samples are observed that require an adaptation of the model. CANDIES relies on mixtures of Gaussians which can be used for classification purposes, too. New processes may emerge in regions of the models' input spaces where few samples were observed before (low-density regions) or in regions where already many samples were available (high-density regions). The latter case is more difficult, but most existing solutions focus on the former. Novelty detection in low- and high-density regions requires different detection strategies. With CANDIES, we introduce a new technique to detect novel processes in high-density regions by means of a fast online goodness-of-fit test. For detection in low-density regions we combine this approach with a 2SND (Two-Stage-Novelty-Detector) which we presented in preliminary work. The properties of CANDIES are evaluated using artificial data and benchmark data from the field of intrusion detection in computer networks, where the task is to detect new kinds of attacks.

preprint2016arXiv

Towards Automation of Knowledge Understanding: An Approach for Probabilistic Generative Classifiers

After data selection, pre-processing, transformation, and feature extraction, knowledge extraction is not the final step in a data mining process. It is then necessary to understand this knowledge in order to apply it efficiently and effectively. Up to now, there is a lack of appropriate techniques that support this significant step. This is partly due to the fact that the assessment of knowledge is often highly subjective, e.g., regarding aspects such as novelty or usefulness. These aspects depend on the specific knowledge and requirements of the data miner. There are, however, a number of aspects that are objective and for which it is possible to provide appropriate measures. In this article we focus on classification problems and use probabilistic generative classifiers based on mixture density models that are quite common in data mining applications. We define objective measures to assess the informativeness, uniqueness, importance, discrimination, representativity, uncertainty, and distinguishability of rules contained in these classifiers numerically. These measures not only support a data miner in evaluating results of a data mining process based on such classifiers. As we will see in illustrative case studies, they may also be used to improve the data mining process itself or to support the later application of the extracted knowledge.

preprint2016arXiv

Variational Bayesian Inference for Hidden Markov Models With Multivariate Gaussian Output Distributions

Hidden Markov Models (HMM) have been used for several years in many time series analysis or pattern recognitions tasks. HMM are often trained by means of the Baum-Welch algorithm which can be seen as a special variant of an expectation maximization (EM) algorithm. Second-order training techniques such as Variational Bayesian Inference (VI) for probabilistic models regard the parameters of the probabilistic models as random variables and define distributions over these distribution parameters, hence the name of this technique. VI can also bee regarded as a special case of an EM algorithm. In this article, we bring both together and train HMM with multivariate Gaussian output distributions with VI. The article defines the new training technique for HMM. An evaluation based on some case studies and a comparison to related approaches is part of our ongoing work.