Source author record

Stefan Güttel

Stefan Güttel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.NA math-ph math.MP Mathematical Software Numerical Analysis

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fast and explainable clustering in the Manhattan and Tanimoto distance

The CLASSIX algorithm is a fast and explainable approach to data clustering. In its original form, this algorithm exploits the sorting of the data points by their first principal component to truncate the search for nearby data points, with nearness being defined in terms of the Euclidean distance. Here we extend CLASSIX to other distance metrics, including the Manhattan distance and the Tanimoto distance. Instead of principal components, we use an appropriate norm of the data vectors as the sorting criterion, combined with the triangle inequality for search termination. In the case of Tanimoto distance, a provably sharper intersection inequality is used to further boost the performance of the new algorithm. On a real-world chemical fingerprint benchmark, CLASSIX Tanimoto is about 30 times faster than the Taylor--Butina algorithm, and about 80 times faster than DBSCAN, while computing higher-quality clusters in both cases.

preprint2022arXiv

An efficient aggregation method for the symbolic representation of temporal data

Symbolic representations are a useful tool for the dimension reduction of temporal data, allowing for the efficient storage of and information retrieval from time series. They can also enhance the training of machine learning algorithms on time series data through noise reduction and reduced sensitivity to hyperparameters. The adaptive Brownian bridge-based aggregation (ABBA) method is one such effective and robust symbolic representation, demonstrated to accurately capture important trends and shapes in time series. However, in its current form the method struggles to process very large time series. Here we present a new variant of the ABBA method, called fABBA. This variant utilizes a new aggregation approach tailored to the piecewise representation of time series. By replacing the k-means clustering used in ABBA with a sorting-based aggregation technique, and thereby avoiding repeated sum-of-squares error computations, the computational complexity is significantly reduced. In contrast to the original method, the new approach does not require the number of time series symbols to be specified in advance. Through extensive tests we demonstrate that the new method significantly outperforms ABBA with a considerable reduction in runtime while also outperforming the popular SAX and 1d-SAX representations in terms of reconstruction accuracy. We further demonstrate that fABBA can compress other data types such as images.

preprint2022arXiv

Model order reduction of layered waveguides via rational Krylov fitting

Rational approximation recently emerged as an efficient numerical tool for the solution of exterior wave propagation problems. Currently, this technique is limited to wave media which are invariant along the main propagation direction. We propose a new model order reduction-based approach for compressing unbounded waveguides with layered inclusions. It is based on the solution of a nonlinear rational least squares problem using the RKFIT method. We show that approximants can be converted into an accurate finite difference representation within a rational Krylov framework. Numerical experiments indicate that RKFIT computes more accurate grids than previous analytic approaches and even works in the presence of pronounced scattering resonances. Spectral adaptation effects allow for finite difference grids with dimensions near or even below the Nyquist limit.

preprint2020arXiv

ABBA: Adaptive Brownian bridge-based symbolic aggregation of time series

A new symbolic representation of time series, called ABBA, is introduced. It is based on an adaptive polygonal chain approximation of the time series into a sequence of tuples, followed by a mean-based clustering to obtain the symbolic representation. We show that the reconstruction error of this representation can be modelled as a random walk with pinned start and end points, a so-called Brownian bridge. This insight allows us to make ABBA essentially parameter-free, except for the approximation tolerance which must be chosen. Extensive comparisons with the SAX and 1d-SAX representations are included in the form of performance profiles, showing that ABBA is able to better preserve the essential shape information of time series compared to other approaches. Advantages and applications of ABBA are discussed, including its in-built differencing property and use for anomaly detection, and Python implementations provided.

preprint2020arXiv

Time Series Forecasting Using LSTM Networks: A Symbolic Approach

Machine learning methods trained on raw numerical time series data exhibit fundamental limitations such as a high sensitivity to the hyper parameters and even to the initialization of random weights. A combination of a recurrent neural network with a dimension-reducing symbolic representation is proposed and applied for the purpose of time series forecasting. It is shown that the symbolic representation can help to alleviate some of the aforementioned problems and, in addition, might allow for faster training without sacrificing the forecast performance.

preprint2015arXiv

Near-optimal perfectly matched layers for indefinite Helmholtz problems

A new construction of an absorbing boundary condition for indefinite Helmholtz problems on unbounded domains is presented. This construction is based on a near-best uniform rational interpolant of the inverse square root function on the union of a negative and positive real interval, designed with the help of a classical result by Zolotarev. Using Krein's interpretation of a Stieltjes continued fraction, this interpolant can be converted into a three-term finite difference discretization of a perfectly matched layer (PML) which converges exponentially fast in the number of grid points. The convergence rate is asymptotically optimal for both propagative and evanescent wave modes. Several numerical experiments and illustrations are included.