Source author record

Jacek Tabor

Jacek Tabor appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Information Theory math.IT Artificial Intelligence Methodology Cryptography and Security eess.IV math.DS math.NA math.ST Numerical Analysis physics.comp-ph Statistics Theory

Catalog footprint

What is connected

36works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bayesian Fine-tuning in Projected Subspaces

Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of large models by decomposing weight updates into low-rank matrices, significantly reducing storage and computational overhead. While effective, standard LoRA lacks mechanisms for uncertainty quantification, leading to overconfident and poorly calibrated models. Bayesian variants of LoRA address this limitation, but at the cost of a significantly increased number of trainable parameters, partially offsetting the original efficiency gains. Additionally, these models are harder to train and may suffer from unstable convergence. In this work, we propose a novel framework for parameter-efficient Bayesian fine-tuning, demonstrating that effective uncertainty quantification can be achieved in very low-dimensional parameter spaces. The proposed method achieves strong performance with improved calibration and generalization while maintaining computational efficiency. Our empirical findings show that, with the appropriate projection of the weight space uncertainty can be effectively modeled in a low-dimensional space, and weight covariances exhibit low ranks.

preprint2026arXiv

ProDG: Prototypes for Data-Free Generative Post-Hoc Explainability

Ante-hoc interpretability methods based on prototypes provide highly accurate explanations by utilizing the intuitive "this looks like that" reasoning paradigm. On the other hand, post-hoc models can explain predictions for a single image without relying on an underlying dataset or requiring costly neural network retraining. Recent approaches successfully solve the retraining problem for prototype-based networks. However, they still face a fundamental limitation: they require access to a subset of data (e.g., a test or validation set) to search for and extract the visual prototypes. In this paper, we address this issue and introduce ProDG: Generative Prototypes for Data-Free Post-Hoc Explainability, a novel framework that leverages generative models to synthesize pure, high-fidelity prototypes directly from the frozen model's weights, completely eliminating the dependency on any external data. By establishing this new frontier in Data-Free XAI, ProDG unlocks robust visual interpretability for privacy-sensitive domains, where original data is strictly restricted or fundamentally inaccessible. Project page: https://github.com/piotr310100/ProDG

preprint2026arXiv

SoftSAE: Dynamic Top-K Selection for Adaptive Sparse Autoencoders

Sparse Autoencoders (SAEs) have become an important tool in mechanistic interpretability, helping to analyze internal representations in both Large Language Models (LLMs) and Vision Transformers (ViTs). By decomposing polysemantic activations into sparse sets of monosemantic features, SAEs aim to translate neural network computations into human-understandable concepts. However, common architectures such as TopK SAEs rely on a fixed sparsity level. They enforce the same number of active features (K) across all inputs, ignoring the varying complexity of real-world data. Natural data often lies on manifolds with varying local intrinsic dimensionality, meaning the number of relevant factors can change significantly across samples. This suggests that a fixed sparsity level is not optimal. Simple inputs may require only a few features, while more complex ones need more expressive representations. Using a constant K can therefore introduce noise in simple cases or miss important structure in more complex ones. To address this issue, we propose SoftSAE, a sparse autoencoder with a Dynamic Top-K selection mechanism. Our method uses a differentiable Soft Top-K operator to learn an input-dependent sparsity level k. This allows the model to adjust the number of active features based on the complexity of each input. As a result, the representation better matches the structure of the data, and the explanation length reflects the amount of information in the input. Experimental results confirm that SoftSAE not only finds meaningful features, but also selects the right number of features for each concept. The source code is available at: https://github.com/St0pien/SoftSAE.

preprint2026arXiv

Stop Marginalizing My Dreams: Model Inversion via Laplace Kernel for Continual Learning

Data-free continual learning (DFCIL) relies on model inversion to synthesize pseudo-samples and mitigate catastrophic forgetting. However, existing inversion methods are fundamentally limited by a simplifying assumption: they model feature distributions using diagonal covariance, effectively ignoring correlations that define the geometry of learned representations. As a result, synthesized samples often lack fidelity, limiting knowledge retention. In this work, we show that modeling feature dependencies is a key ingredient for effective DFCIL. We introduce REMIX, a structured covariance modeling framework that enables scalable full-covariance modeling without the prohibitive cost of dense matrix inversion and log-determinant computation. By leveraging a Laplace kernel parameterization, REMIX captures structured feature dependencies using memory that scales linearly with the feature dimensionality, while requiring only an additional logarithmic factor in computation. Modeling these correlations produces more coherent synthetic samples and consistently improves performance across standard DFCIL benchmarks. Our results demonstrate that moving beyond diagonal assumptions is essential for effective and scalable data-free continual learning. Our code is available at https://github. com/pkrukowski1/REMIX-Model-Inversion-via-Laplace-Kernel.

preprint2022arXiv

Continual Learning with Guarantees via Weight Interval Constraints

We introduce a new training paradigm that enforces interval constraints on neural network parameter space to control forgetting. Contemporary Continual Learning (CL) methods focus on training neural networks efficiently from a stream of data, while reducing the negative impact of catastrophic forgetting, yet they do not provide any firm guarantees that network performance will not deteriorate uncontrollably over time. In this work, we show how to put bounds on forgetting by reformulating continual learning of a model as a continual contraction of its parameter space. To that end, we propose Hyperrectangle Training, a new training methodology where each task is represented by a hyperrectangle in the parameter space, fully contained in the hyperrectangles of the previous tasks. This formulation reduces the NP-hard CL problem back to polynomial time while providing full resilience against forgetting. We validate our claim by developing InterContiNet (Interval Continual Learning) algorithm which leverages interval arithmetic to effectively model parameter regions as hyperrectangles. Through experimental results, we show that our approach performs well in a continual learning setup without storing data from previous tasks.

preprint2022arXiv

Interpretable Image Classification with Differentiable Prototypes Assignment

We introduce ProtoPool, an interpretable image classification model with a pool of prototypes shared by the classes. The training is more straightforward than in the existing methods because it does not require the pruning stage. It is obtained by introducing a fully differentiable assignment of prototypes to particular classes. Moreover, we introduce a novel focal similarity function to focus the model on the rare foreground features. We show that ProtoPool obtains state-of-the-art accuracy on the CUB-200-2011 and the Stanford Cars datasets, substantially reducing the number of prototypes. We provide a theoretical analysis of the method and a user study to show that our prototypes are more distinctive than those obtained with competitive methods.

preprint2022arXiv

LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

Most of the existing methods for estimating the local intrinsic dimension of a data distribution do not scale well to high-dimensional data. Many of them rely on a non-parametric nearest neighbors approach which suffers from the curse of dimensionality. We attempt to address that challenge by proposing a novel approach to the problem: Local Intrinsic Dimension estimation using approximate Likelihood (LIDL). Our method relies on an arbitrary density estimation method as its subroutine and hence tries to sidestep the dimensionality challenge by making use of the recent progress in parametric neural methods for likelihood estimation. We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, and show that LIDL yields competitive results on the standard benchmarks for this problem and that it scales to thousands of dimensions. What is more, we anticipate this approach to improve further with the continuing advances in the density estimation literature.

preprint2022arXiv

ProPaLL: Probabilistic Partial Label Learning

Partial label learning is a type of weakly supervised learning, where each training instance corresponds to a set of candidate labels, among which only one is true. In this paper, we introduce ProPaLL, a novel probabilistic approach to this problem, which has at least three advantages compared to the existing approaches: it simplifies the training process, improves performance, and can be applied to any deep architecture. Experiments conducted on artificial and real-world datasets indicate that ProPaLL outperforms the existing approaches.

preprint2022arXiv

SLOVA: Uncertainty Estimation Using Single Label One-Vs-All Classifier

Deep neural networks present impressive performance, yet they cannot reliably estimate their predictive confidence, limiting their applicability in high-risk domains. We show that applying a multi-label one-vs-all loss reveals classification ambiguity and reduces model overconfidence. The introduced SLOVA (Single Label One-Vs-All) model redefines typical one-vs-all predictive probabilities to a single label situation, where only one class is the correct answer. The proposed classifier is confident only if a single class has a high probability and other probabilities are negligible. Unlike the typical softmax function, SLOVA naturally detects out-of-distribution samples if the probabilities of all other classes are small. The model is additionally fine-tuned with exponential calibration, which allows us to precisely align the confidence score with model accuracy. We verify our approach on three tasks. First, we demonstrate that SLOVA is competitive with the state-of-the-art on in-distribution calibration. Second, the performance of SLOVA is robust under dataset shifts. Finally, our approach performs extremely well in the detection of out-of-distribution samples. Consequently, SLOVA is a tool that can be used in various applications where uncertainty modeling is required.

preprint2021arXiv

HyperPocket: Generative Point Cloud Completion

Scanning real-life scenes with modern registration devices typically give incomplete point cloud representations, mostly due to the limitations of the scanning process and 3D occlusions. Therefore, completing such partial representations remains a fundamental challenge of many computer vision applications. Most of the existing approaches aim to solve this problem by learning to reconstruct individual 3D objects in a synthetic setup of an uncluttered environment, which is far from a real-life scenario. In this work, we reformulate the problem of point cloud completion into an object hallucination task. Thus, we introduce a novel autoencoder-based architecture called HyperPocket that disentangles latent representations and, as a result, enables the generation of multiple variants of the completed 3D point clouds. We split point cloud processing into two disjoint data streams and leverage a hypernetwork paradigm to fill the spaces, dubbed pockets, that are left by the missing object parts. As a result, the generated point clouds are not only smooth but also plausible and geometrically consistent with the scene. Our method offers competitive performances to the other state-of-the-art models, and it enables a~plethora of novel applications.

preprint2021arXiv

Kernel Self-Attention in Deep Multiple Instance Learning

Not all supervised learning problems are described by a pair of a fixed-size input tensor and a label. In some cases, especially in medical image analysis, a label corresponds to a bag of instances (e.g. image patches), and to classify such bag, aggregation of information from all of the instances is needed. There have been several attempts to create a model working with a bag of instances, however, they are assuming that there are no dependencies within the bag and the label is connected to at least one instance. In this work, we introduce Self-Attention Attention-based MIL Pooling (SA-AbMILP) aggregation operation to account for the dependencies between instances. We conduct several experiments on MNIST, histological, microbiological, and retinal databases to show that SA-AbMILP performs better than other models. Additionally, we investigate kernel variations of Self-Attention and their influence on the results.

preprint2020arXiv

Adversarial Examples Detection and Analysis with Layer-wise Autoencoders

We present a mechanism for detecting adversarial examples based on data representations taken from the hidden layers of the target network. For this purpose, we train individual autoencoders at intermediate layers of the target network. This allows us to describe the manifold of true data and, in consequence, decide whether a given example has the same characteristics as true data. It also gives us insight into the behavior of adversarial examples and their flow through the layers of a deep neural network. Experimental results show that our method outperforms the state of the art in supervised and unsupervised settings.

preprint2020arXiv

Finding the Optimal Network Depth in Classification Tasks

We develop a fast end-to-end method for training lightweight neural networks using multiple classifier heads. By allowing the model to determine the importance of each head and rewarding the choice of a single shallow classifier, we are able to detect and remove unneeded components of the network. This operation, which can be seen as finding the optimal depth of the model, significantly reduces the number of parameters and accelerates inference across different hardware processing units, which is not the case for many standard pruning methods. We show the performance of our method on multiple network architectures and datasets, analyze its optimization properties, and conduct ablation studies.

preprint2020arXiv

Generative models with kernel distance in data space

Generative models dealing with modeling a~joint data distribution are generally either autoencoder or GAN based. Both have their pros and cons, generating blurry images or being unstable in training or prone to mode collapse phenomenon, respectively. The objective of this paper is to construct a~model situated between above architectures, one that does not inherit their main weaknesses. The proposed LCW generator (Latent Cramer-Wold generator) resembles a classical GAN in transforming Gaussian noise into data space. What is of utmost importance, instead of a~discriminator, LCW generator uses kernel distance. No adversarial training is utilized, hence the name generator. It is trained in two phases. First, an autoencoder based architecture, using kernel measures, is built to model a manifold of data. We propose a Latent Trick mapping a Gaussian to latent in order to get the final model. This results in very competitive FID values.

preprint2020arXiv

HyperFlow: Representing 3D Objects as Surfaces

In this work, we present HyperFlow - a novel generative model that leverages hypernetworks to create continuous 3D object representations in a form of lightweight surfaces (meshes), directly out of point clouds. Efficient object representations are essential for many computer vision applications, including robotic manipulation and autonomous driving. However, creating those representations is often cumbersome, because it requires processing unordered sets of point clouds. Therefore, it is either computationally expensive, due to additional optimization constraints such as permutation invariance, or leads to quantization losses introduced by binning point clouds into discrete voxels. Inspired by mesh-based representations of objects used in computer graphics, we postulate a fundamentally different approach and represent 3D objects as a family of surfaces. To that end, we devise a generative model that uses a hypernetwork to return the weights of a Continuous Normalizing Flows (CNF) target network. The goal of this target network is to map points from a probability distribution into a 3D mesh. To avoid numerical instability of the CNF on compact support distributions, we propose a new Spherical Log-Normal function which models density of 3D points around object surfaces mimicking noise introduced by 3D capturing devices. As a result, we obtain continuous mesh-based object representations that yield better qualitative results than competing approaches, while reducing training time by over an order of magnitude.

preprint2020arXiv

Molecule Attention Transformer

Designing a single neural network architecture that performs competitively across a range of molecule property prediction tasks remains largely an open challenge, and its solution may unlock a widespread use of deep learning in the drug discovery industry. To move towards this goal, we propose Molecule Attention Transformer (MAT). Our key innovation is to augment the attention mechanism in Transformer using inter-atomic distances and the molecular graph structure. Experiments show that MAT performs competitively on a diverse set of molecular prediction tasks. Most importantly, with a simple self-supervised pretraining, MAT requires tuning of only a few hyperparameter values to achieve state-of-the-art performance on downstream tasks. Finally, we show that attention weights learned by MAT are interpretable from the chemical point of view.

preprint2020arXiv

SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

We propose a semi-supervised generative model, SeGMA, which learns a joint probability distribution of data and their classes and which is implemented in a typical Wasserstein auto-encoder framework. We choose a mixture of Gaussians as a target distribution in latent space, which provides a natural splitting of data into clusters. To connect Gaussian components with correct classes, we use a small amount of labeled data and a Gaussian classifier induced by the target distribution. SeGMA is optimized efficiently due to the use of Cramer-Wold distance as a maximum mean discrepancy penalty, which yields a closed-form expression for a mixture of spherical Gaussian components and thus obviates the need of sampling. While SeGMA preserves all properties of its semi-supervised predecessors and achieves at least as good generative performance on standard benchmark data sets, it presents additional features: (a) interpolation between any pair of points in the latent space produces realistically-looking samples; (b) combining the interpolation property with disentangled class and style variables, SeGMA is able to perform a continuous style transfer from one class to another; (c) it is possible to change the intensity of class characteristics in a data point by moving the latent representation of the data point away from specific Gaussian components.

preprint2020arXiv

Spatial Graph Convolutional Networks

Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Spatial Graph Convolutional Network (SGCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalization of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, SGCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.

preprint2020arXiv

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

The early phase of training of deep neural networks is critical for their final performance. In this work, we study how the hyperparameters of stochastic gradient descent (SGD) used in the early phase of training affect the rest of the optimization trajectory. We argue for the existence of the "break-even" point on this trajectory, beyond which the curvature of the loss surface and noise in the gradient are implicitly regularized by SGD. In particular, we demonstrate on multiple classification tasks that using a large learning rate in the initial phase of training reduces the variance of the gradient, and improves the conditioning of the covariance of gradients. These effects are beneficial from the optimization perspective and become visible after the break-even point. Complementing prior work, we also show that using a low learning rate results in bad conditioning of the loss surface even for a neural network with batch normalization layers. In short, our work shows that key properties of the loss surface are strongly influenced by SGD in the early phase of training. We argue that studying the impact of the identified effects on generalization is a promising future direction.

preprint2019arXiv

Cramer-Wold AutoEncoder

We propose a new generative model, Cramer-Wold Autoencoder (CWAE). Following WAE, we directly encourage normality of the latent space. Our paper uses also the recent idea from Sliced WAE (SWAE) model, which uses one-dimensional projections as a method of verifying closeness of two distributions. The crucial new ingredient is the introduction of a new (Cramer-Wold) metric in the space of densities, which replaces the Wasserstein metric used in SWAE. We show that the Cramer-Wold metric between Gaussian mixtures is given by a simple analytic formula, which results in the removal of sampling necessary to estimate the cost function in WAE and SWAE models. As a consequence, while drastically simplifying the optimization procedure, CWAE produces samples of a matching perceptual quality to other SOTA models.

preprint2019arXiv

Set Aggregation Network as a Trainable Pooling Layer

Global pooling, such as max- or sum-pooling, is one of the key ingredients in deep neural networks used for processing images, texts, graphs and other types of structured data. Based on the recent DeepSets architecture proposed by Zaheer et al. (NIPS 2017), we introduce a Set Aggregation Network (SAN) as an alternative global pooling layer. In contrast to typical pooling operators, SAN allows to embed a given set of features to a vector representation of arbitrary size. We show that by adjusting the size of embedding, SAN is capable of preserving the whole information from the input. In experiments, we demonstrate that replacing global pooling layer by SAN leads to the improvement of classification accuracy. Moreover, it is less prone to overfitting and can be used as a regularizer.

preprint2015arXiv

Extreme Entropy Machines: Robust information theoretic classification

Most of the existing classification methods are aimed at minimization of empirical risk (through some simple point-based error measured with loss function) with added regularization. We propose to approach this problem in a more information theoretic way by investigating applicability of entropy measures as a classification model objective function. We focus on quadratic Renyi's entropy and connected Cauchy-Schwarz Divergence which leads to the construction of Extreme Entropy Machines (EEM). The main contribution of this paper is proposing a model based on the information theoretic concepts which on the one hand shows new, entropic perspective on known linear classifiers and on the other leads to a construction of very robust method competetitive with the state of the art non-information theoretic ones (including Support Vector Machines and Extreme Learning Machines). Evaluation on numerous problems spanning from small, simple ones from UCI repository to the large (hundreads of thousands of samples) extremely unbalanced (up to 100:1 classes' ratios) datasets shows wide applicability of the EEM in real life problems and that it scales well.

preprint2015arXiv

Introduction to Cross-Entropy Clustering The R Package CEC

The R Package CEC performs clustering based on the cross-entropy clustering (CEC) method, which was recently developed with the use of information theory. The main advantage of CEC is that it combines the speed and simplicity of $k$-means with the ability to use various Gaussian mixture models and reduce unnecessary clusters. In this work we present a practical tutorial to CEC based on the R Package CEC. Functions are provided to encompass the whole process of clustering.

preprint2015arXiv

Maximum Entropy Linear Manifold for Learning Discriminative Low-dimensional Representation

Representation learning is currently a very hot topic in modern machine learning, mostly due to the great success of the deep learning methods. In particular low-dimensional representation which discriminates classes can not only enhance the classification procedure, but also make it faster, while contrary to the high-dimensional embeddings can be efficiently used for visual based exploratory data analysis. In this paper we propose Maximum Entropy Linear Manifold (MELM), a multidimensional generalization of Multithreshold Entropy Linear Classifier model which is able to find a low-dimensional linear data projection maximizing discriminativeness of projected classes. As a result we obtain a linear embedding which can be used for classification, class aware dimensionality reduction and data visualization. MELM provides highly discriminative 2D projections of the data which can be used as a method for constructing robust classifiers. We provide both empirical evaluation as well as some interesting theoretical properties of our objective function such us scale and affine transformation invariance, connections with PCA and bounding of the expected balanced accuracy error.

preprint2015arXiv

On rigorous estimates of eigenspaces and eigenvalues of a matrix

We present a method of cones for rigorous estimations of eigenvectors, eigenspaces and eigenvalues of a matrix. The key notion is the cone-domination and is inspired by ideas from hyperbolic dynamical systems. We present theorems which allow to rigorously locate the spectrum of the matrix and the eigenspaces, also multidimensional ones in case of eigenvalues of multiplicity greater than one or clusters of close eigenvalues. In case of isolated eigenvalue we show that the our method give the same or better estimates than ones known in literature.

preprint2014arXiv

Cluster based RBF Kernel for Support Vector Machines

In the classical Gaussian SVM classification we use the feature space projection transforming points to normal distributions with fixed covariance matrices (identity in the standard RBF and the covariance of the whole dataset in Mahalanobis RBF). In this paper we add additional information to Gaussian SVM by considering local geometry-dependent feature space projection. We emphasize that our approach is in fact an algorithm for a construction of the new Gaussian-type kernel. We show that better (compared to standard RBF and Mahalanobis RBF) classification results are obtained in the simple case when the space is preliminary divided by k-means into two sets and points are represented as normal distributions with a covariances calculated according to the dataset partitioning. We call the constructed method C$_k$RBF, where $k$ stands for the amount of clusters used in k-means. We show empirically on nine datasets from UCI repository that C$_2$RBF increases the stability of the grid search (measured as the probability of finding good parameters).

preprint2014arXiv

Multithreshold Entropy Linear Classifier

Linear classifiers separate the data with a hyperplane. In this paper we focus on the novel method of construction of multithreshold linear classifier, which separates the data with multiple parallel hyperplanes. Proposed model is based on the information theory concepts -- namely Renyi's quadratic entropy and Cauchy-Schwarz divergence. We begin with some general properties, including data scale invariance. Then we prove that our method is a multithreshold large margin classifier, which shows the analogy to the SVM, while in the same time works with much broader class of hypotheses. What is also interesting, proposed method is aimed at the maximization of the balanced quality measure (such as Matthew's Correlation Coefficient) as opposed to very common maximization of the accuracy. This feature comes directly from the optimization problem statement and is further confirmed by the experiments on the UCI datasets. It appears, that our Multithreshold Entropy Linear Classifier (MELC) obtaines similar or higher scores than the ones given by SVM on both synthetic and real data. We show how proposed approach can be benefitial for the cheminformatics in the task of ligands activity prediction, where despite better classification results, MELC gives some additional insight into the data structure (classes of underrepresented chemical compunds).

preprint2013arXiv

Optimal Rescaling and the Mahalanobis Distance

One of the basic problems in data analysis lies in choosing the optimal rescaling (change of coordinate system) to study properties of a given data-set $Y$. The classical Mahalanobis approach has its basis in the classical normalization/rescaling formula $Y \ni y \to Σ_Y^{-1/2} \cdot (y-\mathrm{m}_Y)$, where $\mathrm{m}_Y$ denotes the mean of $Y$ and $Σ_Y$ the covariance matrix . Based on the cross-entropy we generalize this approach and define the parameter which measures the fit of a given affine rescaling of $Y$ compared to the Mahalanobis one. This allows in particular to find an optimal change of coordinate system which satisfies some additional conditions. In particular we show that in the case when we put origin of coordinate system in $ \mathrm{m} $ the optimal choice is given by the transformation $Y \ni y \to Σ_Y^{-1/2} \cdot (y-\mathrm{m}_Y)$, where $$ Σ=Σ_Y(Σ_Y-\frac{(\mathrm{m}-\mathrm{m}_Y)(\mathrm{m}-\mathrm{m}_Y)^T}{1+\|\mathrm{m}-\mathrm{m}_Y\|_{Σ_Y}^2})^{-1}Σ_Y. $$

preprint2012arXiv

Cross-Entropy Clustering

We construct a cross-entropy clustering (CEC) theory which finds the optimal number of clusters by automatically removing groups which carry no information. Moreover, our theory gives simple and efficient criterion to verify cluster validity. Although CEC can be build on an arbitrary family of densities, in the most important case of Gaussian CEC: {\em -- the division into clusters is affine invariant; -- the clustering will have the tendency to divide the data into ellipsoid-type shapes; -- the approach is computationally efficient as we can apply Hartigan approach.} We study also with particular attention clustering based on the Spherical Gaussian densities and that of Gaussian densities with covariance $s \I$. In the letter case we show that with $s$ converging to zero we obtain the classical k-means clustering.

preprint2012arXiv

Detection of elliptical shapes via cross-entropy clustering

The problem of finding elliptical shapes in an image will be considered. We discuss the solution which uses cross-entropy clustering. The proposed method allows the search for ellipses with predefined sizes and position in the space. Moreover, it works well for search of ellipsoids in higher dimensions.

preprint2012arXiv

Partition Reduction for Lossy Data Compression Problem

We consider the computational aspects of lossy data compression problem, where the compression error is determined by a cover of the data space. We propose an algorithm which reduces the number of partitions needed to find the entropy with respect to the compression error. In particular, we show that, in the case of finite cover, the entropy is attained on some partition. We give an algorithmic construction of such partition.

preprint2012arXiv

Strict localization of eigenvectors and eigenvalues

In this article we show and implement a simple and effcient method to strictly locate eigenvectors and eigenvalues of a given matrix, based on the modified cone condition. As a consequence we can also effectively localize zeros of complex polynomials.

preprint2012arXiv

The memory centre

Let $x \in \R$ be given. As we know the, amount of bits needed to binary code $x$ with given accuracy ($h \in \R$) is approximately $ \m_{h}(x) \approx \log_{2}(\max {1, |\frac{x}{h}|}). $ We consider the problem where we should translate the origin $a$ so that the mean amount of bits needed to code randomly chosen element from a realization of a random variable $X$ is minimal. In other words, we want to find $a \in \R$ such that $$ \R \ni a \to \mathrm{E} (\m_{h} (X-a)) $$ attains minimum.

preprint2012arXiv

Weighted Approach to Rényi Entropy

Rényi entropy of order αis a general measure of entropy. In this paper we derive estimations for the Rényi entropy of the mixture of sources in terms of the entropy of the single sources. These relations allow to compute the Rényi entropy dimension of arbitrary order of a mixture of measures. The key for obtaining these results is our new definition of the weighted Rényi entropy. It is shown that weighted entropy is equal to the classical Rényi entropy.

preprint2011arXiv

Entropy of the Mixture of Sources and Entropy Dimension

We investigate the problem of the entropy of the mixture of sources. There is given an estimation of the entropy and entropy dimension of convex combination of measures. The proof is based on our alternative definition of the entropy based on measures instead of partitions.

preprint2011arXiv

k-means Approach to the Karhunen-Loeve Transform

We present a simultaneous generalization of the well-known Karhunen-Loeve (PCA) and k-means algorithms. The basic idea lies in approximating the data with k affine subspaces of a given dimension n. In the case n=0 we obtain the classical k-means, while for k=1 we obtain PCA algorithm. We show that for some data exploration problems this method gives better result then either of the classical approaches.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.07706:author:3:jacek-tabor

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.06610:author:3:jacek-tabor

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.11804:author:2:jacek-tabor

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.08858:author:3:jacek-tabor

Imported May 20, 2026Synced May 20, 2026

16 works

Przemysław Spurek

Researcher

Przemysław Spurek contributes to research discovery and scholarly infrastructure.

Open to collaborate

9 works

Łukasz Struski

Researcher

Łukasz Struski contributes to research discovery and scholarly infrastructure.

Open to collaborate

9 works

Marek Śmieja

Researcher

Marek Śmieja contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Bartosz Wójcik

Researcher

Bartosz Wójcik contributes to research discovery and scholarly infrastructure.

Open to collaborate

Jacek Tabor

What is connected

Connect this record

See the researcher in context

Building this map preview

36 published item(s)

Bayesian Fine-tuning in Projected Subspaces

ProDG: Prototypes for Data-Free Generative Post-Hoc Explainability

SoftSAE: Dynamic Top-K Selection for Adaptive Sparse Autoencoders

Stop Marginalizing My Dreams: Model Inversion via Laplace Kernel for Continual Learning

Continual Learning with Guarantees via Weight Interval Constraints

Interpretable Image Classification with Differentiable Prototypes Assignment

LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

ProPaLL: Probabilistic Partial Label Learning

SLOVA: Uncertainty Estimation Using Single Label One-Vs-All Classifier

HyperPocket: Generative Point Cloud Completion

Kernel Self-Attention in Deep Multiple Instance Learning

Adversarial Examples Detection and Analysis with Layer-wise Autoencoders

Finding the Optimal Network Depth in Classification Tasks

Generative models with kernel distance in data space

HyperFlow: Representing 3D Objects as Surfaces

Molecule Attention Transformer

SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

Spatial Graph Convolutional Networks

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

Cramer-Wold AutoEncoder

Set Aggregation Network as a Trainable Pooling Layer

Extreme Entropy Machines: Robust information theoretic classification

Introduction to Cross-Entropy Clustering The R Package CEC

Maximum Entropy Linear Manifold for Learning Discriminative Low-dimensional Representation

On rigorous estimates of eigenspaces and eigenvalues of a matrix

Cluster based RBF Kernel for Support Vector Machines

Multithreshold Entropy Linear Classifier

Optimal Rescaling and the Mahalanobis Distance

Cross-Entropy Clustering

Detection of elliptical shapes via cross-entropy clustering

Partition Reduction for Lossy Data Compression Problem

Strict localization of eigenvectors and eigenvalues

The memory centre

Weighted Approach to Rényi Entropy

Entropy of the Mixture of Sources and Entropy Dimension

k-means Approach to the Karhunen-Loeve Transform