Researcher profile

Mariya Ishteva

Mariya Ishteva contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Robust Basis Spline Decoupling for the Compression of Transformer Models

Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer decoupling can be viewed as a fully connected neural network with a single hidden layer and flexible activation functions, providing a direct link with neural networks. Because of this, the use of decoupling methods has gained increasing attention in neural network domains, particularly compression, since it enables structured approximations with reduced parameter complexity. Existing tensor-based decoupling methods typically rely on polynomial or piecewise-linear parameterizations of the internal nonlinear functions, which can suffer from numerical instability or limited expressiveness. In this work, we introduce a B-spline-based decoupling framework that generalizes these existing approaches. By exploiting the local support and flexible smoothness control of B-splines, the proposed formulation yields a more numerically stable and expressive representation. We derive a constrained coupled matrix-tensor factorization and propose a robust alternating least-squares algorithm, called R-CMTF-BSD, incorporating normalization and Tikhonov regularization. The proposed method is validated through experiments on synthetic data and transformer model compression. Results on the Vision and Swin Transformer architectures demonstrate that B-spline decoupling enables substantial parameter reduction while maintaining competitive accuracy, making the R-CMTF-BSD algorithm a promising tool for structured neural network compression.

preprint2016arXiv

Modeling Parallel Wiener-Hammerstein Systems Using Tensor Decomposition of Volterra Kernels

Providing flexibility and user-interpretability in nonlinear system identification can be achieved by means of block-oriented methods. One of such block-oriented system structures is the parallel Wiener-Hammerstein system, which is a sum of Wiener-Hammerstein branches, consisting of static nonlinearities sandwiched between linear dynamical blocks. Parallel Wiener-Hammerstein models have more descriptive power than their single-branch counterparts, but their identification is a non-trivial task that requires tailored system identification methods. In this work, we will tackle the identification problem by performing a tensor decomposition of the Volterra kernels obtained from the nonlinear system. We illustrate how the parallel Wiener-Hammerstein block-structure gives rise to a joint tensor decomposition of the Volterra kernels with block-circulant structured factors. The combination of Volterra kernels and tensor methods is a fruitful way to tackle the parallel Wiener-Hammerstein system identification task. In simulation experiments, we were able to reconstruct very accurately the underlying blocks under noisy conditions.

preprint2016arXiv

Weighted tensor decomposition for approximate decoupling of multivariate polynomials

Multivariate polynomials arise in many different disciplines. Representing such a polynomial as a vector of univariate polynomials can offer useful insight, as well as more intuitive understanding. For this, techniques based on tensor methods are known, but these have only been studied in the exact case. In this paper, we generalize an existing method to the noisy case, by introducing a weight factor in the tensor decomposition. Finally, we apply the proposed weighted decoupling algorithm in the domain of system identification, and observe smaller model errors.

preprint2014arXiv

Factorization approach to structured low-rank approximation with applications

We consider the problem of approximating an affinely structured matrix, for example a Hankel matrix, by a low-rank matrix with the same structure. This problem occurs in system identification, signal processing and computer algebra, among others. We impose the low-rank by modeling the approximation as a product of two factors with reduced dimension. The structure of the low-rank model is enforced by introducing a penalty term in the objective function. The proposed local optimization algorithm is able to solve the weighted structured low-rank approximation problem, as well as to deal with the cases of missing or fixed elements. In contrast to approaches based on kernel representations (in linear algebraic sense), the proposed algorithm is designed to address the case of small targeted rank. We compare it to existing approaches on numerical examples of system identification, approximate greatest common divisor problem, and symmetric tensor decomposition and demonstrate its consistently good performance.

preprint2012arXiv

A Spectral Algorithm for Latent Junction Trees

Latent variable models are an elegant framework for capturing rich probabilistic dependencies in many applications. However, current approaches typically parametrize these models using conditional probability tables, and learning relies predominantly on local search heuristics such as Expectation Maximization. Using tensor algebra, we propose an alternative parameterization of latent variable models (where the model structures are junction trees) that still allows for computation of marginals among observed variables. While this novel representation leads to a moderate increase in the number of parameters for junction trees of low treewidth, it lets us design a local-minimum-free algorithm for learning this parameterization. The main computation of the algorithm involves only tensor operations and SVDs which can be orders of magnitude faster than EM algorithms for large datasets. To our knowledge, this is the first provably consistent parameter learning technique for a large class of low-treewidth latent graphical models beyond trees. We demonstrate the advantages of our method on synthetic and real datasets.

preprint2012arXiv

Unfolding Latent Tree Structures using 4th Order Tensors

Discovering the latent structure from many observed variables is an important yet challenging learning task. Existing approaches for discovering latent structures often require the unknown number of hidden states as an input. In this paper, we propose a quartet based approach which is \emph{agnostic} to this number. The key contribution is a novel rank characterization of the tensor associated with the marginal distribution of a quartet. This characterization allows us to design a \emph{nuclear norm} based test for resolving quartet relations. We then use the quartet test as a subroutine in a divide-and-conquer algorithm for recovering the latent tree structure. Under mild conditions, the algorithm is consistent and its error probability decays exponentially with increasing sample size. We demonstrate that the proposed approach compares favorably to alternatives. In a real world stock dataset, it also discovers meaningful groupings of variables, and produces a model that fits the data better.