Source author record

Yahya Sattar

Yahya Sattar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Information Theory math.IT eess.SY math.OC Systems and Control

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Finite Sample Identification of Bilinear Dynamical Systems

Bilinear dynamical systems are ubiquitous in many different domains and they can also be used to approximate more general control-affine systems. This motivates the problem of learning bilinear systems from a single trajectory of the system's states and inputs. Under a mild marginal mean-square stability assumption, we identify how much data is needed to estimate the unknown bilinear system up to a desired accuracy with high probability. Our sample complexity and statistical error rates are optimal in terms of the trajectory length, the dimensionality of the system and the input size. Our proof technique relies on an application of martingale small-ball condition. This enables us to correctly capture the properties of the problem, specifically our error rates do not deteriorate with increasing instability. Finally, we show that numerical experiments are well-aligned with our theoretical results.

preprint2020arXiv

Exploring Weight Importance and Hessian Bias in Model Pruning

Model pruning is an essential procedure for building compact and computationally-efficient machine learning models. A key feature of a good pruning algorithm is that it accurately quantifies the relative importance of the model weights. While model pruning has a rich history, we still don't have a full grasp of the pruning mechanics even for relatively simple problems involving linear models or shallow neural nets. In this work, we provide a principled exploration of pruning by building on a natural notion of importance. For linear models, we show that this notion of importance is captured by covariance scaling which connects to the well-known Hessian-based pruning. We then derive asymptotic formulas that allow us to precisely compare the performance of different pruning methods. For neural networks, we demonstrate that the importance can be at odds with larger magnitudes and proper initialization is critical for magnitude-based pruning. Specifically, we identify settings in which weights become more important despite becoming smaller, which in turn leads to a catastrophic failure of magnitude-based pruning. Our results also elucidate that implicit regularization in the form of Hessian structure has a catalytic role in identifying the important weights, which dictate the pruning performance.

preprint2019arXiv

Quickly Finding the Best Linear Model in High Dimensions

We study the problem of finding the best linear model that can minimize least-squares loss given a data-set. While this problem is trivial in the low dimensional regime, it becomes more interesting in high dimensions where the population minimizer is assumed to lie on a manifold such as sparse vectors. We propose projected gradient descent (PGD) algorithm to estimate the population minimizer in the finite sample regime. We establish linear convergence rate and data dependent estimation error bounds for PGD. Our contributions include: 1) The results are established for heavier tailed sub-exponential distributions besides sub-gaussian. 2) We directly analyze the empirical risk minimization and do not require a realizable model that connects input data and labels. 3) Our PGD algorithm is augmented to learn the bias terms which boosts the performance. The numerical experiments validate our theoretical results.

preprint2016arXiv

Accurate Reconstruction of Finite Rate of Innovation Signals on the Sphere

We develop a method for the accurate reconstruction of non-bandlimited finite rate of innovation signals on the sphere. For signals consisting of a finite number of Dirac functions on the sphere, we develop an annihilating filter based method for the accurate recovery of parameters of the Dirac functions using a finite number of observations of the bandlimited signal. In comparison to existing techniques, the proposed method enables more accurate reconstruction primarily due to better conditioning of systems involved in the recovery of parameters. For the recovery of $K$ Diracs on the sphere, the proposed method requires samples of the signal bandlimited in the spherical harmonic~(SH) domain at SH degree equal or greater than $ K + \sqrt{K + \frac{1}{4}} - \frac{1}{2}$. In comparison to the existing state-of-the art technique, the required bandlimit, and consequently the number of samples, of the proposed method is the same or less. We also conduct numerical experiments to demonstrate that the proposed technique is more accurate than the existing methods by a factor of $10^{7}$ or more for $2 \le K\le 20$.