Researcher profile

Puoya Tabaghi

Puoya Tabaghi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

HyperAid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering

The problem of fitting distances by tree-metrics has received significant attention in the theoretical computer science and machine learning communities alike, due to many applications in natural language processing, phylogeny, cancer genomics and a myriad of problem areas that involve hierarchical clustering. Despite the existence of several provably exact algorithms for tree-metric fitting of data that inherently obeys tree-metric constraints, much less is known about how to best fit tree-metrics for data whose structure moderately (or substantially) differs from a tree. For such noisy data, most available algorithms perform poorly and often produce negative edge weights in representative trees. Furthermore, it is currently not known how to choose the most suitable approximation objective for noisy fitting. Our contributions are as follows. First, we propose a new approach to tree-metric denoising (HyperAid) in hyperbolic spaces which transforms the original data into data that is ``more'' tree-like, when evaluated in terms of Gromov's $δ$ hyperbolicity. Second, we perform an ablation study involving two choices for the approximation objective, $\ell_p$ norms and the Dasgupta loss. Third, we integrate HyperAid with schemes for enforcing nonnegative edge-weights. As a result, the HyperAid platform outperforms all other existing methods in the literature, including Neighbor Joining (NJ), TreeRep and T-REX, both on synthetic and real-world data. Synthetic data is represented by edge-augmented trees and shortest-distance metrics while the real-world datasets include Zoo, Iris, Glass, Segmentation and SpamBase; on these datasets, the average improvement with respect to NJ is $125.94\%$.

preprint2022arXiv

Linear Classifiers in Product Space Forms

Embedding methods for product spaces are powerful techniques for low-distortion and low-dimensional representation of complex data structures. Here, we address the new problem of linear classification in product space forms -- products of Euclidean, spherical, and hyperbolic spaces. First, we describe novel formulations for linear classifiers on a Riemannian manifold using geodesics and Riemannian metrics which generalize straight lines and inner products in vector spaces. Second, we prove that linear classifiers in $d$-dimensional space forms of any curvature have the same expressive power, i.e., they can shatter exactly $d+1$ points. Third, we formalize linear classifiers in product space forms, describe the first known perceptron and support vector machine classifiers for such spaces and establish rigorous convergence results for perceptrons. Moreover, we prove that the Vapnik-Chervonenkis dimension of linear classifiers in a product space form of dimension $d$ is \emph{at least} $d+1$. We support our theoretical findings with simulations on several datasets, including synthetic data, image data, and single-cell RNA sequencing (scRNA-seq) data. The results show that classification in low-dimensional product space forms for scRNA-seq data offers, on average, a performance improvement of $\sim15\%$ when compared to that in Euclidean spaces of the same dimension.

preprint2022arXiv

Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces

Many high-dimensional practical data sets have hierarchical structures induced by graphs or time series. Such data sets are hard to process in Euclidean spaces and one often seeks low-dimensional embeddings in other space forms to perform the required learning tasks. For hierarchical data, the space of choice is a hyperbolic space because it guarantees low-distortion embeddings for tree-like structures. The geometry of hyperbolic spaces has properties not encountered in Euclidean spaces that pose challenges when trying to rigorously analyze algorithmic solutions. We propose a unified framework for learning scalable and simple hyperbolic linear classifiers with provable performance guarantees. The gist of our approach is to focus on Poincaré ball models and formulate the classification problems using tangent space formalisms. Our results include a new hyperbolic perceptron algorithm as well as an efficient and highly accurate convex optimization setup for hyperbolic support vector machine classifiers. Furthermore, we adapt our approach to accommodate second-order perceptrons, where data is preprocessed based on second-order information (correlation) to accelerate convergence, and strategic perceptrons, where potentially manipulated data arrives in an online manner and decisions are made sequentially. The excellent performance of the Poincaré second-order and strategic perceptrons shows that the proposed framework can be extended to general machine learning problems in hyperbolic spaces. Our experimental results, pertaining to synthetic, single-cell RNA-seq expression measurements, CIFAR10, Fashion-MNIST and mini-ImageNet, establish that all algorithms provably converge and have complexity comparable to those of their Euclidean counterparts. Accompanying codes can be found at: https://github.com/thupchnsky/PoincareLinearClassification.

preprint2020arXiv

Hyperbolic Distance Matrices

Hyperbolic space is a natural setting for mining and visualizing data with hierarchical structure. In order to compute a hyperbolic embedding from comparison or similarity information, one has to solve a hyperbolic distance geometry problem. In this paper, we propose a unified framework to compute hyperbolic embeddings from an arbitrary mix of noisy metric and non-metric data. Our algorithms are based on semidefinite programming and the notion of a hyperbolic distance matrix, in many ways parallel to its famous Euclidean counterpart. A central ingredient we put forward is a semidefinite characterization of the hyperbolic Gramian -- a matrix of Lorentzian inner products. This characterization allows us to formulate a semidefinite relaxation to efficiently compute hyperbolic embeddings in two stages: first, we complete and denoise the observed hyperbolic distance matrix; second, we propose a spectral factorization method to estimate the embedded points from the hyperbolic distance matrix. We show through numerical experiments how the flexibility to mix metric and non-metric constraints allows us to efficiently compute embeddings from arbitrary data.