Researcher profile

Ali Hadian

Ali Hadian contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2021arXiv

COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies

Recent work proposed learned index structures, which learn the distribution of the underlying dataset to improve performance. The initial work on learned indexes has shown that by learning the cumulative distribution function of the data, index structures such as the B-Tree can improve their performance by one order of magnitude while having a smaller memory footprint. In this paper, we present COAX, a learned index for multidimensional data that, instead of learning the distribution of keys, learns the correlations between attributes of the dataset. Our approach is driven by the observation that in many datasets, values of two (or multiple) attributes are correlated. COAX exploits these correlations to reduce the dimensionality of the datasets. More precisely, we learn how to infer one (or multiple) attribute $C_d$ from the remaining attributes and hence no longer need to index attribute $C_d$. This reduces the dimensionality and hence makes the index smaller and more efficient. We theoretically investigate the effectiveness of the proposed technique based on the predictability of the FD attributes. We further show experimentally that by predicting correlated attributes in the data, we can improve the query execution time and reduce the memory overhead of the index. In our experiments, we reduce the execution time by 25% while reducing the memory footprint of the index by four orders of magnitude.

preprint2021arXiv

Shift-Table: A Low-latency Learned Index for Range Queries using Model Correction

Indexing large-scale databases in main memory is still challenging today. Learned index structures -- in which the core components of classical indexes are replaced with machine learning models -- have recently been suggested to significantly improve performance for read-only range queries. However, a recent benchmark study shows that learned indexes only achieve limited performance improvements for real-world data on modern hardware. More specifically, a learned model cannot learn the micro-level details and fluctuations of data distributions thus resulting in poor accuracy; or it can fit to the data distribution at the cost of training a big model whose parameters cannot fit into cache. As a consequence, querying a learned index on real-world data takes a substantial number of memory lookups, thereby degrading performance. In this paper, we adopt a different approach for modeling a data distribution that complements the model fitting approach of learned indexes. We propose Shift-Table, an algorithmic layer that captures the micro-level data distribution and resolves the local biases of a learned model at the cost of at most one memory lookup. Our suggested model combines the low latency of lookup tables with learned indexes and enables low-latency processing of range queries. Using Shift-Table, we achieve a speedup of 1.5X to 2X on real-world datasets compared to trained and tuned learned indexes.

preprint2020arXiv

Hands-off Model Integration in Spatial Index Structures

Spatial indexes are crucial for the analysis of the increasing amounts of spatial data, for example generated through IoT applications. The plethora of indexes that has been developed in recent decades has primarily been optimised for disk. With increasing amounts of memory even on commodity machines, however, moving them to main memory is an option. Doing so opens up the opportunity to use additional optimizations that are only amenable to main memory. In this paper we thus explore the opportunity to use light-weight machine learning models to accelerate queries on spatial indexes. We do so by exploring the potential of using interpolation and similar techniques on the R-tree, arguably the most broadly used spatial index. As we show in our experimental analysis, the query execution time can be reduced by up to 60% while simultaneously shrinking the index's memory footprint by over 90%