Researcher profile

Ibrahim Sabek

Ibrahim Sabek contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Is Quantum Computing Ready for Real-Time Database Optimization?

Database systems encompass several performance-critical optimization tasks, such as join ordering and index tuning. As data volumes grow and workloads become more complex, these problems have become exponentially harder to solve efficiently. Quantum computing, especially quantum annealing, is a promising paradigm that can efficiently explore very large search spaces through quantum tunneling. It can escape local optima by tunneling through energy barriers rather than climbing over them. Earlier works mainly focused on providing an abstract representation (e.g., Quadratic Unconstrained Binary Optimization (QUBO)) for the database optimization problems (e.g., join order) and overlooked the real integration within database systems due to the high overhead of quantum computing services (e.g., a minimum 5s runtime for D-Wave's CQM-Solver). Recently, quantum annealing providers have offered more low-latency solutions, e.g., NL-Solver, which paves the road to actually realizing quantum solutions within DBMSs. However, this raises new systems research challenges in balancing efficiency and solution quality. In this talk, we show that this balance is possible to achieve. As a proof of concept, we present Q2O, the first real Quantum-augmented Query Optimizer. We show the end-to-end workflow: we encode the join order problem as a nonlinear model, a format solvable by the NL-Solver, using actual database statistics; the solution is translated into a plan hint that guides PostgreSQL's optimizer to produce a complete plan. Q2O is capable of handling actual queries in real time.

preprint2022arXiv

The Case for Learned In-Memory Joins

In-memory join is an essential operator in any database engine. It has been extensively investigated in the database literature. In this paper, we study whether exploiting the CDF-based learned models to boost the join performance is practical or not. To the best of our knowledge, we are the first to fill this gap. We investigate the usage of CDF-based partitioning and learned indexes (e.g., Recursive Model Indexes (RMI) and RadixSpline) in the three join categories; indexed nested loop join (INLJ), sort-based joins (SJ) and hash-based joins (HJ). Our study shows that there is a room to improve the performance of INLJ and SJ categories through our proposed optimized learned variants. Our experimental analysis showed that these proposed learned variants of INLJ and SJ consistently outperform the state-of-the-art techniques.

preprint2021arXiv

The Case for Distance-Bounded Spatial Approximations

Spatial approximations have been traditionally used in spatial databases to accelerate the processing of complex geometric operations. However, approximations are typically only used in a first filtering step to determine a set of candidate spatial objects that may fulfill the query condition. To provide accurate results, the exact geometries of the candidate objects are tested against the query condition, which is typically an expensive operation. Nevertheless, many emerging applications (e.g., visualization tools) require interactive responses, while only needing approximate results. Besides, real-world geospatial data is inherently imprecise, which makes exact data processing unnecessary. Given the uncertainty associated with spatial data and the relaxed precision requirements of many applications, this vision paper advocates for approximate spatial data processing techniques that omit exact geometric tests and provide final answers solely on the basis of (fine-grained) approximations. Thanks to recent hardware advances, this vision can be realized today. Furthermore, our approximate techniques employ a distance-based error bound, i.e., a bound on the maximum spatial distance between false (or missing) and exact results which is crucial for meaningful analyses. This bound allows to control the precision of the approximation and trade accuracy for performance.

preprint2020arXiv

The Case for Learned Spatial Indexes

Spatial data is ubiquitous. Massive amounts of data are generated every day from billions of GPS-enabled devices such as cell phones, cars, sensors, and various consumer-based applications such as Uber, Tinder, location-tagged posts in Facebook, Twitter, Instagram, etc. This exponential growth in spatial data has led the research community to focus on building systems and applications that can process spatial data efficiently. In the meantime, recent research has introduced learned index structures. In this work, we use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) and apply them to five classical multi-dimensional indexes to be able to answer spatial range queries. By tuning each partitioning technique for optimal performance, we show that (i) machine learned search within a partition is faster by 11.79\% to 39.51\% than binary search when using filtering on one dimension, (ii) the bottleneck for tree structures is index lookup, which could potentially be improved by linearizing the indexed partitions (iii) filtering on one dimension and refining using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions, and (iv) learned indexes can have a significant impact on the performance of low selectivity queries while being less effective under higher selectivities.