Source author record

Ahmed Hassan

Ahmed Hassan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Data Structures and Algorithms Databases Information Retrieval physics.chem-ph physics.flu-dyn

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Distributed Real-Time Recommender System for Big Data Streams

In today's data-driven world, recommender systems (RS) play a crucial role to support the decision-making process. As users become continuously connected to the internet, they become less patient and less tolerant to obsolete recommendations made by an RS, e.g., movie recommendations on Netflix or books to read on Amazon. This, in turn, requires continuous training of the RS to cope with both the online fashion of data and the changing nature of user tastes and interests, known as concept drift. Streaming (online) RS has to address three requirements: continuous training and recommendation, handling concept drifts, and ability to scale. Streaming recommender systems proposed in the literature mostly, address the first two requirements and do not consider scalability. That is because they run the training process on a single machine. Such a machine, no matter how powerful it is, will eventually fail to cope with the volume of the data, a lesson learned from big data processing. To tackle the third challenge, we propose a Splitting and Replication mechanism for building distributed streaming recommender systems. Our mechanism is inspired by the successful shared-nothing architecture that underpins contemporary big data processing systems. We have applied our mechanism to two well-known approaches for online recommender systems, namely, matrix factorization and item-based collaborative filtering. We have implemented our mechanism on top of Apache Flink. We conducted experiments comparing the performance of the baseline (single machine) approach with our distributed approach. Evaluating different data sets, improvement in processing latency, throughput, and accuracy have been observed. Our experiments show online recall improvement by 40\% with more than 50\% less memory consumption.

preprint2022arXiv

Technical Report: Bundling Linked Data Structures for Linearizable Range Queries

We present bundled references, a new building block to provide linearizable range query operations for highly concurrent lock-based linked data structures. Bundled references allow range queries to traverse a path through the data structure that is consistent with the target atomic snapshot. We demonstrate our technique with three data structures: a linked list, skip list, and a binary search tree. Our evaluation reveals that in mixed workloads, our design can improve upon the state-of-the-art techniques by 1.2x-1.8x for a skip list and 1.3x-3.7x for a binary search tree. We also integrate our bundled data structure into the DBx1000 in-memory database, yielding up to 40% gain over the same competitors.

preprint2020arXiv

Adjoint-Based Sensitivity Analysis of Steady Char Burnout

Simulations of pulverised coal combustion rely on various models, required in order to correctly approximate the flow, chemical reactions, and behavior of solid particles. These models, in turn, rely on multiple model parameters, which are determined through experiments or small-scale simulations and contain a certain level of uncertainty. The competing effects of transport, particle physics, and chemistry give rise to various scales and disparate dynamics, making it a very challenging problem to analyse. Therefore, the steady combustion process of a single solid particle is considered as a starting point for this study. As an added complication, the large number of parameters present in such simulations makes a purely forward approach to sensitivity analysis very expensive and almost infeasible. Therefore, the use of adjoint-based algorithms, to identify and quantify the underlying sensitivities and uncertainties, is proposed. This adjoint framework bears a great advantage in this case, where a large input space is analysed, since a single forward and backward sweep provides sensitivity information with respect to all parameters of interest. In order to investigate the applicability of such methods, both discrete and continuous adjoints are considered, and compared to the conventional approaches, such as finite differences, and forward sensitivity analysis. Various quantities of interest are considered, and sensitivities with respect to the relevant combustion parameters are reported for two different freestream compositions, describing air and oxy-atmospheres. This study serves as a benchmark for future research, where unsteady and finally turbulent cases will be considered.

preprint2020arXiv

Bundled References: An Abstraction for Highly-Concurrent Linearizable Range Queries

We present bundled references, a new building block to provide linearizable range query operations for highly concurrent linked data structures. Bundled references allow range queries to traverse a path through the data structure that is consistent with the target atomic snapshot and is made of the minimal amount of nodes that should be accessed to preserve linearizability. We implement our technique into a skip list, a binary search tree, and a linked list data structure. Our evaluation reveals that in mixed workloads, our design improves upon the state-of-the-art techniques by 3.9x for a skip list and 2.1x for a binary search tree. We also integrate our bundled data structure into the DBx1000 in-memory database, yielding up to 20% gain over the same competitors.

preprint2011arXiv

Improving Image Search based on User Created Communities

Tag-based retrieval of multimedia content is a difficult problem, not only because of the shorter length of tags associated with images and videos, but also due to mismatch in the terminologies used by searcher and content creator. To alleviate this problem, we propose a simple concept-driven probabilistic model for improving text-based rich-media search. While our approach is similar to existing topic-based retrieval and cluster-based language modeling work, there are two important differences: (1) our proposed model considers not only the query-generation likelihood from cluster, but explicitly accounts for the overall "popularity" of the cluster or underlying concept, and (2) we explore the possibility of inferring the likely concept relevant to a rich-media content through the user-created communities that the content belongs to. We implement two methods of concept extraction: a traditional cluster based approach, and the proposed community based approach. We evaluate these two techniques for how effectively they capture the intended meaning of a term from the content creator and searcher, and their overall value in improving image search. Our results show that concept-driven search, though simple, clearly outperforms plain search. Among the two techniques for concept-driven search, community-based approach is more successful, as the concepts generated from user communities are found to be more intuitive and appealing.