Source author record

Mohammed J. Zaki

Mohammed J. Zaki appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Machine Learning Artificial Intelligence Databases Information Retrieval Cryptography and Security Data Structures and Algorithms Computer Vision Distributed, Parallel, and Cluster Computing Neural and Evolutionary Computing Neurons and Cognition Social and Information Networks

Catalog footprint

What is connected

16works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Deep Clustering with Associative Memories

Deep clustering - joint representation learning and latent space clustering - is a well studied problem especially in computer vision and text processing under the deep learning framework. While the representation learning is generally differentiable, clustering is an inherently discrete optimization task, requiring various approximations and regularizations to fit in a standard differentiable pipeline. This leads to a somewhat disjointed representation learning and clustering. In this work, we propose a novel loss function utilizing energy-based dynamics via Associative Memories to formulate a new deep clustering method, DCAM, which ties together the representation learning and clustering aspects more intricately in a single objective. Our experiments showcase the advantage of DCAM, producing improved clustering quality for various architecture choices (convolutional, residual or fully-connected) and data modalities (images or text).

preprint2026arXiv

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

When do language diffusion models memorize their training data, and how to quantitatively assess their true generative regime? We address these questions by showing that Uniform-based Discrete Diffusion Models (UDDMs) fundamentally behave as Associative Memories (AMs) $\textit{with emergent creative capabilities}$. The core idea of an AM is to reliably recover stored data points as $\textit{memories}$ by establishing distinct basins of attraction around them. Historically, models like Hopfield networks use an explicit energy function to guarantee these stable attractors. We broaden this perspective by leveraging the observation that energy is not strictly necessary, as basins of attraction can also be formed via conditional likelihood maximization. By evaluating token recovery of $\textit{training}$ and $\textit{test}$ examples, we identify in UDDMs a sharp memorization-to-generalization transition governed by the size of the training dataset: as it increases, basins around training examples shrink and basins around unseen test examples expand, until both later converge to the same level. Crucially, we can detect this transition using only the conditional entropy of predicted token sequences: memorization is characterized by vanishing conditional entropy, while in the generalization regime the conditional entropy of most tokens remains finite. Thus, conditional entropy offers a practical probe for the memorization-to-generalization transition in deployed models.

preprint2023arXiv

TINKER: A framework for Open source Cyberthreat Intelligence

Threat intelligence on malware attacks and campaigns is increasingly being shared with other security experts for a cost or for free. Other security analysts use this intelligence to inform them of indicators of compromise, attack techniques, and preventative actions. Security analysts prepare threat analysis reports after investigating an attack, an emerging cyber threat, or a recently discovered vulnerability. Collectively known as cyber threat intelligence (CTI), the reports are typically in an unstructured format and, therefore, challenging to integrate seamlessly into existing intrusion detection systems. This paper proposes a framework that uses the aggregated CTI for analysis and defense at scale. The information is extracted and stored in a structured format using knowledge graphs such that the semantics of the threat intelligence can be preserved and shared at scale with other security analysts. Specifically, we propose the first semi-supervised open-source knowledge graph-based framework, TINKER, to capture cyber threat information and its context. Following TINKER, we generate a Cyberthreat Intelligence Knowledge Graph (CTI-KG) and demonstrate the usage using different use cases.

preprint2022arXiv

Associative Learning for Network Embedding

The network embedding task is to represent the node in the network as a low-dimensional vector while incorporating the topological and structural information. Most existing approaches solve this problem by factorizing a proximity matrix, either directly or implicitly. In this work, we introduce a network embedding method from a new perspective, which leverages Modern Hopfield Networks (MHN) for associative learning. Our network learns associations between the content of each node and that node's neighbors. These associations serve as memories in the MHN. The recurrent dynamics of the network make it possible to recover the masked node, given that node's neighbors. Our proposed method is evaluated on different downstream tasks such as node classification and linkage prediction. The results show competitive performance compared to the common matrix factorization techniques and deep learning based methods.

preprint2022arXiv

Global Self-Attention as a Replacement for Graph Convolution

We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning on graph-structured data. Our model exclusively uses global self-attention as an aggregation mechanism rather than static localized convolutional aggregation. This allows for unconstrained long-range dynamic interactions between nodes. Moreover, the edge channels allow the structural information to evolve from layer to layer, and prediction tasks on edges/links can be performed directly from the output embeddings of these channels. We verify the performance of EGT in a wide range of graph-learning experiments on benchmark datasets, in which it outperforms Convolutional/Message-Passing Graph Neural Networks. EGT sets a new state-of-the-art for the quantum-chemical regression task on the OGB-LSC PCQM4Mv2 dataset containing 3.8 million molecular graphs. Our findings indicate that global self-attention based aggregation can serve as a flexible, adaptive and effective replacement of graph convolution for general-purpose graph learning. Therefore, convolutional local neighborhood aggregation is not an essential inductive bias.

preprint2022arXiv

Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data

With an increased interest in the production of personal health technologies designed to track user data (e.g., nutrient intake, step counts), there is now more opportunity than ever to surface meaningful behavioral insights to everyday users in the form of natural language. This knowledge can increase their behavioral awareness and allow them to take action to meet their health goals. It can also bridge the gap between the vast collection of personal health data and the summary generation required to describe an individual's behavioral tendencies. Previous work has focused on rule-based time-series data summarization methods designed to generate natural language summaries of interesting patterns found within temporal personal health data. We examine recurrent, convolutional, and Transformer-based encoder-decoder models to automatically generate natural language summaries from numeric temporal personal health data. We showcase the effectiveness of our models on real user health data logged in MyFitnessPal and show that we can automatically generate high-quality natural language summaries. Our work serves as a first step towards the ambitious goal of automatically generating novel and meaningful temporal summaries from personal health data.

preprint2021arXiv

Personalized Food Recommendation as Constrained Question Answering over a Large-scale Food Knowledge Graph

Food recommendation has become an important means to help guide users to adopt healthy dietary habits. Previous works on food recommendation either i) fail to consider users' explicit requirements, ii) ignore crucial health factors (e.g., allergies and nutrition needs), or iii) do not utilize the rich food knowledge for recommending healthy recipes. To address these limitations, we propose a novel problem formulation for food recommendation, modeling this task as constrained question answering over a large-scale food knowledge base/graph (KBQA). Besides the requirements from the user query, personalized requirements from the user's dietary preferences and health guidelines are handled in a unified way as additional constraints to the QA system. To validate this idea, we create a QA style dataset for personalized food recommendation based on a large-scale food knowledge graph and health guidelines. Furthermore, we propose a KBQA-based personalized food recommendation framework which is equipped with novel techniques for handling negations and numerical comparisons in the queries. Experimental results on the benchmark show that our approach significantly outperforms non-personalized counterparts (average 59.7% absolute improvement across various evaluation metrics), and is able to recommend more relevant and healthier recipes.

preprint2020arXiv

GraphFlow: Exploiting Conversation Flow with Graph Neural Networks for Conversational Machine Comprehension

Conversational machine comprehension (MC) has proven significantly more challenging compared to traditional MC since it requires better utilization of conversation history. However, most existing approaches do not effectively capture conversation history and thus have trouble handling questions involving coreference or ellipsis. Moreover, when reasoning over passage text, most of them simply treat it as a word sequence without exploring rich semantic relationships among words. In this paper, we first propose a simple yet effective graph structure learning technique to dynamically construct a question and conversation history aware context graph at each conversation turn. Then we propose a novel Recurrent Graph Neural Network, and based on that, we introduce a flow mechanism to model the temporal dependencies in a sequence of context graphs. The proposed GraphFlow model can effectively capture conversational flow in a dialog, and shows competitive performance compared to existing state-of-the-art methods on CoQA, QuAC and DoQA benchmarks. In addition, visualization experiments show that our proposed model can offer good interpretability for the reasoning process.

preprint2020arXiv

MALOnt: An Ontology for Malware Threat Intelligence

Malware threat intelligence uncovers deep information about malware, threat actors, and their tactics, Indicators of Compromise(IoC), and vulnerabilities in different platforms from scattered threat sources. This collective information can guide decision making in cyber defense applications utilized by security operation centers(SoCs). In this paper, we introduce an open-source malware ontology - MALOnt that allows the structured extraction of information and knowledge graph generation, especially for threat intelligence. The knowledge graph that uses MALOnt is instantiated from a corpus comprising hundreds of annotated malware threat reports. The knowledge graph enables the analysis, detection, classification, and attribution of cyber threats caused by malware. We also demonstrate the annotation process using MALOnt on exemplar threat intelligence reports. A work in progress, this research is part of a larger effort towards auto-generation of knowledge graphs (KGs)for gathering malware threat intelligence from heterogeneous online resources.

preprint2020arXiv

Personal Health Knowledge Graphs for Patients

Existing patient data analytics platforms fail to incorporate information that has context, is personal, and topical to patients. For a recommendation system to give a suitable response to a query or to derive meaningful insights from patient data, it should consider personal information about the patient's health history, including but not limited to their preferences, locations, and life choices that are currently applicable to them. In this review paper, we critique existing literature in this space and also discuss the various research challenges that come with designing, building, and operationalizing a personal health knowledge graph (PHKG) for patients.

preprint2020arXiv

Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation

Natural question generation (QG) aims to generate questions from a passage and an answer. Previous works on QG either (i) ignore the rich structure information hidden in text, (ii) solely rely on cross-entropy loss that leads to issues like exposure bias and inconsistency between train/test measurement, or (iii) fail to fully exploit the answer information. To address these limitations, in this paper, we propose a reinforcement learning (RL) based graph-to-sequence (Graph2Seq) model for QG. Our model consists of a Graph2Seq generator with a novel Bidirectional Gated Graph Neural Network based encoder to embed the passage, and a hybrid evaluator with a mixed objective combining both cross-entropy and RL losses to ensure the generation of syntactically and semantically valid text. We also introduce an effective Deep Alignment Network for incorporating the answer information into the passage at both the word and contextual levels. Our model is end-to-end trainable and achieves new state-of-the-art scores, outperforming existing methods by a significant margin on the standard SQuAD benchmark.

preprint2019arXiv

Natural Question Generation with Reinforcement Learning Based Graph-to-Sequence Model

Natural question generation (QG) aims to generate questions from a passage and an answer. In this paper, we propose a novel reinforcement learning (RL) based graph-to-sequence (Graph2Seq) model for QG. Our model consists of a Graph2Seq generator where a novel Bidirectional Gated Graph Neural Network is proposed to embed the passage, and a hybrid evaluator with a mixed objective combining both cross-entropy and RL losses to ensure the generation of syntactically and semantically valid text. The proposed model outperforms previous state-of-the-art methods by a large margin on the SQuAD dataset.

preprint2015arXiv

Arabesque: A System for Distributed Graph Mining - Extended version

Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms do not represent a good match for distributed graph mining problems, as for example finding frequent subgraphs in a graph. Given an input graph, these problems require exploring a very large number of subgraphs and finding patterns that match some "interestingness" criteria desired by the user. These algorithms are very important for areas such as social net- works, semantic web, and bioinformatics. In this paper, we present Arabesque, the first distributed data processing platform for implementing graph mining algorithms. Arabesque automates the process of exploring a very large number of subgraphs. It defines a high-level filter-process computational model that simplifies the development of scalable graph mining algorithms: Arabesque explores subgraphs and passes them to the application, which must simply compute outputs and decide whether the subgraph should be further extended. We use Arabesque's API to produce distributed solutions to three fundamental graph mining problems: frequent subgraph mining, counting motifs, and finding cliques. Our implementations require a handful of lines of code, scale to trillions of subgraphs, and represent in some cases the first available distributed solutions.

preprint2013arXiv

DAGGER: A Scalable Index for Reachability Queries in Large Dynamic Graphs

With the ubiquity of large-scale graph data in a variety of application domains, querying them effectively is a challenge. In particular, reachability queries are becoming increasingly important, especially for containment, subsumption, and connectivity checks. Whereas many methods have been proposed for static graph reachability, many real-world graphs are constantly evolving, which calls for dynamic indexing. In this paper, we present a fully dynamic reachability index over dynamic graphs. Our method, called DAGGER, is a light-weight index based on interval labeling, that scales to million node graphs and beyond. Our extensive experimental evaluation on real-world and synthetic graphs confirms its effectiveness over baseline methods.

preprint2012arXiv

BitPath -- Label Order Constrained Reachability Queries over Large Graphs

In this paper we focus on the following constrained reachability problem over edge-labeled graphs like RDF -- "given source node x, destination node y, and a sequence of edge labels (a, b, c, d), is there a path between the two nodes such that the edge labels on the path satisfy a regular expression "*a.*b.*c.*d.*". A "*" before "a" allows any other edge label to appear on the path before edge "a". "a.*" forces at least one edge with label "a". ".*" after "a" allows zero or more edge labels after "a" and before "b". Our query processing algorithm uses simple divide-and-conquer and greedy pruning procedures to limit the search space. However, our graph indexing technique -- based on "compressed bit-vectors" -- allows indexing large graphs which otherwise would have been infeasible. We have evaluated our approach on graphs with more than 22 million edges and 6 million nodes -- much larger compared to the datasets used in the contemporary work on path queries.

preprint2012arXiv

Mining Attribute-structure Correlated Patterns in Large Attributed Graphs

In this work, we study the correlation between attribute sets and the occurrence of dense subgraphs in large attributed graphs, a task we call structural correlation pattern mining. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Existing methods are not able to extract relevant knowledge regarding how vertex attributes interact with dense subgraphs. Structural correlation pattern mining combines aspects of frequent itemset and quasi-clique mining problems. We propose statistical significance measures that compare the structural correlation of attribute sets against their expected values using null models. Moreover, we evaluate the interestingness of structural correlation patterns in terms of size and density. An efficient algorithm that combines search and pruning strategies in the identification of the most relevant structural correlation patterns is presented. We apply our method for the analysis of three real-world attributed graphs: a collaboration, a music, and a citation network, verifying that it provides valuable knowledge in a feasible time.

Mohammed J. Zaki

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Deep Clustering with Associative Memories

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

TINKER: A framework for Open source Cyberthreat Intelligence

Associative Learning for Network Embedding

Global Self-Attention as a Replacement for Graph Convolution

Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data

Personalized Food Recommendation as Constrained Question Answering over a Large-scale Food Knowledge Graph

GraphFlow: Exploiting Conversation Flow with Graph Neural Networks for Conversational Machine Comprehension

MALOnt: An Ontology for Malware Threat Intelligence

Personal Health Knowledge Graphs for Patients

Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation

Natural Question Generation with Reinforcement Learning Based Graph-to-Sequence Model

Arabesque: A System for Distributed Graph Mining - Extended version

DAGGER: A Scalable Index for Reachability Queries in Large Dynamic Graphs

BitPath -- Label Order Constrained Reachability Queries over Large Graphs

Mining Attribute-structure Correlated Patterns in Large Attributed Graphs