Source author record

Reza Fathi

Reza Fathi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Cryptography and Security Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

2works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Efficient Distributed Algorithms for the $K$-Nearest Neighbors Problem

The $K$-nearest neighbors is a basic problem in machine learning with numerous applications. In this problem, given a (training) set of $n$ data points with labels and a query point $p$, we want to assign a label to $p$ based on the labels of the $K$-nearest points to the query. We study this problem in the {\em $k$-machine model}, (Note that parameter $k$ stands for the number of machines in the $k$-machine model and is independent of $K$-nearest points.) a model for distributed large-scale data. In this model, we assume that the $n$ points are distributed (in a balanced fashion) among the $k$ machines and the goal is to quickly compute answer given a query point to a machine. Our main result is a simple randomized algorithm in the $k$-machine model that runs in $O(\log K)$ communication rounds with high probability success (regardless of the number of machines $k$ and the number of points $n$). The message complexity of the algorithm is small taking only $O(k\log K)$ messages. Our bounds are essentially the best possible for comparison-based algorithms (Algorithms that use only comparison operations ($\leq, \geq, =$) between elements to distinguish the ordering among them). This is due to the existence of a lower bound of $Ω(\log n)$ communication rounds for finding the {\em median} of $2n$ elements distributed evenly among two processors by Rodeh \cite{rodeh}. We also implemented our algorithm and show that it performs well compared to an algorithm (used in practice) that sends $K$ nearest points from each machine to a single machine which then computes the answer.

preprint2016arXiv

Anomaly Detection in XML-Structured SOAP Messages Using Tree-Based Association Rule Mining

Web services are software systems designed for supporting interoperable dynamic cross-enterprise interactions. The result of attacks to Web services can be catastrophic and causing the disclosure of enterprises' confidential data. As new approaches of attacking arise every day, anomaly detection systems seem to be invaluable tools in this context. The aim of this work has been to target the attacks that reside in the Web service layer and the extensible markup language (XML)-structured simple object access protocol (SOAP) messages. After studying the shortcomings of the existing solutions, a new approach for detecting anomalies in Web services is outlined. More specifically, the proposed technique illustrates how to identify anomalies by employing mining methods on XML-structured SOAP messages. This technique also takes the advantages of tree-based association rule mining to extract knowledge in the training phase, which is used in the test phase to detect anomalies. In addition, this novel composition of techniques brings nearly low false alarm rate while maintaining the detection rate reasonably high, which is shown by a case study.