Source author record

Ehab Al-Shaer

Ehab Al-Shaer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Machine Learning Computation and Language Computational Complexity Data Structures and Algorithms

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities

Weaknesses in computer systems such as faults, bugs and errors in the architecture, design or implementation of software provide vulnerabilities that can be exploited by attackers to compromise the security of a system. Common Weakness Enumerations (CWE) are a hierarchically designed dictionary of software weaknesses that provide a means to understand software flaws, potential impact of their exploitation, and means to mitigate these flaws. Common Vulnerabilities and Exposures (CVE) are brief low-level descriptions that uniquely identify vulnerabilities in a specific product or protocol. Classifying or mapping of CVEs to CWEs provides a means to understand the impact and mitigate the vulnerabilities. Since manual mapping of CVEs is not a viable option, automated approaches are desirable but challenging. We present a novel Transformer-based learning framework (V2W-BERT) in this paper. By using ideas from natural language processing, link prediction and transfer learning, our method outperforms previous approaches not only for CWE instances with abundant data to train, but also rare CWE classes with little or no data to train. Our approach also shows significant improvements in using historical data to predict links for future instances of CVEs, and therefore, provides a viable approach for practical applications. Using data from MITRE and National Vulnerability Database, we achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data. We believe that our work will influence the design of better methods and training models, as well as applications to solve increasingly harder problems in cybersecurity.

preprint2020arXiv

The Panacea Threat Intelligence and Active Defense Platform

We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative approaches for message analysis, knowledge representation and dialogue generation. The novelty of the Panacea system is that uses NLP for cyber defense and engages the attacker using bots to elicit evidence to attribute to the attacker and to waste the attacker's time and resources.

preprint2020arXiv

ThreatZoom: CVE2CWE using Hierarchical Neural Network

The Common Vulnerabilities and Exposures (CVE) represent standard means for sharing publicly known information security vulnerabilities. One or more CVEs are grouped into the Common Weakness Enumeration (CWE) classes for the purpose of understanding the software or configuration flaws and potential impacts enabled by these vulnerabilities and identifying means to detect or prevent exploitation. As the CVE-to-CWE classification is mostly performed manually by domain experts, thousands of critical and new CVEs remain unclassified, yet they are unpatchable. This significantly limits the utility of CVEs and slows down proactive threat mitigation. This paper presents the first automatic tool to classify CVEs to CWEs. ThreatZoom uses a novel learning algorithm that employs an adaptive hierarchical neural network which adjusts its weights based on text analytic scores and classification errors. It automatically estimates the CWE classes corresponding to a CVE instance using both statistical and semantic features extracted from the description of a CVE. This tool is rigorously tested by various datasets provided by MITRE and the National Vulnerability Database (NVD). The accuracy of classifying CVE instances to their correct CWE classes are 92% (fine-grain) and 94% (coarse-grain) for NVD dataset, and 75% (fine-grain) and 90% (coarse-grain) for MITRE dataset, despite the small corpus.

preprint2015arXiv

On DDoS Attack Related Minimum Cut Problems

In this paper, we study two important extensions of the classical minimum cut problem, called {\em Connectivity Preserving Minimum Cut (CPMC)} problem and {\em Threshold Minimum Cut (TMC)} problem, which have important applications in large-scale DDoS attacks. In CPMC problem, a minimum cut is sought to separate a of source from a destination node and meanwhile preserve the connectivity between the source and its partner node(s). The CPMC problem also has important applications in many other areas such as emergency responding, image processing, pattern recognition, and medical sciences. In TMC problem, a minimum cut is sought to isolate a target node from a threshold number of partner nodes. TMC problem is an important special case of network inhibition problem and has important applications in network security. We show that the general CPMC problem cannot be approximated within $logn$ unless $NP=P$ has quasi-polynomial algorithms. We also show that a special case of two group CPMC problem in planar graphs can be solved in polynomial time. The corollary of this result is that the network diversion problem in planar graphs is in $P$, a previously open problem. We show that the threshold minimum node cut (TMNC) problem can be approximated within ratio $O(\sqrt{n})$ and the threshold minimum edge cut problem (TMEC) can be approximated within ratio $O(\log^2{n})$. \emph{We also answer another long standing open problem: the hardness of the network inhibition problem and network interdiction problem. We show that both of them cannot be approximated within any constant ratio. unless $NP \nsubseteq \cap_{δ>0} BPTIME(2^{n^δ})$.