Researcher profile

Eric L. Goodman

Eric L. Goodman contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
5topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Packet2Vec: Utilizing Word2Vec for Feature Extraction in Packet Data

One of deep learning's attractive benefits is the ability to automatically extract relevant features for a target problem from largely raw data, instead of utilizing human engineered and error prone handcrafted features. While deep learning has shown success in fields such as image classification and natural language processing, its application for feature extraction on raw network packet data for intrusion detection is largely unexplored. In this paper we modify a Word2Vec approach, used for text processing, and apply it to packet data for automatic feature extraction. We call this approach Packet2Vec. For the classification task of benign versus malicious traffic on a 2009 DARPA network data set, we obtain an area under the curve (AUC) of the receiver operating characteristic (ROC) between 0.988-0.996 and an AUC of the Precision/Recall curve between 0.604-0.667.

preprint2020arXiv

Streaming Temporal Graphs: Subgraph Matching

We investigate solutions to subgraph matching within a temporal stream of data. We present a high-level language for describing temporal subgraphs of interest, the Streaming Analytics Language (SAL). SAL programs are translated into C++ code that is run in parallel on a cluster. We call this implementation of SAL the Streaming Analytics Machine (SAM). SAL programs are succinct, requiring about 20 times fewer lines of code than using the SAM library directly, or writing an implementation using Apache Flink. To benchmark SAM we calculate finding temporal triangles within streaming netflow data. Also, we compare SAM to an implementation written for Flink. We find that SAM is able to scale to 128 nodes or 2560 cores, while Apache Flink has max throughput with 32 nodes and degrades thereafter. Apache Flink has an advantage when triangles are rare, with max aggregate throughput for Flink at 32 nodes greater than the max achievable rate of SAM. In our experiments, when triangle occurrence was faster than five per second per node, SAM performed better. Both frameworks may miss results due to latencies in network communication. SAM consistently reported an average of 93.7% of expected results while Flink decreases from 83.7% to 52.1% as we increase to the maximum size of the cluster. Overall, SAM can obtain rates of 91.8 billion netflows per day.