Researcher profile

Abdallah Shami

Abdallah Shami contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
27works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

27 published item(s)

preprint2022arXiv

A Hybrid Method for Condition Monitoring and Fault Diagnosis of Rolling Bearings With Low System Delay

Vibration-based condition monitoring techniques are commonly used to detect and diagnose failures of rolling bearings. Accuracy and delay in detecting and diagnosing different types of failures are the main performance measures in condition monitoring. Achieving high accuracy with low delay improves system reliability and prevents catastrophic equipment failure. Further, delay is crucial to remote condition monitoring and time-sensitive industrial applications. While most of the proposed methods focus on accuracy, slight attention has been paid to addressing the delay introduced in the condition monitoring process. In this paper, we attempt to bridge this gap and propose a hybrid method for vibration-based condition monitoring and fault diagnosis of rolling bearings that outperforms previous methods in terms of accuracy and delay. Specifically, we address the overall delay in vibration-based condition monitoring systems and introduce the concept of system delay to assess it. Then, we present the proposed method for condition monitoring. It uses Wavelet Packet Transform (WPT) and Fourier analysis to decompose short-duration input segments of the vibration signal into elementary waveforms and obtain their spectral contents. Accordingly, energy concentration in the spectral components-caused by defect induced transient vibrations-is utilized to extract a small number of features with high discriminative capabilities. Consequently, Bayesian optimization-based Random Forest (RF) algorithm is used to classify healthy and faulty operating conditions under varying motor speeds. The experimental results show that the proposed method can achieve high accuracy with low system delay.

preprint2022arXiv

A Transfer Learning and Optimized CNN Based Intrusion Detection System for Internet of Vehicles

Modern vehicles, including autonomous vehicles and connected vehicles, are increasingly connected to the external world, which enables various functionalities and services. However, the improving connectivity also increases the attack surfaces of the Internet of Vehicles (IoV), causing its vulnerabilities to cyber-threats. Due to the lack of authentication and encryption procedures in vehicular networks, Intrusion Detection Systems (IDSs) are essential approaches to protect modern vehicle systems from network attacks. In this paper, a transfer learning and ensemble learning-based IDS is proposed for IoV systems using convolutional neural networks (CNNs) and hyper-parameter optimization techniques. In the experiments, the proposed IDS has demonstrated over 99.25% detection rates and F1-scores on two well-known public benchmark IoV security datasets: the Car-Hacking dataset and the CICIDS2017 dataset. This shows the effectiveness of the proposed IDS for cyber-attack detection in both intra-vehicle and external vehicular networks.

preprint2022arXiv

An Attention-based ConvLSTM Autoencoder with Dynamic Thresholding for Unsupervised Anomaly Detection in Multivariate Time Series

As a substantial amount of multivariate time series data is being produced by the complex systems in Smart Manufacturing, improved anomaly detection frameworks are needed to reduce the operational risks and the monitoring burden placed on the system operators. However, building such frameworks is challenging, as a sufficiently large amount of defective training data is often not available and frameworks are required to capture both the temporal and contextual dependencies across different time steps while being robust to noise. In this paper, we propose an unsupervised Attention-based Convolutional Long Short-Term Memory (ConvLSTM) Autoencoder with Dynamic Thresholding (ACLAE-DT) framework for anomaly detection and diagnosis in multivariate time series. The framework starts by pre-processing and enriching the data, before constructing feature images to characterize the system statuses across different time steps by capturing the inter-correlations between pairs of time series. Afterwards, the constructed feature images are fed into an attention-based ConvLSTM autoencoder, which aims to encode the constructed feature images and capture the temporal behavior, followed by decoding the compressed knowledge representation to reconstruct the feature images input. The reconstruction errors are then computed and subjected to a statistical-based, dynamic thresholding mechanism to detect and diagnose the anomalies. Evaluation results conducted on real-life manufacturing data demonstrate the performance strengths of the proposed approach over state-of-the-art methods under different experimental settings.

preprint2022arXiv

Intelligent Transportation Systems' Orchestration: Lessons Learned & Potential Opportunities

The growing deployment efforts of 5G networks globally has led to the acceleration of the businesses/services' digital transformation. This growth has led to the need for new communication technologies that will promote this transformation. 6G is being proposed as the set of technologies and architectures that will achieve this target. Among the main use cases that have emerged for 5G networks and will continue to play a pivotal role in 6G networks is that of Intelligent Transportation Systems (ITSs). With all the projected benefits of developing and deploying efficient and effective ITSs comes a group of unique challenges that need to be addressed. One prominent challenge is ITS orchestration due to the various supporting technologies and heterogeneous networks used to offer the desired ITS applications/services. To that end, this paper focuses on the ITS orchestration challenge in detail by highlighting the related previous works from the literature and listing the lessons learned from current ITS deployment orchestration efforts. It also presents multiple potential data-driven research opportunities in which paradigms such as reinforcement learning and federated learning can be deployed to offer effective and efficient ITS orchestration.

preprint2022arXiv

LCCDE: A Decision-Based Ensemble Framework for Intrusion Detection in The Internet of Vehicles

Modern vehicles, including autonomous vehicles and connected vehicles, have adopted an increasing variety of functionalities through connections and communications with other vehicles, smart devices, and infrastructures. However, the growing connectivity of the Internet of Vehicles (IoV) also increases the vulnerabilities to network attacks. To protect IoV systems against cyber threats, Intrusion Detection Systems (IDSs) that can identify malicious cyber-attacks have been developed using Machine Learning (ML) approaches. To accurately detect various types of attacks in IoV networks, we propose a novel ensemble IDS framework named Leader Class and Confidence Decision Ensemble (LCCDE). It is constructed by determining the best-performing ML model among three advanced ML algorithms (XGBoost, LightGBM, and CatBoost) for every class or type of attack. The class leader models with their prediction confidence values are then utilized to make accurate decisions regarding the detection of various types of cyber-attacks. Experiments on two public IoV security datasets (Car-Hacking and CICIDS2017 datasets) demonstrate the effectiveness of the proposed LCCDE for intrusion detection on both intra-vehicle and external networks.

preprint2022arXiv

Similarity-Based Predictive Maintenance Framework for Rotating Machinery

Within smart manufacturing, data driven techniques are commonly adopted for condition monitoring and fault diagnosis of rotating machinery. Classical approaches use supervised learning where a classifier is trained on labeled data to predict or classify different operational states of the machine. However, in most industrial applications, labeled data is limited in terms of its size and type. Hence, it cannot serve the training purpose. In this paper, this problem is tackled by addressing the classification task as a similarity measure to a reference sample rather than a supervised classification task. Similarity-based approaches require a limited amount of labeled data and hence, meet the requirements of real-world industrial applications. Accordingly, the paper introduces a similarity-based framework for predictive maintenance (PdM) of rotating machinery. For each operational state of the machine, a reference vibration signal is generated and labeled according to the machine's operational condition. Consequentially, statistical time analysis, fast Fourier transform (FFT), and short-time Fourier transform (STFT) are used to extract features from the captured vibration signals. For each feature type, three similarity metrics, namely structural similarity measure (SSM), cosine similarity, and Euclidean distance are used to measure the similarity between test signals and reference signals in the feature space. Hence, nine settings in terms of feature type-similarity measure combinations are evaluated. Experimental results confirm the effectiveness of similarity-based approaches in achieving very high accuracy with moderate computational requirements compared to machine learning (ML)-based methods. Further, the results indicate that using FFT features with cosine similarity would lead to better performance compared to the other settings.

preprint2022arXiv

Sound Event Classification in an Industrial Environment: Pipe Leakage Detection Use Case

In this work, a multi-stage Machine Learning (ML) pipeline is proposed for pipe leakage detection in an industrial environment. As opposed to other industrial and urban environments, the environment under study includes many interfering background noises, complicating the identification of leaks. Furthermore, the harsh environmental conditions limit the amount of data collected and impose the use of low-complexity algorithms. To address the environment's constraints, the developed ML pipeline applies multiple steps, each addressing the environment's challenges. The proposed ML pipeline first reduces the data dimensionality by feature selection techniques and then incorporates time correlations by extracting time-based features. The resultant features are fed to a Support Vector Machine (SVM) of low-complexity that generalizes well to a small amount of data. An extensive experimental procedure was carried out on two datasets, one with background industrial noise and one without, to evaluate the validity of the proposed pipeline. The SVM hyper-parameters and parameters specific to the pipeline steps were tuned as part of the experimental procedure. The best models obtained from the dataset with industrial noise and leaks were applied to datasets without noise and with and without leaks to test their generalizability. The results show that the model produces excellent results with 99\% accuracy and an F1-score of 0.93 and 0.9 for the respective datasets.

preprint2022arXiv

Towards Supporting Intelligence in 5G/6G Core Networks: NWDAF Implementation and Initial Analysis

Wireless networks, in the fifth-generation and beyond, must support diverse network applications which will support the numerous and demanding connections of today's and tomorrow's devices. Requirements such as high data rates, low latencies, and reliability are crucial considerations and artificial intelligence is incorporated to achieve these requirements for a large number of connected devices. Specifically, intelligent methods and frameworks for advanced analysis are employed by the 5G Core Network Data Analytics Function (NWDAF) to detect patterns and ascribe detailed action information to accommodate end users and improve network performance. To this end, the work presented in this paper incorporates a functional NWDAF into a 5G network developed using open source software. Furthermore, an analysis of the network data collected by the NWDAF and the valuable insights which can be drawn from it have been presented with detailed Network Function interactions. An example application of such insights used for intelligent network management is outlined. Finally, the expected limitations of 5G networks are discussed as motivation for the development of 6G networks.

preprint2021arXiv

Concept Drift Detection in Federated Networked Systems

As next-generation networks materialize, increasing levels of intelligence are required. Federated Learning has been identified as a key enabling technology of intelligent and distributed networks; however, it is prone to concept drift as with any machine learning application. Concept drift directly affects the model's performance and can result in severe consequences considering the critical and emergency services provided by modern networks. To mitigate the adverse effects of drift, this paper proposes a concept drift detection system leveraging the federated learning updates provided at each iteration of the federated training process. Using dimensionality reduction and clustering techniques, a framework that isolates the system's drifted nodes is presented through experiments using an Intelligent Transportation System as a use case. The presented work demonstrates that the proposed framework is able to detect drifted nodes in a variety of non-iid scenarios at different stages of drift and different levels of system exposure.

preprint2021arXiv

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities

The emergence and continued reliance on the Internet and related technologies has resulted in the generation of large amounts of data that can be made available for analyses. However, humans do not possess the cognitive capabilities to understand such large amounts of data. Machine learning (ML) provides a mechanism for humans to process large amounts of data, gain insights about the behavior of the data, and make more informed decision based on the resulting analysis. ML has applications in various fields. This review focuses on some of the fields and applications such as education, healthcare, network security, banking and finance, and social media. Within these fields, there are multiple unique challenges that exist. However, ML can provide solutions to these challenges, as well as create further research opportunities. Accordingly, this work surveys some of the challenges facing the aforementioned fields and presents some of the previous literature works that tackled them. Moreover, it suggests several research opportunities that benefit from the use of ML to address these challenges.

preprint2021arXiv

Making a Case for Federated Learning in the Internet of Vehicles and Intelligent Transportation Systems

With the incoming introduction of 5G networks and the advancement in technologies, such as Network Function Virtualization and Software Defined Networking, new and emerging networking technologies and use cases are taking shape. One such technology is the Internet of Vehicles (IoV), which describes an interconnected system of vehicles and infrastructure. Coupled with recent developments in artificial intelligence and machine learning, the IoV is transformed into an Intelligent Transportation System (ITS). There are, however, several operational considerations that hinder the adoption of ITS systems, including scalability, high availability, and data privacy. To address these challenges, Federated Learning, a collaborative and distributed intelligence technique, is suggested. Through an ITS case study, the ability of a federated model deployed on roadside infrastructure throughout the network to recover from faults by leveraging group intelligence while reducing recovery time and restoring acceptable system performance is highlighted. With a multitude of use cases and benefits, Federated Learning is a key enabler for ITS and is poised to achieve widespread implementation in 5G and beyond networks and applications.

preprint2021arXiv

Mobility Aware Edge Computing Segmentation Towards Localized Orchestration

The current trend in end-user devices' advancements in computing and communication capabilities makes edge computing an attractive solution to pave the way for the coveted ultra-low latency services. The success of the edge computing networking paradigm depends on the proper orchestration of the edge servers. Several Edge applications and services are intolerant to latency, especially in 5G and beyond networks, such as intelligent video surveillance, E-health, Internet of Vehicles, and augmented reality applications. The edge devices underwent rapid growth in both capabilities and size to cope with the service demands. Orchestrating it on the cloud was a prominent trend during the past decade. However, the increasing number of edge devices poses a significant burden on the orchestration delay. In addition to the growth in edge devices, the high mobility of users renders traditional orchestration schemes impractical for contemporary edge networks. Proper segmentation of the edge space becomes necessary to adapt these schemes to address these challenges. In this paper, we introduce a segmentation technique employing lax clustering and segregated mobility-based clustering. We then apply latency mapping to these clusters. The proposed scheme's main objective is to create subspaces (segments) that enable light and efficient edge orchestration by reducing the processing time and the core cloud communication overhead. A bench-marking simulation is conducted with the results showing decreased mobility-related failures and reduced orchestration delay.

preprint2021arXiv

PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams

As the number of Internet of Things (IoT) devices and systems have surged, IoT data analytics techniques have been developed to detect malicious cyber-attacks and secure IoT systems; however, concept drift issues often occur in IoT data analytics, as IoT data is often dynamic data streams that change over time, causing model degradation and attack detection failure. This is because traditional data analytics models are static models that cannot adapt to data distribution changes. In this paper, we propose a Performance Weighted Probability Averaging Ensemble (PWPAE) framework for drift adaptive IoT anomaly detection through IoT data stream analytics. Experiments on two public datasets show the effectiveness of our proposed PWPAE method compared against state-of-the-art methods.

preprint2020arXiv

A Machine Learning-Based Migration Strategy for Virtual Network Function Instances

With the growing demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while simultaneously improving network performance and addressing the increased demand. Although Network Function Virtualization (NFV) has been identified as a promising solution, several challenges must be addressed to ensure its feasibility. In this paper, we address the Virtual Network Function (VNF) migration problem by developing the VNF Neural Network for Instance Migration (VNNIM), a migration strategy for VNF instances. The performance of VNNIM is further improved through the optimization of the learning rate hyperparameter through particle swarm optimization. Results show that the VNNIM is very effective in predicting the post-migration server exhibiting a binary accuracy of 99.07% and a delay difference distribution that is centered around a mean of zero when compared to the optimization model. The greatest advantage of VNNIM, however, is its run-time efficiency highlighted through a run-time analysis.

preprint2020arXiv

A Transfer Learning Framework for Anomaly Detection Using Model of Normality

Convolutional Neural Network (CNN) techniques have proven to be very useful in image-based anomaly detection applications. CNN can be used as deep features extractor where other anomaly detection techniques are applied on these features. For this scenario, using transfer learning is common since pretrained models provide deep feature representations that are useful for anomaly detection tasks. Consequentially, anomaly can be detected by applying similarly measure between extracted features and a defined model of normality. A key factor in such approaches is the decision threshold used for detecting anomaly. While most of the proposed methods focus on the approach itself, slight attention has been paid to address decision threshold settings. In this paper, we tackle this problem and propose a welldefined method to set the working-point decision threshold that improves detection accuracy. We introduce a transfer learning framework for anomaly detection based on similarity measure with a Model of Normality (MoN) and show that with the proposed threshold settings, a significant performance improvement can be achieved. Moreover, the framework has low complexity with relaxed computational requirements.

preprint2020arXiv

Bayesian Optimization with Machine Learning Algorithms Towards Anomaly Detection

Network attacks have been very prevalent as their rate is growing tremendously. Both organization and individuals are now concerned about their confidentiality, integrity and availability of their critical information which are often impacted by network attacks. To that end, several previous machine learning-based intrusion detection methods have been developed to secure network infrastructure from such attacks. In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique to tune the parameters of Support Vector Machine with Gaussian Kernel (SVM-RBF), Random Forest (RF), and k-Nearest Neighbor (k-NN) algorithms. The performance of the considered algorithms is evaluated using the ISCX 2012 dataset. Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.

preprint2020arXiv

Depth-Optimized Delay-Aware Tree (DO-DAT) for Virtual Network Function Placement

With the constant increase in demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while ensuring continual improvements to network performance. Although Network Function Virtualization (NFV) has been identified as a solution, several challenges must be addressed to ensure its feasibility. In this paper, we present a machine learning-based solution to the Virtual Network Function (VNF) placement problem. This paper proposes the Depth-Optimized Delay-Aware Tree (DO-DAT) model by using the particle swarm optimization technique to optimize decision tree hyper-parameters. Using the Evolved Packet Core (EPC) as a use case, we evaluate the performance of the model and compare it to a previously proposed model and a heuristic placement strategy.

preprint2020arXiv

Ensemble-based Feature Selection and Classification Model for DNS Typo-squatting Detection

Domain Name System (DNS) plays in important role in the current IP-based Internet architecture. This is because it performs the domain name to IP resolution. However, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it. This paper focuses on one particular security vulnerability, namely typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. The danger of typo-squatting is that it can lead to information threat, corporate secret leakage, and can facilitate fraud. This paper builds on our previous work in [1], which only proposed majority-voting based classifier, by proposing an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack. Experimental results show that the proposed framework achieves high accuracy and precision in identifying the malicious/suspicious typo-squatting domains (a loss of at most 1.5% in accuracy and 5% in precision when compared to the model that used the complete feature set) while having a lower computational complexity due to the smaller feature set (a reduction of more than 50% in feature set size).

preprint2020arXiv

Introducing Virtual Security Functions into Latency-aware Placement for NFV Applications

The shift towards a completely virtualized networking environment is triggered by the emergence of software defined networking and network function virtualization (NFV). Network service providers have unlocked immense capabilities by these technologies, which have enabled them to dynamically adapt to user needs by deploying their network services in real-time through generating Service Function Chain (SFCs). However, NFV still faces challenges that hinder its full potentials, including availability guarantees, network security, and other performance requirements. For this reason, the deployment of NFV applications remains critical as it should meet different service level agreements while insuring the security of the virtualized functions. In this paper, we tackle the challenge of securing these SFCs by introducing virtual security functions (VSFs) into the latencyaware deployment of NFV applications. This work insures the optimal placement of the SFC components including the security functions while considering the performance constraints and the VSFs' operational rules such as, functions' alliance, proximity, and anti-affinity. This paper develops a mixed integer linear programming model to optimally place all the requested SFCs while satisfying the above constraints and minimizing the latency of every SFC and the intercommunication delay between the SFC components. The simulations are evaluated against a greedy algorithm on the virtualized Evolved Packet Core use case and have shown promising results in maintaining the security rules while achieving minimum delays.

preprint2020arXiv

Machine Learning for Performance-Aware Virtual Network Function Placement

With the growing demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while simultaneously improving network performance and addressing the increased connectivity demand. Although Network Function Virtualization (NFV) has been identified as a solution, several challenges must be addressed to ensure its feasibility. In this paper, we address the Virtual Network Function (VNF) placement problem by developing a machine learning decision tree model that learns from the effective placement of the various VNF instances forming a Service Function Chain (SFC). The model takes several performance-related features from the network as an input and selects the placement of the various VNF instances on network servers with the objective of minimizing the delay between dependent VNF instances. The benefits of using machine learning are realized by moving away from a complex mathematical modelling of the system and towards a data-based understanding of the system. Using the Evolved Packet Core (EPC) as a use case, we evaluate our model on different data center networks and compare it to the BACON algorithm in terms of the delay between interconnected components and the total delay across the SFC. Furthermore, a time complexity analysis is performed to show the effectiveness of the model in NFV applications.

preprint2020arXiv

Machine Learning Towards Enabling Spectrum-as-a-Service Dynamic Sharing

The growth in wireless broadband users, devices, and novel applications has led to a significant increase in the demand for new radio frequency spectrum. This is expected to grow even further given the projection that the global traffic per year will reach 4.8 zettabytes by 2022. Moreover, it is projected that the number of Internet users will reach 4.8 billion and the number of connected devices will be close 28.5 billion devices. However, due to the spectrum being mostly allocated and divided, providing more spectrum to expand existing services or offer new ones has become more challenging. To address this, spectrum sharing has been proposed as a potential solution to improve spectrum utilization efficiency. Adopting effective and efficient spectrum sharing mechanisms is in itself a challenging task given the multitude of levels and techniques that can be integrated to enable it. To that end, this paper provides an overview of the different spectrum sharing levels and techniques that have been proposed in the literature. Moreover, it discusses the potential of adopting dynamic sharing mechanisms by offering Spectrum-as-a-Service architecture. Furthermore, it describes the potential role of machine learning models in facilitating the automated and efficient dynamic sharing of the spectrum and offering Spectrum-as-a-Service.

preprint2020arXiv

Multi-Component V2X Applications Placement in Edge Computing Environment

Vehicle-to-everything (V2X) services are attracting a lot of attention in the research and industry communities due to their applicability in the landscape of connected and autonomous vehicles. Such applications have stringent performance requirements in terms of complex data processing and low latency communications which are utilized to ensure road safety and improve road conditions. To address these challenges, the placement of V2X applications through leveraging of edge computing paradigm, that distributes the computing capabilities to access points in proximity to the vehicles, presents itself as a viable solution. However, the realistic implementation of the edge enabled V2X applications is hindered by the limited computational power provided at the edge and the nature of V2X applications that are composed of multiple independent V2X basic services. To address these challenges, this work targets the efficient placement of V2X basic services in a highway scenario subject to the delay constraints of V2X applications using them and the limited computational resources at the edge. To that end, this work formulates a binary integer linear programming model that minimizes the delay of V2X applications while satisfying the resource requirements of V2X basic services. To demonstrate the soundness of the approach, simulations with varying vehicle densities were conducted, and the results reported show that it can satisfy the delay requirements of V2X applications.

preprint2020arXiv

Multi-split Optimized Bagging Ensemble Model Selection for Multi-class Educational Data Mining

Predicting students' academic performance has been a research area of interest in recent years with many institutions focusing on improving the students' performance and the education quality. The analysis and prediction of students' performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students' final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students' performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms' parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by optimizing a suitable bagging ensemble learner that is built from any combination of six potential base machine learning algorithms. It is shown through experimental results that the posited bagging ensemble models achieve high accuracy for the target group for both datasets.

preprint2020arXiv

Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection

Cyber-security garnered significant attention due to the increased dependency of individuals and organizations on the Internet and their concern about the security and privacy of their online activities. Several previous machine learning (ML)-based network intrusion detection systems (NIDSs) have been developed to protect against malicious online behavior. This paper proposes a novel multi-stage optimized ML-based NIDS framework that reduces computational complexity while maintaining its detection performance. This work studies the impact of oversampling techniques on the models' training sample size and determines the minimal suitable training sample size. Furthermore, it compares between two feature selection techniques, information gain and correlation-based, and explores their effect on detection performance and time complexity. Moreover, different ML hyper-parameter optimization techniques are investigated to enhance the NIDS's performance. The performance of the proposed framework is evaluated using two recent intrusion detection datasets, the CICIDS 2017 and the UNSW-NB 2015 datasets. Experimental results show that the proposed model significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%). Moreover, the model performance is enhanced with hyper-parameter optimization with detection accuracies over 99% for both datasets, outperforming recent literature works by 1-2% higher accuracy and 1-2% lower false alarm rate.

preprint2020arXiv

Relationship between Student Engagement and Performance in e-Learning Environment Using Association Rules

The field of e-learning has emerged as a topic of interest in academia due to the increased ease of accessing the Internet using using smart-phones and wireless devices. One of the challenges facing e-learning platforms is how to keep students motivated and engaged. Moreover, it is also crucial to identify the students that might need help in order to make sure their academic performance doesn't suffer. To that end, this paper tries to investigate the relationship between student engagement and their academic performance. Apriori association rules algorithm is used to derive a set of rules that relate student engagement to academic performance. Experimental results' analysis done using confidence and lift metrics show that a positive correlation exists between students' engagement level and their academic performance in a blended e-learning environment. In particular, it is shown that higher engagement often leads to better academic performance. This cements the previous work that linked engagement and academic performance in traditional classrooms.

preprint2020arXiv

Systematic Ensemble Model Selection Approach for Educational Data Mining

A plethora of research has been done in the past focusing on predicting student's performance in order to support their development. Many institutions are focused on improving the performance and the education quality; and this can be achieved by utilizing data mining techniques to analyze and predict students' performance and to determine possible factors that may affect their final marks. To address this issue, this work starts by thoroughly exploring and analyzing two different datasets at two separate stages of course delivery (20 percent and 50 percent respectively) using multiple graphical, statistical, and quantitative techniques. The feature analysis provides insights into the nature of the different features considered and helps in the choice of the machine learning algorithms and their parameters. Furthermore, this work proposes a systematic approach based on Gini index and p-value to select a suitable ensemble learner from a combination of six potential machine learning algorithms. Experimental results show that the proposed ensemble models achieve high accuracy and low false positive rate at all stages for both datasets.

preprint2020arXiv

The Need for Advanced Intelligence in NFV Management and Orchestration

With the constant demand for connectivity at an all-time high, Network Service Providers (NSPs) are required to optimize their networks to cope with rising capital and operational expenditures required to meet the growing connectivity demand. A solution to this challenge was presented through Network Function Virtualization (NFV). As network complexity increases and futuristic networks take shape, NSPs are required to incorporate an increasing amount of operational efficiency into their NFV-enabled networks. One such technique is Machine Learning (ML), which has been applied to various entities in NFV-enabled networks, most notably in the NFV Orchestrator. While traditional ML provides tremendous operational efficiencies, including real-time and high-volume data processing, challenges such as privacy, security, scalability, transferability, and concept drift hinder its widespread implementation. Through the adoption of Advanced Intelligence techniques such as Reinforcement Learning and Federated Learning, NSPs can leverage the benefits of traditional ML while simultaneously addressing the major challenges traditionally associated with it. This work presents the benefits of adopting these advanced techniques, provides a list of potential use cases and research topics, and proposes a bottom-up micro-functionality approach to applying these methods of Advanced Intelligence to NFV Management and Orchestration.