Researcher profile

Anwar Haque

Anwar Haque contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2023arXiv

DRL-GAN: A Hybrid Approach for Binary and Multiclass Network Intrusion Detection

Our increasingly connected world continues to face an ever-growing amount of network-based attacks. Intrusion detection systems (IDS) are an essential security technology for detecting these attacks. Although numerous machine learning-based IDS have been proposed for the detection of malicious network traffic, the majority have difficulty properly detecting and classifying the more uncommon attack types. In this paper, we implement a novel hybrid technique using synthetic data produced by a Generative Adversarial Network (GAN) to use as input for training a Deep Reinforcement Learning (DRL) model. Our GAN model is trained with the NSL-KDD dataset for four attack categories as well as normal network flow. Ultimately, our findings demonstrate that training the DRL on specific synthetic datasets can result in better performance in correctly classifying minority classes over training on the true imbalanced dataset.

preprint2022arXiv

An Empirical Study on Internet Traffic Prediction Using Statistical Rolling Model

Real-world IP network traffic is susceptible to external and internal factors such as new internet service integration, traffic migration, internet application, etc. Due to these factors, the actual internet traffic is non-linear and challenging to analyze using a statistical model for future prediction. In this paper, we investigated and evaluated the performance of different statistical prediction models for real IP network traffic; and showed a significant improvement in prediction using the rolling prediction technique. Initially, a set of best hyper-parameters for the corresponding prediction model is identified by analyzing the traffic characteristics and implementing a grid search algorithm based on the minimum Akaike Information Criterion (AIC). Then, we performed a comparative performance analysis among AutoRegressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), SARIMA with eXogenous factors (SARIMAX), and Holt-Winter for single-step prediction. The seasonality of our traffic has been explicitly modeled using SARIMA, which reduces the rolling prediction Mean Average Percentage Error (MAPE) by more than 4% compared to ARIMA (incapable of handling the seasonality). We further improved traffic prediction using SARIMAX to learn different exogenous factors extracted from the original traffic, which yielded the best rolling prediction results with a MAPE of 6.83%. Finally, we applied the exponential smoothing technique to handle the variability in traffic following the Holt-Winter model, which exhibited a better prediction than ARIMA (around 1.5% less MAPE). The rolling prediction technique reduced prediction error using real Internet Service Provider (ISP) traffic data by more than 50\% compared to the standard prediction method.

preprint2022arXiv

Deep Sequence Modeling for Anomalous ISP Traffic Prediction

Internet traffic in the real world is susceptible to various external and internal factors which may abruptly change the normal traffic flow. Those unexpected changes are considered outliers in traffic. However, deep sequence models have been used to predict complex IP traffic, but their comparative performance for anomalous traffic has not been studied extensively. In this paper, we investigated and evaluated the performance of different deep sequence models for anomalous traffic prediction. Several deep sequences models were implemented to predict real traffic without and with outliers and show the significance of outlier detection in real-world traffic prediction. First, two different outlier detection techniques, such as the Three-Sigma rule and Isolation Forest, were applied to identify the anomaly. Second, we adjusted those abnormal data points using the Backward Filling technique before training the model. Finally, the performance of different models was compared for abnormal and adjusted traffic. LSTM_Encoder_Decoder (LSTM_En_De) is the best prediction model in our experiment, reducing the deviation between actual and predicted traffic by more than 11\% after adjusting the outliers. All other models, including Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), LSTM_En_De with Attention layer (LSTM_En_De_Atn), Gated Recurrent Unit (GRU), show better prediction after replacing the outliers and decreasing prediction error by more than 29%, 24%, 19%, and 10% respectively. Our experimental results indicate that the outliers in the data can significantly impact the quality of the prediction. Thus, outlier detection and mitigation assist the deep sequence model in learning the general trend and making better predictions.

preprint2022arXiv

Transfer Learning Based Efficient Traffic Prediction with Limited Training Data

Efficient prediction of internet traffic is an essential part of Self Organizing Network (SON) for ensuring proactive management. There are many existing solutions for internet traffic prediction with higher accuracy using deep learning. But designing individual predictive models for each service provider in the network is challenging due to data heterogeneity, scarcity, and abnormality. Moreover, the performance of the deep sequence model in network traffic prediction with limited training data has not been studied extensively in the current works. In this paper, we investigated and evaluated the performance of the deep transfer learning technique in traffic prediction with inadequate historical data leveraging the knowledge of our pre-trained model. First, we used a comparatively larger real-world traffic dataset for source domain prediction based on five different deep sequence models: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), LSTM Encoder-Decoder (LSTM_En_De), LSTM_En_De with Attention layer (LSTM_En_De_Atn), and Gated Recurrent Unit (GRU). Then, two best-performing models, LSTM_En_De and LSTM_En_De_Atn, from the source domain with an accuracy of 96.06% and 96.05% are considered for the target domain prediction. Finally, four smaller traffic datasets collected for four particular sources and destination pairs are used in the target domain to compare the performance of the standard learning and transfer learning in terms of accuracy and execution time. According to our experimental result, transfer learning helps to reduce the execution time for most cases, while the model's accuracy is improved in transfer learning with a larger training session.

preprint2022arXiv

Wavelet-Based Hybrid Machine Learning Model for Out-of-distribution Internet Traffic Prediction

Efficient prediction of internet traffic is essential for ensuring proactive management of computer networks. Nowadays, machine learning approaches show promising performance in modeling real-world complex traffic. However, most existing works assumed that model training and evaluation data came from identical distribution. But in practice, there is a high probability that the model will deal with data from a slightly or entirely unknown distribution in the deployment phase. This paper investigated and evaluated machine learning performances using eXtreme Gradient Boosting, Light Gradient Boosting Machine, Stochastic Gradient Descent, Gradient Boosting Regressor, CatBoost Regressor, and their stacked ensemble model using data from both identical and out-of distribution. Also, we proposed a hybrid machine learning model integrating wavelet decomposition for improving out-of-distribution prediction as standalone models were unable to generalize very well. Our experimental results show the best performance of the standalone ensemble model with an accuracy of 96.4%, while the hybrid ensemble model improved it by 1% for in-distribution data. But its performance dropped significantly when tested with three different datasets having a distribution shift than the training set. However, our proposed hybrid model considerably reduces the performance gap between identical and out-of-distribution evaluation compared with the standalone model, indicating the decomposition technique's effectiveness in the case of out-of-distribution generalization.

preprint2021arXiv

Appliance Operation Modes Identification Using Cycles Clustering

The increasing cost, energy demand, and environmental issues has led many researchers to find approaches for energy monitoring, and hence energy conservation. The emerging technologies of Internet of Things (IoT) and Machine Learning (ML) deliver techniques that have the potential to efficiently conserve energy and improve the utilization of energy consumption. Smart Home Energy Management Systems (SHEMSs) have the potential to contribute in energy conservation through the application of Demand Response (DR) in the residential sector. In this paper, we propose appliances Operation Modes Identification using Cycles Clustering (OMICC) which is SHEMS fundamental approach that utilizes the sensed residential disaggregated power consumption in supporting DR by providing consumers the opportunity to select lighter appliance operation modes. The cycles of the Single Usage Profile (SUP) of an appliance are extracted and reformed into features in terms of clusters of cycles. These features are then used to identify the operation mode used in every occurrence using K-Nearest Neighbors (KNN). Operation modes identification is considered a basis for many potential smart DR applications within SHEMS towards the consumers or the suppliers

preprint2021arXiv

Towards Network Traffic Monitoring Using Deep Transfer Learning

Network traffic is growing at an outpaced speed globally. The modern network infrastructure makes classic network intrusion detection methods inefficient to classify an inflow of vast network traffic. This paper aims to present a modern approach towards building a network intrusion detection system (NIDS) by using various deep learning methods. To further improve our proposed scheme and make it effective in real-world settings, we use deep transfer learning techniques where we transfer the knowledge learned by our model in a source domain with plentiful computational and data resources to a target domain with sparse availability of both the resources. Our proposed method achieved 98.30% classification accuracy score in the source domain and an improved 98.43% classification accuracy score in the target domain with a boost in the classification speed using UNSW-15 dataset. This study demonstrates that deep transfer learning techniques make it possible to construct large deep learning models to perform network classification, which can be deployed in the real world target domains where they can maintain their classification performance and improve their classification speed despite the limited accessibility of resources.

preprint2020arXiv

Demand Response For Residential Uses: A Data Analytics Approach

In the Smart Grid environment, the advent of intelligent measuring devices facilitates monitoring appliance electricity consumption. This data can be used in applying Demand Response (DR) in residential houses through data analytics, and developing data mining techniques. In this research, we introduce a smart system foundation that is applied to user's disaggregated power consumption data. This system encourages the users to apply DR by changing their behaviour of using heavier operation modes to lighter modes, and by encouraging users to shift their usages to off-peak hours. First, we apply Cross Correlation (XCORR) to detect times of the occurrences when an appliance is being used. We then use The Dynamic Time Warping (DTW) to recognize the operation mode used.

preprint2020arXiv

Machine Learning Towards Enabling Spectrum-as-a-Service Dynamic Sharing

The growth in wireless broadband users, devices, and novel applications has led to a significant increase in the demand for new radio frequency spectrum. This is expected to grow even further given the projection that the global traffic per year will reach 4.8 zettabytes by 2022. Moreover, it is projected that the number of Internet users will reach 4.8 billion and the number of connected devices will be close 28.5 billion devices. However, due to the spectrum being mostly allocated and divided, providing more spectrum to expand existing services or offer new ones has become more challenging. To address this, spectrum sharing has been proposed as a potential solution to improve spectrum utilization efficiency. Adopting effective and efficient spectrum sharing mechanisms is in itself a challenging task given the multitude of levels and techniques that can be integrated to enable it. To that end, this paper provides an overview of the different spectrum sharing levels and techniques that have been proposed in the literature. Moreover, it discusses the potential of adopting dynamic sharing mechanisms by offering Spectrum-as-a-Service architecture. Furthermore, it describes the potential role of machine learning models in facilitating the automated and efficient dynamic sharing of the spectrum and offering Spectrum-as-a-Service.

preprint2020arXiv

Multi-Component V2X Applications Placement in Edge Computing Environment

Vehicle-to-everything (V2X) services are attracting a lot of attention in the research and industry communities due to their applicability in the landscape of connected and autonomous vehicles. Such applications have stringent performance requirements in terms of complex data processing and low latency communications which are utilized to ensure road safety and improve road conditions. To address these challenges, the placement of V2X applications through leveraging of edge computing paradigm, that distributes the computing capabilities to access points in proximity to the vehicles, presents itself as a viable solution. However, the realistic implementation of the edge enabled V2X applications is hindered by the limited computational power provided at the edge and the nature of V2X applications that are composed of multiple independent V2X basic services. To address these challenges, this work targets the efficient placement of V2X basic services in a highway scenario subject to the delay constraints of V2X applications using them and the limited computational resources at the edge. To that end, this work formulates a binary integer linear programming model that minimizes the delay of V2X applications while satisfying the resource requirements of V2X basic services. To demonstrate the soundness of the approach, simulations with varying vehicle densities were conducted, and the results reported show that it can satisfy the delay requirements of V2X applications.