Researcher profile

Sanjay K. Sahay

Sanjay K. Sahay contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2023arXiv

Gravitational wave: generation and detection techniques

In this paper, we review the theoretical basis for generation of gravitational waves and the detection techniques used to detect a gravitational wave. To materialize this goal in a thorough way we first start with a mathematical background for general relativity from which a clue for gravitational wave was conceived by Einstein. Thereafter we give the classification scheme of gravitational waves such as (i) continuous gravitational waves, (ii) compact binary inspiral gravitational waves and (iii) stochastic gravitational wave. Necessary mathematical insight into gravitational waves from binaries are also dealt with which follows detection of gravitational waves based on the frequency classification. Ground based observatories as well as space borne gravitational wave detectors are discussed in a length. We have provided an overview on the inflationary gravitational waves. In connection to data analysis by matched filtering there are a few highlights on the techniques, e.g. (i) Random noise, (ii) power spectrum, (iii) shot noise, and (iv) Gaussian noise. Optimal detection statistics for a gravitational wave detection is also in the pipeline of the discussion along with detailed necessity of the matched filter and deep learning.

preprint2022arXiv

Deep Reinforcement Learning for Cybersecurity Threat Detection and Protection: A Review

The cybersecurity threat landscape has lately become overly complex. Threat actors leverage weaknesses in the network and endpoint security in a very coordinated manner to perpetuate sophisticated attacks that could bring down the entire network and many critical hosts in the network. Increasingly advanced deep and machine learning-based solutions have been used in threat detection and protection. The application of these techniques has been reviewed well in the scientific literature. Deep Reinforcement Learning has shown great promise in developing AI-based solutions for areas that had earlier required advanced human cognizance. Different techniques and algorithms under deep reinforcement learning have shown great promise in applications ranging from games to industrial processes, where it is claimed to augment systems with general AI capabilities. These algorithms have recently also been used in cybersecurity, especially in threat detection and endpoint protection, where these are showing state-of-the-art results. Unlike supervised machines and deep learning, deep reinforcement learning is used in more diverse ways and is empowering many innovative applications in the threat defense landscape. However, there does not exist any comprehensive review of these unique applications and accomplishments. Therefore, in this paper, we intend to fill this gap and provide a comprehensive review of the different applications of deep reinforcement learning in cybersecurity threat detection and protection.

preprint2021arXiv

Detection of Malicious Android Applications: Classical Machine Learning vs. Deep Neural Network Integrated with Clustering

Today anti-malware community is facing challenges due to the ever-increasing sophistication and volume of malware attacks developed by adversaries. Traditional malware detection mechanisms are not able to cope-up with next-generation malware attacks. Therefore in this paper, we propose effective and efficient Android malware detection models based on machine learning and deep learning integrated with clustering. We performed a comprehensive study of different feature reduction, classification and clustering algorithms over various performance metrics to construct the Android malware detection models. Our experimental results show that malware detection models developed using Random Forest eclipsed deep neural network and other classifiers on the majority of performance metrics. The baseline Random Forest model without any feature reduction achieved the highest AUC of 99.4%. Also, the segregating of vector space using clustering integrated with Random Forest further boosted the AUC to 99.6% in one cluster and direct detection of Android malware in another cluster, thus reducing the curse of dimensionality. Additionally, we found that feature reduction in detection models does improve the model efficiency (training and testing time) many folds without much penalty on the effectiveness of the detection model.

preprint2021arXiv

DRLDO: A novel DRL based De-ObfuscationSystem for Defense against Metamorphic Malware

In this paper, we propose a novel mechanism to normalize metamorphic and obfuscated malware down at the opcode level and hence create an advanced metamorphic malware de-obfuscation and defense system. We name this system DRLDO, for Deep Reinforcement Learning based De-Obfuscator. With the inclusion of the DRLDO as a sub-component, an existing Intrusion Detection System could be augmented with defensive capabilities against 'zero-day' attacks from obfuscated and metamorphic variants of existing malware. This gains importance, not only because there exists no system to date that uses advanced DRL to intelligently and automatically normalize obfuscation down even to the opcode level, but also because the DRLDO system does not mandate any changes to the existing IDS. The DRLDO system does not even mandate the IDS' classifier to be retrained with any new dataset containing obfuscated samples. Hence DRLDO could be easily retrofitted into any existing IDS deployment. We designed, developed, and conducted experiments on the system to evaluate the same against multiple-simultaneous attacks from obfuscations generated from malware samples from a standardized dataset that contains multiple generations of malware. Experimental results prove that DRLDO was able to successfully make the otherwise un-detectable obfuscated variants of the malware detectable by an existing pre-trained malware classifier. The detection probability was raised well above the cut-off mark to 0.6 for the classifier to detect the obfuscated malware unambiguously. Further, the de-obfuscated variants generated by DRLDO achieved a very high correlation (of 0.99) with the base malware. This observation validates that the DRLDO system is actually learning to de-obfuscate and not exploiting a trivial trick.

preprint2021arXiv

Identification of Significant Permissions for Efficient Android Malware Detection

Since Google unveiled Android OS for smartphones, malware are thriving with 3Vs, i.e. volume, velocity, and variety. A recent report indicates that one out of every five business/industry mobile application leaks sensitive personal data. Traditional signature/heuristic-based malware detection systems are unable to cope up with current malware challenges and thus threaten the Android ecosystem. Therefore recently researchers have started exploring machine learning and deep learning based malware detection systems. In this paper, we performed a comprehensive feature analysis to identify the significant Android permissions and propose an efficient Android malware detection system using machine learning and deep neural network. We constructed a set of $16$ permissions ($8\%$ of the total set) derived from variance threshold, auto-encoders, and principal component analysis to build a malware detection engine that consumes less train and test time without significant compromise on the model accuracy. Our experimental results show that the Android malware detection model based on the random forest classifier is most balanced and achieves the highest area under curve score of $97.7\%$, which is better than the current state-of-art systems. We also observed that deep neural networks attain comparable accuracy to the baseline results but with a massive computational penalty.

preprint2021arXiv

Robust Android Malware Detection System against Adversarial Attacks using Q-Learning

The current state-of-the-art Android malware detection systems are based on machine learning and deep learning models. Despite having superior performance, these models are susceptible to adversarial attacks. Therefore in this paper, we developed eight Android malware detection models based on machine learning and deep neural network and investigated their robustness against adversarial attacks. For this purpose, we created new variants of malware using Reinforcement Learning, which will be misclassified as benign by the existing Android malware detection models. We propose two novel attack strategies, namely single policy attack and multiple policy attack using reinforcement learning for white-box and grey-box scenario respectively. Putting ourselves in the adversary's shoes, we designed adversarial attacks on the detection models with the goal of maximizing fooling rate, while making minimum modifications to the Android application and ensuring that the app's functionality and behavior do not change. We achieved an average fooling rate of 44.21% and 53.20% across all the eight detection models with a maximum of five modifications using a single policy attack and multiple policy attack, respectively. The highest fooling rate of 86.09% with five changes was attained against the decision tree-based model using the multiple policy approach. Finally, we propose an adversarial defense strategy that reduces the average fooling rate by threefold to 15.22% against a single policy attack, thereby increasing the robustness of the detection models i.e. the proposed model can effectively detect variants (metamorphic) of malware. The experimental analysis shows that our proposed Android malware detection system using reinforcement learning is more robust against adversarial attacks.

preprint2020arXiv

Assessment of the Relative Importance of different hyper-parameters of LSTM for an IDS

Recurrent deep learning language models like the LSTM are often used to provide advanced cyber-defense for high-value assets. The underlying assumption for using LSTM networks for malware-detection is that the op-code sequence of malware could be treated as a (spoken) language representation. There are differences between any spoken-language (sequence of words/sentences) and the machine-language (sequence of op-codes). In this paper, we demonstrate that due to these inherent differences, an LSTM model with its default configuration as tuned for a spoken-language, may not work well to detect malware (using its op-code sequence) unless the network's essential hyper-parameters are tuned appropriately. In the process, we also determine the relative importance of all the different hyper-parameters of an LSTM network as applied to malware detection using their op-code sequence representations. We experimented with different configurations of LSTM networks, and altered hyper-parameters like the embedding-size, number of hidden layers, number of LSTM-units in a hidden layer, pruning/padding-length of the input-vector, activation-function, and batch-size. We discovered that owing to the enhanced complexity of the malware/machine-language, the performance of an LSTM network configured for an Intrusion Detection System, is very sensitive towards the number-of-hidden-layers, input sequence-length, and the choice of the activation-function. Also, for (spoken) language-modeling, the recurrent architectures by-far outperform their non-recurrent counterparts. Therefore, we also assess how sequential DL architectures like the LSTM compare against their non-sequential counterparts like the MLP-DNN for the purpose of malware-detection.

preprint2020arXiv

Secure and Energy-Efficient Key-Agreement Protocol for Multi-Server Architecture

Authentication schemes are practised globally to verify the legitimacy of users and servers for the exchange of data in different facilities. Generally, the server verifies a user to provide resources for different purposes. But due to the large network system, the authentication process has become complex and therefore, time-to-time different authentication protocols have been proposed for the multi-server architecture. However, most of the protocols are vulnerable to various security attacks and their performance is not efficient. In this paper, we propose a secure and energy-efficient remote user authentication protocol for multi-server systems. The results show that the proposed protocol is comparatively ~44% more efficient and needs ~38% less communication cost. We also demonstrate that with only two-factor authentication, the proposed protocol is more secure from the earlier related authentication schemes.

preprint2020arXiv

Secure Communication Protocol for Smart Transportation Based on Vehicular Cloud

The pioneering concept of connected vehicles has transformed the way of thinking for researchers and entrepreneurs by collecting relevant data from nearby objects. However, this data is useful for a specific vehicle only. Moreover, vehicles get a high amount of data (e.g., traffic, safety, and multimedia infotainment) on the road. Thus, vehicles expect adequate storage devices for this data, but it is infeasible to have a large memory in each vehicle. Hence, the vehicular cloud computing (VCC) framework came into the picture to provide a storage facility by connecting a road-side-unit (RSU) with the vehicular cloud (VC). In this, data should be saved in an encrypted form to preserve security, but there is a challenge to search for information over encrypted data. Next, we understand that many of vehicular communication schemes are inefficient for data transmissions due to its poor performance results and vulnerable to different fundamental security attacks. Accordingly, on-device performance is critical, but data damages and secure on-time connectivity are also significant challenges in a public environment. Therefore, we propose reliable data transmission protocols for cutting-edge architecture to search data from the storage, to resist against various security attacks, and provide better performance results. Thus, the proposed data transmission protocol is useful in diverse smart city applications (business, safety, and entertainment) for the benefits of society.