Source author record

Mohammad Saidur Rahman

Mohammad Saidur Rahman appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Machine Learning Artificial Intelligence cs.CY Databases Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FreeMOCA: Memory-Free Continual Learning for Malicious Code Analysis

As over 200 million new malware samples are identified each year, antivirus systems must continuously adapt to the evolving threat landscape. However, retraining solely on new samples leads to catastrophic forgetting and exploitable blind spots, while retraining on the entire dataset incurs substantial computational cost. We propose FreeMOCA, a memory- and compute-efficient continual learning framework for malicious code analysis that preserves prior knowledge via adaptive layer-wise interpolation between consecutive task updates, leveraging the fact that warm-started task optima are connected by low-loss paths in parameter space. We evaluate FreeMOCA in both class-incremental (Class-IL) and domain-incremental (Domain-IL) settings on large-scale Windows (EMBER) and Android (AZ) malware benchmarks. FreeMOCA achieves substantial gains in Class-IL, outperforming 11 baselines on both EMBER and AZ benchmarks. It also significantly reduces forgetting, achieving the best retention across baselines, and improving accuracy by up to 42% and 37% on EMBER and AZ, respectively. These results demonstrate that warm-started interpolation in parameter space provides a scalable and effective alternative to replay for continual malware detection. Code is available at: https://github.com/IQSeC-Lab/FreeMOCA.

preprint2026arXiv

McNdroid: A Longitudinal Multimodal Benchmark for Robust Drift Detection in Android Malware

Machine learning (ML) in real-world systems must contend with concept drift, adversarial actors, and a spectrum of potential features with varying costs and benefits. Malware naturally exhibits all of these complexities, but for the same reason, it is challenging to curate and organize data to study these factors. We present McNdroid, to our knowledge the largest longitudinal multimodal Android malware benchmark for malware detection and drift analysis. McNdroid spans 2013--2025, excluding 2015, and represents each application with three aligned modalities--static features from manifests and smali code, dynamic behavioral features from sandbox execution, and graph-based features from function-call graphs. Using temporally separated splits, we evaluate standard ML and deep-learning detectors across increasing train--test time gaps. Results show clear temporal degradation, while multimodal fusion outperforms the best single modality across long-term temporal gaps. Cross-modal agreement also declines over time, suggesting that drift affects both individual feature spaces and the consistency among modalities. We further analyze modality-specific drift, malware-family evolution, and temporal changes in model explanations. We publicly release McNdroid, benchmark splits, and code to support reproducible research on temporal generalization and robust multimodal learning in security-critical, non-stationary settings.

preprint2022arXiv

On the Limitations of Continual Learning for Malware Classification

Malicious software (malware) classification offers a unique challenge for continual learning (CL) regimes due to the volume of new samples received on a daily basis and the evolution of malware to exploit new vulnerabilities. On a typical day, antivirus vendors receive hundreds of thousands of unique pieces of software, both malicious and benign, and over the course of the lifetime of a malware classifier, more than a billion samples can easily accumulate. Given the scale of the problem, sequential training using continual learning techniques could provide substantial benefits in reducing training and storage overhead. To date, however, there has been no exploration of CL applied to malware classification tasks. In this paper, we study 11 CL techniques applied to three malware tasks covering common incremental learning scenarios, including task, class, and domain incremental learning (IL). Specifically, using two realistic, large-scale malware datasets, we evaluate the performance of the CL methods on both binary malware classification (Domain-IL) and multi-class malware family classification (Task-IL and Class-IL) tasks. To our surprise, continual learning methods significantly underperformed naive Joint replay of the training data in nearly all settings -- in some cases reducing accuracy by more than 70 percentage points. A simple approach of selectively replaying 20% of the stored data achieves better performance, with 50% of the training time compared to Joint replay. Finally, we discuss potential reasons for the unexpectedly poor performance of the CL techniques, with the hope that it spurs further research on developing techniques that are more effective in the malware classification domain.

preprint2020arXiv

Optimizing Smart Grid Aggregators and Measuring Degree of Privacy in a Distributed Trust Based Anonymous Aggregation System

A smart grid is an advanced method for supplying electricity to the consumers alleviating the limitations of the existing system. It causes frequent meter reading transmission from the end-user to the supplier. This frequent data transmission poses privacy risks. Several works have been proposed to solve this problem but cannot ensure privacy at the optimal level. This work is based on a distributed trust-based data aggregation system leveraging a secret sharing mechanism. In this work, we show that {\em three aggregators} are enough for ensuring consumer's privacy in a distributed trust-based system. We leverage the idea of anonymity in our research and show that neither an active attacker nor a passive attacker can breach consumer's privacy. We show proof of our concept mathematically and in a cryptographic game based mechanism. We name our new proposed system \emph{"Distributed Trust Based Anonymous System (DTBAS)"}.

preprint2015arXiv

Student Satisfaction mining in a typical core course of Computer Science

Students' satisfaction plays a vital role in success of an educational institute. Hence, many educational institutes continuously improve their service to produce a supportive learning environment to satisfy the student need. For this reason, educational institutions collect student satisfaction data to make decision about institutional quality, but till now it cannot be determined because student satisfaction is a complex matter which is influenced by variety of characteristics of students and institutions. There are many studies have been performed to inspect student satisfaction in the form of college services, programs, student accommodation facility, student-faculty interaction, consulting hours etc. So, still we cannot have a standard method to know what is going on about satisfaction in the case of a core course. In this research we determined the attributes that heavily affect student satisfaction in a core course of computer science and the current status of other attributes as well.

preprint2012arXiv

A Coherent Distributed Grid Service for Assimilation and Unification of Heterogeneous Data Source

Grid services are heavily used for handling large distributed computations. They are also very useful to handle heavy data intensive applications where data are distributed in different sites. Most of the data grid services used in such situations are meant for homogeneous data source. In case of Heterogeneous data sources, most of the grid services that are available are designed such a way that they must be identical in schema definition for their smooth operation. But there can be situations where the grid site databases are heterogeneous and their schema definition is different from the central schema definition. In this paper we propose a light weight coherent grid service for heterogeneous data sources that is very easily install. It can map and convert the central SQL schema into that of the grid members and send queries to get according results from heterogeneous data sources.