Source author record

Nasir U. Eisty

Nasir U. Eisty appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Computer Vision Cryptography and Security Machine Learning

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Integrating APK Image and Text Data for Enhanced Threat Detection: A Multimodal Deep Learning Approach to Android Malware

As zero-day Android malware attacks grow more sophisticated, recent research highlights the effectiveness of using image-based representations of malware bytecode to detect previously unseen threats. However, existing studies often overlook how image type and resolution affect detection and ignore valuable textual data in Android Application Packages (APKs), such as permissions and metadata, limiting their ability to fully capture malicious behavior. The integration of multimodality, which combines image and text data, has gained momentum as a promising approach to address these limitations. This paper proposes a multimodal deep learning framework integrating APK images and textual features to enhance Android malware detection. We systematically evaluate various image types and resolutions across different Convolutional Neural Networks (CNN) architectures, including VGG, ResNet-152, MobileNet, DenseNet, EfficientNet-B4, and use LLaMA-2, a large language model, to extract and annotate textual features for improved analysis. The findings demonstrate that RGB images at higher resolutions (e.g., 256x256, 512x512) achieve superior classification performance, while the multimodal integration of image and text using the CLIP model reveals limited potential. Overall, this research highlights the importance of systematically evaluating image attributes and integrating multimodal data to develop effective malware detection for Android systems.

preprint2026arXiv

LLMs in Code Vulnerability Analysis: A Proof of Concept

Context: Traditional software security analysis methods struggle to keep pace with the scale and complexity of modern codebases, requiring intelligent automation to detect, assess, and remediate vulnerabilities more efficiently and accurately. Objective: This paper explores the incorporation of code-specific and general-purpose Large Language Models (LLMs) to automate critical software security tasks, such as identifying vulnerabilities, predicting severity and access complexity, and generating fixes as a proof of concept. Method: We evaluate five pairs of recent LLMs, including both code-based and general-purpose open-source models, on two recognized C/C++ vulnerability datasets, namely Big-Vul and Vul-Repair. Additionally, we compare fine-tuning and prompt-based approaches. Results: The results show that fine-tuning uniformly outperforms both zero-shot and few-shot approaches across all tasks and models. Notably, code-specialized models excel in zero-shot and few-shot settings on complex tasks, while general-purpose models remain nearly as effective. Discrepancies among CodeBLEU, CodeBERTScore, BLEU, and ChrF highlight the inadequacy of current metrics for measuring repair quality. Conclusions: This study contributes to the software security community by investigating the potential of advanced LLMs to improve vulnerability analysis and remediation.

preprint2026arXiv

Multi-Artifact Analysis of Self-Admitted Technical Debt in Scientific Software

Context: Self-admitted technical debt (SATD) occurs when developers acknowledge shortcuts in code. In scientific software (SSW), such debt poses unique risks to the validity and reproducibility of results. Objective: This study aims to identify, categorize, and evaluate scientific debt, a specialized form of SATD in SSW, and assess the extent to which traditional SATD categories capture these domain-specific issues. Method: We conduct a multi-artifact analysis across code comments, commit messages, pull requests, and issue trackers from 23 open-source SSW projects. We construct and validate a curated dataset of scientific debt, develop a multi-source SATD classifier, and conduct a practitioner validation to assess the practical relevance of scientific debt. Results: Our classifier performs strongly across 900,358 artifacts from 23 SSW projects. SATD is most prevalent in pull requests and issue trackers, underscoring the value of multi-artifact analysis. Models trained on traditional SATD often miss scientific debt, emphasizing the need for its explicit detection in SSW. Practitioner validation confirmed that scientific debt is both recognizable and useful in practice. Conclusions: Scientific debt represents a unique form of SATD in SSW that that is not adequately captured by traditional categories and requires specialized identification and management. Our dataset, classification analysis, and practitioner validation results provide the first formal multi-artifact perspective on scientific debt, highlighting the need for tailored SATD detection approaches in SSW.

preprint2026arXiv

Technical Lag as Latent Technical Debt: A Rapid Review

Context: Technical lag accumulates when software systems fail to keep pace with technological advancements, leading to a deterioration in software quality. Objective: This paper aims to consolidate existing research on technical lag, clarify definitions, explore its detection and quantification methods, examine underlying causes and consequences, review current management practices, and lay out a vision as an indicator of passively accumulated technical debt. Method: We conducted a Rapid Review with snowballing to select the appropriate peer-reviewed studies. We leveraged the ACM Digital Library, IEEE Xplore, Scopus, and Springer as our primary source databases. Results: Technical lag accumulates passively, often unnoticed due to inadequate detection metrics and tools. It negatively impacts software quality through outdated dependencies, obsolete APIs, unsupported platforms, and aging infrastructure. Strategies to manage technical lag primarily involve automated dependency updates, continuous integration processes, and regular auditing. Conclusions: Enhancing and extending the current standardized metrics, detection methods, and empirical studies to use technical lag as an indication of accumulated latent debt can greatly improve the process of maintaining large codebases that are heavily dependent on external packages. We have identified the research gaps and outlined a future vision for researchers and practitioners to explore.

preprint2022arXiv

Automatic Transformation of Natural to Unified Modeling Language: A Systematic Review

Context: Processing Software Requirement Specifications (SRS) manually takes a much longer time for requirement analysts in software engineering. Researchers have been working on making an automatic approach to ease this task. Most of the existing approaches require some intervention from an analyst or are challenging to use. Some automatic and semi-automatic approaches were developed based on heuristic rules or machine learning algorithms. However, there are various constraints to the existing approaches of UML generation, such as restriction on ambiguity, length or structure, anaphora, incompleteness, atomicity of input text, requirements of domain ontology, etc. Objective: This study aims to better understand the effectiveness of existing systems and provide a conceptual framework with further improvement guidelines. Method: We performed a systematic literature review (SLR). We conducted our study selection into two phases and selected 70 papers. We conducted quantitative and qualitative analyses by manually extracting information, cross-checking, and validating our findings. Result: We described the existing approaches and revealed the issues observed in these works. We identified and clustered both the limitations and benefits of selected articles. Conclusion: This research upholds the necessity of a common dataset and evaluation framework to extend the research consistently. It also describes the significance of natural language processing obstacles researchers face. In addition, it creates a path forward for future research.

preprint2022arXiv

Software Engineering Approaches for TinyML based IoT Embedded Vision: A Systematic Literature Review

Internet of Things (IoT) has catapulted human ability to control our environments through ubiquitous sensing, communication, computation, and actuation. Over the past few years, IoT has joined forces with Machine Learning (ML) to embed deep intelligence at the far edge. TinyML (Tiny Machine Learning) has enabled the deployment of ML models for embedded vision on extremely lean edge hardware, bringing the power of IoT and ML together. However, TinyML powered embedded vision applications are still in a nascent stage, and they are just starting to scale to widespread real-world IoT deployment. To harness the true potential of IoT and ML, it is necessary to provide product developers with robust, easy-to-use software engineering (SE) frameworks and best practices that are customized for the unique challenges faced in TinyML engineering. Through this systematic literature review, we aggregated the key challenges reported by TinyML developers and identified state-of-art SE approaches in large-scale Computer Vision, Machine Learning, and Embedded Systems that can help address key challenges in TinyML based IoT embedded vision. In summary, our study draws synergies between SE expertise that embedded systems developers and ML developers have independently developed to help address the unique challenges in the engineering of TinyML based IoT embedded vision.

preprint2022arXiv

Testing Research Software: A Survey

Background: Research software plays an important role in solving real-life problems, empowering scientific innovations, and handling emergency situations. Therefore, the correctness and trustworthiness of research software are of absolute importance. Software testing is an important activity for identifying problematic code and helping to produce high-quality software. However, testing of research software is difficult due to the complexity of the underlying science, relatively unknown results from scientific algorithms, and the culture of the research software community. Aims: The goal of this paper is to better understand current testing practices, identify challenges, and provide recommendations on how to improve the testing process for research software development. Method: We surveyed members of the research software developer community to collect information regarding their knowledge about and use of software testing in their projects. Results: We analysed 120 responses and identified that even though research software developers report they have an average level of knowledge about software testing, they still find it difficult due to the numerous challenges involved. However, there are a number of ways, such as proper training, that can improve the testing process for research software. Conclusions: Testing can be challenging for any type of software. This difficulty is especially present in the development of research software, where software engineering activities are typically given less attention. To produce trustworthy results from research software, there is a need for a culture change so that testing is valued and teams devote appropriate effort to writing and executing tests.

Nasir U. Eisty

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Integrating APK Image and Text Data for Enhanced Threat Detection: A Multimodal Deep Learning Approach to Android Malware

LLMs in Code Vulnerability Analysis: A Proof of Concept

Multi-Artifact Analysis of Self-Admitted Technical Debt in Scientific Software

Technical Lag as Latent Technical Debt: A Rapid Review

Automatic Transformation of Natural to Unified Modeling Language: A Systematic Review

Software Engineering Approaches for TinyML based IoT Embedded Vision: A Systematic Literature Review

Testing Research Software: A Survey