Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

Large language models are increasingly being used to support network operations (NetOps) and artificial intelligence for IT operations (AIOps), including incident investigation, root-cause analysis, configuration synthesis, and limited self-healing. In both NetOps and AIOps, this shift is changing how tasks are managed. Agent-based operations work as workflows, from gathering evidence to taking action, following permissions, policies, and checks, and providing rollback options when necessary. This is crucial because operational decisions can have instant impacts. To make the argument concrete, we organise the relevant literature around the hierarchy of autonomy, tool scope, evidence traces, and assurance contracts. These contracts define what an agent may observe, propose, and execute. They also define the checks that must pass before any action is allowed. A consistent pattern appears across work on telemetry query recommendation, diagnosis, root-cause analysis, configuration synthesis, change planning, and limited self-healing. Operational reliability does not come chiefly from the model itself. It depends on the machinery around the model. We also argue that evaluation should go beyond static question answering. Agentic NetOps and AIOps systems require workflow-centred evaluation, including trace quality, bounded tool use, safe proposal generation, replay in sandboxed environments, and canary trials with rollback-aware scoring. Without these measures, a system may appear robust yet remain too fragile. Finally, we examine security, privacy, and governance risks that become acute when agents sit close to operational control surfaces. Taken together, the survey concludes that progress in intelligent NetOps and AIOps will depend on treating autonomy as a constrained operational control problem, whose outputs must be reliable, auditable, and securely deployable.

preprint2026arXiv

Mining Subscenario Refactoring Opportunities in Behaviour-Driven Software Test Suites: ML Classifiers and LLM-Judge Baselines

Context. Behaviour-Driven Development (BDD) software test suites accumulate duplicated step subsequences. Three published refactoring patterns are available (within-file Background, within-repo reusable-scenario invocation, cross-organisational shared higher-level step), but no prior work automates which recurring subsequences are worth extracting or which mechanism applies. Objective. Rank recurring step subsequences ("slices") by refactoring suitability (extraction-worthy), pre-map each to one of the three patterns, and quantify prevalence across the public BDD ecosystem. Method. Every contiguous L-step window (L in [2, 18]) in a 339-repository / 276-upstream-owner Gherkin corpus is keyed by paraphrase-robust cluster identifiers and counted under three scopes. Sentence-BERT (SBERT) / Uniform Manifold Approximation and Projection (UMAP) / Hierarchical Density-Based Clustering (HDBSCAN) recovers paraphrase-equivalent slices. Three authors label a stratified 200-slice pool against a written rubric. An eXtreme Gradient Boosting (XGBoost) extraction-worthy classifier trained under 5-fold cross-validation is compared with a tuned rule baseline and two open-weight Large Language Model (LLM) judges. Results. The miner produces 5,382,249 slices collapsing to 692,020 recurring patterns. Three-author Fleiss' kappa = 0.56 (extraction-worthy) and 0.79 (mechanism). The classifier reaches out-of-fold F1 = 0.891 (95% CI [0.852, 0.927]), outperforming both the rule baseline (F1 = 0.836, p = 0.017) and the better LLM judge (F1 = 0.728, p < 1e-4). 75.0%, 59.5%, and 11.7% of scenarios carry a within-file Background, within-repo reusable-scenario, or cross-organisational shared-step candidate. Conclusion. Paraphrase-robust subscenario discovery yields a corpus-wide census of BDD refactoring opportunities; pipeline, classifier predictions, labelled pool, and rubric are released under Apache-2.0.

preprint2026arXiv

NOS-Gate: Queue-Aware Streaming IDS for Consumer Gateways under Timing-Controlled Evasion

Timing and burst patterns can leak through encryption, and an adaptive adversary can exploit them. This undermines metadata-only detection in a stand-alone consumer gateway. Therefore, consumer gateways need streaming intrusion detection on encrypted traffic using metadata only, under tight CPU and latency budgets. We present a streaming IDS for stand-alone gateways that instantiates a lightweight two-state unit derived from Network-Optimised Spiking (NOS) dynamics per flow, named NOS-Gate. NOS-Gate scores fixed-length windows of metadata features and, under a $K$-of-$M$ persistence rule, triggers a reversible mitigation that temporarily reduces the flow&#39;s weight under weighted fair queueing (WFQ). We evaluate NOS-Gate under timing-controlled evasion using an executable &#39;worlds&#39; benchmark that specifies benign device processes, auditable attacker budgets, contention structure, and packet-level WFQ replay to quantify queue impact. All methods are calibrated label-free via burn-in quantile thresholding. Across multiple reproducible worlds and malicious episodes, at an achieved $0.1%$ false-positive operating point, NOS-Gate attains 0.952 incident recall versus 0.857 for the best baseline in these runs. Under gating, it reduces p99.9 queueing delay and p99.9 collateral delay with a mean scoring cost of ~ 2.09 μs per flow-window on CPU.

preprint2022arXiv

Comparative Analysis of Time Series Forecasting Approaches for Household Electricity Consumption Prediction

As a result of increasing population and globalization, the demand for energy has greatly risen. Therefore, accurate energy consumption forecasting has become an essential prerequisite for government planning, reducing power wastage and stable operation of the energy management system. In this work we present a comparative analysis of major machine learning models for time series forecasting of household energy consumption. Specifically, we use Weka, a data mining tool to first apply models on hourly and daily household energy consumption datasets available from Kaggle data science community. The models applied are: Multilayer Perceptron, K Nearest Neighbor regression, Support Vector Regression, Linear Regression, and Gaussian Processes. Secondly, we also implemented time series forecasting models, ARIMA and VAR, in python to forecast household energy consumption of selected South Korean households with and without weather data. Our results show that the best methods for the forecasting of energy consumption prediction are Support Vector Regression followed by Multilayer Perceptron and Gaussian Process Regression.

preprint2021arXiv

On the Impact of Device and Behavioral Heterogeneity in Federated Learning

Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities. FL has seen successful deployment in production environments, and it has been adopted in services such as virtual keyboards, auto-completion, item recommendation, and several IoT applications. However, FL comes with the challenge of performing training over largely heterogeneous datasets, devices, and networks that are out of the control of the centralized FL server. Motivated by this inherent setting, we make a first step towards characterizing the impact of device and behavioral heterogeneity on the trained model. We conduct an extensive empirical study spanning close to 1.5K unique configurations on five popular FL benchmarks. Our analysis shows that these sources of heterogeneity have a major impact on both model performance and fairness, thus sheds light on the importance of considering heterogeneity in FL system design.

preprint2021arXiv

Use of Transfer Learning and Wavelet Transform for Breast Cancer Detection

Breast cancer is one of the most common cause of deaths among women. Mammography is a widely used imaging modality that can be used for cancer detection in its early stages. Deep learning is widely used for the detection of cancerous masses in the images obtained via mammography. The need to improve accuracy remains constant due to the sensitive nature of the datasets so we introduce segmentation and wavelet transform to enhance the important features in the image scans. Our proposed system aids the radiologist in the screening phase of cancer detection by using a combination of segmentation and wavelet transforms as pre-processing augmentation that leads to transfer learning in neural networks. The proposed system with these pre-processing techniques significantly increases the accuracy of detection on Mini-MIAS.

preprint2020arXiv

Amateur Drones Detection: A machine learning approach utilizing the acoustic signals in the presence of strong interference

Owing to small size, sensing capabilities and autonomous nature, the Unmanned Air Vehicles (UAVs) have enormous applications in various areas, e.g., remote sensing, navigation, archaeology, journalism, environmental science, and agriculture. However, the unmonitored deployment of UAVs called the amateur drones (AmDr) can lead to serious security threats and risk to human life and infrastructure. Therefore, timely detection of the AmDr is essential for the protection and security of sensitive organizations, human life and other vital infrastructure. AmDrs can be detected using different techniques based on sound, video, thermal, and radio frequencies. However, the performance of these techniques is limited in sever atmospheric conditions. In this paper, we propose an efficient unsupervise machine learning approach of independent component analysis (ICA) to detect various acoustic signals i.e., sounds of bird, airplanes, thunderstorm, rain, wind and the UAVs in practical scenario. After unmixing the signals, the features like Mel Frequency Cepstral Coefficients (MFCC), the power spectral density (PSD) and the Root Mean Square Value (RMS) of the PSD are extracted by using ICA. The PSD and the RMS of PSD signals are extracted by first passing the signals from octave band filter banks. Based on the above features the signals are classified using Support Vector Machines (SVM) and K Nearest Neighbor (KNN) to detect the presence or absence of AmDr. Unique feature of the proposed technique is the detection of a single or multiple AmDrs at a time in the presence of multiple acoustic interfering signals. The proposed technique is verified through extensive simulations and it is observed that the RMS values of PSD with KNN performs better than the MFCC with KNN and SVM.

preprint2020arXiv

Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

The electrocardiogram (ECG) is one of the most extensively employed signals used in the diagnosis and prediction of cardiovascular diseases (CVDs). The ECG signals can capture the heart&#39;s rhythmic irregularities, commonly known as arrhythmias. A careful study of ECG signals is crucial for precise diagnoses of patients&#39; acute and chronic heart conditions. In this study, we propose a two-dimensional (2-D) convolutional neural network (CNN) model for the classification of ECG signals into eight classes; namely, normal beat, premature ventricular contraction beat, paced beat, right bundle branch block beat, left bundle branch block beat, atrial premature contraction beat, ventricular flutter wave beat, and ventricular escape beat. The one-dimensional ECG time series signals are transformed into 2-D spectrograms through short-time Fourier transform. The 2-D CNN model consisting of four convolutional layers and four pooling layers is designed for extracting robust features from the input spectrograms. Our proposed methodology is evaluated on a publicly available MIT-BIH arrhythmia dataset. We achieved a state-of-the-art average classification accuracy of 99.11\%, which is better than those of recently reported results in classifying similar types of arrhythmias. The performance is significant in other indices as well, including sensitivity and specificity, which indicates the success of the proposed method.

preprint2020arXiv

Classification of Chest Diseases using Wavelet Transforms and Transfer Learning

Chest X-ray scan is a most often used modality by radiologists to diagnose many chest related diseases in their initial stages. The proposed system aids the radiologists in making decision about the diseases found in the scans more efficiently. Our system combines the techniques of image processing for feature enhancement and deep learning for classification among diseases. We have used the ChestX-ray14 database in order to train our deep learning model on the 14 different labeled diseases found in it. The proposed research shows the significant improvement in the results by using wavelet transforms as pre-processing technique.

preprint2020arXiv

RDSP: Rapidly Deployable Wireless Ad Hoc System for Post-Disaster Management

In post-disaster scenarios, such as after floods, earthquakes, and in war zones, the cellular communication infrastructure may be destroyed or seriously disrupted. In such emergency scenarios, it becomes very important for first aid responders to communicate with other rescue teams in order to provide feedback to both the central office and the disaster survivors. To address this issue, rapidly deployable systems are required to re-establish connectivity and assist users and first responders in the region of incident. In this work, we describe the design, implementation, and evaluation of a rapidly deployable system for first response applications in post-disaster situations, named RDSP. The proposed system helps early rescue responders and victims by sharing their location information to remotely located servers by utilizing a novel routing scheme. This novel routing scheme consists of the Dynamic ID Assignment (DIA) algorithm and the Minimum Maximum Neighbor (MMN) algorithm. The DIA algorithm is used by relay devices to dynamically select their IDs on the basis of all the available IDs of networks. Whereas, the MMN algorithm is used by the client and relay devices to dynamically select their next neighbor relays for the transmission of messages. The RDSP contains three devices; the client device sends the victim&#39;s location information to the server, the relay device relays information between client and server device, the server device receives messages from the client device to alert the rescue team. We deployed and evaluated our system in the outdoor environment of the university campus. The experimental results show that the RDSP system reduces the message delivery delay and improves the message delivery ratio with lower communication overhead.

preprint2020arXiv

Secure and Robust Machine Learning for Healthcare: A Survey

Recent years have witnessed widespread adoption of machine learning (ML)/deep learning (DL) techniques due to their superior performance for a variety of healthcare applications ranging from the prediction of cardiac arrest from one-dimensional heart signals to computer-aided diagnosis (CADx) using multi-dimensional medical images. Notwithstanding the impressive performance of ML/DL, there are still lingering doubts regarding the robustness of ML/DL in healthcare settings (which is traditionally considered quite challenging due to the myriad security and privacy issues involved), especially in light of recent results that have shown that ML/DL are vulnerable to adversarial attacks. In this paper, we present an overview of various application areas in healthcare that leverage such techniques from security and privacy point of view and present associated challenges. In addition, we present potential methods to ensure secure and privacy-preserving ML for healthcare applications. Finally, we provide insight into the current research challenges and promising directions for future research.

preprint2020arXiv

Single Board Computers (SBC): The Future of Next Generation Pedagogies in Pakistan

ARM processors have taken over the mobile industry from a long time now. Future of data centers and the IT industry is estimated to make use of ARM Processors. Projects like Openstack on ARM are enabling use of ARM in data centers . Single board computers (SBCs) based on ARM processors have become the norm these days. Reason for their popularity lies in their cost effective and power efficient nature. There are hundreds of them available in the market having different sizes, compute power and prices. The reason for their popularity is largely due to the rise of new technology called IoT (Internet of Things) but there is another perspective where they can become handy. Low Price and Power Usage of single board computers makes them top candidate to be used for teaching many courses with hands-on experience in developing countries like Pakistan. Many boards support full Linux distributions and can be used as general-purpose computers while many of them are open hardware based. In this paper, we have reviewed the famous options available and tried to figure out which of them are better for teaching what kind of courses.

preprint2020arXiv

Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

Cross-lingual speech emotion recognition (SER) is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that can learn language invariant representations without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.

preprint2019arXiv

Secure Distribution of Protected Content in Information-Centric Networking

The benefits of the ubiquitous caching in ICN are profound, such features make ICN promising for content distribution, but it also introduces a challenge to content protection against the unauthorized access. The protection of a content against unauthorized access requires consumer authentication and involves the conventional end-to-end encryption. However, in information-centric networking (ICN), such end-to-end encryption makes the content caching ineffective since encrypted contents stored in a cache are useless for any consumers except those who know the encryption key. For effective caching of encrypted contents in ICN, we propose a secure distribution of protected content (SDPC) scheme, which ensures that only authenticated consumers can access the content. SDPC is lightweight and allows consumers to verify the originality of the published content by using a symmetric key encryption. Moreover, SDPC naming scheme provides protection against privacy leakage. The security of SDPC was proved with the BAN logic and Scyther tool verification, and simulation results show that SDPC can reduce the content download delay.