Researcher profile

Sheraz Ahmed

Sheraz Ahmed contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2026arXiv

E-TCAV: Formalizing Penultimate Proxies for Efficient Concept Based Interpretability

TCAV (Testing with Concept Activation Vectors) is an interpretability method that assesses the alignment between the internal representations of a trained neural network and human-understandable, high-level concepts. Though effective, TCAV suffers from significant computational overhead, inter-layer disagreement of TCAV scores, and statistical instability. This work takes a step toward addressing these challenges by introducing E-TCAV, a framework for efficient approximation of TCAV scores, which is based on extensive investigation into three key aspects of the TCAV methodology: 1) the effect of latent classifiers on the stability of TCAV scores, 2) the inter-layer agreement of TCAV scores, and 3) the use of the penultimate layer as a fast proxy for earlier layers for TCAV computation. To ensure a solid foundation for E-TCAV, we conduct extensive evaluations across four different architectures and five datasets, encompassing problems from both computer vision and natural language domains. Our results show that the layers in the final block of the neural network strongly agree with the penultimate layer in terms of the TCAV scores, and the commonly observed variance of the TCAV scores can be attributed to the choice of the latent classifier. Leveraging this inter-layer agreement and the degeneracy of directional sensitivities at the penultimate layer, E-TCAV guarantees linearly scaling speed-ups with respect to the network's size and the number of evaluation samples, marking a step towards efficient model debugging and real-time concept-guided training.

preprint2022arXiv

A Novel Approach to Train Diverse Types of Language Models for Health Mention Classification of Tweets

Health mention classification deals with the disease detection in a given text containing disease words. However, non-health and figurative use of disease words adds challenges to the task. Recently, adversarial training acting as a means of regularization has gained popularity in many NLP tasks. In this paper, we propose a novel approach to train language models for health mention classification of tweets that involves adversarial training. We generate adversarial examples by adding perturbation to the representations of transformer models for tweet examples at various levels using Gaussian noise. Further, we employ contrastive loss as an additional objective function. We evaluate the proposed method on the PHM2017 dataset extended version. Results show that our proposed approach improves the performance of classifier significantly over the baseline methods. Moreover, our analysis shows that adding noise at earlier layers improves models' performance whereas adding noise at intermediate layers deteriorates models' performance. Finally, adding noise towards the final layers performs better than the middle layers noise addition.

preprint2022arXiv

ExAID: A Multimodal Explanation Framework for Computer-Aided Diagnosis of Skin Lesions

One principal impediment in the successful deployment of AI-based Computer-Aided Diagnosis (CAD) systems in clinical workflows is their lack of transparent decision making. Although commonly used eXplainable AI methods provide some insight into opaque algorithms, such explanations are usually convoluted and not readily comprehensible except by highly trained experts. The explanation of decisions regarding the malignancy of skin lesions from dermoscopic images demands particular clarity, as the underlying medical problem definition is itself ambiguous. This work presents ExAID (Explainable AI for Dermatology), a novel framework for biomedical image analysis, providing multi-modal concept-based explanations consisting of easy-to-understand textual explanations supplemented by visual maps justifying the predictions. ExAID relies on Concept Activation Vectors to map human concepts to those learnt by arbitrary Deep Learning models in latent space, and Concept Localization Maps to highlight concepts in the input space. This identification of relevant concepts is then used to construct fine-grained textual explanations supplemented by concept-wise location information to provide comprehensive and coherent multi-modal explanations. All information is comprehensively presented in a diagnostic interface for use in clinical routines. An educational mode provides dataset-level explanation statistics and tools for data and model exploration to aid medical research and education. Through rigorous quantitative and qualitative evaluation of ExAID, we show the utility of multi-modal explanations for CAD-assisted scenarios even in case of wrong predictions. We believe that ExAID will provide dermatologists an effective screening tool that they both understand and trust. Moreover, it will be the basis for similar applications in other biomedical imaging fields.

preprint2022arXiv

FiN: A Smart Grid and Power Line Communication Dataset

The increasing complexity of low-voltage networks poses a growing challenge for the reliable and fail-safe operation of electricity grids. The reasons for this include an increasingly decentralized energy generation (photovoltaic systems, wind power, etc.) and the emergence of new types of consumers (e-mobility, domestic electricity storage, etc.). At the same time, the low-voltage grid is largely unmonitored and local power failures are sometimes hard to detect. To overcome this, power line communication (PLC) has emerged as a potential solution for reliable monitoring of the low-voltage grid. In addition to establishing a communication infrastructure, PLC also offers the possibility of evaluating the cables themselves, as well as the connection quality between individual cable distributors based on their Signal-to-Noise Ratio (SNR). The roll-out of a large-scale PLC infrastructure therefore not only ensures communication, but also introduces a tool for monitoring the entire network. To evaluate the potential of this data, we installed 38 PLC modems in three different areas of a German city with a population of about 150,000 as part of the Fühler-im-Netz project. Over a period of 22 months, an SNR spectrum of each connection between adjacent PLC modems was generated every quarter of an hour. % and the voltage was measured every minute. The availability of this real-world PLC data opens up new possibilities to react to the increasingly complex challenges in future smart grids. This paper provides a detailed analysis of the data generation and describes how the data was collected during normal operation of the electricity grid. In addition, we present common anomalies, effects, and trends that could be observed in the PLC data at daily, weekly, or seasonal levels. Finally, we discuss potential use cases and the remote inspection of a cable section is highlighted as an example.

preprint2022arXiv

Improving Health Mentioning Classification of Tweets using Contrastive Adversarial Training

Health mentioning classification (HMC) classifies an input text as health mention or not. Figurative and non-health mention of disease words makes the classification task challenging. Learning the context of the input text is the key to this problem. The idea is to learn word representation by its surrounding words and utilize emojis in the text to help improve the classification results. In this paper, we improve the word representation of the input text using adversarial training that acts as a regularizer during fine-tuning of the model. We generate adversarial examples by perturbing the embeddings of the model and then train the model on a pair of clean and adversarial examples. Additionally, we utilize contrastive loss that pushes a pair of clean and perturbed examples close to each other and other examples away in the representation space. We train and evaluate the method on an extended version of the publicly available PHM2017 dataset. Experiments show an improvement of 1.0% over BERT-Large baseline and 0.6% over RoBERTa-Large baseline, whereas 5.8% over the state-of-the-art in terms of F1 score. Furthermore, we provide a brief analysis of the results by utilizing the power of explainable AI.

preprint2022arXiv

KENN: Enhancing Deep Neural Networks by Leveraging Knowledge for Time Series Forecasting

End-to-end data-driven machine learning methods often have exuberant requirements in terms of quality and quantity of training data which are often impractical to fulfill in real-world applications. This is specifically true in time series domain where problems like disaster prediction, anomaly detection, and demand prediction often do not have a large amount of historical data. Moreover, relying purely on past examples for training can be sub-optimal since in doing so we ignore one very important domain i.e knowledge, which has its own distinct advantages. In this paper, we propose a novel knowledge fusion architecture, Knowledge Enhanced Neural Network (KENN), for time series forecasting that specifically aims towards combining strengths of both knowledge and data domains while mitigating their individual weaknesses. We show that KENN not only reduces data dependency of the overall framework but also improves performance by producing predictions that are better than the ones produced by purely knowledge and data driven domains. We also compare KENN with state-of-the-art forecasting methods and show that predictions produced by KENN are significantly better even when trained on only 50\% of the data.

preprint2022arXiv

Time to Focus: A Comprehensive Benchmark Using Time Series Attribution Methods

In the last decade neural network have made huge impact both in industry and research due to their ability to extract meaningful features from imprecise or complex data, and by achieving super human performance in several domains. However, due to the lack of transparency the use of these networks is hampered in the areas with safety critical areas. In safety-critical areas, this is necessary by law. Recently several methods have been proposed to uncover this black box by providing interpreation of predictions made by these models. The paper focuses on time series analysis and benchmark several state-of-the-art attribution methods which compute explanations for convolutional classifiers. The presented experiments involve gradient-based and perturbation-based attribution methods. A detailed analysis shows that perturbation-based approaches are superior concerning the Sensitivity and occlusion game. These methods tend to produce explanations with higher continuity. Contrarily, the gradient-based techniques are superb in runtime and Infidelity. In addition, a validation the dependence of the methods on the trained model, feasible application domains, and individual characteristics is attached. The findings accentuate that choosing the best-suited attribution method is strongly correlated with the desired use case. Neither category of attribution methods nor a single approach has shown outstanding performance across all aspects.

preprint2022arXiv

TimeREISE: Time-series Randomized Evolving Input Sample Explanation

Deep neural networks are one of the most successful classifiers across different domains. However, due to their limitations concerning interpretability their use is limited in safety critical context. The research field of explainable artificial intelligence addresses this problem. However, most of the interpretability methods are aligned to the image modality by design. The paper introduces TimeREISE a model agnostic attribution method specifically aligned to success in the context of time series classification. The method shows superior performance compared to existing approaches concerning different well-established measurements. TimeREISE is applicable to any time series classification network, its runtime does not scale in a linear manner concerning the input shape and it does not rely on prior data knowledge.

preprint2022arXiv

Utilizing Out-Domain Datasets to Enhance Multi-Task Citation Analysis

Citations are generally analyzed using only quantitative measures while excluding qualitative aspects such as sentiment and intent. However, qualitative aspects provide deeper insights into the impact of a scientific research artifact and make it possible to focus on relevant literature free from bias associated with quantitative aspects. Therefore, it is possible to rank and categorize papers based on their sentiment and intent. For this purpose, larger citation sentiment datasets are required. However, from a time and cost perspective, curating a large citation sentiment dataset is a challenging task. Particularly, citation sentiment analysis suffers from both data scarcity and tremendous costs for dataset annotation. To overcome the bottleneck of data scarcity in the citation analysis domain we explore the impact of out-domain data during training to enhance the model performance. Our results emphasize the use of different scheduling methods based on the use case. We empirically found that a model trained using sequential data scheduling is more suitable for domain-specific usecases. Conversely, shuffled data feeding achieves better performance on a cross-domain task. Based on our findings, we propose an end-to-end trainable multi-task model that covers the sentiment and intent analysis that utilizes out-domain datasets to overcome the data scarcity.

preprint2021arXiv

Deep Learning Based Decision Support for Medicine -- A Case Study on Skin Cancer Diagnosis

Early detection of skin cancers like melanoma is crucial to ensure high chances of survival for patients. Clinical application of Deep Learning (DL)-based Decision Support Systems (DSS) for skin cancer screening has the potential to improve the quality of patient care. The majority of work in the medical AI community focuses on a diagnosis setting that is mainly relevant for autonomous operation. Practical decision support should, however, go beyond plain diagnosis and provide explanations. This paper provides an overview of works towards explainable, DL-based decision support in medical applications with the example of skin cancer diagnosis from clinical, dermoscopic and histopathologic images. Analysis reveals that comparably little attention is payed to the explanation of histopathologic skin images and that current work is dominated by visual relevance maps as well as dermoscopic feature identification. We conclude that future work should focus on meeting the stakeholder's cognitive concepts, providing exhaustive explanations that combine global and local approaches and leverage diverse modalities. Moreover, the possibility to intervene and guide models in case of misbehaviour is identified as a major step towards successful deployment of AI as DL-based DSS and beyond.

preprint2020arXiv

Benchmarking adversarial attacks and defenses for time-series data

The adversarial vulnerability of deep networks has spurred the interest of researchers worldwide. Unsurprisingly, like images, adversarial examples also translate to time-series data as they are an inherent weakness of the model itself rather than the modality. Several attempts have been made to defend against these adversarial attacks, particularly for the visual modality. In this paper, we perform detailed benchmarking of well-proven adversarial defense methodologies on time-series data. We restrict ourselves to the $L_{\infty}$ threat model. We also explore the trade-off between smoothness and clean accuracy for regularization-based defenses to better understand the trade-offs that they offer. Our analysis shows that the explored adversarial defenses offer robustness against both strong white-box as well as black-box attacks. This paves the way for future research in the direction of adversarial attacks and defenses, particularly for time-series data.

preprint2020arXiv

Combining Fine- and Coarse-Grained Classifiers for Diabetic Retinopathy Detection

Visual artefacts of early diabetic retinopathy in retinal fundus images are usually small in size, inconspicuous, and scattered all over retina. Detecting diabetic retinopathy requires physicians to look at the whole image and fixate on some specific regions to locate potential biomarkers of the disease. Therefore, getting inspiration from ophthalmologist, we propose to combine coarse-grained classifiers that detect discriminating features from the whole images, with a recent breed of fine-grained classifiers that discover and pay particular attention to pathologically significant regions. To evaluate the performance of this proposed ensemble, we used publicly available EyePACS and Messidor datasets. Extensive experimentation for binary, ternary and quaternary classification shows that this ensemble largely outperforms individual image classifiers as well as most of the published works in most training setups for diabetic retinopathy detection. Furthermore, the performance of fine-grained classifiers is found notably superior than coarse-grained image classifiers encouraging the development of task-oriented fine-grained classifiers modelled after specialist ophthalmologists.

preprint2020arXiv

Explaining AI-based Decision Support Systems using Concept Localization Maps

Human-centric explainability of AI-based Decision Support Systems (DSS) using visual input modalities is directly related to reliability and practicality of such algorithms. An otherwise accurate and robust DSS might not enjoy trust of experts in critical application areas if it is not able to provide reasonable justification of its predictions. This paper introduces Concept Localization Maps (CLMs), which is a novel approach towards explainable image classifiers employed as DSS. CLMs extend Concept Activation Vectors (CAVs) by locating significant regions corresponding to a learned concept in the latent space of a trained image classifier. They provide qualitative and quantitative assurance of a classifier's ability to learn and focus on similar concepts important for humans during image recognition. To better understand the effectiveness of the proposed method, we generated a new synthetic dataset called Simple Concept DataBase (SCDB) that includes annotations for 10 distinguishable concepts, and made it publicly available. We evaluated our proposed method on SCDB as well as a real-world dataset called CelebA. We achieved localization recall of above 80% for most relevant concepts and average recall above 60% for all concepts using SE-ResNeXt-50 on SCDB. Our results on both datasets show great promise of CLMs for easing acceptance of DSS in practice.

preprint2020arXiv

G1020: A Benchmark Retinal Fundus Image Dataset for Computer-Aided Glaucoma Detection

Scarcity of large publicly available retinal fundus image datasets for automated glaucoma detection has been the bottleneck for successful application of artificial intelligence towards practical Computer-Aided Diagnosis (CAD). A few small datasets that are available for research community usually suffer from impractical image capturing conditions and stringent inclusion criteria. These shortcomings in already limited choice of existing datasets make it challenging to mature a CAD system so that it can perform in real-world environment. In this paper we present a large publicly available retinal fundus image dataset for glaucoma classification called G1020. The dataset is curated by conforming to standard practices in routine ophthalmology and it is expected to serve as standard benchmark dataset for glaucoma detection. This database consists of 1020 high resolution colour fundus images and provides ground truth annotations for glaucoma diagnosis, optic disc and optic cup segmentation, vertical cup-to-disc ratio, size of neuroretinal rim in inferior, superior, nasal and temporal quadrants, and bounding box location for optic disc. We also report baseline results by conducting extensive experiments for automated glaucoma diagnosis and segmentation of optic disc and optic cup.

preprint2020arXiv

On Interpretability of Deep Learning based Skin Lesion Classifiers using Concept Activation Vectors

Deep learning based medical image classifiers have shown remarkable prowess in various application areas like ophthalmology, dermatology, pathology, and radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD) systems in real clinical setups is severely limited primarily because their decision-making process remains largely obscure. This work aims at elucidating a deep learning based medical image classifier by verifying that the model learns and utilizes similar disease-related concepts as described and employed by dermatologists. We used a well-trained and high performing neural network developed by REasoning for COmplex Data (RECOD) Lab for classification of three skin tumours, i.e. Melanocytic Naevi, Melanoma and Seborrheic Keratosis and performed a detailed analysis on its latent space. Two well established and publicly available skin disease datasets, PH2 and derm7pt, are used for experimentation. Human understandable concepts are mapped to RECOD image classification model with the help of Concept Activation Vectors (CAVs), introducing a novel training and significance testing paradigm for CAVs. Our results on an independent evaluation set clearly shows that the classifier learns and encodes human understandable concepts in its latent representation. Additionally, TCAV scores (Testing with CAVs) suggest that the neural network indeed makes use of disease-related concepts in the correct way when making predictions. We anticipate that this work can not only increase confidence of medical practitioners on CAD but also serve as a stepping stone for further development of CAV-based neural network interpretation methods.

preprint2020arXiv

ProbAct: A Probabilistic Activation Function for Deep Neural Networks

Activation functions play an important role in training artificial neural networks. The majority of currently used activation functions are deterministic in nature, with their fixed input-output relationship. In this work, we propose a novel probabilistic activation function, called ProbAct. ProbAct is decomposed into a mean and variance and the output value is sampled from the formed distribution, making ProbAct a stochastic activation function. The values of mean and variances can be fixed using known functions or trained for each element. In the trainable ProbAct, the mean and the variance of the activation distribution is trained within the back-propagation framework alongside other parameters. We show that the stochastic perturbation induced through ProbAct acts as a viable generalization technique for feature augmentation. In our experiments, we compare ProbAct with well-known activation functions on classification tasks on different modalities: Images(CIFAR-10, CIFAR-100, and STL-10) and Text (Large Movie Review). We show that ProbAct increases the classification accuracy by +2-3% compared to ReLU or other conventional activation functions on both original datasets and when datasets are reduced to 50% and 25% of the original size. Finally, we show that ProbAct learns an ensemble of models by itself that can be used to estimate the uncertainties associated with the prediction and provides robustness to noisy inputs.

preprint2020arXiv

QuIS: The Question of Intelligent Site Selection

Site selection is one of the most important decisions to be made by companies. Such a decision depends on various factors of sites, including socio-economic, geographical, ecological, as well as specific requirements of companies. The existing approaches for site selection are manual, subjective, and not scalable. The paper presents the new approach QuIS for site selection, which is automatic, scalable, and more effective than existing state-of-the-art methods. It impartially finds suitables site based on analyzing decisive data of all location factors in both time and space. Another highlight of the proposed method is that the recommendations are supported by explanations, i.e., why something was suggested. To assess the effectiveness of the presented method, a case study on site selection of supermarkets in Germany is performed using real data of more than 200 location factors for 11.162 sites. Evaluation results show that there is a big coverage (86.4 %) between the sites of existing supermarkets selected by economists and the sites recommended by the presented method. In addition, the method also recommends many sites (328) where it is benefial to open a new supermarket. Furthermore, new decisive location factors are revealed, which have an impact on the existence of supermarkets.

preprint2020arXiv

QuViS -- The Question of Visual Site Selection

This paper present QuViS, which is an interactive platform for visualization and exploratory data analysis of site selection. The aim of QuViS is to support decision makers and experts during the process of site selection. In addition to visualization engine for exploratory analysis, QuViS is also integrated with our automatic site selection method (QuIS), which recommend different sites automatically based on the selected location factors by economists and experts. To show the potential and highlight the visualization and exploration capabilities of QuViS, a case study on 1,556 German supermarket site selection is performed. The real publicly available dataset contains 450 location factors for all 11,162 multiplicities in Germany, covering the last 10-15 years. Case study results shows that QuViS provides an easy and intuitive way for exploratory analysis of geospatial multidimensional data.

preprint2020arXiv

TSInsight: A local-global attribution framework for interpretability in time-series data

With the rise in the employment of deep learning methods in safety-critical scenarios, interpretability is more essential than ever before. Although many different directions regarding interpretability have been explored for visual modalities, time-series data has been neglected with only a handful of methods tested due to their poor intelligibility. We approach the problem of interpretability in a novel way by proposing TSInsight where we attach an auto-encoder to the classifier with a sparsity-inducing norm on its output and fine-tune it based on the gradients from the classifier and a reconstruction penalty. TSInsight learns to preserve features that are important for prediction by the classifier and suppresses those that are irrelevant i.e. serves as a feature attribution method to boost interpretability. In contrast to most other attribution frameworks, TSInsight is capable of generating both instance-based and model-based explanations. We evaluated TSInsight along with 9 other commonly used attribution methods on 8 different time-series datasets to validate its efficacy. Evaluation results show that TSInsight naturally achieves output space contraction, therefore, is an effective tool for the interpretability of deep time-series models.

preprint2020arXiv

TSViz: Demystification of Deep Learning Models for Time-Series Analysis

This paper presents a novel framework for demystification of convolutional deep learning models for time-series analysis. This is a step towards making informed/explainable decisions in the domain of time-series, powered by deep learning. There have been numerous efforts to increase the interpretability of image-centric deep neural network models, where the learned features are more intuitive to visualize. Visualization in time-series domain is much more complicated as there is no direct interpretation of the filters and inputs as compared to the image modality. In addition, little or no concentration has been devoted for the development of such tools in the domain of time-series in the past. TSViz provides possibilities to explore and analyze a network from different dimensions at different levels of abstraction which includes identification of parts of the input that were responsible for a prediction (including per filter saliency), importance of different filters present in the network for a particular prediction, notion of diversity present in the network through filter clustering, understanding of the main sources of variation learnt by the network through inverse optimization, and analysis of the network's robustness against adversarial noise. As a sanity check for the computed influence values, we demonstrate results regarding pruning of neural networks based on the computed influence information. These representations allow to understand the network features so that the acceptability of deep networks for time-series data can be enhanced. This is extremely important in domains like finance, industry 4.0, self-driving cars, health-care, counter-terrorism etc., where reasons for reaching a particular prediction are equally important as the prediction itself. We assess the proposed framework for interpretability with a set of desirable properties essential for any method.

preprint2020arXiv

Two-stage framework for optic disc localization and glaucoma classification in retinal fundus images using deep learning

With the advancement of powerful image processing and machine learning techniques, CAD has become ever more prevalent in all fields of medicine including ophthalmology. Since optic disc is the most important part of retinal fundus image for glaucoma detection, this paper proposes a two-stage framework that first detects and localizes optic disc and then classifies it into healthy or glaucomatous. The first stage is based on RCNN and is responsible for localizing and extracting optic disc from a retinal fundus image while the second stage uses Deep CNN to classify the extracted disc into healthy or glaucomatous. In addition to the proposed solution, we also developed a rule-based semi-automatic ground truth generation method that provides necessary annotations for training RCNN based model for automated disc localization. The proposed method is evaluated on seven publicly available datasets for disc localization and on ORIGA dataset, which is the largest publicly available dataset for glaucoma classification. The results of automatic localization mark new state-of-the-art on six datasets with accuracy reaching 100% on four of them. For glaucoma classification we achieved AUC equal to 0.874 which is 2.7% relative improvement over the state-of-the-art results previously obtained for classification on ORIGA. Once trained on carefully annotated data, Deep Learning based methods for optic disc detection and localization are not only robust, accurate and fully automated but also eliminates the need for dataset-dependent heuristic algorithms. Our empirical evaluation of glaucoma classification on ORIGA reveals that reporting only AUC, for datasets with class imbalance and without pre-defined train and test splits, does not portray true picture of the classifier's performance and calls for additional performance metrics to substantiate the results.