Source author record

Tushar Sharma

Tushar Sharma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Machine Learning Artificial Intelligence Digital Libraries

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Transformer models are widely deployed in critical AI applications, yet faults in their attention mechanisms, projections, and other internal components often degrade behavior silently without raising runtime errors. Existing fault diagnosis techniques often target generic deep neural networks and cannot identify which transformer component is responsible for an observed symptom. In this article, we present DEFault++, a hierarchical learning-based diagnostic technique that operates at three level of abstraction: it detects whether a fault is present, classifies it into one of 12 transformer-specific fault categories (covering both attention-internal mechanisms and surrounding architectural components), and identifies the underlying root cause from up to 45 mechanisms. To facilitate both training and evaluation, we construct DEFault-bench, a benchmark of 3,739 labeled instances obtained through systematic mutation testing. These instances are created across seven transformer models and nine downstream tasks using DEForm, a transformer-specific mutation technique we developed for this purpose. DEFault++ measures runtime behavior at the level of individual transformer components. It organizes these measurements through a Fault Propagation Graph (FPG) derived from the transformer architecture. It then produces an interpretable diagnosis using prototype matching combined with supervised contrastive learning. On DEFault-bench, DEFault++ exceeds an AUROC of 0.96 for detection and a Macro-F1 of 0.85 for both categorization and root-cause diagnosis on encoder and decoder architectures. In a developer study with 21 practitioners, the accuracy of choosing correct repair actions increased from 57.1% without support to 83.3% when using DEFault++.

preprint2022arXiv

A Survey on Machine Learning Techniques for Source Code Analysis

The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such a large number of studies hinders the community from understanding the current research landscape. This paper aims to summarize the current knowledge in applied machine learning for source code analysis. We review studies belonging to twelve categories of software engineering tasks and corresponding machine learning techniques, tools, and datasets that have been applied to solve them. To do so, we conducted an extensive literature search and identified 479 primary studies published between 2011 and 2021. We summarize our observations and findings with the help of the identified studies. Our findings suggest that the use of machine learning techniques for source code analysis tasks is consistently increasing. We synthesize commonly used steps and the overall workflow for each task and summarize machine learning techniques employed. We identify a comprehensive list of available datasets and tools useable in this context. Finally, the paper discusses perceived challenges in this area, including the availability of standard datasets, reproducibility and replicability, and hardware resources.

preprint2022arXiv

Empirical Standards for Repository Mining

The purpose of scholarly peer review is to evaluate the quality of scientific manuscripts. However, study after study demonstrates that peer review neither effectively nor reliably assesses research quality. Empirical standards attempt to address this problem by modelling a scientific community's expectations for each kind of empirical study conducted in that community. This should enhance not only the quality of research but also the reliability and predictability of peer review, as scientists adopt the standards in both their researcher and reviewer roles. However, these improvements depend on the quality and adoption of the standards. This tutorial will therefore present the empirical standard for mining software repositories, both to communicate its contents and to get feedback from the attendees. The tutorial will be organized into three parts: (1) brief overview of the empirical standards project; (2) detailed presentation of the repository mining standard; (3) discussion and suggestions for improvement.

Tushar Sharma

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

A Survey on Machine Learning Techniques for Source Code Analysis

Empirical Standards for Repository Mining