Source author record

Amit Singh

Amit Singh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computer Vision Distributed, Parallel, and Cluster Computing Information Retrieval Molecular Networks Software Engineering

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering

Establishing precise traceability between embedded systems datasheets and their corresponding code implementations remains a fundamental challenge in systems engineering, particularly for low-level software where manual mapping between specification documents and large code repositories is infeasible. Existing Traceability Link Recovery approaches primarily rely on lexical similarity and information retrieval techniques, which struggle to capture the semantic, structural, and symbol level relationships prevalent in embedded systems software. We present a hierarchical datasheet-to-code mapping methodology that employs large language models for semantic analysis while explicitly structuring the traceability process across multiple abstraction levels. Rather than performing direct specification-to-code matching, the proposed approach progressively narrows the search space through repository-level structure inference, file-level relevance estimation, and fine-grained symbollevel alignment. The method extends beyond function-centric mapping by explicitly covering macros, structs, constants, configuration parameters, and register definitions commonly found in systems-level C/C++ codebases. We evaluate the approach on multiple open-source embedded systems repositories using manually curated datasheet-to-code ground truth. Experimental results show substantial improvements over traditional information-retrieval-based baselines, achieving up to 73.3% file mapping accuracy. We significantly reduce computational overhead, lowering total LLM token consumption by 84% and end-to-end runtime by approximately 80%. This methodology supports automated analysis of large embedded software systems and enables downstream applications such as training data generation for systems-aware machine learning models, standards compliance verification, and large-scale specification coverage analysis.

preprint2022arXiv

NGAME: Negative Mining-aware Mini-batching for Extreme Classification

Extreme Classification (XC) seeks to tag data points with the most relevant subset of labels from an extremely large label set. Performing deep XC with dense, learnt representations for data points and labels has attracted much attention due to its superiority over earlier XC methods that used sparse, hand-crafted features. Negative mining techniques have emerged as a critical component of all deep XC methods that allow them to scale to millions of labels. However, despite recent advances, training deep XC models with large encoder architectures such as transformers remains challenging. This paper identifies that memory overheads of popular negative mining techniques often force mini-batch sizes to remain small and slow training down. In response, this paper introduces NGAME, a light-weight mini-batch creation technique that offers provably accurate in-batch negative samples. This allows training with larger mini-batches offering significantly faster convergence and higher accuracies than existing negative sampling techniques. NGAME was found to be up to 16% more accurate than state-of-the-art methods on a wide array of benchmark datasets for extreme classification, as well as 3% more accurate at retrieving search engine queries in response to a user webpage visit to show personalized ads. In live A/B tests on a popular search engine, NGAME yielded up to 23% gains in click-through-rates.

preprint2021arXiv

Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation

Unsupervised domain adaptation (DA) has gained substantial interest in semantic segmentation. However, almost all prior arts assume concurrent access to both labeled source and unlabeled target, making them unsuitable for scenarios demanding source-free adaptation. In this work, we enable source-free DA by partitioning the task into two: a) source-only domain generalization and b) source-free target adaptation. Towards the former, we provide theoretical insights to develop a multi-head framework trained with a virtually extended multi-source dataset, aiming to balance generalization and specificity. Towards the latter, we utilize the multi-head framework to extract reliable target pseudo-labels for self-training. Additionally, we introduce a novel conditional prior-enforcing auto-encoder that discourages spatial irregularities, thereby enhancing the pseudo-label quality. Experiments on the standard GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes benchmarks show our superiority even against the non-source-free prior-arts. Further, we show our compatibility with online adaptation enabling deployment in a sequentially changing environment.

preprint2014arXiv

Analysis of the hierarchical structure of the B. subtilis transcriptional regulatory network

The transcriptional regulation of gene expression is orchestrated by complex networks of interacting genes. Increasing evidence indicates that these transcriptional regulatory networks (TRNs) in bacteria have an inherently hierarchical architecture, although the design principles and the specific advantages offered by this type of organization have not yet been fully elucidated. In this study, we focussed on the hierarchical structure of the TRN of the gram-positive bacterium Bacillus subtilis and performed a comparative analysis with the TRN of the gram-negative bacterium Escherichia coli. Using a graph-theoretic approach, we organized the transcription factors (TFs) and sigma-factors in the TRNs of B. subtilis and E. coli into three hierarchical levels (Top, Middle and Bottom) and studied several structural and functional properties across them. In addition to many similarities, we found also specific differences, explaining the majority of them with variations in the distribution of sigma-factors across the hierarchical levels in the two organisms. We then investigated the control of target metabolic genes by transcriptional regulators to characterize the differential regulation of three distinct metabolic subsystems (catabolism, anabolism and central energy metabolism). These results suggest that the hierarchical architecture that we observed in B. subtilis represents an effective organization of its TRN to achieve flexibility in the response to diverse stimuli.

preprint2012arXiv

Case Tool: Fast Interconnections with New 3-Disjoint Paths MIN Simulation Module

Multi-stage interconnection networks (MIN) can be designed to achieve fault tolerance and collision solving by providing a set of disjoint paths. In this paper, we are discussing the new simulator added to the tool designed for developing fault tolerant MINs. The designed tool is one of its own kind and will help the user in developing 2 and 3-disjoint path networks. The java technology has been used to design the tool and have been tested on different software platform.