Source author record

Ashish Verma

Ashish Verma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.CA Machine Learning Distributed, Parallel, and Cluster Computing Artificial Intelligence Computation and Language Computer Vision Cryptography and Security

Catalog footprint

What is connected

13works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Multi-RADS Synthetic Radiology Report Dataset and Head-to-Head Benchmarking of 41 Open-Weight and Proprietary Language Models

Background: Reporting and Data Systems (RADS) standardize radiology risk communication but automated RADS assignment from narrative reports is challenging because of guideline complexity, output-format constraints, and limited benchmarking across RADS frameworks and model sizes. Purpose: To create RXL-RADSet, a radiologist-verified synthetic multi-RADS benchmark, and compare validity and accuracy of open-weight small language models (SLMs) with a proprietary model for RADS assignment. Materials and Methods: RXL-RADSet contains 1,600 synthetic radiology reports across 10 RADS (BI-RADS, CAD-RADS, GB-RADS, LI-RADS, Lung-RADS, NI-RADS, O-RADS, PI-RADS, TI-RADS, VI-RADS) and multiple modalities. Reports were generated by LLMs using scenario plans and simulated radiologist styles and underwent two-stage radiologist verification. We evaluated 41 quantized SLMs (12 families, 0.135-32B parameters) and GPT-5.2 under a fixed guided prompt. Primary endpoints were validity and accuracy; a secondary analysis compared guided versus zero-shot prompting. Results: Under guided prompting GPT-5.2 achieved 99.8% validity and 81.1% accuracy (1,600 predictions). Pooled SLMs (65,600 predictions) achieved 96.8% validity and 61.1% accuracy; top SLMs in the 20-32B range reached ~99% validity and mid-to-high 70% accuracy. Performance scaled with model size (inflection between <1B and >=10B) and declined with RADS complexity primarily due to classification difficulty rather than invalid outputs. Guided prompting improved validity (99.2% vs 96.7%) and accuracy (78.5% vs 69.6%) compared with zero-shot. Conclusion: RXL-RADSet provides a radiologist-verified multi-RADS benchmark; large SLMs (20-32B) can approach proprietary-model performance under guided prompting, but gaps remain for higher-complexity schemes.

preprint2022arXiv

Just-in-Time Aggregation for Federated Learning

The increasing number and scale of federated learning (FL) jobs necessitates resource efficient scheduling and management of aggregation to make the economics of cloud-hosted aggregation work. Existing FL research has focused on the design of FL algorithms and optimization, and less on the efficacy of aggregation. Existing FL platforms often employ aggregators that actively wait for model updates. This wastes computational resources on the cloud, especially in large scale FL settings where parties are intermittently available for training. In this paper, we propose a new FL aggregation paradigm -- "just-in-time" (JIT) aggregation that leverages unique properties of FL jobs, especially the periodicity of model updates, to defer aggregation as much as possible and free compute resources for other FL jobs or other datacenter workloads. We describe a novel way to prioritize FL jobs for aggregation, and demonstrate using multiple datasets, models and FL aggregation algorithms that our techniques can reduce resource usage by 60+\% when compared to eager aggregation used in existing FL platforms. We also demonstrate that using JIT aggregation has negligible overhead and impact on the latency of the FL job.

preprint2021arXiv

Adversarial training in communication constrained federated learning

Federated learning enables model training over a distributed corpus of agent data. However, the trained model is vulnerable to adversarial examples, designed to elicit misclassification. We study the feasibility of using adversarial training (AT) in the federated learning setting. Furthermore, we do so assuming a fixed communication budget and non-iid data distribution between participating agents. We observe a significant drop in both natural and adversarial accuracies when AT is used in the federated setting as opposed to centralized training. We attribute this to the number of epochs of AT performed locally at the agents, which in turn effects (i) drift between local models; and (ii) convergence time (measured in number of communication rounds). Towards this end, we propose FedDynAT, a novel algorithm for performing AT in federated setting. Through extensive experimentation we show that FedDynAT significantly improves both natural and adversarial accuracy, as well as model convergence time by reducing the model drift.

preprint2020arXiv

Effective Elastic Scaling of Deep Learning Workloads

The increased use of deep learning (DL) in academia, government and industry has, in turn, led to the popularity of on-premise and cloud-hosted deep learning platforms, whose goals are to enable organizations utilize expensive resources effectively, and to share said resources among multiple teams in a fair and effective manner. In this paper, we examine the elastic scaling of Deep Learning (DL) jobs over large-scale training platforms and propose a novel resource allocation strategy for DL training jobs, resulting in improved job run time performance as well as increased cluster utilization. We begin by analyzing DL workloads and exploit the fact that DL jobs can be run with a range of batch sizes without affecting their final accuracy. We formulate an optimization problem that explores a dynamic batch size allocation to individual DL jobs based on their scaling efficiency, when running on multiple nodes. We design a fast dynamic programming based optimizer to solve this problem in real-time to determine jobs that can be scaled up/down, and use this optimizer in an autoscaler to dynamically change the allocated resources and batch sizes of individual DL jobs. We demonstrate empirically that our elastic scaling algorithm can complete up to $\approx 2 \times$ as many jobs as compared to a strong baseline algorithm that also scales the number of GPUs but does not change the batch size. We also demonstrate that the average completion time with our algorithm is up to $\approx 10 \times$ faster than that of the baseline.

preprint2020arXiv

Finite summation formulas of generalized Kampé de Fériet series

In the present paper, author obtained finite summation formulas for the generalized Kampé de Fériet series. The peculiar outcome for four generalized Lauricella functions and confluent forms of Lauricella series in $n$ variables are also drawn from the finite summation formulas for the generalized Kampé de Fériet series.

preprint2020arXiv

IBM Federated Learning: an Enterprise Framework White Paper V0.1

Federated Learning (FL) is an approach to conduct machine learning without centralizing training data in a single place, for reasons of privacy, confidentiality or data volume. However, solving federated machine learning problems raises issues above and beyond those of centralized machine learning. These issues include setting up communication infrastructure between parties, coordinating the learning process, integrating party results, understanding the characteristics of the training data sets of different participating parties, handling data heterogeneity, and operating with the absence of a verification data set. IBM Federated Learning provides infrastructure and coordination for federated learning. Data scientists can design and run federated learning jobs based on existing, centralized machine learning models and can provide high-level instructions on how to run the federation. The framework applies to both Deep Neural Networks as well as ``traditional'' approaches for the most common machine learning libraries. {\proj} enables data scientists to expand their scope from centralized to federated machine learning, minimizing the learning curve at the outset while also providing the flexibility to deploy to different compute environments and design custom fusion algorithms.

preprint2020arXiv

Improving the affordability of robustness training for DNNs

Projected Gradient Descent (PGD) based adversarial training has become one of the most prominent methods for building robust deep neural network models. However, the computational complexity associated with this approach, due to the maximization of the loss function when finding adversaries, is a longstanding problem and may be prohibitive when using larger and more complex models. In this paper we show that the initial phase of adversarial training is redundant and can be replaced with natural training which significantly improves the computational efficiency. We demonstrate that this efficiency gain can be achieved without any loss in accuracy on natural and adversarial test samples. We support our argument with insights on the nature of the adversaries and their relative strength during the training process. We show that our proposed method can reduce the training time by a factor of up to 2.5 with comparable or better model test accuracy and generalization on various strengths of adversarial attacks.

preprint2020arXiv

Infinite summation formulas of Srivastava's general triple hypergeometric function

In this paper we derive the infinite summation formulas of Srivastava's general triple hypergeometric function. Certain particular cases leading to infinite summation formulas for fourteen Lauricella and three Srivastavaś triple hypergeometric functions are also presented.

preprint2020arXiv

On the incomplete Lauricella matrix functions of several variables

In this article, we introduce incomplete Lauricella matrix functions (ILMFs) of $n$ variables through application of the incomplete Pochhammer matrix symbols. Furthermore there is derivation of certain properties; matrix differential equation, integral formula, recursion formula and differentiation formula of the ILMFs. We also establish the connection between these matrix functions and other peculiar matrix functions; incomplete gamma matrix function, Laguerre, Bessel and modified Bessel matrix functions.

preprint2020arXiv

On the incomplete Srivastava's triple hypergeometric matrix functions

The paper proposes to introduce incomplete Srivastava's triple hypergeometric matrix functions through application of the incomplete Pochhammer matrix symbols. We also derive certain properties such as matrix differential equation, integral formula, reduction formula, recursion formula, recurrence relation and differentiation formula of the incomplete Srivastava's triple hypergeometric matrix functions.

preprint2020arXiv

On the recursion formulas for the matrix special functions of one and two variables

Special matrix functions have recently been investigated for regions of convergence, integral representations and the systems of matrix differential equation that these functions satisfy. In this paper, we find the recursion formulas for the Gauss hypergeometric matrix function. We also give the recursion formulas for the two variable Appell matrix functions

preprint2020arXiv

PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination

We develop a novel method, called PoWER-BERT, for improving the inference time of the popular BERT model, while maintaining the accuracy. It works by: a) exploiting redundancy pertaining to word-vectors (intermediate encoder outputs) and eliminating the redundant vectors. b) determining which word-vectors to eliminate by developing a strategy for measuring their significance, based on the self-attention mechanism. c) learning how many word-vectors to eliminate by augmenting the BERT model and the loss function. Experiments on the standard GLUE benchmark shows that PoWER-BERT achieves up to 4.5x reduction in inference time over BERT with <1% loss in accuracy. We show that PoWER-BERT offers significantly better trade-off between accuracy and inference time compared to prior methods. We demonstrate that our method attains up to 6.8x reduction in inference time with <1% loss in accuracy when applied over ALBERT, a highly compressed version of BERT. The code for PoWER-BERT is publicly available at https://github.com/IBM/PoWER-BERT.

preprint2020arXiv

Some results on the Kampé de Feŕiet hypergeometric matrix function

In this paper, we obtain recursion formulas for the Kampé de Feŕiet hypergeometric matrix function. We also give finite and infinite summation formulas for the Kampé de Feŕiet hypergeometric matrix function.

Ashish Verma

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Multi-RADS Synthetic Radiology Report Dataset and Head-to-Head Benchmarking of 41 Open-Weight and Proprietary Language Models

Just-in-Time Aggregation for Federated Learning

Adversarial training in communication constrained federated learning

Effective Elastic Scaling of Deep Learning Workloads

Finite summation formulas of generalized Kampé de Fériet series

IBM Federated Learning: an Enterprise Framework White Paper V0.1

Improving the affordability of robustness training for DNNs

Infinite summation formulas of Srivastava's general triple hypergeometric function

On the incomplete Lauricella matrix functions of several variables

On the incomplete Srivastava's triple hypergeometric matrix functions

On the recursion formulas for the matrix special functions of one and two variables

PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination

Some results on the Kampé de Feŕiet hypergeometric matrix function