Researcher profile

Biresh Kumar Joardar

Biresh Kumar Joardar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Building Reliable Arithmetic Multipliers Under NBTI Aging and Process Variations

Hardware aging poses a significant challenge for integrated circuits (ICs), leading to performance degradation and eventual failure. In this work, we focus on the aging of arithmetic multipliers, which are a cornerstone of modern computing systems including in CPUs, GPUs, and FPGAs, as well as AI accelerators like systolic arrays. In particular, AI workloads, which rely predominantly on multiplications, can accelerate Negative Bias Temperature Instability (NBTI) effects in multipliers. This paper presents a novel aging mitigation technique that leverages the signinvariance property of multiplication. By selectively applying 2s complement transformations to inputs, the method redistributes stress across transistors, reducing the effects of NBTI aging. The proposed method is also integrated into systolic arrays, a common AI accelerator, to demonstrate its efficiency in a high-throughput AI accelerator. Experimental evaluations using Cadence tools show better lifetime compared to natural aging (with no mitigation) baseline, while introducing negligible area and delay overheads.

preprint2022arXiv

ReaLPrune: ReRAM Crossbar-aware Lottery Ticket Pruned CNNs

Training machine learning (ML) models at the edge (on-chip training on end user devices) can address many pressing challenges including data privacy/security, increase the accessibility of ML applications to different parts of the world by reducing the dependence on the communication fabric and the cloud infrastructure, and meet the real-time requirements of AR/VR applications. However, existing edge platforms do not have sufficient computing capabilities to support complex ML tasks such as training large CNNs. ReRAM-based architectures offer high-performance yet energy efficient computing platforms for on-chip CNN training/inferencing. However, ReRAM-based architectures are not scalable with the size of the CNN. Larger CNNs have more weights, which requires more ReRAM cells that cannot be integrated in a single chip. Moreover, training larger CNNs on-chip will require higher power, which cannot be afforded by these smaller devices. Pruning is an effective way to solve this problem. However, existing pruning techniques are either targeted for inferencing only, or they are not crossbar-aware. This leads to sub-optimal hardware savings and performance benefits for CNN training on ReRAM-based architectures. In this paper, we address this problem by proposing a novel crossbar-aware pruning strategy, referred as ReaLPrune, which can prune more than 90% of CNN weights. The pruned model can be trained from scratch without any accuracy loss. Experimental results indicate that ReaLPrune reduces hardware requirements by 77.2% and accelerates CNN training by ~20X compared to unpruned CNNs. ReaLPrune also outperforms other crossbar-aware pruning techniques in terms of both performance and hardware savings. In addition, ReaLPrune is equally effective for diverse datasets and more complex CNNs

preprint2021arXiv

ReGraphX: NoC-enabled 3D Heterogeneous ReRAM Architecture for Training Graph Neural Networks

Graph Neural Network (GNN) is a variant of Deep Neural Networks (DNNs) operating on graphs. However, GNNs are more complex compared to traditional DNNs as they simultaneously exhibit features of both DNN and graph applications. As a result, architectures specifically optimized for either DNNs or graph applications are not suited for GNN training. In this work, we propose a 3D heterogeneous manycore architecture for on-chip GNN training to address this problem. The proposed architecture, ReGraphX, involves heterogeneous ReRAM crossbars to fulfill the disparate requirements of both DNN and graph computations simultaneously. The ReRAM-based architecture is complemented with a multicast-enabled 3D NoC to improve the overall achievable performance. We demonstrate that ReGraphX outperforms conventional GPUs by up to 3.5X (on an average 3X) in terms of execution time, while reducing energy consumption by as much as 11X.