Source author record

Krishnendu Chakrabarty

Krishnendu Chakrabarty appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Emerging Technologies Hardware Architecture nlin.CD Cryptography and Security Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Neural and Evolutionary Computing Other Computer Science

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

H3PIMAP: A Heterogeneity-Aware Multi-Objective DNN Mapping Framework on Electronic-Photonic Processing-in-Memory Architectures

The future of artificial intelligence (AI) acceleration demands a paradigm shift beyond the limitations of purely electronic or photonic architectures. Photonic analog computing delivers unmatched speed and parallelism but struggles with data movement, robustness, and precision, while electronic processing-in-memory (PIM) enables energy-efficient computing by co-locating storage and computation but suffers from endurance and reconfiguration constraints, limiting it to static weight mapping. Neither approach alone achieves the balance needed for adaptive, efficient AI. To break this impasse, we study a hybrid electronic-photonic-PIM computing architecture and introduce H3PIMAP, a heterogeneity-aware mapping framework that seamlessly orchestrates workloads across electronic and optical tiers. By optimizing workload partitioning through a two-stage multi-objective exploration method, H3PIMAP harnesses light speed for high-throughput operations and PIM efficiency for memory-bound tasks. In system-level evaluations, H3PIMAP delivers a 3.32x latency reduction across language and vision models and, on large language models, achieves 77.0% lower latency with 14.6% lower energy at matched quality, outperforming homogeneous and naive mapping strategies. This proposed framework lays the foundation for hybrid AI accelerators, bridging the gap between electronic and photonic computation for next-generation efficiency and scalability.

preprint2025arXiv

STAMP-2.5D: Structural and Thermal Aware Methodology for Placement in 2.5D Integration

Chiplet-based architectures and advanced packaging has emerged as transformative approaches in semiconductor design. While conventional physical design for 2.5D heterogeneous systems typically prioritizes wirelength reduction through tight chiplet packing, this strategy creates thermal bottlenecks and intensifies coefficient of thermal expansion (CTE) mismatches, compromising long-term reliability. Addressing these challenges requires holistic consideration of thermal performance, mechanical stress, and interconnect efficiency. We introduce STAMP-2.5D, the first automated floorplanning methodology that simultaneously optimizes these critical factors. Our approach employs finite element analysis to simulate temperature distributions and stress profiles across chiplet configurations while minimizing interconnect wirelength. Experimental results demonstrate that our thermal structural aware automated floorplanning approach reduces overall stress by 11% while maintaining excellent thermal performance with a negligible 0.5% temperature increase and simultaneously reducing total wirelength by 11% compared to temperature-only optimization. Additionally, we conduct an exploratory study on the effects of temperature gradients on structural integrity, providing crucial insights for reliability-conscious chiplet design. STAMP-2.5D establishes a robust platform for navigating critical trade-offs in advanced semiconductor packaging.

preprint2022arXiv

Don't CWEAT It: Toward CWE Analysis Techniques in Early Stages of Hardware Design

To help prevent hardware security vulnerabilities from propagating to later design stages where fixes are costly, it is crucial to identify security concerns as early as possible, such as in RTL designs. In this work, we investigate the practical implications and feasibility of producing a set of security-specific scanners that operate on Verilog source files. The scanners indicate parts of code that might contain one of a set of MITRE's common weakness enumerations (CWEs). We explore the CWE database to characterize the scope and attributes of the CWEs and identify those that are amenable to static analysis. We prototype scanners and evaluate them on 11 open source designs - 4 system-on-chips (SoC) and 7 processor cores - and explore the nature of identified weaknesses. Our analysis reported 53 potential weaknesses in the OpenPiton SoC used in Hack@DAC-21, 11 of which we confirmed as security concerns.

preprint2022arXiv

LoCI: An Analysis of the Impact of Optical Loss and Crosstalk Noise in Integrated Silicon-Photonic Neural Networks

Compared to electronic accelerators, integrated silicon-photonic neural networks (SP-NNs) promise higher speed and energy efficiency for emerging artificial-intelligence applications. However, a hitherto overlooked problem in SP-NNs is that the underlying silicon photonic devices suffer from intrinsic optical loss and crosstalk noise, the impact of which accumulates as the network scales up. Leveraging precise device-level models, this paper presents the first comprehensive and systematic optical loss and crosstalk modeling framework for SP-NNs. For an SP-NN case study with two hidden layers and 1380 tunable parameters, we show a catastrophic 84% drop in inferencing accuracy due to optical loss and crosstalk noise.

preprint2022arXiv

ReaLPrune: ReRAM Crossbar-aware Lottery Ticket Pruned CNNs

Training machine learning (ML) models at the edge (on-chip training on end user devices) can address many pressing challenges including data privacy/security, increase the accessibility of ML applications to different parts of the world by reducing the dependence on the communication fabric and the cloud infrastructure, and meet the real-time requirements of AR/VR applications. However, existing edge platforms do not have sufficient computing capabilities to support complex ML tasks such as training large CNNs. ReRAM-based architectures offer high-performance yet energy efficient computing platforms for on-chip CNN training/inferencing. However, ReRAM-based architectures are not scalable with the size of the CNN. Larger CNNs have more weights, which requires more ReRAM cells that cannot be integrated in a single chip. Moreover, training larger CNNs on-chip will require higher power, which cannot be afforded by these smaller devices. Pruning is an effective way to solve this problem. However, existing pruning techniques are either targeted for inferencing only, or they are not crossbar-aware. This leads to sub-optimal hardware savings and performance benefits for CNN training on ReRAM-based architectures. In this paper, we address this problem by proposing a novel crossbar-aware pruning strategy, referred as ReaLPrune, which can prune more than 90% of CNN weights. The pruned model can be trained from scratch without any accuracy loss. Experimental results indicate that ReaLPrune reduces hardware requirements by 77.2% and accelerates CNN training by ~20X compared to unpruned CNNs. ReaLPrune also outperforms other crossbar-aware pruning techniques in terms of both performance and hardware savings. In addition, ReaLPrune is equally effective for diverse datasets and more complex CNNs

preprint2021arXiv

ReGraphX: NoC-enabled 3D Heterogeneous ReRAM Architecture for Training Graph Neural Networks

Graph Neural Network (GNN) is a variant of Deep Neural Networks (DNNs) operating on graphs. However, GNNs are more complex compared to traditional DNNs as they simultaneously exhibit features of both DNN and graph applications. As a result, architectures specifically optimized for either DNNs or graph applications are not suited for GNN training. In this work, we propose a 3D heterogeneous manycore architecture for on-chip GNN training to address this problem. The proposed architecture, ReGraphX, involves heterogeneous ReRAM crossbars to fulfill the disparate requirements of both DNN and graph computations simultaneously. The ReRAM-based architecture is complemented with a multicast-enabled 3D NoC to improve the overall achievable performance. We demonstrate that ReGraphX outperforms conventional GPUs by up to 3.5X (on an average 3X) in terms of execution time, while reducing energy consumption by as much as 11X.

preprint2016arXiv

Design-Space Exploration and Optimization of an Energy-Efficient and Reliable 3D Small-world Network-on-Chip

A three-dimensional (3D) Network-on-Chip (NoC) enables the design of high performance and low power many-core chips. Existing 3D NoCs are inadequate for meeting the ever-increasing performance requirements of many-core processors since they are simple extensions of regular 2D architectures and they do not fully exploit the advantages provided by 3D integration. Moreover, the anticipated performance gain of a 3D NoC-enabled many-core chip may be compromised due to the potential failures of through-silicon-vias (TSVs) that are predominantly used as vertical interconnects in a 3D IC. To address these problems, we propose a machine-learning-inspired predictive design methodology for energy-efficient and reliable many-core architectures enabled by 3D integration. We demonstrate that a small-world network-based 3D NoC (3D SWNoC) performs significantly better than its 3D MESH-based counterparts. On average, the 3D SWNoC shows 35% energy-delay-product (EDP) improvement over 3D MESH for the PARSEC and SPLASH2 benchmarks considered in this work. To improve the reliability of 3D NoC, we propose a computationally efficient spare-vertical link (sVL) allocation algorithm based on a state-space search formulation. Our results show that the proposed sVL allocation algorithm can significantly improve the reliability as well as the lifetime of 3D SWNoC.

preprint2014arXiv

Bifurcation and control of chaos in Induction motor drives

The induction motor controlled by Indirect Field Oriented Control (IFOC) is known to have high performance and better stability. This paper reports the dynamical behavior of an indirect field oriented control (IFOC) induction motor drive in the light of bifurcation theory. The speed of high performance induction motor drive is controlled by IFOC method. The knowledge of qualitative change of the behavior of the motor such as equilibrium points, limit cycles and chaos with the change of motor parameters and load torque are essential for proper control of the motor. This paper provides a numerical approach to understand better the dynamical behavior of an indirect field oriented control of a current-fed induction motor. The focus is on bifurcation analysis of the IFOC motor, with a particular emphasis on the change that affects the dynamics and stability under small variations of Proportional Integral controller (PI) parameters, load torque and k, the ratio of the rotor time constant and its estimate etc. Bifurcation diagrams are computed. This paper also attempts to discuss various types of the transition to chaos in the induction motor. The results of the obtained bifurcation simulations give useful guidelines for adjusting both motor model and PI controller parameters. It is also important to ensure desired operation of the motor when the motor shows chaotic behavior. Infinite numbers of unstable periodic orbits are embedded in a chaotic attractor. Any unstable periodic orbit can be stabilized by proper control algorithm. The delayed feedback control method to control chaos has been implemented in this system.

preprint2014arXiv

Indices to detect Hopf bifurcation in Induction motor drives

The loss of stability of induction motor controlled by Indirect Field Oriented Control (IFOC) is a matter of great concern of operators and design engineers. This paper reports indices to detect and predict stability problem such as system oscillations. Oscillations as a result of loss of stability, due to Hopf bifurcation, for different parameter values of IFOC motor are studied using the proposed indices.

preprint2014arXiv

Stabilization of unstable periodic orbits in dc drives

Electric drive using dc shunt motor or permanent magnet dc (PMDC) motor as prime mover exhibits bifurcation and chaos. The characteristics of dc shunt and PMDC motors are linear in nature. These motors are controlled by pulse width modulation (PWM) technique with the help of semiconductor switches. These switches are nonlinear element that introduces nonlinear characteristics in the drive. Any nonlinear system can exhibit bifurcation and chaos. dc shunt or PMDC drives show normal behavior with certain range of parameter values. It is also observed that these drive show chaos for significantly large ranges of parameter values. In this paper we present a method for controlling chaos applicable to dc shunt and PMDC drives. The results of numerical investigation are presented.

preprint2013arXiv

Algorithms for Producing Linear Dilution Gradient with Digital Microfluidics

Digital microfluidic (DMF) biochips are now being extensively used to automate several biochemical laboratory protocols such as clinical analysis, point-of-care diagnostics, and polymerase chain reaction (PCR). In many biological assays, e.g., in bacterial susceptibility tests, samples and reagents are required in multiple concentration (or dilution) factors, satisfying certain "gradient" patterns such as linear, exponential, or parabolic. Dilution gradients are usually prepared with continuous-flow microfluidic devices; however, they suffer from inflexibility, non-programmability, and from large requirement of costly stock solutions. DMF biochips, on the other hand, are shown to produce, more efficiently, a set of random dilution factors. However, all existing algorithms fail to optimize the cost or performance when a certain gradient pattern is required. In this work, we present an algorithm to generate any arbitrary linear gradient, on-chip, with minimum wastage, while satisfying a required accuracy in the concentration factor. We present new theoretical results on the number of mix-split operations and waste computation, and prove an upper bound on the storage requirement. The corresponding layout design of the biochip is also proposed. Simulation results on different linear gradients show a significant improvement in sample cost over three earlier algorithms used for the generation of multiple concentrations.

preprint2007arXiv

Design of Fault-Tolerant and Dynamically-Reconfigurable Microfluidic Biochips

Microfluidics-based biochips are soon expected to revolutionize clinical diagnosis, DNA sequencing, and other laboratory procedures involving molecular biology. Most microfluidic biochips are based on the principle of continuous fluid flow and they rely on permanently-etched microchannels, micropumps, and microvalves. We focus here on the automated design of "digital" droplet-based microfluidic biochips. In contrast to continuous-flow systems, digital microfluidics offers dynamic reconfigurability; groups of cells in a microfluidics array can be reconfigured to change their functionality during the concurrent execution of a set of bioassays. We present a simulated annealing-based technique for module placement in such biochips. The placement procedure not only addresses chip area, but it also considers fault tolerance, which allows a microfluidic module to be relocated elsewhere in the system when a single cell is detected to be faulty. Simulation results are presented for a case study involving the polymerase chain reaction.

preprint2007arXiv

Rapid Generation of Thermal-Safe Test Schedules

Overheating has been acknowledged as a major issue in testing complex SOCs. Several power constrained system-level DFT solutions (power constrained test scheduling) have recently been proposed to tackle this problem. However, as it will be shown in this paper, imposing a chip-level maximum power constraint doesn't necessarily avoid local overheating due to the non-uniform distribution of power across the chip. This paper proposes a new approach for dealing with overheating during test, by embedding thermal awareness into test scheduling. The proposed approach facilitates rapid generation of thermal-safer test schedules without requiring time-consuming thermal simulations. This is achieved by employing a low-complexity test session thermal model used to guide the test schedule generation algorithm. This approach reduces the chances of a design re-spin due to potential overheating during test.

preprint2007arXiv

Test Planning for Mixed-Signal SOCs with Wrapped Analog Cores

Many SOCs today contain both digital and analog embedded cores. Even though the test cost for such mixed-signal SOCs is significantly higher than that for digital SOCs, most prior research in this area has focused exclusively on digital cores. We propose a low-cost test development methodology for mixed-signal SOCs that allows the analog and digital cores to be tested in a unified manner, thereby minimizing the overall test cost. The analog cores in the SOC are wrapped such that they can be accessed using a digital test access mechanism (TAM). We evaluate the impact of the use of analog test wrappers on area overhead and test time. To reduce area overhead, we present an analog test wrapper optimization technique, which is then combined with TAM optimization in a cost-oriented heuristic approach for test scheduling. We also demonstrate the feasibility of using analog wrappers by presenting transistor-level simulations for an analog wrapper and a representative core. We present experimental results on test scheduling for an ITC'02 benchmark SOC that has been augmented with five analog cores.

preprint2007arXiv

Yield Enhancement of Digital Microfluidics-Based Biochips Using Space Redundancy and Local Reconfiguration

As microfluidics-based biochips become more complex, manufacturing yield will have significant influence on production volume and product cost. We propose an interstitial redundancy approach to enhance the yield of biochips that are based on droplet-based microfluidics. In this design method, spare cells are placed in the interstitial sites within the microfluidic array, and they replace neighboring faulty cells via local reconfiguration. The proposed design method is evaluated using a set of concurrent real-life bioassays.

Krishnendu Chakrabarty

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

H3PIMAP: A Heterogeneity-Aware Multi-Objective DNN Mapping Framework on Electronic-Photonic Processing-in-Memory Architectures

STAMP-2.5D: Structural and Thermal Aware Methodology for Placement in 2.5D Integration

Don't CWEAT It: Toward CWE Analysis Techniques in Early Stages of Hardware Design

LoCI: An Analysis of the Impact of Optical Loss and Crosstalk Noise in Integrated Silicon-Photonic Neural Networks

ReaLPrune: ReRAM Crossbar-aware Lottery Ticket Pruned CNNs

ReGraphX: NoC-enabled 3D Heterogeneous ReRAM Architecture for Training Graph Neural Networks

Design-Space Exploration and Optimization of an Energy-Efficient and Reliable 3D Small-world Network-on-Chip

Bifurcation and control of chaos in Induction motor drives

Indices to detect Hopf bifurcation in Induction motor drives

Stabilization of unstable periodic orbits in dc drives

Algorithms for Producing Linear Dilution Gradient with Digital Microfluidics

Design of Fault-Tolerant and Dynamically-Reconfigurable Microfluidic Biochips

Rapid Generation of Thermal-Safe Test Schedules

Test Planning for Mixed-Signal SOCs with Wrapped Analog Cores

Yield Enhancement of Digital Microfluidics-Based Biochips Using Space Redundancy and Local Reconfiguration