Researcher profile

S. R. Nandakumar

S. R. Nandakumar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2020arXiv

Accurate deep neural network inference using computational phase-change memory

In-memory computing is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. Crossbar arrays of resistive memory devices can be used to encode the network weights and perform efficient analog matrix-vector multiplications without intermediate movements of data. However, due to device variability and noise, the network needs to be trained in a specific way so that transferring the digitally trained weights to the analog resistive memory devices will not result in significant loss of accuracy. Here, we introduce a methodology to train ResNet-type convolutional neural networks that results in no appreciable accuracy loss when transferring weights to in-memory computing hardware based on phase-change memory (PCM). We also propose a compensation technique that exploits the batch normalization parameters to improve the accuracy retention over time. We achieve a classification accuracy of 93.7% on the CIFAR-10 dataset and a top-1 accuracy on the ImageNet benchmark of 71.6% after mapping the trained weights to PCM. Our hardware results on CIFAR-10 with ResNet-32 demonstrate an accuracy above 93.5% retained over a one day period, where each of the 361,722 synaptic weights of the network is programmed on just two PCM devices organized in a differential configuration.

preprint2020arXiv

Bio-mimetic Synaptic Plasticity and Learning in a sub-500mV Cu/SiO$_2$/W Memristor

The computational efficiency of the human brain is believed to stem from the parallel information processing capability of neurons with integrated storage in synaptic interconnections programmed by local spike triggered learning rules such as spike timing dependent plasticity (STDP). The extremely low operating voltages (approximately $100\,$mV) used to trigger neuronal signaling and synaptic adaptation is believed to be a critical reason for the brain's power efficiency. We demonstrate the feasibility of spike triggered STDP behavior in a two-terminal Cu/SiO$_2$/W memristive device capable of operating below $500\,$mV. We analyze the state-dependent nature of conductance updates in the device to develop a phenomenological model. Using the model, we evaluate the potential of such devices to generate precise spike times under supervised learning conditions and classify handwritten digits from the MNIST dataset in an unsupervised learning setting. The results form a promising step towards creating a low power synaptic device capable of on-chip learning.

preprint2020arXiv

Mixed-precision deep learning based on computational memory

Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner. However, updating the conductance states in a reliable manner during the weight update process is a fundamental challenge that limits the training accuracy of such an implementation. Here, we propose a mixed-precision architecture that combines a computational memory unit performing the weighted summations and imprecise conductance updates with a digital processing unit that accumulates the weight updates in high precision. A combined hardware/software training experiment of a multilayer perceptron based on the proposed architecture using a phase-change memory (PCM) array achieves 97.73% test accuracy on the task of classifying handwritten digits (based on the MNIST dataset), within 0.6% of the software baseline. The architecture is further evaluated using accurate behavioral models of PCM on a wide class of networks, namely convolutional neural networks, long-short-term-memory networks, and generative-adversarial networks. Accuracies comparable to those of floating-point implementations are achieved without being constrained by the non-idealities associated with the PCM devices. A system-level study demonstrates 173x improvement in energy efficiency of the architecture when used for training a multilayer perceptron compared with a dedicated fully digital 32-bit implementation.