Source author record

Brian D. Hoskins

Brian D. Hoskins appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Emerging Technologies Machine Learning cond-mat.dis-nn cond-mat.mes-hall cond-mat.mtrl-sci physics.app-ph

Catalog footprint

What is connected

3works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Implementation of a Binary Neural Network on a Passive Array of Magnetic Tunnel Junctions

The increasing scale of neural networks and their growing application space have produced demand for more energy- and memory-efficient artificial-intelligence-specific hardware. Avenues to mitigate the main issue, the von Neumann bottleneck, include in-memory and near-memory architectures, as well as algorithmic approaches. Here we leverage the low-power and the inherently binary operation of magnetic tunnel junctions (MTJs) to demonstrate neural network hardware inference based on passive arrays of MTJs. In general, transferring a trained network model to hardware for inference is confronted by degradation in performance due to device-to-device variations, write errors, parasitic resistance, and nonidealities in the substrate. To quantify the effect of these hardware realities, we benchmark 300 unique weight matrix solutions of a 2-layer perceptron to classify the Wine dataset for both classification accuracy and write fidelity. Despite device imperfections, we achieve software-equivalent accuracy of up to 95.3 % with proper tuning of network parameters in 15 x 15 MTJ arrays having a range of device sizes. The success of this tuning process shows that new metrics are needed to characterize the performance and quality of networks reproduced in mixed signal hardware.

preprint2020arXiv

Memory-efficient training with streaming dimensionality reduction

The movement of large quantities of data during the training of a Deep Neural Network presents immense challenges for machine learning workloads. To minimize this overhead, especially on the movement and calculation of gradient information, we introduce streaming batch principal component analysis as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic k-rank approximation of the network gradient. We demonstrate that the low rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini batch gradient descent. These results can lead to both improvements in the design of application specific integrated circuits for deep learning and in the speed of synchronization of machine learning models trained with data parallelism.

preprint2015arXiv

Three-Dimensional Stateful Material Implication Logic

Monolithic three-dimensional integration of memory and logic circuits could dramatically improve performance and energy efficiency of computing systems. Some conventional and emerging memories are suitable for vertical integration, including highly scalable metal-oxide resistive switching devices (memristors), yet integration of logic circuits proves to be much more challenging. Here we demonstrate memory and logic functionality in a monolithic three-dimensional circuit by adapting recently proposed memristor-based stateful material implication logic. Though such logic has been already implemented with a variety of memory devices, prohibitively large device variability in the most prospective memristor-based circuits has limited experimental demonstrations to simple gates and just a few cycles of operations. By developing a low-temperature, low-variability fabrication process, and modifying the original circuit to increase its robustness to device imperfections, we experimentally show, for the first time, reliable multi-cycle multi-gate material implication logic operation within a three-dimensional stack of monolithically integrated memristors. The direct data manipulation in three dimensions enables extremely compact and high-throughput logic-in-memory computing and, remarkably, presents a viable solution for the Feynman grand challenge of implementing an 8-bit adder at the nanoscale.