Source author record

Sorin Cotofana

Sorin Cotofana appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mes-hall Distributed, Parallel, and Cluster Computing eess.SP Neural and Evolutionary Computing

Catalog footprint

What is connected

2works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Would Magnonic Circuits Outperform CMOS Counterparts?

In the early stages of a novel technology development, it is difficult to provide a comprehensive assessment of its potential capabilities and impact. Nevertheless, some preliminary estimates can be drawn and are certainly of great interest and in this paper we follow this line of reasoning within the framework of the Spin Wave (SW) computing paradigm. In particular, we are interested in assessing the technological development horizon that needs to be reached in order to unleash the full SW paradigm potential such that SW circuits can outperform CMOS counterparts in terms of energy consumption. In view of the zero power SWs propagation through ferromagnetic waveguides, the overall SW circuit power consumption is determined by the one associated to SWs generation and sensing by means of transducers. While current antenna based transducers are clearly power hungry recent developments indicate that magneto-electric (ME) cells have a great potential for ultra-low power SW generation and sensing. Given that MEs have been only proposed at the conceptual level and no actual experimental demonstration has been reported we cannot evaluate the impact of their utilization on the SW circuit energy consumption. However, we can perform a reverse engineering alike analysis to determine ME delay and power consumption upper bounds that can place SW circuits in the leading position. To this end, we utilize a 32-bit Brent-Kung Adder (BKA) as discussion vehicle and compute the maximum ME delay and power consumption that could potentially enable a SW implementation able to outperform its 7nm CMOS counterpart. We evaluate different BKA SW implementations that rely on conversion or normalization gate cascading and consider continuous or pulsed SW generation scenarios. 31nW is the maximum transducer power consumption for which a 32-bit BKA SW implementation can outperform its 7nm CMOS counterpart.

preprint2020arXiv

Evolutionary Bin Packing for Memory-Efficient Dataflow Inference Acceleration on FPGA

Convolutional neural network (CNN) dataflow inference accelerators implemented in Field Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA on-chip memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200$\times$ faster than current state-of-the-art simulated annealing approaches.