Source author record

Georgios Zervakis

Georgios Zervakis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Hardware Architecture Machine Learning

Catalog footprint

What is connected

7works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Approximate Decision Trees For Machine Learning Classification on Tiny Printed Circuits

Although Printed Electronics (PE) cannot compete with silicon-based systems in conventional evaluation metrics, e.g., integration density, area and performance, PE offers attractive properties such as on-demand ultra-low-cost fabrication, flexibility and non-toxicity. As a result, it targets application domains that are untouchable by lithography-based silicon electronics and thus have not yet seen much proliferation of computing. However, despite the attractive characteristics of PE, the large feature sizes in PE prohibit the realization of complex printed circuits, such as Machine Learning (ML) classifiers. In this work, we exploit the hardware-friendly nature of Decision Trees for machine learning classification and leverage the hardware-efficiency of the approximate design in order to generate approximate ML classifiers that are suitable for tiny, ultra-resource constrained, and battery-powered printed applications.

preprint2022arXiv

Automated Design Approximation to Overcome Circuit Aging

Transistor aging phenomena manifest themselves as degradations in the main electrical characteristics of transistors. Over time, they result in a significant increase of cell propagation delay, leading to errors due to timing violations, since the operating frequency becomes unsustainable as the circuit ages. Conventional techniques employ timing guardbands to mitigate aging-induced delay increase, which leads to considerable performance losses from the beginning of the circuit's lifetime. Leveraging the inherent error resilience of a vast number of application domains, approximate computing was recently introduced as an aging mitigation mechanism. In this work, we present the first automated framework for generating aging-aware approximate circuits. Our framework, by applying directed gate-level netlist approximation, induces a small functional error and recovers the delay degradation due to aging. As a result, our optimized circuits eliminate aging-induced timing errors. Experimental evaluation over a variety of arithmetic circuits and image processing benchmarks demonstrates that for an average error of merely $5\times10^{-3}$, our framework completely eliminates aging-induced timing guardbands. Compared to the respective baseline circuits without timing guardbands (i.e., iso-performance evaluation), the error of the circuits generated by our framework is $1208$x smaller.

preprint2022arXiv

Energy-efficient DNN Inference on Approximate Accelerators Through Formal Property Exploration

Deep Neural Networks (DNNs) are being heavily utilized in modern applications and are putting energy-constraint devices to the test. To bypass high energy consumption issues, approximate computing has been employed in DNN accelerators to balance out the accuracy-energy reduction trade-off. However, the approximation-induced accuracy loss can be very high and drastically degrade the performance of the DNN. Therefore, there is a need for a fine-grain mechanism that would assign specific DNN operations to approximation in order to maintain acceptable DNN accuracy, while also achieving low energy consumption. In this paper, we present an automated framework for weight-to-approximation mapping enabling formal property exploration for approximate DNN accelerators. At the MAC unit level, our experimental evaluation surpassed already energy-efficient mappings by more than $\times2$ in terms of energy gains, while also supporting significantly more fine-grain control over the introduced approximation.

preprint2022arXiv

Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high computational complexity. To enable efficient execution of DNN inference, more and more research works, therefore, exploit the inherent error resilience of DNNs and employ Approximate Computing (AC) principles to address the elevated energy demands of DNN accelerators. This article provides a comprehensive survey and analysis of hardware approximation techniques for DNN accelerators. First, we analyze the state of the art and by identifying approximation families, we cluster the respective works with respect to the approximation type. Next, we analyze the complexity of the performed evaluations (with respect to the dataset and DNN size) to assess the efficiency, the potential, and limitations of approximate DNN accelerators. Moreover, a broad discussion is provided, regarding error metrics that are more suitable for designing approximate units for DNN accelerators as well as accuracy recovery approaches that are tailored to DNN inference. Finally, we present how Approximate Computing for DNN accelerators can go beyond energy efficiency and address reliability and security issues, as well.

preprint2021arXiv

Control Variate Approximation for DNN Accelerators

In this work, we introduce a control variate approximation technique for low error approximate Deep Neural Network (DNN) accelerators. The control variate technique is used in Monte Carlo methods to achieve variance reduction. Our approach significantly decreases the induced error due to approximate multiplications in DNN inference, without requiring time-exhaustive retraining compared to state-of-the-art. Leveraging our control variate method, we use highly approximated multipliers to generate power-optimized DNN accelerators. Our experimental evaluation on six DNNs, for Cifar-10 and Cifar-100 datasets, demonstrates that, compared to the accurate design, our control variate approximation achieves same performance and 24% power reduction for a merely 0.16% accuracy loss.

preprint2021arXiv

Positive/Negative Approximate Multipliers for DNN Accelerators

Recent Deep Neural Networks (DNNs) managed to deliver superhuman accuracy levels on many AI tasks. Several applications rely more and more on DNNs to deliver sophisticated services and DNN accelerators are becoming integral components of modern systems-on-chips. DNNs perform millions of arithmetic operations per inference and DNN accelerators integrate thousands of multiply-accumulate units leading to increased energy requirements. Approximate computing principles are employed to significantly lower the energy consumption of DNN accelerators at the cost of some accuracy loss. Nevertheless, recent research demonstrated that complex DNNs are increasingly sensitive to approximation. Hence, the obtained energy savings are often limited when targeting tight accuracy constraints. In this work, we present a dynamically configurable approximate multiplier that supports three operation modes, i.e., exact, positive error, and negative error. In addition, we propose a filter-oriented approximation method to map the weights to the appropriate modes of the approximate multiplier. Our mapping algorithm balances the positive with the negative errors due to the approximate multiplications, aiming at maximizing the energy reduction while minimizing the overall convolution error. We evaluate our approach on multiple DNNs and datasets against state-of-the-art approaches, where our method achieves 18.33% energy gains on average across 7 NNs on 4 different datasets for a maximum accuracy drop of only 1%.

preprint2021arXiv

Reliability-Aware Quantization for Anti-Aging NPUs

Transistor aging is one of the major concerns that challenges designers in advanced technologies. It profoundly degrades the reliability of circuits during its lifetime as it slows down transistors resulting in errors due to timing violations unless large guardbands are included, which leads to considerable performance losses. When it comes to Neural Processing Units (NPUs), where increasing the inference speed is the primary goal, such performance losses cannot be tolerated. In this work, we are the first to propose a reliability-aware quantization to eliminate aging effects in NPUs while completely removing guardbands. Our technique delivers a graceful inference accuracy degradation over time while compensating for the aging-induced delay increase of the NPU. Our evaluation, over ten state-of-the-art neural network architectures trained on the ImageNet dataset, demonstrates that for an entire lifetime of 10 years, the average accuracy loss is merely 3%. In the meantime, our technique achieves 23% higher performance due to the elimination of the aging guardband.

Georgios Zervakis

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Approximate Decision Trees For Machine Learning Classification on Tiny Printed Circuits

Automated Design Approximation to Overcome Circuit Aging

Energy-efficient DNN Inference on Approximate Accelerators Through Formal Property Exploration

Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

Control Variate Approximation for DNN Accelerators

Positive/Negative Approximate Multipliers for DNN Accelerators

Reliability-Aware Quantization for Anti-Aging NPUs