Researcher profile

Xiaozhe Gu

Xiaozhe Gu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding

Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing and the closer resemblance of biological processes in the nervous system of humans. However, SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models, which offsets efficiency and inhibits its application to low-power systems for real-world use cases. To alleviate this problem, emerging neural encoding schemes are proposed to shorten the spike train while maintaining the high accuracy. However, current accelerators for SNN cannot well support the emerging encoding schemes. In this work, we present a novel hardware architecture that can efficiently support SNN with emerging neural encoding. Our implementation features energy and area efficient processing units with increased parallelism and reduced memory accesses. We verified the accelerator on FPGA and achieve 25% and 90% improvement over previous work in power consumption and latency, respectively. At the same time, high area efficiency allows us to scale for large neural network models. To the best of our knowledge, this is the first work to deploy the large neural network model VGG on physical FPGA-based neuromorphic hardware.

preprint2021arXiv

E3NE: An End-to-End Framework for Accelerating Spiking Neural Networks with Emerging Neural Encoding on FPGAs

Compiler frameworks are crucial for the widespread use of FPGA-based deep learning accelerators. They allow researchers and developers, who are not familiar with hardware engineering, to harness the performance attained by domain-specific logic. There exists a variety of frameworks for conventional artificial neural networks. However, not much research effort has been put into the creation of frameworks optimized for spiking neural networks (SNNs). This new generation of neural networks becomes increasingly interesting for the deployment of AI on edge devices, which have tight power and resource constraints. Our end-to-end framework E3NE automates the generation of efficient SNN inference logic for FPGAs. Based on a PyTorch model and user parameters, it applies various optimizations and assesses trade-offs inherent to spike-based accelerators. Multiple levels of parallelism and the use of an emerging neural encoding scheme result in an efficiency superior to previous SNN hardware implementations. For a similar model, E3NE uses less than 50% of hardware resources and 20% less power, while reducing the latency by an order of magnitude. Furthermore, scalability and generality allowed the deployment of the large-scale SNN models AlexNet and VGG.

preprint2020arXiv

Dynamic Budget Management with Service Guarantees for Mixed-Criticality Systems

Many existing studies on mixed-criticality (MC) scheduling assume that low-criticality budgets for high-criticality applications are known apriori. These budgets are primarily used as guidance to determine when the scheduler should switch the system mode from low to high. Based on this key observation, in this paper we propose a dynamic MC scheduling model under which low-criticality budgets for individual high-criticality applications are determined at runtime as opposed to being fixed offline. To ensure sufficient budget for high-criticality applications at all times, we use offline schedulability analysis to determine a system-wide total low-criticality budget allocation for all the high-criticality applications combined. This total budget is used as guidance in our model to determine the need for a mode-switch. The runtime strategy then distributes this total budget among the various applications depending on their execution requirement and with the objective of postponing mode-switch as much as possible. We show that this runtime strategy is able to postpone mode-switches for a longer time than any strategy that uses a fixed low-criticality budget allocation for each application. Finally, since we are able to control the total budget allocation for high-criticality applications before mode-switch, we also propose techniques to determine these budgets considering system-wide objectives such as schedulability and service guarantee for low-criticality applications.

preprint2020arXiv

Efficient Schedulability Test for Dynamic-Priority Scheduling of Mixed-Criticality Real-Time Systems

Systems in many safety-critical application domains are subject to certification requirements. In such a system, there are typically different applications providing functionalities that have varying degrees of criticality. Consequently, the certification requirements for functionalities at these different criticality levels are also varying, with very high levels of assurance required for a highly critical functionality, whereas relatively low levels of assurance required for a less critical functionality. Considering the timing assurance given to various applications in the form of guaranteed budgets within deadlines, a theory of real-time scheduling for such multi-criticality systems has been under development in the recent past. In particular, an algorithm called Earliest Deadline First with Virtual Deadlines (EDF-VD) has shown a lot of promise for systems with two criticality levels, especially in terms of practical performance demonstrated through experiment results. In this paper we design a new schedulability test for EDF-VD that extend these performance benefits to multi-criticality systems. We propose a new test based on demand bound functions and also present a novel virtual deadline assignment strategy. Through extensive experiments we show that the proposed technique significantly outperforms existing strategies for a variety of generic real-time systems.

preprint2020arXiv

Resource Efficient Isolation Mechanisms in Mixed-Criticality Scheduling

Mixed-criticality real-time scheduling has been developed to improve resource utilization while guaranteeing safe execution of critical applications. These studies use optimistic resource reservation for all the applications to improve utilization, but prioritize critical applications when the reservations become insufficient at runtime. Many of them however share an impractical assumption that all the critical applications will simultaneously demand additional resources. As a consequence, they under-utilize resources by penalizing all the low-criticality applications. In this paper we overcome this shortcoming using a novel mechanism that comprises a parameter to model the expected number of critical applications simultaneously demanding more resources, and an execution strategy based on the parameter to improve resource utilization. Since most mixed-criticality systems in practice are component-based, we design our mechanism such that the component boundaries provide the isolation necessary to support the execution of low-criticality applications, and at the same time protect the critical ones. We also develop schedulability tests for the proposed mechanism under both a flat as well as a hierarchical scheduling framework. Finally, through simulations, we compare the performance of the proposed approach with existing studies in terms of schedulability and the capability to support low-criticality applications.