Source author record

Steven Furgurson

Steven Furgurson appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Hardware Architecture Machine Learning

Catalog footprint

What is connected

2works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

Deep Separable Convolutional Neural Networks (DSCNNs) have become the emerging paradigm by offering modular networks with structural sparsity in order to achieve higher accuracy with relatively lower operations and parameters. However, there is a lack of customized architectures that can provide flexible solutions that fit the sparsity of the DSCNNs. This paper introduces DeepDive, which is a fully-functional, vertical co-design framework, for power-efficient implementation of DSCNNs on edge FPGAs. DeepDive's architecture supports crucial heterogeneous Compute Units (CUs) to fully support DSCNNs with various convolutional operators interconnected with structural sparsity. It offers an FPGA-aware training and online quantization combined with modular synthesizable C++ CUs, customized for DSCNNs. The execution results on Xilinx's ZCU102 FPGA board, demonstrate 47.4 and 233.3 FPS/Watt for MobileNet-V2 and a compact version of EfficientNet, respectively, as two state-of-the-art depthwise separable CNNs. These comparisons showcase how DeepDive improves FPS/Watt by 2.2$\times$ and 1.51$\times$ over Jetson Nano high and low power modes, respectively. It also enhances FPS/Watt about 2.27$\times$ and 37.25$\times$ over two other FPGA implementations. The DeepDive output for MobileNetV2 is available at https://github.com/TeCSAR-UNCC/DeepDive.

preprint2020arXiv

EfficientHRNet: Efficient Scaling for Lightweight High-Resolution Multi-Person Pose Estimation

There is an increasing demand for lightweight multi-person pose estimation for many emerging smart IoT applications. However, the existing algorithms tend to have large model sizes and intense computational requirements, making them ill-suited for real-time applications and deployment on resource-constrained hardware. Lightweight and real-time approaches are exceedingly rare and come at the cost of inferior accuracy. In this paper, we present EfficientHRNet, a family of lightweight multi-person human pose estimators that are able to perform in real-time on resource-constrained devices. By unifying recent advances in model scaling with high-resolution feature representations, EfficientHRNet creates highly accurate models while reducing computation enough to achieve real-time performance. The largest model is able to come within 4.4% accuracy of the current state-of-the-art, while having 1/3 the model size and 1/6 the computation, achieving 23 FPS on Nvidia Jetson Xavier. Compared to the top real-time approach, EfficientHRNet increases accuracy by 22% while achieving similar FPS with 1/3 the power. At every level, EfficientHRNet proves to be more computationally efficient than other bottom-up 2D human pose estimation approaches, while achieving highly competitive accuracy.

Steven Furgurson

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

EfficientHRNet: Efficient Scaling for Lightweight High-Resolution Multi-Person Pose Estimation