Source author record

Mohammadreza Baharani

Mohammadreza Baharani appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Hardware Architecture Robotics

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

ATCN: Resource-Efficient Processing of Time Series on Edge

This paper presents a scalable deep learning model called Agile Temporal Convolutional Network (ATCN) for high-accurate fast classification and time series prediction in resource-constrained embedded systems. ATCN is a family of compact networks with formalized hyperparameters that enable application-specific adjustments to be made to the model architecture. It is primarily designed for embedded edge devices with very limited performance and memory, such as wearable biomedical devices and real-time reliability monitoring systems. ATCN makes fundamental improvements over the mainstream temporal convolutional neural networks, including residual connections to increase the network depth and accuracy, and the incorporation of separable depth-wise convolution to reduce the computational complexity of the model. As part of the present work, two ATCN families, namely T0, and T1 are also presented and evaluated on different ranges of embedded processors - Cortex-M7 and Cortex-A57 processor. An evaluation of the ATCN models against the best-in-class InceptionTime and MiniRocket shows that ATCN almost maintains accuracy while improving the execution time on a broad range of embedded and cyber-physical applications with demand for real-time processing on the embedded edge. At the same time, in contrast to existing solutions, ATCN is the first time-series classifier based on deep learning that can be run bare-metal on embedded microcontrollers (Cortex-M7) with limited computational performance and memory capacity while delivering state-of-the-art accuracy.

preprint2022arXiv

DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Vehicle trajectory prediction is essential for enabling safety-critical intelligent transportation systems (ITS) applications used in management and operations. While there have been some promising advances in the field, there is a need for modern deep learning algorithms that allow real-time trajectory prediction on embedded IoT devices. This article presents DeepTrack, a novel deep learning algorithm customized for real-time vehicle trajectory prediction and monitoring applications in arterial management, freeway management, traffic incident management, and work zone management for high-speed incoming traffic. In contrast to previous methods, the vehicle dynamics are encoded using Temporal Convolutional Networks (TCNs) to provide more robust time prediction with less computation. DeepTrack also uses depthwise convolution, which reduces the complexity of models compared to existing approaches in terms of model size and operations. Overall, our experimental results demonstrate that DeepTrack achieves comparable accuracy to state-of-the-art trajectory prediction models but with smaller model sizes and lower computational complexity, making it more suitable for real-world deployment.

preprint2020arXiv

DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

Deep Separable Convolutional Neural Networks (DSCNNs) have become the emerging paradigm by offering modular networks with structural sparsity in order to achieve higher accuracy with relatively lower operations and parameters. However, there is a lack of customized architectures that can provide flexible solutions that fit the sparsity of the DSCNNs. This paper introduces DeepDive, which is a fully-functional, vertical co-design framework, for power-efficient implementation of DSCNNs on edge FPGAs. DeepDive's architecture supports crucial heterogeneous Compute Units (CUs) to fully support DSCNNs with various convolutional operators interconnected with structural sparsity. It offers an FPGA-aware training and online quantization combined with modular synthesizable C++ CUs, customized for DSCNNs. The execution results on Xilinx's ZCU102 FPGA board, demonstrate 47.4 and 233.3 FPS/Watt for MobileNet-V2 and a compact version of EfficientNet, respectively, as two state-of-the-art depthwise separable CNNs. These comparisons showcase how DeepDive improves FPS/Watt by 2.2$\times$ and 1.51$\times$ over Jetson Nano high and low power modes, respectively. It also enhances FPS/Watt about 2.27$\times$ and 37.25$\times$ over two other FPGA implementations. The DeepDive output for MobileNetV2 is available at https://github.com/TeCSAR-UNCC/DeepDive.