Source author record

Tulika Mitra

Tulika Mitra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Machine Learning Computer Vision Cryptography and Security Networking and Internet Architecture Performance

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

With the increasing popularity of deep learning on edge devices, compressing large neural networks to meet the hardware requirements of resource-constrained devices became a significant research direction. Numerous compression methodologies are currently being used to reduce the memory sizes and energy consumption of neural networks. Knowledge distillation (KD) is among such methodologies and it functions by using data samples to transfer the knowledge captured by a large model (teacher) to a smaller one(student). However, due to various reasons, the original training data might not be accessible at the compression stage. Therefore, data-free model compression is an ongoing research problem that has been addressed by various works. In this paper, we point out that catastrophic forgetting is a problem that can potentially be observed in existing data-free distillation methods. Moreover, the sample generation strategies in some of these methods could result in a mismatch between the synthetic and real data distributions. To prevent such problems, we propose a data-free KD framework that maintains a dynamic collection of generated samples over time. Additionally, we add the constraint of matching the real data distribution in sample generation strategies that target maximum information gain. Our experiments demonstrate that we can improve the accuracy of the student models obtained via KD when compared with state-of-the-art approaches on the SVHN, Fashion MNIST and CIFAR100 datasets.

preprint2020arXiv

High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors

IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inference to attain maximal throughput. However, high communication overhead involved in parallelization of computations from convolution kernels across clusters is detrimental to throughput. We present an alternative framework called Pipe-it that employs pipelined design to split convolutional layers across clusters while limiting parallelization of their respective kernels to the assigned cluster. We develop a performance-prediction model that utilizes only the convolutional layer descriptors to predict the execution time of each layer individually on all permitted core configurations (type and count). Pipe-it then exploits the predictions to create a balanced pipeline using an efficient design space exploration algorithm. Pipe-it on average results in a 39% higher throughput than the highest antecedent throughput.

preprint2020arXiv

IsoRAN: Isolation and Scaling for 5G RANvia User-Level Data Plane Virtualization

5G presents a unique set of challenges for cellular network architecture. The architecture needs to be versatile in order to handle a variety of use cases. While network slicing has been proposed as a way to provide such versatility, it is also important to ensure that slices do not adversely interfere with each other. In other words, isolation among network slices is needed. Additionally, the large number of use cases also implies a large number of users, making it imperative that 5G architectures scale efficiently. In this paper we propose IsoRAN, which provides isolation and scaling along with the flexibility needed for 5G architecture. In IsoRAN, users are processed by daemon threads in the Cloud Radio Access Network (CRAN) architecture. Our design allows users from different use cases to be executed, in a distributed manner, on the most efficient hardware to ensure that the Service Level Agreements (SLAs) are met while minimising power consumption. Our experiments show that IsoRAN handles users with different SLA while providing isolation to reduce interference. This increased isolation reduces the drop rate for different users from 42% to nearly 0% in some cases. Finally, we run large scale simulations on real traces to show the benefits for power consumption and cost reduction scale while increasing the number of base stations.

preprint2020arXiv

Neural Network Inference on Mobile SoCs

The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accelerators. These different components are capable of independently performing inference but with very different power-performance characteristics. In this article, we provide a quantitative evaluation of the inference capabilities of the different components on mobile SoCs. We also present insights behind their respective power-performance behavior. Finally, we explore the performance limit of the mobile SoCs by synergistically engaging all the components concurrently. We observe that a mobile SoC provides up to 2x improvement with parallel inference when all its components are engaged, as opposed to engaging only one component.

preprint2019arXiv

KLEESPECTRE: Detecting Information Leakage through Speculative Cache Attacks via Symbolic Execution

Spectre attacks disclosed in early 2018 expose data leakage scenarios via cache side channels. Specifically, speculatively executed paths due to branch mis-prediction may bring secret data into the cache which are then exposed via cache side channels even after the speculative execution is squashed. Symbolic execution is a well-known test generation method to cover program paths at the level of the application software. In this paper, we extend symbolic execution with modelingof cache and speculative execution. Our tool KLEESPECTRE, built on top of the KLEE symbolic execution engine, can thus provide a testing engine to check for the data leakage through cache side-channel as shown via Spectre attacks. Our symbolic cache model can verify whether the sensitive data leakage due to speculative execution can be observed by an attacker at a given program point. Our experiments show that KLEESPECTREcan effectively detect data leakage along speculatively executed paths and our cache model can further make the leakage detection much more precise.

Tulika Mitra

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors

IsoRAN: Isolation and Scaling for 5G RANvia User-Level Data Plane Virtualization

Neural Network Inference on Mobile SoCs

KLEESPECTRE: Detecting Information Leakage through Speculative Cache Attacks via Symbolic Execution