Source author record

Shubham Jain

Shubham Jain appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision eess.SP Emerging Technologies Human-Computer Interaction Cryptography and Security Hardware Architecture Networking and Internet Architecture Other Computer Science

Catalog footprint

What is connected

8works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning

End-to-end encryption (E2EE) by messaging platforms enable people to securely and privately communicate with one another. Its widespread adoption however raised concerns that illegal content might now be shared undetected. Following the global pushback against key escrow systems, client-side scanning based on perceptual hashing has been recently proposed by tech companies, governments and researchers to detect illegal content in E2EE communications. We here propose the first framework to evaluate the robustness of perceptual hashing-based client-side scanning to detection avoidance attacks and show current systems to not be robust. More specifically, we propose three adversarial attacks--a general black-box attack and two white-box attacks for discrete cosine transform-based algorithms--against perceptual hashing algorithms. In a large-scale evaluation, we show perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks in a black-box setting, with more than 99.9% of images successfully attacked while preserving the content of the image. We furthermore show our attack to generate diverse perturbations, strongly suggesting that straightforward mitigation strategies would be ineffective. Finally, we show that the larger thresholds necessary to make the attack harder would probably require more than one billion images to be flagged and decrypted daily, raising strong privacy concerns. Taken together, our results shed serious doubts on the robustness of perceptual hashing-based client-side scanning mechanisms currently proposed by governments, organizations, and researchers around the world.

preprint2022arXiv

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

With the fast-growing number of classification models being produced every day, numerous model interpretation and comparison solutions have also been introduced. For example, LIME and SHAP can interpret what input features contribute more to a classifier's output predictions. Different numerical metrics (e.g., accuracy) can be used to easily compare two classifiers. However, few works can interpret the contribution of a data feature to a classifier in comparison with its contribution to another classifier. This comparative interpretation can help to disclose the fundamental difference between two classifiers, select classifiers in different feature conditions, and better ensemble two classifiers. To accomplish it, we propose a learning-from-disagreement (LFD) framework to visually compare two classification models. Specifically, LFD identifies data instances with disagreed predictions from two compared classifiers and trains a discriminator to learn from the disagreed instances. As the two classifiers' training features may not be available, we train the discriminator through a set of meta-features proposed based on certain hypotheses of the classifiers to probe their behaviors. Interpreting the trained discriminator with the SHAP values of different meta-features, we provide actionable insights into the compared classifiers. Also, we introduce multiple metrics to profile the importance of meta-features from different perspectives. With these metrics, one can easily identify meta-features with the most complementary behaviors in two classifiers, and use them to better ensemble the classifiers. We focus on binary classification models in the financial services and advertising industry to demonstrate the efficacy of our proposed framework and visualizations.

preprint2022arXiv

RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification

In this work, we present RadioTransformer, a novel visual attention-driven transformer framework, that leverages radiologists' gaze patterns and models their visuo-cognitive behavior for disease diagnosis on chest radiographs. Domain experts, such as radiologists, rely on visual information for medical image interpretation. On the other hand, deep neural networks have demonstrated significant promise in similar tasks even where visual interpretation is challenging. Eye-gaze tracking has been used to capture the viewing behavior of domain experts, lending insights into the complexity of visual search. However, deep learning frameworks, even those that rely on attention mechanisms, do not leverage this rich domain information. RadioTransformer fills this critical gap by learning from radiologists' visual search patterns, encoded as 'human visual attention regions' in a cascaded global-focal transformer framework. The overall 'global' image characteristics and the more detailed 'local' features are captured by the proposed global and focal modules, respectively. We experimentally validate the efficacy of our student-teacher approach for 8 datasets involving different disease classification tasks where eye-gaze data is not available during the inference phase. Code: https://github.com/bmi-imaginelab/radiotransformer.

preprint2021arXiv

TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar Systems

Resistive crossbars have attracted significant interest in the design of Deep Neural Network (DNN) accelerators due to their ability to natively execute massively parallel vector-matrix multiplications within dense memory arrays. However, crossbar-based computations face a major challenge due to a variety of device and circuit-level non-idealities, which manifest as errors in the vector-matrix multiplications and eventually degrade DNN accuracy. To address this challenge, there is a need for tools that can model the functional impact of non-idealities on DNN training and inference. Existing efforts towards this goal are either limited to inference, or are too slow to be used for large-scale DNN training. We propose TxSim, a fast and customizable modeling framework to functionally evaluate DNN training on crossbar-based hardware considering the impact of non-idealities. The key features of TxSim that differentiate it from prior efforts are: (i) It comprehensively models non-idealities during all training operations (forward propagation, backward propagation, and weight update) and (ii) it achieves computational efficiency by mapping crossbar evaluations to well-optimized BLAS routines and incorporates speedup techniques to further reduce simulation time with minimal impact on accuracy. TxSim achieves orders-of-magnitude improvement in simulation speed over prior works, and thereby makes it feasible to evaluate training of large-scale DNNs on crossbars. Our experiments using TxSim reveal that the accuracy degradation in DNN training due to non-idealities can be substantial (3%-10%) for large-scale DNNs, underscoring the need for further research in mitigation techniques. We also analyze the impact of various device and circuit-level parameters and the associated non-idealities to provide key insights that can guide the design of crossbar-based DNN training accelerators.

preprint2020arXiv

Designing Just-in-Time Detection for Gamified Fitness Frameworks

This paper presents our findings from a multi-year effort to detect motion events early using inertial sensors in real-world settings. We believe early event detection is the next step in advancing motion tracking, and can enable just-in-time interventions, particularly for mHealth applications. Our system targets strength training workouts in the fitness domain, where users perform well-defined movements for each exercise, while wearing an inertial sensor. We collect data for 20 exercises across 12 users over 26 months. We propose an algorithm to detect repetitions before they end, to allow a user to visualize movement derived metrics in real-time. We further develop a gamified approach to display this information to the user and encourage them to perform consistent movements. Participants in a feasibility study find the gamified feedback useful in improving their form. Our system can detect repetition events as early as 500 ms before it ends, which is 2x faster and more accurate than state-of-the-art trackers. We believe our approach will open exciting avenues for tracking, detection, and gamification for fitness frameworks.

preprint2020arXiv

RxNN: A Framework for Evaluating Deep Neural Networks on Resistive Crossbars

Resistive crossbars designed with non-volatile memory devices have emerged as promising building blocks for Deep Neural Network (DNN) hardware, due to their ability to compactly and efficiently realize vector-matrix multiplication (VMM), the dominant computational kernel in DNNs. However, a key challenge with resistive crossbars is that they suffer from a range of device and circuit level non-idealities such as interconnect parasitics, peripheral circuits, sneak paths, and process variations. These non-idealities can lead to errors in VMMs, eventually degrading the DNN's accuracy. It is therefore critical to study the impact of crossbar non-idealities on the accuracy of large-scale DNNs. However, this is challenging because existing device and circuit models are too slow to use in application-level evaluations. We present RxNN, a fast and accurate simulation framework to evaluate large-scale DNNs on resistive crossbar systems. RxNN splits and maps the computations involved in each DNN layer into crossbar operations, and evaluates them using a Fast Crossbar Model (FCM) that accurately captures the errors arising due to crossbar non-idealities while being four-to-five orders of magnitude faster than circuit simulation. FCM models a crossbar-based VMM operation using three stages - non-linear models for the input and output peripheral circuits (DACs and ADCs), and an equivalent non-ideal conductance matrix for the core crossbar array. We implement RxNN by extending the Caffe machine learning framework and use it to evaluate a suite of six large-scale DNNs developed for the ImageNet Challenge. Our experiments reveal that resistive crossbar non-idealities can lead to significant accuracy degradations (9.6%-32%) for these large-scale DNNs. To the best of our knowledge, this work is the first quantitative evaluation of the accuracy of large-scale DNNs on resistive crossbar based hardware.

preprint2020arXiv

TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks

The use of lower precision has emerged as a popular technique to optimize the compute and storage requirements of complex Deep Neural Networks (DNNs). In the quest for lower precision, recent studies have shown that ternary DNNs (which represent weights and activations by signed ternary values) represent a promising sweet spot, achieving accuracy close to full-precision networks on complex tasks. We propose TiM-DNN, a programmable in-memory accelerator that is specifically designed to execute ternary DNNs. TiM-DNN supports various ternary representations including unweighted {-1,0,1}, symmetric weighted {-a,0,a}, and asymmetric weighted {-a,0,b} ternary systems. The building blocks of TiM-DNN are TiM tiles -- specialized memory arrays that perform massively parallel signed ternary vector-matrix multiplications with a single access. TiM tiles are in turn composed of Ternary Processing Cells (TPCs), bit-cells that function as both ternary storage units and signed ternary multiplication units. We evaluate an implementation of TiM-DNN in 32nm technology using an architectural simulator calibrated with SPICE simulations and RTL synthesis. We evaluate TiM-DNN across a suite of state-of-the-art DNN benchmarks including both deep convolutional and recurrent neural networks. A 32-tile instance of TiM-DNN achieves a peak performance of 114 TOPs/s, consumes 0.9W power, and occupies 1.96mm2 chip area, representing a 300X and 388X improvement in TOPS/W and TOPS/mm2, respectively, compared to an NVIDIA Tesla V100 GPU. In comparison to specialized DNN accelerators, TiM-DNN achieves 55X-240X and 160X-291X improvement in TOPS/W and TOPS/mm2, respectively. Finally, when compared to a well-optimized near-memory accelerator for ternary DNNs, TiM-DNN demonstrates 3.9x-4.7x improvement in system-level energy and 3.2x-4.2x speedup, underscoring the potential of in-memory computing for ternary DNNs.

preprint2013arXiv

A customized flocking algorithm for swarms of sensors tracking a swarm of targets

Wireless mobile sensor networks (WMSNs) are groups of mobile sensing agents with multi-modal sensing capabilities that communicate over wireless networks. WMSNs have more flexibility in terms of deployment and exploration abilities over static sensor networks. Sensor networks have a wide range of applications in security and surveillance systems, environmental monitoring, data gathering for network-centric healthcare systems, monitoring seismic activities and atmospheric events, tracking traffic congestion and air pollution levels, localization of autonomous vehicles in intelligent transportation systems, and detecting failures of sensing, storage, and switching components of smart grids. The above applications require target tracking for processes and events of interest occurring in an environment. Various methods and approaches have been proposed in order to track one or more targets in a pre-defined area. Usually, this turns out to be a complicated job involving higher order mathematics coupled with artificial intelligence due to the dynamic nature of the targets. To optimize the resources we need to have an approach that works in a more straightforward manner while resulting in fairly satisfactory data. In this paper we have discussed the various cases that might arise while flocking a group of sensors to track targets in a given environment. The approach has been developed from scratch although some basic assumptions have been made keeping in mind some previous theories. This paper outlines a customized approach for feasibly tracking swarms of targets in a specific area so as to minimize the resources and optimize tracking efficiency.

Shubham Jain

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification

TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar Systems

Designing Just-in-Time Detection for Gamified Fitness Frameworks

RxNN: A Framework for Evaluating Deep Neural Networks on Resistive Crossbars

TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks

A customized flocking algorithm for swarms of sensors tracking a swarm of targets