Researcher profile

Abhijit Suprem

Abhijit Suprem contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2023arXiv

Continuously Reliable Detection of New-Normal Misinformation: Semantic Masking and Contrastive Smoothing in High-Density Latent Regions

Toxic misinformation campaigns have caused significant societal harm, e.g., affecting elections and COVID-19 information awareness. Unfortunately, despite successes of (gold standard) retrospective studies of misinformation that confirmed their harmful effects after the fact, they arrive too late for timely intervention and reduction of such harm. By design, misinformation evades retrospective classifiers by exploiting two properties we call new-normal: (1) never-seen-before novelty that cause inescapable generalization challenges for previous classifiers, and (2) massive but short campaigns that end before they can be manually annotated for new classifier training. To tackle these challenges, we propose UFIT, which combines two techniques: semantic masking of strong signal keywords to reduce overfitting, and intra-proxy smoothness regularization of high-density regions in the latent space to improve reliability and maintain accuracy. Evaluation of UFIT on public new-normal misinformation data shows over 30% improvement over existing approaches on future (and unseen) campaigns. To the best of our knowledge, UFIT is the first successful effort to achieve such high level of generalization on new-normal misinformation data with minimal concession (1 to 5%) of accuracy compared to oracles trained with full knowledge of all campaigns.

preprint2022arXiv

Constructive Interpretability with CoLabel: Corroborative Integration, Complementary Features, and Collaborative Learning

Machine learning models with explainable predictions are increasingly sought after, especially for real-world, mission-critical applications that require bias detection and risk mitigation. Inherent interpretability, where a model is designed from the ground-up for interpretability, provides intuitive insights and transparent explanations on model prediction and performance. In this paper, we present CoLabel, an approach to build interpretable models with explanations rooted in the ground truth. We demonstrate CoLabel in a vehicle feature extraction application in the context of vehicle make-model recognition (VMMR). CoLabel performs VMMR with a composite of interpretable features such as vehicle color, type, and make, all based on interpretable annotations of the ground truth labels. First, CoLabel performs corroborative integration to join multiple datasets that each have a subset of desired annotations of color, type, and make. Then, CoLabel uses decomposable branches to extract complementary features corresponding to desired annotations. Finally, CoLabel fuses them together for final predictions. During feature fusion, CoLabel harmonizes complementary branches so that VMMR features are compatible with each other and can be projected to the same semantic space for classification. With inherent interpretability, CoLabel achieves superior performance to the state-of-the-art black-box models, with accuracy of 0.98, 0.95, and 0.94 on CompCars, Cars196, and BoxCars116K, respectively. CoLabel provides intuitive explanations due to constructive interpretability, and subsequently achieves high accuracy and usability in mission-critical situations.

preprint2022arXiv

Evaluating Generalizability of Fine-Tuned Models for Fake News Detection

The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an `infodemic' by the CDC and WHO. Misinformation tied to the Covid-19 infodemic changes continuously; this can lead to performance degradation of fine-tuned models due to concept drift. Degredation can be mitigated if models generalize well-enough to capture some cyclical aspects of drifted data. In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets. We show that existing models often overfit on their training dataset and have poor performance on unseen data. However, on some subsets of unseen data that overlap with training data, models have higher accuracy. Based on this observation, we also present KMeans-Proxy, a fast and effective method based on K-Means clustering for quickly identifying these overlapping subsets of unseen data. KMeans-Proxy improves generalizability on unseen fake news datasets by 0.1-0.2 f1-points across datasets. We present both our generalizability experiments as well as KMeans-Proxy to further research in tackling the fake news problem.

preprint2022arXiv

MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News Detection

COVID-19 related misinformation and fake news, coined an 'infodemic', has dramatically increased over the past few years. This misinformation exhibits concept drift, where the distribution of fake news changes over time, reducing effectiveness of previously trained models for fake news detection. Given a set of fake news models trained on multiple domains, we propose an adaptive decision module to select the best-fit model for a new sample. We propose MiDAS, a multi-domain adaptative approach for fake news detection that ranks relevancy of existing models to new samples. MiDAS contains 2 components: a doman-invariant encoder, and an adaptive model selector. MiDAS integrates multiple pre-trained and fine-tuned models with their training data to create a domain-invariant representation. Then, MiDAS uses local Lipschitz smoothness of the invariant embedding space to estimate each model's relevance to a new sample. Higher ranked models provide predictions, and lower ranked models abstain. We evaluate MiDAS on generalization to drifted data with 9 fake news datasets, each obtained from different domains and modalities. MiDAS achieves new state-of-the-art performance on multi-domain adaptation for out-of-distribution fake news classification.

preprint2020arXiv

EventMapper: Detecting Real-World Physical Events Using Corroborative and Probabilistic Sources

The ubiquity of social media makes it a rich source for physical event detection, such as disasters, and as a potential resource for crisis management resource allocation. There have been some recent works on leveraging social media sources for retrospective, after-the-fact event detection of large events such as earthquakes or hurricanes. Similarly, there is a long history of using traditional physical sensors such as climate satellites to perform regional event detection. However, combining social media with corroborative physical sensors for real-time, accurate, and global physical detection has remained unexplored. This paper presents EventMapper, a framework to support event recognition of small yet equally costly events (landslides, flooding, wildfires). EventMapper integrates high-latency, high-accuracy corroborative sources such as physical sensors with low-latency, noisy probabilistic sources such as social media streams to deliver real-time, global event recognition. Furthermore, EventMapper is resilient to the concept drift phenomenon, where machine learning models require continuous fine-tuning to maintain high performance. By exploiting the common features of probabilistic and corroborative sources, EventMapper automates machine learning model updates, maintenance, and fine-tuning. We describe three applications built on EventMapper for landslide, wildfire, and flooding detection.

preprint2020arXiv

Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention

Vehicle re-identification (re-id) is a fundamental problem for modern surveillance camera networks. Existing approaches for vehicle re-id utilize global features and local features for re-id by combining multiple subnetworks and losses. In this paper, we propose GLAMOR, or Global and Local Attention MOdules for Re-id. GLAMOR performs global and local feature extraction simultaneously in a unified model to achieve state-of-the-art performance in vehicle re-id across a variety of adversarial conditions and datasets (mAPs 80.34, 76.48, 77.15 on VeRi-776, VRIC, and VeRi-Wild, respectively). GLAMOR introduces several contributions: a better backbone construction method that outperforms recent approaches, group and layer normalization to address conflicting loss targets for re-id, a novel global attention module for global feature extraction, and a novel local attention module for self-guided part-based local feature extraction that does not require supervision. Additionally, GLAMOR is a compact and fast model that is 10x smaller while delivering 25% better performance.

preprint2020arXiv

ODIN: Automated Drift Detection and Recovery in Video Analytics

Recent advances in computer vision have led to a resurgence of interest in visual data analytics. Researchers are developing systems for effectively and efficiently analyzing visual data at scale. A significant challenge that these systems encounter lies in the drift in real-world visual data. For instance, a model for self-driving vehicles that is not trained on images containing snow does not work well when it encounters them in practice. This drift phenomenon limits the accuracy of models employed for visual data analytics. In this paper, we present a visual data analytics system, called ODIN, that automatically detects and recovers from drift. ODIN uses adversarial autoencoders to learn the distribution of high-dimensional images. We present an unsupervised algorithm for detecting drift by comparing the distributions of the given data against that of previously seen data. When ODIN detects drift, it invokes a drift recovery algorithm to deploy specialized models tailored towards the novel data points. These specialized models outperform their non-specialized counterpart on accuracy, performance, and memory footprint. Lastly, we present a model selection algorithm for picking an ensemble of best-fit specialized models to process a given input. We evaluate the efficacy and efficiency of ODIN on high-resolution dashboard camera videos captured under diverse environments from the Berkeley DeepDrive dataset. We demonstrate that ODIN's models deliver 6x higher throughput, 2x higher accuracy, and 6x smaller memory footprint compared to a baseline system without automated drift detection and recovery.

preprint2020arXiv

Robust, Extensible, and Fast: Teamed Classifiers for Vehicle Tracking and Vehicle Re-ID in Multi-Camera Networks

As camera networks have become more ubiquitous over the past decade, the research interest in video management has shifted to analytics on multi-camera networks. This includes performing tasks such as object detection, attribute identification, and vehicle/person tracking across different cameras without overlap. Current frameworks for management are designed for multi-camera networks in a closed dataset environment where there is limited variability in cameras and characteristics of the surveillance environment are well known. Furthermore, current frameworks are designed for offline analytics with guidance from human operators for forensic applications. This paper presents a teamed classifier framework for video analytics in heterogeneous many-camera networks with adversarial conditions such as multi-scale, multi-resolution cameras capturing the environment with varying occlusion, blur, and orientations. We describe an implementation for vehicle tracking and vehicle re-identification (re-id), where we implement a zero-shot learning (ZSL) system that performs automated tracking of all vehicles all the time. Our evaluations on VeRi-776 and Cars196 show the teamed classifier framework is robust to adversarial conditions, extensible to changing video characteristics such as new vehicle types/brands and new cameras, and offers real-time performance compared to current offline video analytics approaches.

preprint2020arXiv

Small, Accurate, and Fast Vehicle Re-ID on the Edge: the SAFR Approach

We propose a Small, Accurate, and Fast Re-ID (SAFR) design for flexible vehicle re-id under a variety of compute environments such as cloud, mobile, edge, or embedded devices by only changing the re-id model backbone. Through best-fit design choices, feature extraction, training tricks, global attention, and local attention, we create a reid model design that optimizes multi-dimensionally along model size, speed, & accuracy for deployment under various memory and compute constraints. We present several variations of our flexible SAFR model: SAFR-Large for cloud-type environments with large compute resources, SAFR-Small for mobile devices with some compute constraints, and SAFR-Micro for edge devices with severe memory & compute constraints. SAFR-Large delivers state-of-the-art results with mAP 81.34 on the VeRi-776 vehicle re-id dataset (15% better than related work). SAFR-Small trades a 5.2% drop in performance (mAP 77.14 on VeRi-776) for over 60% model compression and 150% speedup. SAFR-Micro, at only 6MB and 130MFLOPS, trades 6.8% drop in accuracy (mAP 75.80 on VeRi-776) for 95% compression and 33x speedup compared to SAFR-Large.