Researcher profile

Arif Mahmood

Arif Mahmood contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Data Augmentation for Graph Data: Recent Advancements

Graph Neural Network (GNNs) based methods have recently become a popular tool to deal with graph data because of their ability to incorporate structural information. The only hurdle in the performance of GNNs is the lack of labeled data. Data Augmentation techniques for images and text data can not be used for graph data because of the complex and non-euclidean structure of graph data. This gap has forced researchers to shift their focus towards the development of data augmentation techniques for graph data. Most of the proposed Graph Data Augmentation (GDA) techniques are task-specific. In this paper, we survey the existing GDA techniques based on different graph tasks. This survey not only provides a reference to the research community of GDA but also provides the necessary information to the researchers of other domains.

preprint2022arXiv

Generative Cooperative Learning for Unsupervised Video Anomaly Detection

Video anomaly detection is well investigated in weakly-supervised and one-class classification (OCC) settings. However, unsupervised video anomaly detection methods are quite sparse, likely because anomalies are less frequent in occurrence and usually not well-defined, which when coupled with the absence of ground truth supervision, could adversely affect the performance of the learning algorithms. This problem is challenging yet rewarding as it can completely eradicate the costs of obtaining laborious annotations and enable such systems to be deployed without human intervention. To this end, we propose a novel unsupervised Generative Cooperative Learning (GCL) approach for video anomaly detection that exploits the low frequency of anomalies towards building a cross-supervision between a generator and a discriminator. In essence, both networks get trained in a cooperative fashion, thereby allowing unsupervised learning. We conduct extensive experiments on two large-scale video anomaly detection datasets, UCF crime, and ShanghaiTech. Consistent improvement over the existing state-of-the-art unsupervised and OCC methods corroborate the effectiveness of our approach.

preprint2022arXiv

Lightweight Encoder-Decoder Architecture for Foot Ulcer Segmentation

Continuous monitoring of foot ulcer healing is needed to ensure the efficacy of a given treatment and to avoid any possibility of deterioration. Foot ulcer segmentation is an essential step in wound diagnosis. We developed a model that is similar in spirit to the well-established encoder-decoder and residual convolution neural networks. Our model includes a residual connection along with a channel and spatial attention integrated within each convolution block. A simple patch-based approach for model training, test time augmentations, and majority voting on the obtained predictions resulted in superior performance. Our model did not leverage any readily available backbone architecture, pre-training on a similar external dataset, or any of the transfer learning techniques. The total number of network parameters being around 5 million made it a significantly lightweight model as compared with the available state-of-the-art models used for the foot ulcer segmentation task. Our experiments presented results at the patch-level and image-level. Applied on publicly available Foot Ulcer Segmentation (FUSeg) Challenge dataset from MICCAI 2021, our model achieved state-of-the-art image-level performance of 88.22% in terms of Dice similarity score and ranked second in the official challenge leaderboard. We also showed an extremely simple solution that could be compared against the more advanced architectures.

preprint2022arXiv

Quantification of Occlusion Handling Capability of a 3D Human Pose Estimation Framework

3D human pose estimation using monocular images is an important yet challenging task. Existing 3D pose detection methods exhibit excellent performance under normal conditions however their performance may degrade due to occlusion. Recently some occlusion aware methods have also been proposed, however, the occlusion handling capability of these networks has not yet been thoroughly investigated. In the current work, we propose an occlusion-guided 3D human pose estimation framework and quantify its occlusion handling capability by using different protocols. The proposed method estimates more accurate 3D human poses using 2D skeletons with missing joints as input. Missing joints are handled by introducing occlusion guidance that provides extra information about the absence or presence of a joint. Temporal information has also been exploited to better estimate the missing joints. A large number of experiments are performed for the quantification of occlusion handling capability of the proposed method on three publicly available datasets in various settings including random missing joints, fixed body parts missing, and complete frames missing, using mean per joint position error criterion. In addition to that, the quality of the predicted 3D poses is also evaluated using action classification performance as a criterion. 3D poses estimated by the proposed method achieved significantly improved action recognition performance in the presence of missing joints. Our experiments demonstrate the effectiveness of the proposed framework for handling the missing joints as well as quantification of the occlusion handling capability of the deep neural networks.

preprint2022arXiv

Reconstruction of Time-varying Graph Signals via Sobolev Smoothness

Graph Signal Processing (GSP) is an emerging research field that extends the concepts of digital signal processing to graphs. GSP has numerous applications in different areas such as sensor networks, machine learning, and image processing. The sampling and reconstruction of static graph signals have played a central role in GSP. However, many real-world graph signals are inherently time-varying and the smoothness of the temporal differences of such graph signals may be used as a prior assumption. In the current work, we assume that the temporal differences of graph signals are smooth, and we introduce a novel algorithm based on the extension of a Sobolev smoothness function for the reconstruction of time-varying graph signals from discrete samples. We explore some theoretical aspects of the convergence rate of our Time-varying Graph signal Reconstruction via Sobolev Smoothness (GraphTRSS) algorithm by studying the condition number of the Hessian associated with our optimization problem. Our algorithm has the advantage of converging faster than other methods that are based on Laplacian operators without requiring expensive eigenvalue decomposition or matrix inversions. The proposed GraphTRSS is evaluated on several datasets including two COVID-19 datasets and it has outperformed many existing state-of-the-art methods for time-varying graph signal reconstruction. GraphTRSS has also shown excellent performance on two environmental datasets for the recovery of particulate matter and sea surface temperature signals.

preprint2021arXiv

Fake Visual Content Detection Using Two-Stream Convolutional Neural Networks

Rapid progress in adversarial learning has enabled the generation of realistic-looking fake visual content. To distinguish between fake and real visual content, several detection techniques have been proposed. The performance of most of these techniques however drops off significantly if the test and the training data are sampled from different distributions. This motivates efforts towards improving the generalization of fake detectors. Since current fake content generation techniques do not accurately model the frequency spectrum of the natural images, we observe that the frequency spectrum of the fake visual data contains discriminative characteristics that can be used to detect fake content. We also observe that the information captured in the frequency spectrum is different from that of the spatial domain. Using these insights, we propose to complement frequency and spatial domain features using a two-stream convolutional neural network architecture called TwoStreamNet. We demonstrate the improved generalization of the proposed two-stream network to several unseen generation architectures, datasets, and techniques. The proposed detector has demonstrated significant performance improvement compared to the current state-of-the-art fake content detectors and fusing the frequency and spatial domain streams has also improved generalization of the detector.

preprint2021arXiv

Leveraging Orientation for Weakly Supervised Object Detection with Application to Firearm Localization

Automatic detection of firearms is important for enhancing the security and safety of people, however, it is a challenging task owing to the wide variations in shape, size, and appearance of firearms. Also, most of the generic object detectors process axis-aligned rectangular areas though, a thin and long rifle may actually cover only a small percentage of that area and the rest may contain irrelevant details suppressing the required object signatures. To handle these challenges, we propose a weakly supervised Orientation Aware Object Detection (OAOD) algorithm which learns to detect oriented object bounding boxes (OBB) while using AxisAligned Bounding Boxes (AABB) for training. The proposed OAOD is different from the existing oriented object detectors which strictly require OBB during training which may not always be present. The goal of training on AABB and detection of OBB is achieved by employing a multistage scheme, with Stage-1 predicting the AABB and Stage-2 predicting OBB. In-between the two stages, the oriented proposal generation module along with the object aligned RoI pooling is designed to extract features based on the predicted orientation and to make these features orientation invariant. A diverse and challenging dataset consisting of eleven thousand images is also proposed for firearm detection which is manually annotated for firearm classification and localization. The proposed ITU Firearm dataset (ITUF) contains a wide range of guns and rifles. The OAOD algorithm is evaluated on the ITUF dataset and compared with current state-of-the-art object detectors, including fully supervised oriented object detectors. OAOD has outperformed both types of object detectors with a significant margin. The experimental results (mAP: 88.3 on AABB & mAP: 77.5 on OBB) demonstrate the effectiveness of the proposed algorithm for firearm detection.

preprint2020arXiv

Localizing Firearm Carriers by Identifying Human-Object Pairs

Visual identification of gunmen in a crowd is a challenging problem, that requires resolving the association of a person with an object (firearm). We present a novel approach to address this problem, by defining human-object interaction (and non-interaction) bounding boxes. In a given image, human and firearms are separately detected. Each detected human is paired with each detected firearm, allowing us to create a paired bounding box that contains both object and the human. A network is trained to classify these paired-bounding-boxes into human carrying the identified firearm or not. Extensive experiments were performed to evaluate effectiveness of the algorithm, including exploiting full pose of the human, hand key-points, and their association with the firearm. The knowledge of spatially localized features is key to success of our method by using multi-size proposals with adaptive average pooling. We have also extended a previously firearm detection dataset, by adding more images and tagging in extended dataset the human-firearm pairs (including bounding boxes for firearms and gunmen). The experimental results ($AP_{hold} = 78.5$) demonstrate effectiveness of the proposed method.