Source author record

Irfan Mehmood

Irfan Mehmood appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Retrieval Multimedia

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CapCLIP: A Vision-Language Representation Alignment Approach for Wireless Capsule Endoscopy Analysis

Wireless capsule endoscopy (WCE) enables non-invasive visual assessment of the small bowel, but its clinical utility is constrained by the large volume of frames generated per examination and the difficulty of recognising subtle abnormalities under highly variable imaging conditions. Existing learning-based approaches for WCE are predominantly vision-only, often confined to narrow pathology sets, and show limited transfer across datasets and centres. To address these limitations, this study introduces CapCLIP, a domain-specific vision-language representation learning framework for WCE. CapCLIP aligns capsule endoscopy frames with clinically grounded textual descriptions derived from standardised nomenclature and pathology-aware caption templates, thereby learning embeddings that are both semantically informed and transferable. The proposed framework is evaluated against relevant open-source vision and vision-language foundation models under strict zero-shot conditions using unseen WCE datasets. Evaluation covers three downstream tasks: K-nearest neighbour classification, CLIP-style image-text classification, and text-to-image retrieval. Across these settings, CapCLIP consistently outperforms the compared baselines, with particularly strong gains in zero-shot image-text classification and cross-modal retrieval on out-of-distribution datasets. The results indicate that language-guided representation learning can improve both generalisation and semantic interpretability in WCE analysis. These findings position CapCLIP as a step toward foundation models tailored to capsule endoscopy and support the use of language-grounded WCE analysis.

preprint2015arXiv

A novel magic LSB substitution method (M-LSB-SM) using multi-level encryption and achromatic component of an image

Image Steganography is a thriving research area of information security where secret data is embedded in images to hide its existence while getting the minimum possible statistical detectability. This paper proposes a novel magic least significant bit substitution method (M-LSB-SM) for RGB images. The proposed method is based on the achromatic component (I-plane) of the hue-saturation-intensity (HSI) color model and multi-level encryption (MLE) in the spatial domain. The input image is transposed and converted into an HSI color space. The I-plane is divided into four sub-images of equal size, rotating each sub-image with a different angle using a secret key. The secret information is divided into four blocks, which are then encrypted using an MLE algorithm (MLEA). Each sub-block of the message is embedded into one of the rotated sub-images based on a specific pattern using magic LSB substitution. Experimental results validate that the proposed method not only enhances the visual quality of stego images but also provides good imperceptibility and multiple security levels as compared to several existing prominent methods.

preprint2015arXiv

Describing Colors, Textures and Shapes for Content Based Image Retrieval - A Survey

Visual media has always been the most enjoyed way of communication. From the advent of television to the modern day hand held computers, we have witnessed the exponential growth of images around us. Undoubtedly it's a fact that they carry a lot of information in them which needs be utilized in an effective manner. Hence intense need has been felt to efficiently index and store large image collections for effective and on- demand retrieval. For this purpose low-level features extracted from the image contents like color, texture and shape has been used. Content based image retrieval systems employing these features has proven very successful. Image retrieval has promising applications in numerous fields and hence has motivated researchers all over the world. New and improved ways to represent visual content are being developed each day. Tremendous amount of research has been carried out in the last decade. In this paper we will present a detailed overview of some of the powerful color, texture and shape descriptors for content based image retrieval. A comparative analysis will also be carried out for providing an insight into outstanding challenges in this field.

preprint2015arXiv

Ontology-based Secure Retrieval of Semantically Significant Visual Contents

Image classification is an enthusiastic research field where large amount of image data is classified into various classes based on their visual contents. Researchers have presented various low-level features-based techniques for classifying images into different categories. However, efficient and effective classification and retrieval is still a challenging problem due to complex nature of visual contents. In addition, the traditional information retrieval techniques are vulnerable to security risks, making it easy for attackers to retrieve personal visual contents such as patients records and law enforcement agencies databases. Therefore, we propose a novel ontology-based framework using image steganography for secure image classification and information retrieval. The proposed framework uses domain-specific ontology for mapping the low-level image features to high-level concepts of ontologies which consequently results in efficient classification. Furthermore, the proposed method utilizes image steganography for hiding the image semantics as a secret message inside them, making the information retrieval process secure from third parties. The proposed framework minimizes the computational complexity of traditional techniques, increasing its suitability for secure and real-time visual contents retrieval from personalized image databases. Experimental results confirm the efficiency, effectiveness, and security of the proposed framework as compared with other state-of-the-art systems.

Irfan Mehmood

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

CapCLIP: A Vision-Language Representation Alignment Approach for Wireless Capsule Endoscopy Analysis

A novel magic LSB substitution method (M-LSB-SM) using multi-level encryption and achromatic component of an image

Describing Colors, Textures and Shapes for Content Based Image Retrieval - A Survey

Ontology-based Secure Retrieval of Semantically Significant Visual Contents