Researcher profile

Mita Nasipuri

Mita Nasipuri contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
41works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

41 published item(s)

preprint2021arXiv

Exploring Knowledge Distillation of a Deep Neural Network for Multi-Script identification

Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller student network by utilizing the teacher model's predictions. This process is known as dark knowledge transfer. It has been quite successful in many domains where the final result obtained is unachievable through directly training the student network with a simple architecture. In this paper, we explore dark knowledge transfer approach using long short-term memory(LSTM) and CNN based assistant model and various deep neural networks as the teacher model, with a simple CNN based student network, in this domain of multi-script identification from natural scene text images. We explore the performance of different teacher models and their ability to transfer knowledge to a student network. Although the small student network's limited size, our approach obtains satisfactory results on a well-known script identification dataset CVSI-2015.

preprint2021arXiv

Multispectral Object Detection with Deep Learning

Object detection in natural scenes can be a challenging task. In many real-life situations, the visible spectrum is not suitable for traditional computer vision tasks. Moving outside the visible spectrum range, such as the thermal spectrum or the near-infrared (NIR) images, is much more beneficial in low visibility conditions, NIR images are very helpful for understanding the object's material quality. In this work, we have taken images with both the Thermal and NIR spectrum for the object detection task. As multi-spectral data with both Thermal and NIR is not available for the detection task, we needed to collect data ourselves. Data collection is a time-consuming process, and we faced many obstacles that we had to overcome. We train the YOLO v3 network from scratch to detect an object from multi-spectral images. Also, to avoid overfitting, we have done data augmentation and tune hyperparameters.

preprint2021arXiv

RectiNet-v2: A stacked network architecture for document image dewarping

With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.

preprint2020arXiv

A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping

Capturing images of documents is one of the easiest and most used methods of recording them. These images however, being captured with the help of handheld devices, often lead to undesirable distortions that are hard to remove. We propose a supervised Gated and Bifurcated Stacked U-Net module to predict a dewarping grid and create a distortion free image from the input. While the network is trained on synthetically warped document images, results are calculated on the basis of real world images. The novelty in our methods exists not only in a bifurcation of the U-Net to help eliminate the intermingling of the grid coordinates, but also in the use of a gated network which adds boundary and other minute line level details to the model. The end-to-end pipeline proposed by us achieves state-of-the-art performance on the DocUNet dataset after being trained on just 8 percent of the data used in previous methods.

preprint2020arXiv

A Genetic Algorithm based Kernel-size Selection Approach for a Multi-column Convolutional Neural Network

Deep neural network-based architectures give promising results in various domains including pattern recognition. Finding the optimal combination of the hyper-parameters of such a large-sized architecture is tedious and requires a large number of laboratory experiments. But, identifying the optimal combination of a hyper-parameter or appropriate kernel size for a given architecture of deep learning is always a challenging and tedious task. Here, we introduced a genetic algorithm-based technique to reduce the efforts of finding the optimal combination of a hyper-parameter (kernel size) of a convolutional neural network-based architecture. The method is evaluated on three popular datasets of different handwritten Bangla characters and digits. The implementation of the proposed methodology can be found in the following link: https://github.com/DeepQn/GA-Based-Kernel-Size.

preprint2020arXiv

A Hybrid Swarm and Gravitation based feature selection algorithm for Handwritten Indic Script Classification problem

In any multi-script environment, handwritten script classification is of paramount importance before the document images are fed to their respective Optical Character Recognition (OCR) engines. Over the years, this complex pattern classification problem has been solved by researchers proposing various feature vectors mostly having large dimension, thereby increasing the computation complexity of the whole classification model. Feature Selection (FS) can serve as an intermediate step to reduce the size of the feature vectors by restricting them only to the essential and relevant features. In our paper, we have addressed this issue by introducing a new FS algorithm, called Hybrid Swarm and Gravitation based FS (HSGFS). This algorithm is made to run on 3 feature vectors introduced in the literature recently - Distance-Hough Transform (DHT), Histogram of Oriented Gradients (HOG) and Modified log-Gabor (MLG) filter Transform. Three state-of-the-art classifiers namely, Multi-Layer Perceptron (MLP), K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) are used for the handwritten script classification. Handwritten datasets, prepared at block, text-line and word level, consisting of officially recognized 12 Indic scripts are used for the evaluation of our method. An average improvement in the range of 2-5 % is achieved in the classification accuracies by utilizing only about 75-80 % of the original feature vectors on all three datasets. The proposed methodology also shows better performance when compared to some popularly used FS models.

preprint2020arXiv

A New Approach for Texture based Script Identification At Block Level using Quad Tree Decomposition

A considerable amount of success has been achieved in developing monolingual OCR systems for Indic scripts. But in a country like India, where multi-script scenario is prevalent, identifying scripts beforehand becomes obligatory. In this paper, we present the significance of Gabor wavelets filters in extracting directional energy and entropy distributions for 11 official handwritten scripts namely, Bangla, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Urdu and Roman. The experimentation is conducted at block level based on a quad-tree decomposition approach and evaluated using six different well-known classifiers. Finally, the best identification accuracy of 96.86% has been achieved by Multi Layer Perceptron (MLP) classifier for 3-fold cross validation at level-2 decomposition. The results serve to establish the efficacy of the present approach to the classification of handwritten Indic scripts

preprint2020arXiv

A Skip-connected Multi-column Network for Isolated Handwritten Bangla Character and Digit recognition

Finding local invariant patterns in handwrit-ten characters and/or digits for optical character recognition is a difficult task. Variations in writing styles from one person to another make this task challenging. We have proposed a non-explicit feature extraction method using a multi-scale multi-column skip convolutional neural network in this work. Local and global features extracted from different layers of the proposed architecture are combined to derive the final feature descriptor encoding a character or digit image. Our method is evaluated on four publicly available datasets of isolated handwritten Bangla characters and digits. Exhaustive comparative analysis against contemporary methods establishes the efficacy of our proposed approach.

preprint2020arXiv

Computational modeling of Human-nCoV protein-protein interaction network

COVID-19 has created a global pandemic with high morbidity and mortality in 2020. Novel coronavirus (nCoV), also known as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV2), is responsible for this deadly disease. International Committee on Taxonomy of Viruses (ICTV) has declared that nCoV is highly genetically similar to SARS-CoV epidemic in 2003 (89% similarity). Limited number of clinically validated Human-nCoV protein interaction data is available in the literature. With this hypothesis, the present work focuses on developing a computational model for nCoV-Human protein interaction network, using the experimentally validated SARS-CoV-Human protein interactions. Initially, level-1 and level-2 human spreader proteins are identified in SARS-CoV-Human interaction network, using Susceptible-Infected-Susceptible (SIS) model. These proteins are considered as potential human targets for nCoV bait proteins. A gene-ontology based fuzzy affinity function has been used to construct the nCoV-Human protein interaction network at 99.98% specificity threshold. This also identifies the level-1 human spreaders for COVID-19 in human protein-interaction network. Level-2 human spreaders are subsequently identified using the SIS model. The derived host-pathogen interaction network is finally validated using 7 potential FDA listed drugs for COVID-19 with significant overlap between the known drug target proteins and the identified spreader proteins.

preprint2020arXiv

Cytology Image Analysis Techniques Towards Automation: Systematically Revisited

Cytology is the branch of pathology which deals with the microscopic examination of cells for diagnosis of carcinoma or inflammatory conditions. Automation in cytology started in the early 1950s with the aim to reduce manual efforts in diagnosis of cancer. The inflush of intelligent technological units with high computational power and improved specimen collection techniques helped to achieve its technological heights. In the present survey, we focus on such image processing techniques which put steps forward towards the automation of cytology. We take a short tour to 17 types of cytology and explore various segmentation and/or classification techniques which evolved during last three decades boosting the concept of automation in cytology. It is observed, that most of the works are aligned towards three types of cytology: Cervical, Breast and Lung, which are discussed elaborately in this paper. The user-end systems developed during that period are summarized to comprehend the overall growth in the respective domains. To be precise, we discuss the diversity of the state-of-the-art methodologies, their challenges to provide prolific and competent future research directions inbringing the cytology-based commercial systems into the mainstream.

preprint2020arXiv

EDC3: Ensemble of Deep-Classifiers using Class-specific Copula functions to Improve Semantic Image Segmentation

In the literature, many fusion techniques are registered for the segmentation of images, but they primarily focus on observed output or belief score or probability score of the output classes. In the present work, we have utilized inter source statistical dependency among different classifiers for ensembling of different deep learning techniques for semantic segmentation of images. For this purpose, in the present work, a class-wise Copula-based ensembling method is newly proposed for solving the multi-class segmentation problem. Experimentally, it is observed that the performance has improved more for semantic image segmentation using the proposed class-specific Copula function than the traditionally used single Copula function for the problem. The performance is also compared with three state-of-the-art ensembling methods.

preprint2020arXiv

Handwritten Script Identification from Text Lines

In a multilingual country like India where 12 different official scripts are in use, automatic identification of handwritten script facilitates many important applications such as automatic transcription of multilingual documents, searching for documents on the web/digital archives containing a particular script and for the selection of script specific Optical Character Recognition (OCR) system in a multilingual environment. In this paper, we propose a robust method towards identifying scripts from the handwritten documents at text line-level. The recognition is based upon features extracted using Chain Code Histogram (CCH) and Discrete Fourier Transform (DFT). The proposed method is experimented on 800 handwritten text lines written in seven Indic scripts namely, Gujarati, Kannada, Malayalam, Oriya, Tamil, Telugu, Urdu along with Roman script and yielded an average identification rate of 95.14% using Support Vector Machine (SVM) classifier.

preprint2020arXiv

Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator

The present work demonstrates a fast and improved technique for dewarping nonlinearly warped document images. The images are first dewarped at the page-level by estimating optimum inverse projections using curvilinear homography. The quality of the process is then estimated by evaluating a set of metrics related to the characteristics of the text lines and rectilinear objects for measuring parallelism, orthogonality, etc. These are designed specifically to estimate the quality of the dewarping process without the need of any ground truth. If the quality is estimated to be unsatisfactory, the page-level dewarping process is repeated with finer approximations. This is followed by a line-level dewarping process that makes granular corrections to the warps in individual text-lines. The methodology has been tested on the CBDAR 2007 / IUPR 2011 document image dewarping dataset and is seen to yield the best OCR accuracy in the shortest amount of time, till date. The usefulness of the methodology has also been evaluated on the DocUNet 2018 dataset with some minor tweaks, and is seen to produce comparable results.

preprint2020arXiv

Skin Diseases Detection using LBP and WLD- An Ensembling Approach

In all developing and developed countries in the world, skin diseases are becoming a very frequent health problem for the humans of all age groups. Skin problems affect mental health, develop addiction to alcohol and drugs and sometimes causes social isolation. Considering the importance, we propose an automatic technique to detect three popular skin diseases- Leprosy, Tinea versicolor and Vitiligofrom the images of skin lesions. The proposed technique involves Weber local descriptor and Local binary pattern to represent texture pattern of the affected skin regions. This ensemble technique achieved 91.38% accuracy using multi-level support vector machine classifier, where features are extracted from different regions that are based on center of gravity. We have also applied some popular deep learn-ing networks such as MobileNet, ResNet_152, GoogLeNet,DenseNet_121, and ResNet_101. We get 89% accuracy using ResNet_101. The ensemble approach clearly outperform all of the used deep learning networks. This imaging tool will be useful for early skin disease screening.

preprint2020arXiv

Word Segmentation from Unconstrained Handwritten Bangla Document Images using Distance Transform

Segmentation of handwritten document images into text lines and words is one of the most significant and challenging tasks in the development of a complete Optical Character Recognition (OCR) system. This paper addresses the automatic segmentation of text words directly from unconstrained Bangla handwritten document images. The popular Distance transform (DT) algorithm is applied for locating the outer boundary of the word images. This technique is free from generating the over-segmented words. A simple post-processing procedure is applied to isolate the under-segmented word images, if any. The proposed technique is tested on 50 random images taken from CMATERdb1.1.1 database. Satisfactory result is achieved with a segmentation accuracy of 91.88% which confirms the robustness of the proposed methodology.

preprint2010arXiv

A Hough Transform based Technique for Text Segmentation

Text segmentation is an inherent part of an OCR system irrespective of the domain of application of it. The OCR system contains a segmentation module where the text lines, words and ultimately the characters must be segmented properly for its successful recognition. The present work implements a Hough transform based technique for line and word segmentation from digitized images. The proposed technique is applied not only on the document image dataset but also on dataset for business card reader system and license plate recognition system. For standardization of the performance of the system the technique is also applied on public domain dataset published in the website by CMATER, Jadavpur University. The document images consist of multi-script printed and hand written text lines with variety in script and line spacing in single document image. The technique performs quite satisfactorily when applied on mobile camera captured business card images with low resolution. The usefulness of the technique is verified by applying it in a commercial project for localization of license plate of vehicles from surveillance camera images by the process of segmentation itself. The accuracy of the technique for word segmentation, as verified experimentally, is 85.7% for document images, 94.6% for business card images and 88% for surveillance camera images.

preprint2010arXiv

A New Approach to Keyphrase Extraction Using Neural Networks

Keyphrases provide a simple way of describing a document, giving the reader some clues about its contents. Keyphrases can be useful in a various applications such as retrieval engines, browsing interfaces, thesaurus construction, text mining etc.. There are also other tasks for which keyphrases are useful, as we discuss in this paper. This paper describes a neural network based approach to keyphrase extraction from scientific articles. Our results show that the proposed method performs better than some state-of-the art keyphrase extraction approaches.

preprint2010arXiv

A novel approach for handwritten Devnagari character recognition

In this paper a method for recognition of handwritten devanagari characters is described. Here, feature vector is constituted by accumulated directional gradient changes in different segments, number of intersections points for the character, type of spine present and type of shirorekha present in the character. One Multi-layer Perceptron with conjugate-gradient training is used to classify these feature vectors. This method is applied to a database with 1000 sample characters and the recognition rate obtained is 88.12%

preprint2010arXiv

A Two Stage Classification Approach for Handwritten Devanagari Characters

The paper presents a two stage classification approach for handwritten devanagari characters The first stage is using structural properties like shirorekha, spine in character and second stage exploits some intersection features of characters which are fed to a feedforward neural network. Simple histogram based method does not work for finding shirorekha, vertical bar (Spine) in handwritten devnagari characters. So we designed a differential distance based technique to find a near straight line for shirorekha and spine. This approach has been tested for 50000 samples and we got 89.12% success

preprint2010arXiv

Binarizing Business Card Images for Mobile Devices

Business card images are of multiple natures as these often contain graphics, pictures and texts of various fonts and sizes both in background and foreground. So, the conventional binarization techniques designed for document images can not be directly applied on mobile devices. In this paper, we have presented a fast binarization technique for camera captured business card images. A card image is split into small blocks. Some of these blocks are classified as part of the background based on intensity variance. Then the non-text regions are eliminated and the text ones are skew corrected and binarized using a simple yet adaptive technique. Experiment shows that the technique is fast, efficient and applicable for the mobile devices.

preprint2010arXiv

Classification of fused face images using multilayer perceptron neural network

This paper presents a concept of image pixel fusion of visual and thermal faces, which can significantly improve the overall performance of a face recognition system. Several factors affect face recognition performance including pose variations, facial expression changes, occlusions, and most importantly illumination changes. So, image pixel fusion of thermal and visual images is a solution to overcome the drawbacks present in the individual thermal and visual face images. Fused images are projected into eigenspace and finally classified using a multi-layer perceptron. In the experiments we have used Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database benchmark thermal and visual face images. Experimental results show that the proposed approach significantly improves the verification and identification performance and the success rate is 95.07%. The main objective of employing fusion is to produce a fused image that provides the most detailed and reliable information. Fusion of multiple images together produces a more efficient representation of the image.

preprint2010arXiv

Classification Of Gradient Change Features Using MLP For Handwritten Character Recognition

A novel, generic scheme for off-line handwritten English alphabets character images is proposed. The advantage of the technique is that it can be applied in a generic manner to different applications and is expected to perform better in uncertain and noisy environments. The recognition scheme is using a multilayer perceptron(MLP) neural networks. The system was trained and tested on a database of 300 samples of handwritten characters. For improved generalization and to avoid overtraining, the whole available dataset has been divided into two subsets: training set and test set. We achieved 99.10% and 94.15% correct recognition rates on training and test sets respectively. The purposed scheme is robust with respect to various writing styles and size as well as presence of considerable noise.

preprint2010arXiv

Classification of Log-Polar-Visual Eigenfaces using Multilayer Perceptron

In this paper we present a simple novel approach to tackle the challenges of scaling and rotation of face images in face recognition. The proposed approach registers the training and testing visual face images by log-polar transformation, which is capable to handle complicacies introduced by scaling and rotation. Log-polar images are projected into eigenspace and finally classified using an improved multi-layer perceptron. In the experiments we have used ORL face database and Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database for visual face images. Experimental results show that the proposed approach significantly improves the recognition performances from visual to log-polar-visual face images. In case of ORL face database, recognition rate for visual face images is 89.5% and that is increased to 97.5% for log-polar-visual face images whereas for OTCBVS face database recognition rate for visual images is 87.84% and 96.36% for log-polar-visual face images.

preprint2010arXiv

Classification of Polar-Thermal Eigenfaces using Multilayer Perceptron for Human Face Recognition

This paper presents a novel approach to handle the challenges of face recognition. In this work thermal face images are considered, which minimizes the affect of illumination changes and occlusion due to moustache, beards, adornments etc. The proposed approach registers the training and testing thermal face images in polar coordinate, which is capable to handle complicacies introduced by scaling and rotation. Polar images are projected into eigenspace and finally classified using a multi-layer perceptron. In the experiments we have used Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database benchmark thermal face images. Experimental results show that the proposed approach significantly improves the verification and identification performance and the success rate is 97.05%.

preprint2010arXiv

Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition

In this paper we present an OCR for Handwritten Devnagari Characters. Basic symbols are recognized by neural classifier. We have used four feature extraction techniques namely, intersection, shadow feature, chain code histogram and straight line fitting features. Shadow features are computed globally for character image while intersection features, chain code histogram features and line fitting features are computed by dividing the character image into different segments. Weighted majority voting technique is used for combining the classification decision obtained from four Multi Layer Perceptron(MLP) based classifier. On experimentation with a dataset of 4900 samples the overall recognition rate observed is 92.80% as we considered top five choices results. This method is compared with other recent methods for Handwritten Devnagari Character Recognition and it has been observed that this approach has better success rate than other methods.

preprint2010arXiv

Face Synthesis (FASY) System for Determining the Characteristics of a Face Image

This paper aims at determining the characteristics of a face image by extracting its components. The FASY (FAce SYnthesis) System is a Face Database Retrieval and new Face generation System that is under development. One of its main features is the generation of the requested face when it is not found in the existing database, which allows a continuous growing of the database also. To generate the new face image, we need to store the face components in the database. So we have designed a new technique to extract the face components by a sophisticated method. After extraction of the facial feature points we have analyzed the components to determine their characteristics. After extraction and analysis we have stored the components along with their characteristics into the face database for later use during the face construction.

preprint2010arXiv

Face Synthesis (FASY) System for Generation of a Face Image from Human Description

This paper aims at generating a new face based on the human like description using a new concept. The FASY (FAce SYnthesis) System is a Face Database Retrieval and new Face generation System that is under development. One of its main features is the generation of the requested face when it is not found in the existing database, which allows a continuous growing of the database also.

preprint2010arXiv

FPGA Based Assembling of Facial Components for Human Face Construction

This paper aims at VLSI realization for generation of a new face from textual description. The FASY (FAce SYnthesis) System is a Face Database Retrieval and new Face generation System that is under development. One of its main features is the generation of the requested face when it is not found in the existing database. The new face generation system works in three steps - searching phase, assembling phase and tuning phase. In this paper the tuning phase using hardware description language and its implementation in a Field Programmable Gate Array (FPGA) device is presented.

preprint2010arXiv

Fusion of Daubechies Wavelet Coefficients for Human Face Recognition

In this paper fusion of visual and thermal images in wavelet transformed domain has been presented. Here, Daubechies wavelet transform, called as D2, coefficients from visual and corresponding coefficients computed in the same manner from thermal images are combined to get fused coefficients. After decomposition up to fifth level (Level 5) fusion of coefficients is done. Inverse Daubechies wavelet transform of those coefficients gives us fused face images. The main advantage of using wavelet transform is that it is well-suited to manage different image resolution and allows the image decomposition in different kinds of coefficients, while preserving the image information. Fused images thus found are passed through Principal Component Analysis (PCA) for reduction of dimensions and then those reduced fused images are classified using a multi-layer perceptron. For experiments IRIS Thermal/Visual Face Database was used. Experimental results show that the performance of the approach presented here achieves maximum success rate of 100% in many cases.

preprint2010arXiv

Handwritten Bangla Basic and Compound character recognition using MLP and SVM classifier

A novel approach for recognition of handwritten compound Bangla characters, along with the Basic characters of Bangla alphabet, is presented here. Compared to English like Roman script, one of the major stumbling blocks in Optical Character Recognition (OCR) of handwritten Bangla script is the large number of complex shaped character classes of Bangla alphabet. In addition to 50 basic character classes, there are nearly 160 complex shaped compound character classes in Bangla alphabet. Dealing with such a large varieties of handwritten characters with a suitably designed feature set is a challenging problem. Uncertainty and imprecision are inherent in handwritten script. Moreover, such a large varieties of complex shaped characters, some of which have close resemblance, makes the problem of OCR of handwritten Bangla characters more difficult. Considering the complexity of the problem, the present approach makes an attempt to identify compound character classes from most frequently to less frequently occurred ones, i.e., in order of importance. This is to develop a frame work for incrementally increasing the number of learned classes of compound characters from more frequently occurred ones to less frequently occurred ones along with Basic characters. On experimentation, the technique is observed produce an average recognition rate of 79.25 after three fold cross validation of data with future scope of improvement and extension.

preprint2010arXiv

Human Face Recognition using Line Features

In this work we investigate a novel approach to handle the challenges of face recognition, which includes rotation, scale, occlusion, illumination etc. Here, we have used thermal face images as those are capable to minimize the affect of illumination changes and occlusion due to moustache, beards, adornments etc. The proposed approach registers the training and testing thermal face images in polar coordinate, which is capable to handle complicacies introduced by scaling and rotation. Line features are extracted from thermal polar images and feature vectors are constructed using these line. Feature vectors thus obtained passes through principal component analysis (PCA) for the dimensionality reduction of feature vectors. Finally, the images projected into eigenspace are classified using a multi-layer perceptron. In the experiments we have used Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database. Experimental results show that the proposed approach significantly improves the verification and identification performance and the success rate is 99.25%.

preprint2010arXiv

Image Pixel Fusion for Human Face Recognition

In this paper we present a technique for fusion of optical and thermal face images based on image pixel fusion approach. Out of several factors, which affect face recognition performance in case of visual images, illumination changes are a significant factor that needs to be addressed. Thermal images are better in handling illumination conditions but not very consistent in capturing texture details of the faces. Other factors like sunglasses, beard, moustache etc also play active role in adding complicacies to the recognition process. Fusion of thermal and visual images is a solution to overcome the drawbacks present in the individual thermal and visual face images. Here fused images are projected into an eigenspace and the projected images are classified using a radial basis function (RBF) neural network and also by a multi-layer perceptron (MLP). In the experiments Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database benchmark for thermal and visual face images have been used. Comparison of experimental results show that the proposed approach performs significantly well in recognizing face images with a success rate of 96% and 95.07% for RBF Neural Network and MLP respectively.

preprint2010arXiv

Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition

This work presents the application of weighted majority voting technique for combination of classification decision obtained from three Multi_Layer Perceptron(MLP) based classifiers for Recognition of Handwritten Devnagari characters using three different feature sets. The features used are intersection, shadow feature and chain code histogram features. Shadow features are computed globally for character image while intersection features and chain code histogram features are computed by dividing the character image into different segments. On experimentation with a dataset of 4900 samples the overall recognition rate observed is 92.16% as we considered top five choices results. This method is compared with other recent methods for Handwritten Devnagari Character Recognition and it has been observed that this approach has better success rate than other methods.

preprint2010arXiv

Performance Comparison of SVM and ANN for Handwritten Devnagari Character Recognition

Classification methods based on learning from examples have been widely applied to character recognition from the 1990s and have brought forth significant improvements of recognition accuracies. This class of methods includes statistical methods, artificial neural networks, support vector machines (SVM), multiple classifier combination, etc. In this paper, we discuss the characteristics of the some classification methods that have been successfully applied to handwritten Devnagari character recognition and results of SVM and ANNs classification method, applied on Handwritten Devnagari characters. After preprocessing the character image, we extracted shadow features, chain code histogram features, view based features and longest run features. These features are then fed to Neural classifier and in support vector machine for classification. In neural classifier, we explored three ways of combining decisions of four MLP's designed for four different features.

preprint2010arXiv

Quotient Based Multiresolution Image Fusion of Thermal and Visual Images Using Daubechies Wavelet Transform for Human Face Recognition

This paper investigates the multiresolution level-1 and level-2 Quotient based Fusion of thermal and visual images. In the proposed system, the method-1 namely "Decompose then Quotient Fuse Level-1" and the method-2 namely "Decompose-Reconstruct then Quotient Fuse Level-2" both work on wavelet transformations of the visual and thermal face images. The wavelet transform is well-suited to manage different image resolution and allows the image decomposition in different kinds of coefficients, while preserving the image information without any loss. This approach is based on a definition of an illumination invariant signature image which enables an analytic generation of the image space with varying illumination. The quotient fused images are passed through Principal Component Analysis (PCA) for dimension reduction and then those images are classified using a multi-layer perceptron (MLP). The performances of both the methods have been evaluated using OTCBVS and IRIS databases. All the different classes have been tested separately, among them the maximum recognition result is 100%.

preprint2010arXiv

Recognition of Non-Compound Handwritten Devnagari Characters using a Combination of MLP and Minimum Edit Distance

This paper deals with a new method for recognition of offline Handwritten non-compound Devnagari Characters in two stages. It uses two well known and established pattern recognition techniques: one using neural networks and the other one using minimum edit distance. Each of these techniques is applied on different sets of characters for recognition. In the first stage, two sets of features are computed and two classifiers are applied to get higher recognition accuracy. Two MLP's are used separately to recognize the characters. For one of the MLP's the characters are represented with their shadow features and for the other chain code histogram feature is used. The decision of both MLP's is combined using weighted majority scheme. Top three results produced by combined MLP's in the first stage are used to calculate the relative difference values. In the second stage, based on these relative differences character set is divided into two. First set consists of the characters with distinct shapes and second set consists of confused characters, which appear very similar in shapes. Characters of distinct shapes of first set are classified using MLP. Confused characters in second set are classified using minimum edit distance method. Method of minimum edit distance makes use of corner detected in a character image using modified Harris corner detection technique. Experiment on this method is carried out on a database of 7154 samples. The overall recognition is found to be 90.74%.

preprint2010arXiv

Reduction of Feature Vectors Using Rough Set Theory for Human Face Recognition

In this paper we describe a procedure to reduce the size of the input feature vector. A complex pattern recognition problem like face recognition involves huge dimension of input feature vector. To reduce that dimension here we have used eigenspace projection (also called as Principal Component Analysis), which is basically transformation of space. To reduce further we have applied feature selection method to select indispensable features, which will remain in the final feature vectors. Features those are not selected are removed from the final feature vector considering them as redundant or superfluous. For selection of features we have used the concept of reduct and core from rough set theory. This method has shown very good performance. It is worth to mention that in some cases the recognition rate increases with the decrease in the feature vector dimension.

preprint2010arXiv

Text Region Extraction from Business Card Images for Mobile Devices

Designing a Business Card Reader (BCR) for mobile devices is a challenge to the researchers because of huge deformation in acquired images, multiplicity in nature of the business cards and most importantly the computational constraints of the mobile devices. This paper presents a text extraction method designed in our work towards developing a BCR for mobile devices. At first, the background of a camera captured image is eliminated at a coarse level. Then, various rule based techniques are applied on the Connected Components (CC) to filter out the noises and picture regions. The CCs identified as text are then binarized using an adaptive but light-weight binarization technique. Experiments show that the text extraction accuracy is around 98% for a wide range of resolutions with varying computation time and memory requirements. The optimum performance is achieved for the images of resolution 1024x768 pixels with text extraction accuracy of 98.54% and, space and time requirements as 1.1 MB and 0.16 seconds respectively.

preprint2010arXiv

Text/Graphics Separation and Skew Correction of Text Regions of Business Card Images for Mobile Devices

Separation of the text regions from background texture and graphics is an important step of any optical character recognition system for the images containing both texts and graphics. In this paper, we have presented a novel text/graphics separation technique and a method for skew correction of text regions extracted from business card images captured with a cell-phone camera. At first, the background is eliminated at a coarse level based on intensity variance. This makes the foreground components distinct from each other. Then the non-text components are removed using various characteristic features of text and graphics. Finally, the text regions are skew corrected for further processing. Experimenting with business card images of various resolutions, we have found an optimum performance of 98.25% (recall) with 0.75 MP images, that takes 0.17 seconds processing time and 1.1 MB peak memory on a moderately powerful computer (DualCore 1.73 GHz Processor, 1 GB RAM, 1 MB L2 Cache). The developed technique is computationally efficient and consumes low memory so as to be applicable on mobile devices.

preprint2010arXiv

Text/Graphics Separation for Business Card Images for Mobile Devices

Separation of the text regions from background texture and graphics is an important step of any optical character recognition sytem for the images containg both texts and graphics. In this paper, we have presented a novel text/graphics separation technique for business card images captured with a cell-phone camera. At first, the background is eliminated at a coarse level based on intensity variance. This makes the foreground components distinct from each other. Then the non-text components are removed using various characteristic features of text and graphics. Finally, the text regions are skew corrected and binarized for further processing. Experimenting with business card images of various resolutions, we have found an optimum performance of 98.54% with 0.75 MP images, that takes 0.17 seconds processing time and 1.1 MB peak memory on a moderately powerful computer (DualCore 1.73 GHz Processor, 1 GB RAM, 1 MB L2 Cache). The developed technique is computationally efficient and consumes low memory so as to be applicable on mobile devices.

preprint2010arXiv

Word level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script

India is a multi-lingual country where Roman script is often used alongside different Indic scripts in a text document. To develop a script specific handwritten Optical Character Recognition (OCR) system, it is therefore necessary to identify the scripts of handwritten text correctly. In this paper, we present a system, which automatically separates the scripts of handwritten words from a document, written in Bangla or Devanagri mixed with Roman scripts. In this script separation technique, we first, extract the text lines and words from document pages using a script independent Neighboring Component Analysis technique. Then we have designed a Multi Layer Perceptron (MLP) based classifier for script separation, trained with 8 different wordlevel holistic features. Two equal sized datasets, one with Bangla and Roman scripts and the other with Devanagri and Roman scripts, are prepared for the system evaluation. On respective independent text samples, word-level script identification accuracies of 99.29% and 98.43% are achieved.