Researcher profile

Nibaran Das

Nibaran Das contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2021arXiv

Exploring Knowledge Distillation of a Deep Neural Network for Multi-Script identification

Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller student network by utilizing the teacher model's predictions. This process is known as dark knowledge transfer. It has been quite successful in many domains where the final result obtained is unachievable through directly training the student network with a simple architecture. In this paper, we explore dark knowledge transfer approach using long short-term memory(LSTM) and CNN based assistant model and various deep neural networks as the teacher model, with a simple CNN based student network, in this domain of multi-script identification from natural scene text images. We explore the performance of different teacher models and their ability to transfer knowledge to a student network. Although the small student network's limited size, our approach obtains satisfactory results on a well-known script identification dataset CVSI-2015.

preprint2021arXiv

Multispectral Object Detection with Deep Learning

Object detection in natural scenes can be a challenging task. In many real-life situations, the visible spectrum is not suitable for traditional computer vision tasks. Moving outside the visible spectrum range, such as the thermal spectrum or the near-infrared (NIR) images, is much more beneficial in low visibility conditions, NIR images are very helpful for understanding the object's material quality. In this work, we have taken images with both the Thermal and NIR spectrum for the object detection task. As multi-spectral data with both Thermal and NIR is not available for the detection task, we needed to collect data ourselves. Data collection is a time-consuming process, and we faced many obstacles that we had to overcome. We train the YOLO v3 network from scratch to detect an object from multi-spectral images. Also, to avoid overfitting, we have done data augmentation and tune hyperparameters.

preprint2021arXiv

RectiNet-v2: A stacked network architecture for document image dewarping

With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.

preprint2020arXiv

A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping

Capturing images of documents is one of the easiest and most used methods of recording them. These images however, being captured with the help of handheld devices, often lead to undesirable distortions that are hard to remove. We propose a supervised Gated and Bifurcated Stacked U-Net module to predict a dewarping grid and create a distortion free image from the input. While the network is trained on synthetically warped document images, results are calculated on the basis of real world images. The novelty in our methods exists not only in a bifurcation of the U-Net to help eliminate the intermingling of the grid coordinates, but also in the use of a gated network which adds boundary and other minute line level details to the model. The end-to-end pipeline proposed by us achieves state-of-the-art performance on the DocUNet dataset after being trained on just 8 percent of the data used in previous methods.

preprint2020arXiv

A Genetic Algorithm based Kernel-size Selection Approach for a Multi-column Convolutional Neural Network

Deep neural network-based architectures give promising results in various domains including pattern recognition. Finding the optimal combination of the hyper-parameters of such a large-sized architecture is tedious and requires a large number of laboratory experiments. But, identifying the optimal combination of a hyper-parameter or appropriate kernel size for a given architecture of deep learning is always a challenging and tedious task. Here, we introduced a genetic algorithm-based technique to reduce the efforts of finding the optimal combination of a hyper-parameter (kernel size) of a convolutional neural network-based architecture. The method is evaluated on three popular datasets of different handwritten Bangla characters and digits. The implementation of the proposed methodology can be found in the following link: https://github.com/DeepQn/GA-Based-Kernel-Size.

preprint2020arXiv

A Skip-connected Multi-column Network for Isolated Handwritten Bangla Character and Digit recognition

Finding local invariant patterns in handwrit-ten characters and/or digits for optical character recognition is a difficult task. Variations in writing styles from one person to another make this task challenging. We have proposed a non-explicit feature extraction method using a multi-scale multi-column skip convolutional neural network in this work. Local and global features extracted from different layers of the proposed architecture are combined to derive the final feature descriptor encoding a character or digit image. Our method is evaluated on four publicly available datasets of isolated handwritten Bangla characters and digits. Exhaustive comparative analysis against contemporary methods establishes the efficacy of our proposed approach.

preprint2020arXiv

Cytology Image Analysis Techniques Towards Automation: Systematically Revisited

Cytology is the branch of pathology which deals with the microscopic examination of cells for diagnosis of carcinoma or inflammatory conditions. Automation in cytology started in the early 1950s with the aim to reduce manual efforts in diagnosis of cancer. The inflush of intelligent technological units with high computational power and improved specimen collection techniques helped to achieve its technological heights. In the present survey, we focus on such image processing techniques which put steps forward towards the automation of cytology. We take a short tour to 17 types of cytology and explore various segmentation and/or classification techniques which evolved during last three decades boosting the concept of automation in cytology. It is observed, that most of the works are aligned towards three types of cytology: Cervical, Breast and Lung, which are discussed elaborately in this paper. The user-end systems developed during that period are summarized to comprehend the overall growth in the respective domains. To be precise, we discuss the diversity of the state-of-the-art methodologies, their challenges to provide prolific and competent future research directions inbringing the cytology-based commercial systems into the mainstream.

preprint2020arXiv

EDC3: Ensemble of Deep-Classifiers using Class-specific Copula functions to Improve Semantic Image Segmentation

In the literature, many fusion techniques are registered for the segmentation of images, but they primarily focus on observed output or belief score or probability score of the output classes. In the present work, we have utilized inter source statistical dependency among different classifiers for ensembling of different deep learning techniques for semantic segmentation of images. For this purpose, in the present work, a class-wise Copula-based ensembling method is newly proposed for solving the multi-class segmentation problem. Experimentally, it is observed that the performance has improved more for semantic image segmentation using the proposed class-specific Copula function than the traditionally used single Copula function for the problem. The performance is also compared with three state-of-the-art ensembling methods.

preprint2020arXiv

Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator

The present work demonstrates a fast and improved technique for dewarping nonlinearly warped document images. The images are first dewarped at the page-level by estimating optimum inverse projections using curvilinear homography. The quality of the process is then estimated by evaluating a set of metrics related to the characteristics of the text lines and rectilinear objects for measuring parallelism, orthogonality, etc. These are designed specifically to estimate the quality of the dewarping process without the need of any ground truth. If the quality is estimated to be unsatisfactory, the page-level dewarping process is repeated with finer approximations. This is followed by a line-level dewarping process that makes granular corrections to the warps in individual text-lines. The methodology has been tested on the CBDAR 2007 / IUPR 2011 document image dewarping dataset and is seen to yield the best OCR accuracy in the shortest amount of time, till date. The usefulness of the methodology has also been evaluated on the DocUNet 2018 dataset with some minor tweaks, and is seen to produce comparable results.

preprint2020arXiv

Skin Diseases Detection using LBP and WLD- An Ensembling Approach

In all developing and developed countries in the world, skin diseases are becoming a very frequent health problem for the humans of all age groups. Skin problems affect mental health, develop addiction to alcohol and drugs and sometimes causes social isolation. Considering the importance, we propose an automatic technique to detect three popular skin diseases- Leprosy, Tinea versicolor and Vitiligofrom the images of skin lesions. The proposed technique involves Weber local descriptor and Local binary pattern to represent texture pattern of the affected skin regions. This ensemble technique achieved 91.38% accuracy using multi-level support vector machine classifier, where features are extracted from different regions that are based on center of gravity. We have also applied some popular deep learn-ing networks such as MobileNet, ResNet_152, GoogLeNet,DenseNet_121, and ResNet_101. We get 89% accuracy using ResNet_101. The ensemble approach clearly outperform all of the used deep learning networks. This imaging tool will be useful for early skin disease screening.

preprint2020arXiv

SynCGAN: Using learnable class specific priors to generate synthetic data for improving classifier performance on cytological images

One of the most challenging aspects of medical image analysis is the lack of a high quantity of annotated data. This makes it difficult for deep learning algorithms to perform well due to a lack of variations in the input space. While generative adversarial networks have shown promise in the field of synthetic data generation, but without a carefully designed prior the generation procedure can not be performed well. In the proposed approach we have demonstrated the use of automatically generated segmentation masks as learnable class-specific priors to guide a conditional GAN for the generation of patho-realistic samples for cytology image. We have observed that augmentation of data using the proposed pipeline called "SynCGAN" improves the performance of state of the art classifiers such as ResNet-152, DenseNet-161, Inception-V3 significantly.

preprint2010arXiv

A comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron

The work presents a comparative assessment of seven different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron (MLP) based classifier. The seven feature sets employed here consist of shadow features, octant centroids, longest runs, angular distances, effective spans, dynamic centers of gravity, and some of their combinations. On experimentation with a database of 3000 samples, the maximum recognition rate of 95.80% is observed with both of two separate combinations of features. One of these combinations consists of shadow and centriod features, i. e. 88 features in all, and the other shadow, centroid and longest run features, i. e. 124 features in all. Out of these two, the former combination having a smaller number of features is finally considered effective for applications related to Optical Character Recognition (OCR) of handwritten Arabic numerals. The work can also be extended to include OCR of handwritten characters of Arabic alphabet.

preprint2010arXiv

Binarizing Business Card Images for Mobile Devices

Business card images are of multiple natures as these often contain graphics, pictures and texts of various fonts and sizes both in background and foreground. So, the conventional binarization techniques designed for document images can not be directly applied on mobile devices. In this paper, we have presented a fast binarization technique for camera captured business card images. A card image is split into small blocks. Some of these blocks are classified as part of the background based on intensity variance. Then the non-text regions are eliminated and the text ones are skew corrected and binarized using a simple yet adaptive technique. Experiment shows that the technique is fast, efficient and applicable for the mobile devices.

preprint2010arXiv

Handwritten Arabic Numeral Recognition using a Multi Layer Perceptron

Handwritten numeral recognition is in general a benchmark problem of Pattern Recognition and Artificial Intelligence. Compared to the problem of printed numeral recognition, the problem of handwritten numeral recognition is compounded due to variations in shapes and sizes of handwritten characters. Considering all these, the problem of handwritten numeral recognition is addressed under the present work in respect to handwritten Arabic numerals. Arabic is spoken throughout the Arab World and the fifth most popular language in the world slightly before Portuguese and Bengali. For the present work, we have developed a feature set of 88 features is designed to represent samples of handwritten Arabic numerals for this work. It includes 72 shadow and 16 octant features. A Multi Layer Perceptron (MLP) based classifier is used here for recognition handwritten Arabic digits represented with the said feature set. On experimentation with a database of 3000 samples, the technique yields an average recognition rate of 94.93% evaluated after three-fold cross validation of results. It is useful for applications related to OCR of handwritten Arabic Digit and can also be extended to include OCR of handwritten characters of Arabic alphabet.

preprint2010arXiv

Handwritten Bangla Basic and Compound character recognition using MLP and SVM classifier

A novel approach for recognition of handwritten compound Bangla characters, along with the Basic characters of Bangla alphabet, is presented here. Compared to English like Roman script, one of the major stumbling blocks in Optical Character Recognition (OCR) of handwritten Bangla script is the large number of complex shaped character classes of Bangla alphabet. In addition to 50 basic character classes, there are nearly 160 complex shaped compound character classes in Bangla alphabet. Dealing with such a large varieties of handwritten characters with a suitably designed feature set is a challenging problem. Uncertainty and imprecision are inherent in handwritten script. Moreover, such a large varieties of complex shaped characters, some of which have close resemblance, makes the problem of OCR of handwritten Bangla characters more difficult. Considering the complexity of the problem, the present approach makes an attempt to identify compound character classes from most frequently to less frequently occurred ones, i.e., in order of importance. This is to develop a frame work for incrementally increasing the number of learned classes of compound characters from more frequently occurred ones to less frequently occurred ones along with Basic characters. On experimentation, the technique is observed produce an average recognition rate of 79.25 after three fold cross validation of data with future scope of improvement and extension.

preprint2010arXiv

Text Region Extraction from Business Card Images for Mobile Devices

Designing a Business Card Reader (BCR) for mobile devices is a challenge to the researchers because of huge deformation in acquired images, multiplicity in nature of the business cards and most importantly the computational constraints of the mobile devices. This paper presents a text extraction method designed in our work towards developing a BCR for mobile devices. At first, the background of a camera captured image is eliminated at a coarse level. Then, various rule based techniques are applied on the Connected Components (CC) to filter out the noises and picture regions. The CCs identified as text are then binarized using an adaptive but light-weight binarization technique. Experiments show that the text extraction accuracy is around 98% for a wide range of resolutions with varying computation time and memory requirements. The optimum performance is achieved for the images of resolution 1024x768 pixels with text extraction accuracy of 98.54% and, space and time requirements as 1.1 MB and 0.16 seconds respectively.

preprint2010arXiv

Word level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script

India is a multi-lingual country where Roman script is often used alongside different Indic scripts in a text document. To develop a script specific handwritten Optical Character Recognition (OCR) system, it is therefore necessary to identify the scripts of handwritten text correctly. In this paper, we present a system, which automatically separates the scripts of handwritten words from a document, written in Bangla or Devanagri mixed with Roman scripts. In this script separation technique, we first, extract the text lines and words from document pages using a script independent Neighboring Component Analysis technique. Then we have designed a Multi Layer Perceptron (MLP) based classifier for script separation, trained with 8 different wordlevel holistic features. Two equal sized datasets, one with Bangla and Roman scripts and the other with Devanagri and Roman scripts, are prepared for the system evaluation. On respective independent text samples, word-level script identification accuracies of 99.29% and 98.43% are achieved.