Researcher profile

Manoranjan Paul

Manoranjan Paul contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

A Sparse-Attention Deep Learning Model Integrating Heterogeneous Multimodal Features for Parkinson's Disease Severity Profiling

Characterising the heterogeneous presentation of Parkinson's disease (PD) requires integrating biological and clinical markers within a unified predictive framework. While multimodal data provide complementary information, many existing computational models struggle with interpretability, class imbalance, or effective fusion of high-dimensional imaging and tabular clinical features. To address these limitations, we propose the Class-Weighted Sparse-Attention Fusion Network (SAFN), an interpretable deep learning framework for robust multimodal profiling. SAFN integrates MRI cortical thickness, MRI volumetric measures, clinical assessments, and demographic variables using modality-specific encoders and a symmetric cross-attention mechanism that captures nonlinear interactions between imaging and clinical representations. A sparsity-constrained attention-gating fusion layer dynamically prioritises informative modalities, while a class-balanced focal loss (beta = 0.999, gamma = 1.5) mitigates dataset imbalance without synthetic oversampling. Evaluated on 703 participants (570 PD, 133 healthy controls) from the Parkinson's Progression Markers Initiative using subject-wise five-fold cross-validation, SAFN achieves an accuracy of 0.98 plus or minus 0.02 and a PR-AUC of 1.00 plus or minus 0.00, outperforming established machine learning and deep learning baselines. Interpretability analysis shows a clinically coherent decision process, with approximately 60 percent of predictive weight assigned to clinical assessments, consistent with Movement Disorder Society diagnostic principles. SAFN provides a reproducible and transparent multimodal modelling paradigm for computational profiling of neurodegenerative disease.

preprint2022arXiv

Debiasing pipeline improves deep learning model generalization for X-ray based lung nodule detection

Lung cancer is the leading cause of cancer death worldwide and a good prognosis depends on early diagnosis. Unfortunately, screening programs for the early diagnosis of lung cancer are uncommon. This is in-part due to the at-risk groups being located in rural areas far from medical facilities. Reaching these populations would require a scaled approach that combines mobility, low cost, speed, accuracy, and privacy. We can resolve these issues by combining the chest X-ray imaging mode with a federated deep-learning approach, provided that the federated model is trained on homogenous data to ensure that no single data source can adversely bias the model at any point in time. In this study we show that an image pre-processing pipeline that homogenizes and debiases chest X-ray images can improve both internal classification and external generalization, paving the way for a low-cost and accessible deep learning-based clinical system for lung cancer screening. An evolutionary pruning mechanism is used to train a nodule detection deep learning model on the most informative images from a publicly available lung nodule X-ray dataset. Histogram equalization is used to remove systematic differences in image brightness and contrast. Model training is performed using all combinations of lung field segmentation, close cropping, and rib suppression operators. We show that this pre-processing pipeline results in deep learning models that successfully generalize an independent lung nodule dataset using ablation studies to assess the contribution of each operator in this pipeline. In stripping chest X-ray images of known confounding variables by lung field segmentation, along with suppression of signal noise from the bone structure we can train a highly accurate deep learning lung nodule detection algorithm with outstanding generalization accuracy of 89% to nodule samples in unseen data.

preprint2022arXiv

Dilated convolutional neural network-based deep reference picture generation for video compression

Motion estimation and motion compensation are indispensable parts of inter prediction in video coding. Since the motion vector of objects is mostly in fractional pixel units, original reference pictures may not accurately provide a suitable reference for motion compensation. In this paper, we propose a deep reference picture generator which can create a picture that is more relevant to the current encoding frame, thereby further reducing temporal redundancy and improving video compression efficiency. Inspired by the recent progress of Convolutional Neural Network(CNN), this paper proposes to use a dilated CNN to build the generator. Moreover, we insert the generated deep picture into Versatile Video Coding(VVC) as a reference picture and perform a comprehensive set of experiments to evaluate the effectiveness of our network on the latest VVC Test Model VTM. The experimental results demonstrate that our proposed method achieves on average 9.7% bit saving compared with VVC under low-delay P configuration.

preprint2022arXiv

Dynamic Point Cloud Compression with Cross-Sectional Approach

The recent development of dynamic point clouds has introduced the possibility of mimicking natural reality, and greatly assisting quality of life. However, to broadcast successfully, the dynamic point clouds require higher compression due to their huge volume of data compared to the traditional video. Recently, MPEG finalized a Video-based Point Cloud Compression standard known as V-PCC. However, V-PCC requires huge computational time due to expensive normal calculation and segmentation, sacrifices some points to limit the number of 2D patches, and cannot occupy all spaces in the 2D frame. The proposed method addresses these limitations by using a novel cross-sectional approach. This approach reduces expensive normal estimation and segmentation, retains more points, and utilizes more spaces for 2D frame generation compared to the VPCC. The experimental results using standard video sequences show that the proposed technique can achieve better compression in both geometric and texture data compared to the V-PCC standard.

preprint2022arXiv

Efficient dynamic point cloud coding using Slice-Wise Segmentation

With the fast growth of immersive video sequences, achieving seamless and high-quality compressed 3D content is even more critical. MPEG recently developed a video-based point cloud compression (V-PCC) standard for dynamic point cloud coding. However, reconstructed point clouds using V-PCC suffer from different artifacts, including losing data during pre-processing before applying existing video coding techniques, e.g., High-Efficiency Video Coding (HEVC). Patch generations and self-occluded points in the 3D to the 2D projection are the main reasons for missing data using V-PCC. This paper proposes a new method that introduces overlapping slicing as an alternative to patch generation to decrease the number of patches generated and the amount of data lost. In the proposed method, the entire point cloud has been cross-sectioned into variable-sized slices based on the number of self-occluded points so that data loss can be minimized in the patch generation process and projection. For this, a variable number of layers are considered, partially overlapped to retain the self-occluded points. The proposed method's added advantage is to reduce the bits requirement and to encode geometric data using the slicing base position. The experimental results show that the proposed method is much more flexible than the standard V-PCC method, improves the rate-distortion performance, and decreases the data loss significantly compared to the standard V-PCC method.

preprint2022arXiv

Efficient Motion Modelling with Variable-sized blocks from Hierarchical Cuboidal Partitioning

Motion modelling with block-based architecture has been widely used in video coding where a frame is divided into fixed-sized blocks that are motion compensated independently. This often leads to coding inefficiency as fixed-sized blocks hardly align with the object boundaries. Although hierarchical block-partitioning has been introduced to address this, the increased number of motion vectors limits the benefit. Recently, approximate segmentation of images with cuboidal partitioning has gained popularity. Not only are the variable-sized rectangular segments (cuboids) readily amenable to block-based image/video coding techniques, but they are also capable of aligning well with the object boundaries. This is because cuboidal partitioning is based on a homogeneity constraint, minimising the sum of squared errors (SSE). In this paper, we have investigated the potential of cuboids in motion modelling against the fixed-sized blocks used in scalable video coding. Specifically, we have constructed motion-compensated current frame using the cuboidal partitioning information of the anchor frame in a group-of-picture (GOP). The predicted current frame has then been used as the base layer while encoding the current frame as an enhancement layer using the scalable HEVC encoder. Experimental results confirm 6.71%-10.90% bitrate savings on 4K video sequences.

preprint2021arXiv

MAVIDH Score: A COVID-19 Severity Scoring using Chest X-Ray Pathology Features

The application of computer vision for COVID-19 diagnosis is complex and challenging, given the risks associated with patient misclassifications. Arguably, the primary value of medical imaging for COVID-19 lies rather on patient prognosis. Radiological images can guide physicians assessing the severity of the disease, and a series of images from the same patient at different stages can help to gauge disease progression. Hence, a simple method based on lung-pathology interpretable features for scoring disease severity from Chest X-rays is proposed here. As the primary contribution, this method correlates well to patient severity in different stages of disease progression with competitive results compared to other existing, more complex methods. An original data selection approach is also proposed, allowing the simple model to learn the severity-related features. It is hypothesized that the resulting competitive performance presented here is related to the method being feature-based rather than reliant on lung involvement or opacity as others in the literature. A second contribution comes from the validation of the results, conceptualized as the scoring of patients groups from different stages of the disease. Besides performing such validation on an independent data set, the results were also compared with other proposed scoring methods in the literature. The results show that there is a significant correlation between the scoring system (MAVIDH) and patient outcome, which could potentially help physicians rating and following disease progression in COVID-19 patients.

preprint2021arXiv

Potential Features of ICU Admission in X-ray Images of COVID-19 Patients

X-ray images may present non-trivial features with predictive information of patients that develop severe symptoms of COVID-19. If true, this hypothesis may have practical value in allocating resources to particular patients while using a relatively inexpensive imaging technique. The difficulty of testing such a hypothesis comes from the need for large sets of labelled data, which need to be well-annotated and should contemplate the post-imaging severity outcome. This paper presents an original methodology for extracting semantic features that correlate to severity from a data set with patient ICU admission labels through interpretable models. The methodology employs a neural network trained to recognise lung pathologies to extract the semantic features, which are then analysed with low-complexity models to limit overfitting while increasing interpretability. This analysis points out that only a few features explain most of the variance between patients that developed severe symptoms. When applied to an unrelated larger data set with pathology-related clinical notes, the method has shown to be capable of selecting images for the learned features, which could translate some information about their common locations in the lung. Besides attesting separability on patients that eventually develop severe symptoms, the proposed methods represent a statistical approach highlighting the importance of features related to ICU admission that may have been only qualitatively reported. While handling limited data sets, notable methodological aspects are adopted, such as presenting a state-of-the-art lung segmentation network and the use of low-complexity models to avoid overfitting. The code for methodology and experiments is also available.

preprint2020arXiv

A Practical Blockchain Framework using Image Hashing for Image Authentication

Blockchain is a relatively new technology that can be seen as a decentralised database. Blockchain systems heavily rely on cryptographic hash functions to store their data, which makes it difficult to tamper with any data stored in the system. A topic that was researched along with blockchain is image authentication. Image authentication focuses on investigating and maintaining the integrity of images. As a blockchain system can be useful for maintaining data integrity, image authentication has the potential to be enhanced by blockchain. There are many techniques that can be used to authenticate images; the technique investigated by this work is image hashing. Image hashing is a technique used to calculate how similar two different images are. This is done by converting the images into hashes and then comparing them using a distance formula. To investigate the topic, an experiment involving a simulated blockchain was created. The blockchain acted as a database for images. This blockchain was made up of devices which contained their own unique image hashing algorithms. The blockchain was tested by creating modified copies of the images contained in the database, and then submitting them to the blockchain to see if it will return the original image. Through this experiment it was discovered that it is plausible to create an image authentication system using blockchain and image hashing. However, the design proposed by this work requires refinement, as it appears to struggle in some situations. This work shows that blockchain can be a suitable approach for authenticating images, particularly via image hashing. Other observations include that using multiple image hash algorithms at the same time can increase performance in some cases, as well as that each type of test done to the blockchain has its own unique pattern to its data.

preprint2020arXiv

Automatically Assessing Quality of Online Health Articles

The information ecosystem today is overwhelmed by an unprecedented quantity of data on versatile topics are with varied quality. However, the quality of information disseminated in the field of medicine has been questioned as the negative health consequences of health misinformation can be life-threatening. There is currently no generic automated tool for evaluating the quality of online health information spanned over a broad range. To address this gap, in this paper, we applied a data mining approach to automatically assess the quality of online health articles based on 10 quality criteria. We have prepared a labeled dataset with 53012 features and applied different feature selection methods to identify the best feature subset with which our trained classifier achieved an accuracy of 84%-90% varied over 10 criteria. Our semantic analysis of features shows the underpinning associations between the selected features & assessment criteria and further rationalize our assessment approach. Our findings will help in identifying high-quality health articles and thus aiding users in shaping their opinion to make the right choice while picking health-related help from online.

preprint2020arXiv

Hyperspectral Imaging to detect Age, Defects and Individual Nutrient Deficiency in Grapevine Leaves

Hyperspectral (HS) imaging was successfully employed in the 380 nm to 1000 nm wavelength range to investigate the efficacy of detecting age, healthiness and individual nutrient deficiency of grapevine leaves collected from vineyards located in central west NSW, Australia. For age detection, the appearance of many healthy grapevine leaves has been examined. Then visually defective leaves were compared with healthy leaves. Control leaves and individual nutrient-deficient leaves (e.g. N, K and Mg) were also analysed. Several features were employed at various stages in the Ultraviolet (UV), Visible (VIS) and Near Infrared (NIR) regions to evaluate the experimental data: mean brightness, mean 1st derivative brightness, variation index, mean spectral ratio, normalised difference vegetation index (NDVI) and standard deviation (SD). Experiment results demonstrate that these features could be utilised with a high degree of effectiveness to compare age, identify unhealthy samples and not only to distinguish from control and nutrient deficiency but also to identify individual nutrient defects. Therefore, our work corroborated that HS imaging has excellent potential as a non-destructive as well as a non-contact method to detect age, healthiness and individual nutrient deficiencies of grapevine leaves

preprint2020arXiv

Rain Streak Removal in a Video to Improve Visibility by TAWL Algorithm

In computer vision applications, the visibility of the video content is crucial to perform analysis for better accuracy. The visibility can be affected by several atmospheric interferences in challenging weather-one of them is the appearance of rain streak. In recent time, rain streak removal achieves lots of interest to the researchers as it has some exciting applications such as autonomous car, intelligent traffic monitoring system, multimedia, etc. In this paper, we propose a novel and simple method by combining three novel extracted features focusing on temporal appearance, wide shape and relative location of the rain streak and we called it TAWL (Temporal Appearance, Width, and Location) method. The proposed TAWL method adaptively uses features from different resolutions and frame rates. Moreover, it progressively processes features from the up-coming frames so that it can remove rain in the real-time. The experiments have been conducted using video sequences with both real rains and synthetic rains to compare the performance of the proposed method against the relevant state-of-the-art methods. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods by removing more rain streaks while keeping other moving regions.

preprint2020arXiv

Towards Domain-Specific Characterization of Misinformation

The rapid dissemination of health misinformation poses an increasing risk to public health. To best understand the way of combating health misinformation, it is important to acknowledge how the fundamental characteristics of misinformation differ from domain to domain. This paper presents a pathway towards domain-specific characterization of misinformation so that we can address the concealed behavior of health misinformation compared to others and take proper initiative accordingly for combating it. With this aim, we have mentioned several possible approaches to identify discriminating features of medical misinformation from other types of misinformation. Thereafter, we briefly propose a research plan followed by possible challenges to meet up. The findings of the proposed research idea will provide new directions to the misinformation research community.