Researcher profile

Marcellin Atemkeng

Marcellin Atemkeng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Anomaly Detection in Soil Heavy Metal Contamination Using Unsupervised Learning for Environmental Risk Assessment

Soil contamination by heavy metals poses a persistent environmental and public health concern in rapidly urbanising regions of Ghana, particularly at unregulated waste disposal sites. This study applies an unsupervised machine learning framework to detect and characterise anomalous heavy metal contamination patterns in soils from twelve waste sites and residential controls in the Central Region, of Ghana. Concentrations of eight metals (As, Cd, Cr, Cu, Hg, Ni, Pb, Zn) were analysed alongside standard health risk indices, including the Hazard Index (HI) and Incremental Lifetime Cancer Risk (ILCR). Isolation Forest and PCA reconstruction error each identified $12$ anomalous samples ($15.4\%$ of $78$ samples), while DBSCAN detected no density-isolated noise points. A consensus approach isolated six robust anomalies ($7.7\%)$, all spatially concentrated at a single site (S3). Anomalies exhibited approximately $70$--$80\%$ higher mean HI values than normal samples, with all consensus anomalies exceeding the HI$=1$ threshold. PCA reconstruction error showed a strong positive association with HI ($r \approx 0.8$), indicating consistency between multivariate deviation and health risk. Three distinct anomaly types were identified: extreme Cu enrichment at S3, anomalously low Ni at S4/S5, and moderate multi-metal (Pb--Zn) co-elevation at S9--S12. The results demonstrate that unsupervised machine learning provides granular, objective insight beyond aggregate indices, enabling targeted site prioritisation and risk-informed environmental management.

preprint2026arXiv

Bridging visual saliency and large language models for explainable deep learning in medical imaging

The opaque nature of deep learning models remains a significant barrier to their clinical adoption in medical imaging. This paper presents a multimodal explainability framework that bridges the gap between convolutional neural network (CNN) predictions and clinically actionable insights for brain tumor classification, leveraging large language models (LLMs) to deliver human-interpretable diagnostic narratives. The proposed framework operates through three coupled stages. First, nine CNN architectures are extended with a dual-output hybrid formulation that simultaneously optimises a classification head and a segmentation head, enabling spatially richer feature learning. Second, visual saliency attribution methods, namely Grad-CAM, Grad-CAM++, and ScoreCAM, are applied to generate class-discriminative heatmaps, which are subsequently refined into binary tumor masks via an adaptive percentile thresholding pipeline. Third, the resulting masks are mapped onto the Harvard-Oxford cortical atlas to translate pixel-level evidence into named neuroanatomical structures, and the extracted findings are encoded into a structured JSON file that conditions three LLMs (Grok3, Mistral, and LLaMA) to generate coherent, radiological-style diagnostic reports. Evaluated on a dataset of 4,834 contrast-enhanced T1-weighted brain MRI images spanning three tumor classes, InceptionResNetV2 achieved the highest classification performance and Grad-CAM++ yielded the best segmentation overlap. Among the language models, Grok3 led in lexical diversity and coherence, while LLaMA achieved the highest readability score. By integrating visual, anatomical, and linguistic modalities into a unified pipeline, the framework produces explanations that are technically grounded and meaningfully interpretable, advancing the transparency and clinical accountability of artificial intelligence assisted brain tumor diagnosis.

preprint2026arXiv

Unsupervised Electrofacies Classification and Porosity Characterization in the Offshore Keta Basin Using Wireline Logs

This study presents an unsupervised machine learning workflow for electrofacies analysis in the offshore Keta Basin, Ghana, where core data are scarce. Six standard wireline logs from Well~C were analysed over a depth interval comprising approximately $11{,}195$ samples. K-means clustering was applied in multivariate log space, with the clustering structure evaluated using inertia and silhouette diagnostics. Four clusters were identified, supported by an average silhouette coefficient of approximately $0.50$, indicating moderate but meaningful separation. The resulting electrofacies exhibit systematic, depth-continuous patterns associated with variations in clay content, porosity, and rock framework properties, forming a geological continuum from shale-dominated to cleaner sandstone-dominated units. The results demonstrate that log-only, unsupervised clustering supported by quantitative metrics provides a robust and reproducible framework for subsurface characterisation. The proposed workflow offers a practical tool for early-stage formation evaluation in frontier offshore basins and a foundation for future integrated studies.

preprint2022arXiv

Automatic Speech Recognition And Limited Vocabulary: A Survey

Automatic Speech Recognition (ASR) is an active field of research due to its large number of applications and the proliferation of interfaces or computing devices that can support speech processing. However, the bulk of applications are based on well-resourced languages that overshadow under-resourced ones. Yet, ASR represents an undeniable means to promote such languages, especially when designing human-to-human or human-to-machine systems involving illiterate people. An approach to design an ASR system targeting under-resourced languages is to start with a limited vocabulary. ASR using a limited vocabulary is a subset of the speech recognition problem that focuses on the recognition of a small number of words or sentences. This paper aims to provide a comprehensive view of mechanisms behind ASR systems as well as techniques, tools, projects, recent contributions, and possible future directions in ASR using a limited vocabulary. This work consequently provides a way forward when designing an ASR system using limited vocabulary. Although an emphasis is put on limited vocabulary, most of the tools and techniques reported in this survey can be applied to ASR systems in general.

preprint2022arXiv

Creating awareness about security and safety on highways to mitigate wildlife-vehicle collisions by detecting and recognizing wildlife fences using deep learning and drone technology

In South Africa, it is a common practice for people to leave their vehicles beside the road when traveling long distances for a short comfort break. This practice might increase human encounters with wildlife, threatening their security and safety. Here we intend to create awareness about wildlife fencing, using drone technology and computer vision algorithms to recognize and detect wildlife fences and associated features. We collected data at Amakhala and Lalibela private game reserves in the Eastern Cape, South Africa. We used wildlife electric fence data containing single and double fences for the classification task. Additionally, we used aerial and still annotated images extracted from the drone and still cameras for the segmentation and detection tasks. The model training results from the drone camera outperformed those from the still camera. Generally, poor model performance is attributed to (1) over-decompression of images and (2) the ability of drone cameras to capture more details on images for the machine learning model to learn as compared to still cameras that capture only the front view of the wildlife fence. We argue that our model can be deployed on client-edge devices to inform people about the presence and significance of wildlife fencing, which minimizes human encounters with wildlife, thereby mitigating wildlife-vehicle collisions.

preprint2022arXiv

Predicting Fuel Consumption in Power Generation Plants using Machine Learning and Neural Networks

The instability of power generation from national grids has led industries (e.g., telecommunication) to rely on plant generators to run their businesses. However, these secondary generators create additional challenges such as fuel leakages in and out of the system and perturbations in the fuel level gauges. Consequently, telecommunication operators have been involved in a constant need for fuel to supply diesel generators. With the increase in fuel prices due to socio-economic factors, excessive fuel consumption and fuel pilferage become a problem, and this affects the smooth run of the network companies. In this work, we compared four machine learning algorithms (i.e. Gradient Boosting, Random Forest, Neural Network, and Lasso) to predict the amount of fuel consumed by a power generation plant. After evaluating the predictive accuracy of these models, the Gradient Boosting model out-perform the other three regressor models with the highest Nash efficiency value of 99.1%.

preprint2022arXiv

The extended HI halo of NGC 4945 as seen by MeerKAT

Observations of the neutral atomic hydrogen (HI) in the nuclear starburst galaxy NGC 4945 with MeerKAT are presented. We find a large amount of halo gas, previously missed by HI observations, accounting for 6.8% of the total HI mass. This is most likely gas blown into the halo by star formation. Our maps go down to a $3σ$ column density level of $5\times10^{18} cm^{-2}$ . We model the HI distribution using tilted-ring fitting techniques and find a warp on the galaxy's approaching and receding sides. The HI in the northern side of the galaxy appears to be suppressed. This may be the result of ionisation by the starburst activity in the galaxy, as suggested by a previous study. The origin of the warp is unclear but could be due to past interactions or ram pressure stripping. Broad, asymmetric HI absorption lines extending beyond the HI emission velocity channels are present towards the nuclear region of NGC 4945. Such broad lines suggest the existence of a nuclear ring moving at a high circular velocity. This is supported by the clear rotation patterns in the HI absorption velocity field. The asymmetry of the absorption spectra can be caused by outflows or inflows of gas in the nuclear region of NGC 4945. The continuum map shows small extensions on both sides of the galaxy's major axis that might be signs of outflows resulting from the starburst activity.

preprint2021arXiv

Xova: Baseline-Dependent Time and Channel Averaging for Radio Interferometry

Xova is a software package that implements baseline-dependent time and channel averaging on Measurement Set data. The uv-samples along a baseline track are aggregated into a bin until a specified decorrelation tolerance is exceeded. The degree of decorrelation in the bin correspondingly determines the amount of channel and timeslot averaging that is suitable for samples in the bin. This necessarily implies that the number of channels and timeslots varies per bin and the output data loses the rectilinear input shape of the input data.

preprint2020arXiv

Filling the uv-gaps of the current VLBI network in Africa

In the African continent, South Africa has world-class astronomical facilities for advanced radio astronomy research. With the advent of the Square Kilometre Array project in South Africa (SA SKA), six countries in Africa (SA SKA partner countries) have joined South Africa to contribute towards the African Very Long Baseline Interferometry (VLBI) Networks (AVN). Each of the AVN countries will soon have a single-dish radio telescope that will be part of the AVN, the European VLBI Network, and the global VLBI network. The SKA and the AVN will enable very high sensitivity VLBI in the southern hemisphere. In the current AVN network, there is a gap in coverage in the central African region. This work analyses the scientific impact if new antennas were to be built or old telecommunication facilities were to be converted to radio telescopes in each of the six countries in central Africa i.e. Cameroon, Gabon, Congo, Equatorial Guinea, Chad, Central African Republic. The work also discusses some economical and skills transfer impacts of having a radio interferometer in this area of Africa.