Researcher profile

George Boateng

George Boateng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Adesua: Development and Feasibility Study of an AI WhatsApp Bot for Science Learning in West Africa

Sub-Saharan Africa faces persistently high student-teacher ratios and shortages of qualified teachers, limiting students' access to personalized learning support and formative assessment. To address this challenge, we present Adesua, a WhatsApp-based AI Teaching Assistant for science education that extends the Kwame for Science platform. Adesua leverages WhatsApp's widespread adoption in Africa to provide accessible, curriculum-aligned learning support for Junior High School (JHS) and Senior High School (SHS) students across West Africa. The system integrates curated textbooks and 33 years of national examination questions with generative AI to enable conversational question answering and automated assessment with feedback via a WhatsApp bot. Students can ask science questions, take timed or untimed multiple-choice tests by topic or exam year, and receive instant grading and detailed explanations of correct and incorrect responses. A 6-month feasibility deployment in 2025 had 56 active users in Ghana, including students and parents. Quantitative evaluation showed a high perceived usefulness, with a helpfulness score of 93.75\% for AI-generated answers, albeit with a small number of ratings (n=16). These preliminary results provide a basis for more extensive future evaluation of a WhatsApp-based AI assistant to assess its potential to offer scalable, low-cost personalized learning support and formative assessment in resource-constrained educational contexts.

preprint2026arXiv

Eskwai for Students: Generative AI Assistant for Legal Education in Ghana

Recent advances in generative AI have shown their potential to be leveraged for legal education. Yet, work on the development and deployment of such systems for legal education in the Global South is limited. In this work, we developed Eskwai for Students, a generative AI assistant to help law students with their legal education. Eskwai for Students is a retrieval augmented generation (RAG) system that provides answers to a wide range of legal questions for law students grounded in a curated database of over 12K case laws and 1.4K legislation in Ghana. We deployed Eskwai for Students in a longitudinal study of 30 months (2.5 years) used by 3.1K law students in Ghana who made 32K queries. We evaluated the helpfulness of our AI, and provided insight into the kinds of queries law students submit to this generative AI tool, which raises some ethical concerns. This work contributes to an understanding of how law students in the Global South are using generative AI for their studies and the ways it could be leveraged responsibly to advance legal education.

preprint2026arXiv

NSMQ Riddles: A Benchmark of Scientific and Mathematical Riddles for Quizzing Large Language Models

Large Language Models (LLMs) have shown good performance on various science educational benchmarks, demonstrating their potential for use in science and mathematics education. Yet, LLMs tend to be evaluated on science and mathematical educational datasets from the Western world, with an underrepresentation of datasets from the Global South. Furthermore, they tend to have multiple-choice answer options that are trivial to evaluate. In this work, we present NSMQ Riddles, a novel benchmark of Scientific and Mathematical Riddles from Ghana's National Science and Maths Quiz (NSMQ) competition to evaluate LLMs. The NSMQ is an annual live TV competition for senior secondary school students in Ghana that brings together the smartest high school students in Ghana who compete in teams of 2 by answering questions in biology, chemistry, physics, and math over five rounds and five stages until a winning team is crowned for that year. NSMQ Riddles consists of 11 years of riddle questions (n=1.8K) from the 5th round, with each riddle containing a minimum of 3 clues. Students compete to be the first to guess the answer on any of the clues, with earlier clues being vague and also fetching more points. The answers are usually a number, word, or short phrase, allowing for automatic evaluation. We evaluated state-of-the-art models: closed (GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6) and open models (Kimi-K2.5, DeepSeek-V3.1, GPT-OSS-120B) with high and low reasoning settings. Our evaluation shows that the dataset is challenging even for state-of-the-art LLMs, which performed worse than the best student contestants. This work contributes a novel and challenging benchmark for scientific and mathematical reasoning from the Global South towards enabling a true global benchmarking of LLMs' capabilities for science and mathematics education.

preprint2022arXiv

"Are you okay, honey?": Recognizing Emotions among Couples Managing Diabetes in Daily Life using Multimodal Real-World Smartwatch Data

Couples generally manage chronic diseases together and the management takes an emotional toll on both patients and their romantic partners. Consequently, recognizing the emotions of each partner in daily life could provide an insight into their emotional well-being in chronic disease management. Currently, the process of assessing each partner's emotions is manual, time-intensive, and costly. Despite the existence of works on emotion recognition among couples, none of these works have used data collected from couples' interactions in daily life. In this work, we collected 85 hours (1,021 5-minute samples) of real-world multimodal smartwatch sensor data (speech, heart rate, accelerometer, and gyroscope) and self-reported emotion data (n=612) from 26 partners (13 couples) managing diabetes mellitus type 2 in daily life. We extracted physiological, movement, acoustic, and linguistic features, and trained machine learning models (support vector machine and random forest) to recognize each partner's self-reported emotions (valence and arousal). Our results from the best models (balanced accuracies of 63.8% and 78.1% for arousal and valence respectively) are better than chance and our prior work that also used data from German-speaking, Swiss-based couples, albeit, in the lab. This work contributes toward building automated emotion recognition systems that would eventually enable partners to monitor their emotions in daily life and enable the delivery of interventions to improve their emotional well-being.

preprint2022arXiv

Development, Deployment, and Evaluation of DyMand -- An Open-Source Smartwatch and Smartphone System for Capturing Couples' Dyadic Interactions in Chronic Disease Management in Daily Life

Dyadic interactions of couples are of interest as they provide insight into relationship quality and chronic disease management. Currently, ambulatory assessment of couples' interactions entails collecting data at random or scheduled times which could miss significant couples' interaction/conversation moments. In this work, we developed, deployed and evaluated DyMand, a novel open-source smartwatch and smartphone system for collecting self-report and sensor data from couples based on partners' interaction moments. Our smartwatch-based algorithm uses the Bluetooth signal strength between two smartwatches each worn by one partner, and a voice activity detection machine-learning algorithm to infer that the partners are interacting, and then to trigger data collection. We deployed the DyMand system in a 7-day field study and collected data about social support, emotional well-being, and health behavior from 13 (N=26) Swiss-based heterosexual couples managing diabetes mellitus type 2 of one partner. Our system triggered 99.1% of the expected number of sensor and self-report data when the app was running, and 77.6% of algorithm-triggered recordings contained partners' conversation moments compared to 43.8% for scheduled triggers. The usability evaluation showed that DyMand was easy to use. DyMand can be used by social, clinical, or health psychology researchers to understand the social dynamics of couples in everyday life, and for developing and delivering behavioral interventions for couples who are managing chronic diseases.

preprint2022arXiv

Emotion Recognition among Couples: A Survey

Couples' relationships affect the physical health and emotional well-being of partners. Automatically recognizing each partner's emotions could give a better understanding of their individual emotional well-being, enable interventions and provide clinical benefits. In the paper, we summarize and synthesize works that have focused on developing and evaluating systems to automatically recognize the emotions of each partner based on couples' interaction or conversation contexts. We identified 28 articles from IEEE, ACM, Web of Science, and Google Scholar that were published between 2010 and 2021. We detail the datasets, features, algorithms, evaluation, and results of each work as well as present main themes. We also discuss current challenges, research gaps and propose future research directions. In summary, most works have used audio data collected from the lab with annotations done by external experts and used supervised machine learning approaches for binary classification of positive and negative affect. Performance results leave room for improvement with significant research gaps such as no recognition using data from daily life. This survey will enable new researchers to get an overview of this field and eventually enable the development of emotion recognition systems to inform interventions to improve the emotional well-being of couples.

preprint2022arXiv

Kwame for Science: An AI Teaching Assistant Based on Sentence-BERT for Science Education in West Africa

Africa has a high student-to-teacher ratio which limits students' access to teachers. Consequently, students struggle to get answers to their questions. In this work, we extended Kwame, our previous AI teaching assistant, adapted it for science education, and deployed it as a web app. Kwame for Science answers questions of students based on the Integrated Science subject of the West African Senior Secondary Certificate Examination (WASSCE). Kwame for Science is a Sentence-BERT-based question-answering web app that displays 3 paragraphs as answers along with a confidence score in response to science questions. Additionally, it displays the top 5 related past exam questions and their answers in addition to the 3 paragraphs. Our preliminary evaluation of the Kwame for Science with a 2.5-week real-world deployment showed a top 3 accuracy of 87.5% (n=56) with 190 users across 11 countries. Kwame for Science will enable the delivery of scalable, cost-effective, and quality remote education to millions of people across Africa.