Researcher profile

Xiaojuan Ma

Xiaojuan Ma contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Legitimizing, Developing, and Sustaining Feminist HCI in East Asia: Challenges and Opportunities

Feminist HCI has been rapidly developing in East Asian contexts in recent years. The region's unique cultural and political backgrounds have contributed valuable, situated knowledge, revealing topics such as localized digital feminism practices, or women's complex navigation among social expectations. However, the very factors that ground these perspectives also create significant survival challenges for researchers in East Asia. These include a scarcity of dedicated funding, the stigma of being perceived as less valuable than productivity-oriented technologies, and the lack of senior researchers and established, resilient communities. Grounded in these challenges and our prior collective practices, we propose this meet-up with two focused goals: (1) to provide a legitimized channel for Feminist HCI researchers to connect and build community, and (2) to facilitate an action-oriented dialogue on how to legitimize, develop, and sustain Feminist HCI in the East Asian context. The website for this meet-up is: https://feminist-hci.github.io/

preprint2026arXiv

Mixed Reality Scenic Live Streaming for Cultural Heritage: Visual Interactions in a Historic Landscape

Scenic Live Streams (SLS), capturing real-world scenic sites from fixed cameras without streamers, have gained increasing popularity recently. They afford unique real-time lenses into remote sites for viewers' synchronous and collective engagement. Foregrounding its lack of dynamism and interactivity, we aim to maximize the potential of SLS by making it interactive. Namely MRSLS, we overlaid plain SLS with interactive Mixed Reality content that matches the site's geographical structures and local cultural backgrounds. We further highlight the substantial benefit of MRSLS to cultural heritage site interactions, and we demonstrate this design proposal with an MRSLS prototype at a UNESCO-listed heritage site in China. The design process includes an interview (N=6) to pinpoint local scenery and culture, as well as two iterative design studies (N=15, 14). A mixed-methods, between-subjects study (N=43, 37) shows that MRSLS affords immersive scenery appreciation, effective cultural imprints, and vivid shared experience. With its balance between cultural, participatory, and authentic attributes, we appeal for more HCI attention to (MR)SLS as an under-explored design space.

preprint2023arXiv

Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making

In AI-assisted decision-making, it is critical for human decision-makers to know when to trust AI and when to trust themselves. However, prior studies calibrated human trust only based on AI confidence indicating AI's correctness likelihood (CL) but ignored humans' CL, hindering optimal team decision-making. To mitigate this gap, we proposed to promote humans' appropriate trust based on the CL of both sides at a task-instance level. We first modeled humans' CL by approximating their decision-making models and computing their potential performance in similar instances. We demonstrated the feasibility and effectiveness of our model via two preliminary studies. Then, we proposed three CL exploitation strategies to calibrate users' trust explicitly/implicitly in the AI-assisted decision-making process. Results from a between-subjects experiment (N=293) showed that our CL exploitation strategies promoted more appropriate human trust in AI, compared with only using AI confidence. We further provided practical implications for more human-compatible AI-assisted decision-making.

preprint2022arXiv

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

Code-switching is a speech phenomenon occurring when a speaker switches language during a conversation. Despite the spontaneous nature of code-switching in conversational spoken language, most existing works collect code-switching data from read speech instead of spontaneous speech. ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong. We report ASCEND's design and procedure for collecting the speech data, including annotations. ASCEND consists of 10.62 hours of clean speech, collected from 23 bilingual speakers of Chinese and English. Furthermore, we conduct baseline experiments using pre-trained wav2vec 2.0 models, achieving a best performance of 22.69\% character error rate and 27.05% mixed error rate.

preprint2022arXiv

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

Automatic speech recognition (ASR) on low resource languages improves the access of linguistic minorities to technological advantages provided by artificial intelligence (AI). In this paper, we address the problem of data scarcity for the Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech paired with transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises philosophy, politics, education, culture, lifestyle and family domains, covering a wide range of topics. We also review all existing Cantonese datasets and analyze them according to their speech type, data source, total size and availability. We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset. In addition, we create a powerful and robust Cantonese ASR model by applying multi-dataset learning on MDCC and Common Voice zh-HK.

preprint2022arXiv

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition

With the rise of deep learning and intelligent vehicle, the smart assistant has become an essential in-car component to facilitate driving and provide extra functionalities. In-car smart assistants should be able to process general as well as car-related commands and perform corresponding actions, which eases driving and improves safety. However, there is a data scarcity issue for low resource languages, hindering the development of research and applications. In this paper, we introduce a new dataset, Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR), for in-car command recognition in the Cantonese language with both video and audio data. It consists of 4,984 samples (8.3 hours) of 200 in-car commands recorded by 30 native Cantonese speakers. Furthermore, we augment our dataset using common in-car background noises to simulate real environments, producing a dataset 10 times larger than the collected one. We provide detailed statistics of both the clean and the augmented versions of our dataset. Moreover, we implement two multimodal baselines to demonstrate the validity of CI-AVSR. Experiment results show that leveraging the visual signal improves the overall performance of the model. Although our best model can achieve a considerable quality on the clean test set, the speech recognition quality on the noisy data is still inferior and remains as an extremely challenging task for real in-car speech recognition systems. The dataset and code will be released at https://github.com/HLTCHKUST/CI-AVSR.

preprint2022arXiv

Evaluating the Effect of Enhanced Text-Visualization Integration on Combating Misinformation in Data Story

Misinformation has disruptive effects on our lives. Many researchers have looked into means to identify and combat misinformation in text or data visualization. However, there is still a lack of understanding of how misinformation can be introduced when text and visualization are combined to tell data stories, not to mention how to improve the lay public's awareness of possible misperceptions about facts in narrative visualization. In this paper, we first analyze where misinformation could possibly be injected into the production-consumption process of data stories through a literature survey. Then, as a first step towards combating misinformation in data stories, we explore possible defensive design methods to enhance the reader's awareness of information misalignment when data facts are scripted and visualized. More specifically, we conduct a between-subjects crowdsourcing study to investigate the impact of two design methods enhancing text-visualization integration, i.e., explanatory annotation and interactive linking, on users' awareness of misinformation in data stories. The study results show that although most participants still can not find misinformation, the two design methods can significantly lower the perceived credibility of the text or visualizations. Our work informs the possibility of fighting an infodemic through defensive design methods.

preprint2022arXiv

Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative Comprehension

Question answering (QA) is a fundamental means to facilitate assessment and training of narrative comprehension skills for both machines and young children, yet there is scarcity of high-quality QA datasets carefully designed to serve this purpose. In particular, existing datasets rarely distinguish fine-grained reading skills, such as the understanding of varying narrative elements. Drawing on the reading education research, we introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students. Generated by educational experts based on an evidence-based theoretical framework, FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories, covering seven types of narrative elements or relations. Our dataset is valuable in two folds: First, we ran existing QA models on our dataset and confirmed that this annotation helps assess models' fine-grained learning skills. Second, the dataset supports question generation (QG) task in the education domain. Through benchmarking with QG models, we show that the QG model trained on FairytaleQA is capable of asking high-quality and more diverse questions.

preprint2022arXiv

How to Save Lives with Microblogs? Lessons From the Usage of Weibo for Requests for Medical Assistance During COVID-19

During recent crises like COVID-19, microblogging platforms have become popular channels for affected people seeking assistance such as medical supplies and rescue operations from emergency responders and the public. Despite this common practice, the affordances of microblogging services for help-seeking during crises that needs immediate attention are not well understood. To fill this gap, we analyzed 8K posts from COVID-19 patients or caregivers requesting urgent medical assistance on Weibo, the largest microblogging site in China. Our mixed-methods analyses suggest that existing microblogging functions need to be improved in multiple aspects to sufficiently facilitate help-seeking in emergencies, including capabilities of search and tracking requests, ease of use, and privacy protection. We also find that people tend to stick to certain well-established functions for publishing requests, even after better alternatives emerge. These findings have implications for designing microblogging tools to better support help requesting and responding during crises.

preprint2022arXiv

Know it to Defeat it: Exploring Health Rumor Characteristics and Debunking Efforts on Chinese Social Media during COVID-19 Crisis

Health-related rumors spreading online during a public crisis may pose a serious threat to people's well-being. Existing crisis informatics research lacks in-depth insights into the characteristics of health rumors and the efforts to debunk them on social media in a pandemic. To fill this gap, we conduct a comprehensive analysis of four months of rumor-related online discussion during COVID-19 on Weibo, a Chinese microblogging site. Results suggest that the dread (cause fear) type of health rumors provoked significantly more discussions and lasted longer than the wish (raise hope) type. We further explore how four kinds of social media users (i.e., government, media, organization, and individual) combat health rumors, and identify their preferred way of sharing debunking information and the key rhetoric strategies used in the process. We examine the relationship between debunking and rumor discussions using a Granger causality approach, and show the efficacy of debunking in suppressing rumor discussions, which is time-sensitive and varies across rumor types and debunkers. Our results can provide insights into crisis informatics and risk management on social media in pandemic settings.

preprint2022arXiv

RankAxis: Towards a Systematic Combination of Projection and Ranking in Multi-Attribute Data Exploration

Projection and ranking are frequently used analysis techniques in multi-attribute data exploration. Both families of techniques help analysts with tasks such as identifying similarities between observations and determining ordered subgroups, and have shown good performances in multi-attribute data exploration. However, they often exhibit problems such as distorted projection layouts, obscure semantic interpretations, and non-intuitive effects produced by selecting a subset of (weighted) attributes. Moreover, few studies have attempted to combine projection and ranking into the same exploration space to complement each other's strengths and weaknesses. For this reason, we propose RankAxis, a visual analytics system that systematically combines projection and ranking to facilitate the mutual interpretation of these two techniques and jointly support multi-attribute data exploration. A real-world case study, expert feedback, and a user study demonstrate the efficacy of RankAxis.

preprint2022arXiv

TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening

Conversational agents (CAs) have the great potential in mitigating the clinicians' burden in screening for neurocognitive disorders among older adults. It is important, therefore, to develop CAs that can be engaging, to elicit conversational speech input from older adult participants for supporting assessment of cognitive abilities. As an initial step, this paper presents research in developing the backchanneling ability in CAs in the form of a verbal response to engage the speaker. We analyzed 246 conversations of cognitive assessments between older adults and human assessors, and derived the categories of reactive backchannels (e.g. "hmm") and proactive backchannels (e.g. "please keep going"). This is used in the development of TalkTive, a CA which can predict both timing and form of backchanneling during cognitive assessments. The study then invited 36 older adult participants to evaluate the backchanneling feature. Results show that proactive backchanneling is more appreciated by participants than reactive backchanneling.

preprint2022arXiv

When Gamification Spoils Your Learning: A Qualitative Case Study of Gamification Misuse in a Language-Learning App

More and more learning apps like Duolingo are using some form of gamification (e.g., badges, points, and leaderboards) to enhance user learning. However, they are not always successful. Gamification misuse is a phenomenon that occurs when users become too fixated on gamification and get distracted from learning. This undesirable phenomenon wastes users' precious time and negatively impacts their learning performance. However, there has been little research in the literature to understand gamification misuse and inform future gamification designs. Therefore, this paper aims to fill this knowledge gap by conducting the first extensive qualitative research on gamification misuse in a popular learning app called Duolingo. Duolingo is currently the world's most downloaded learning app used to learn languages. This study consists of two phases: (I) a content analysis of data from Duolingo forums (from the past nine years) and (II) semi-structured interviews with 15 international Duolingo users. Our research contributes to the Human-Computer Interaction (HCI) and Learning at Scale (L@S) research communities in three ways: (1) elaborating the ramifications of gamification misuse on user learning, well-being, and ethics, (2) identifying the most common reasons for gamification misuse (e.g., competitiveness, overindulgence in playfulness, and herding), and (3) providing designers with practical suggestions to prevent (or mitigate) the occurrence of gamification misuse in their future designs of gamified learning apps.

preprint2021arXiv

CASS: Towards Building a Social-Support Chatbot for Online Health Community

Chatbots systems, despite their popularity in today's HCI and CSCW research, fall short for one of the two reasons: 1) many of the systems use a rule-based dialog flow, thus they can only respond to a limited number of pre-defined inputs with pre-scripted responses; or 2) they are designed with a focus on single-user scenarios, thus it is unclear how these systems may affect other users or the community. In this paper, we develop a generalizable chatbot architecture (CASS) to provide social support for community members in an online health community. The CASS architecture is based on advanced neural network algorithms, thus it can handle new inputs from users and generate a variety of responses to them. CASS is also generalizable as it can be easily migrate to other online communities. With a follow-up field experiment, CASS is proven useful in supporting individual members who seek emotional support. Our work also contributes to fill the research gap on how a chatbot may influence the whole community's engagement.

preprint2021arXiv

Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites

Problem-Based Learning (PBL) is a popular approach to instruction that supports students to get hands-on training by solving problems. Question Pool websites (QPs) such as LeetCode, Code Chef, and Math Playground help PBL by supplying authentic, diverse, and contextualized questions to students. Nonetheless, empirical findings suggest that 40% to 80% of students registered in QPs drop out in less than two months. This research is the first attempt to understand and predict student dropouts from QPs via exploiting students' engagement moods. Adopting a data-driven approach, we identify five different engagement moods for QP students, which are namely challenge-seeker, subject-seeker, interest-seeker, joy-seeker, and non-seeker. We find that students have collective preferences for answering questions in each engagement mood, and deviation from those preferences increases their probability of dropping out significantly. Last but not least, this paper contributes by introducing a new hybrid machine learning model (we call Dropout-Plus) for predicting student dropouts in QPs. The test results on a popular QP in China, with nearly 10K students, show that Dropout-Plus can exceed the rival algorithms' dropout prediction performance in terms of accuracy, F1-measure, and AUC. We wrap up our work by giving some design suggestions to QP managers and online learning professionals to reduce their student dropouts.

preprint2020arXiv

A Visual Analytics Approach to Scheduling Customized Shuttle Buses via Perceiving Passengers' Travel Demands

Shuttle buses have been a popular means to move commuters sharing similar origins and destinations during periods of high travel demand. However, planning and deploying reasonable, customized service bus systems becomes challenging when the commute demand is rather dynamic. It is difficult, if not impossible to form a reliable, unbiased estimation of user needs in such a case using traditional modeling methods. We propose a visual analytics approach to facilitating assessment of actual, varying travel demands and planning of night customized shuttle systems. A preliminary case study verifies the efficacy of our approach.

preprint2020arXiv

Friend Network as Gatekeeper: A Study of WeChat Users' Consumption of Friend-Curated Contents

Social media enables users to publish, disseminate, and access information easily. The downside is that it has fewer gatekeepers of what content is allowed to enter public circulation than the traditional media. In this paper, we present preliminary empirical findings from WeChat, a popular messaging app of the Chinese, indicating that social media users leverage their friend networks collectively as latent, dynamic gatekeepers for content consumption. Taking a mixed-methods approach, we analyze over seven million users' information consumption behaviors on WeChat and conduct an online survey of $216$ users. Both quantitative and qualitative evidence suggests that friend network indeed acts as a gatekeeper in social media. Shifting from what should be produced that gatekeepers used to decide, friend network helps separate the worthy from the unworthy for individual information consumption, and its structure and dynamics that play an important role in gatekeeping may inspire the future design of socio-technical systems.

preprint2020arXiv

Investigating the Effects of Robot Engagement Communication on Learning from Demonstration

Robot Learning from Demonstration (RLfD) is a technique for robots to derive policies from instructors' examples. Although the reciprocal effects of student engagement on teacher behavior are widely recognized in the educational community, it is unclear whether the same phenomenon holds true for RLfD. To fill this gap, we first design three types of robot engagement behavior (attention, imitation, and a hybrid of the two) based on the learning literature. We then conduct, in a simulation environment, a within-subject user study to investigate the impact of different robot engagement cues on humans compared to a "without-engagement" condition. Results suggest that engagement communication significantly changes the human's estimation of the robots' capability and significantly raises their expectation towards the learning outcomes, even though we do not run actual learning algorithms in the experiments. Moreover, imitation behavior affects humans more than attention does in all metrics, while their combination has the most profound influences on humans. We also find that communicating engagement via imitation or the combined behavior significantly improve humans' perception towards the quality of demonstrations, even if all demonstrations are of the same quality.