Source author record

Juho Kim

Juho Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Artificial Intelligence Computer Science and Game Theory cs.CY Cryptography and Security Social and Information Networks

Catalog footprint

What is connected

11works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Domain-Independent Game Abstraction using Word Embedding Techniques

Many games of interest in the real world are often intractably large, thereby necessitating the use of game abstraction to shrink them in size, typically by many magnitudes. Over the last two decades, there have been significant advances in game abstraction; however, the domain-specific nature (usually poker) of much of the prior work prevents those techniques from being easily generalized to other settings without extensively analyzing the game at hand. In this paper, we propose a domain-independent approach to game abstraction, which applies word embedding techniques from the field of natural language processing. Treating each action as a word and gameplay data as a corpus, word vectors can be trained to represent each action as a real-valued vector, which can then be clustered to facilitate game abstraction. We also explore the use of foundational embedding models and show that action embeddings obtained this way can capture a surprising amount of information about the underlying game. Experimental results demonstrate that our proposed game abstraction technique is effective, although it does not outperform specialized algorithms tailored to specific games.

preprint2026arXiv

Heuristic Pathologies and Further Variance Reduction via Uncertainty Propagation in the AIVAT Family of Techniques

How should an agent's performance in a multiagent environment be evaluated when there is a limited sample size or a high cost of running a trial? The AIVAT family of variance reduction techniques was proposed to address this challenge by introducing unbiased low-variance estimators of agents' expected payoffs. An important component of AIVAT is a heuristic value function that discriminates between potentially low- and high-value counterfactual histories. A notable gap in the literature is that there is little to no constraint or guideline on how the heuristic value function should be chosen or how uncertainty in its output should be handled. In our first contribution, we parameterize the heuristic value function to highlight AIVAT's potential vulnerabilities: a) the sample variance can be set pathologically low by directly applying gradient descent on the sample variance, and b) one can p-hack to draw a desired statistical conclusion via gradient descent/ascent on the test statistic. The main takeaway is that the heuristic value function should be fixed prior to observing the evaluation data! In our second contribution, we show how the heuristic uncertainty can be propagated to quantify the uncertainty of AIVAT estimates. It is then possible to further reduce the variance using inverse-variance weighted averaging, but AIVAT's unbiasedness guarantee may have to be sacrificed. In our experiments, we use a dataset of 10,000 poker hands to demonstrate our heuristic pathology and uncertainty results, with the latter yielding a 43.0% reduction in the number of samples (poker hands) needed to draw statistical conclusions.

preprint2026arXiv

Parallelizing Counterfactual Regret Minimization

Parallelization has played an instrumental role in the field of artificial intelligence (AI), drastically reducing the time taken to train and evaluate large AI models. In contrast to its impact in the broader field of AI, applying parallelization to computational game solving is relatively unexplored, despite its great potential. In this paper, we parallelize the family of counterfactual regret minimization (CFR) algorithms, which were central to important breakthroughs for solving large imperfect-information games. We present a generalized parallelization framework, reframing CFR as a series of linear algebra operations. Then, existing techniques for parallelizing linear algebra operations can be applied to accelerate CFR. We also describe how our technique can be applied to other tabular members of the CFR family of algorithms, including the state-of-the-art, such as CFR+, discounted CFR, and predictive variants of CFR. Experimentally, we show that our CFR implementation on a GPU is up to four orders of magnitude faster than Google DeepMind OpenSpiel's CFR implementations on a CPU.

preprint2026arXiv

Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games

Watermarking techniques for large language models (LLMs), which encode hidden information in the output so its source can be verified, have gained significant attention in recent days, thanks to their potential capability to detect accidental or deliberate misuse. Similar challenges involving model misuse also exist in the context of game-playing, such as when detecting the unauthorized use of AI tools in gaming platforms (e.g., cheating in online chess). In this paper, we initiate the study of how game-playing strategies can be watermarked. We show how the KGW watermark for LLMs can be adapted to watermark game-playing agents in perfect-information extensive-form games. The watermark can then be detected using a statistical test. We show that the degradation in the quality of the watermarked strategy profile, quantified by the expected utility, can be bounded, but there is a tradeoff between detectability and quality. In our experiments, we bootstrap the watermarking framework to various chess engines and demonstrate that a) the impact of the watermark on the quality of the strategy is negligible and b) the watermark can be detected with just a handful of games.

preprint2022arXiv

RLens: A Computer-aided Visualization System for Supporting Reflection on Language Learning under Distributed Tutorship

With the rise of the gig economy, online language tutoring platforms are becoming increasingly popular. These platforms provide temporary and flexible jobs for native speakers as tutors and allow language learners to have one-on-one speaking practices on demand, on which learners occasionally practice the language with different tutors. With such distributed tutorship, learners can hold flexible schedules and receive diverse feedback. However, learners face challenges in consistently tracking their learning progress because different tutors provide feedback from diverse standards and perspectives, and hardly refer to learners' previous experiences with other tutors. We present RLens, a visualization system for facilitating learners' learning progress reflection by grouping different tutors' feedback, tracking how each feedback type has been addressed across learning sessions, and visualizing the learning progress. We validate our design through a between-subjects study with 40 real-world learners. Results show that learners can successfully analyze their progress and common language issues under distributed tutorship with RLens, while most learners using the baseline interface had difficulty achieving reflection tasks. We further discuss design considerations of computer-aided systems for supporting learning under distributed tutorship.

preprint2022arXiv

Understanding Distributed Tutorship in Online Language Tutoring

With the rise of the gig economy, online language tutoring platforms are becoming increasingly popular. They provide temporary and flexible jobs for native speakers as tutors and allow language learners to have one-on-one speaking practices on demand. However, the lack of stable relationships hinders tutors and learners from building long-term trust. "Distributed tutorship" -- temporally discontinuous learning experience with different tutors -- has been underexplored yet has many implications for modern learning platforms. In this paper, we analyzed tutorship sequences of 15,959 learners and found that around 40% of learners change to new tutors every session; 44% learners change to new tutors while reverting to previous tutors sometimes; only 16% learners change to new tutors and then fix on one tutor. We also found suggestive evidence that higher distributedness -- higher diversity and lower continuity in tutorship -- is correlated to slower improvements in speaking performance scores with a similar number of sessions. We further surveyed 519 and interviewed 40 learners and found that more learners preferred fixed tutorship while some do not have it due to various reasons. Finally, we conducted semi-structured interviews with three tutors and one product manager to discuss the implications for improving the continuity in learning under distributed tutorship.

preprint2015arXiv

Mudslide: A Spatially Anchored Census of Student Confusion for Online Lecture Videos

Educators have developed an effective technique to get feedback after in-person lectures, called "muddy card." Students are given time to reflect and write the "muddiest" (least clear) point on an index card, to hand in as they leave class. This practice of assigning end-of-lecture reflection tasks to generate explicit student feedback is well suited for adaptation to the challenge of supporting feedback in online video lectures. We describe the design and evaluation of Mudslide, a prototype system that translates the practice of muddy cards into the realm of online lecture videos. Based on an in-lab study of students and teachers, we find that spatially contextualizing students' muddy point feedback with respect to particular lecture slides is advantageous to both students and teachers. We also reflect on further opportunities for enhancing this feedback method based on teachers' and students' experiences with our prototype.

preprint2015arXiv

RIMES: Embedding Interactive Multimedia Exercises in Lecture Videos

Teachers in conventional classrooms often ask learners to express themselves and show their thought processes by speaking out loud, drawing on a whiteboard, or even using physical objects. Despite the pedagogical value of such activities, interactive exercises available in most online learning platforms are constrained to multiple-choice and short answer questions. We introduce RIMES, a system for easily authoring, recording, and reviewing interactive multimedia exercises embedded in lecture videos. With RIMES, teachers can prompt learners to record their responses to an activity using video, audio, and inking while watching lecture videos. Teachers can then review and interact with all the learners' responses in an aggregated gallery. We evaluated RIMES with 19 teachers and 25 students. Teachers created a diverse set of activities across multiple subjects that tested deep conceptual and procedural knowledge. Teachers found the exercises useful for capturing students' thought processes, identifying misconceptions, and engaging students with content.

preprint2015arXiv

Supporting Instructors in Collaborating with Researchers using MOOClets

Most education and workplace learning takes place in classroom contexts far removed from laboratories or field sites with special arrangements for scientific research. But digital online resources provide a novel opportunity for large scale efforts to bridge the real world and laboratory settings which support data collection and randomized A/B experiments comparing different versions of content or interactions [2]. However, there are substantial technological and practical barriers in aligning instructors and researchers to use learning technologies like blended lessons/exercises & MOOCs as both a service for students and a realistic context to conduct research. This paper explains how the concept of a MOOClet can facilitate research-practitioner collaborations. MOOClets [3] are defined as modular components of a digital resource that can be implemented in technology to: (1) allow modification to create multiple versions, (2) allow experimental comparison and personalization of different versions, (3) reliably specify what data are collected. We suggest a framework in which instructors specify what kinds of changes to lessons, exercises, and emails they would be willing to adopt, and what data they will collect and make available. Researchers can then: (1) specify or design experiments that compare the effects of different versions on quantifiable outcomes. (2) Explore algorithms for maximizing particular outcomes by choosing alternative versions of a MOOClet based on the input variables available. We present a prototype survey tool for instructors intended to facilitate practitioner researcher matches and successful collaborations.

preprint2015arXiv

Using and Designing Platforms for In Vivo Education Experiments

In contrast to typical laboratory experiments, the everyday use of online educational resources by large populations and the prevalence of software infrastructure for A/B testing leads us to consider how platforms can embed in vivo experiments that do not merely support research, but ensure practical improvements to their educational components. Examples are presented of randomized experimental comparisons conducted by subsets of the authors in three widely used online educational platforms Khan Academy, edX, and ASSISTments. We suggest design principles for platform technology to support randomized experiments that lead to practical improvements enabling Iterative Improvement and Collaborative Work and explain the benefit of their implementation by WPI co-authors in the ASSISTments platform.

preprint2014arXiv

Attendee-Sourcing: Exploring The Design Space of Community-Informed Conference Scheduling

Constructing a good conference schedule for a large multi-track conference needs to take into account the preferences and constraints of organizers, authors, and attendees. Creating a schedule which has fewer conflicts for authors and attendees, and thematically coherent sessions is a challenging task. Cobi introduced an alternative approach to conference scheduling by engaging the community to play an active role in the planning process. The current Cobi pipeline consists of committee-sourcing and author-sourcing to plan a conference schedule. We further explore the design space of community-sourcing by introducing attendee-sourcing -- a process that collects input from conference attendees and encodes them as preferences and constraints for creating sessions and schedule. For CHI 2014, a large multi-track conference in human-computer interaction with more than 3,000 attendees and 1,000 authors, we collected attendees' preferences by making available all the accepted papers at the conference on a paper recommendation tool we built called Confer, for a period of 45 days before announcing the conference program (sessions and schedule). We compare the preferences marked on Confer with the preferences collected from Cobi's author-sourcing approach. We show that attendee-sourcing can provide insights beyond what can be discovered by author-sourcing. For CHI 2014, the results show value in the method and attendees' participation. It produces data that provides more alternatives in scheduling and complements data collected from other methods for creating coherent sessions and reducing conflicts.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint