Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
45works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

45 published item(s)

preprint2026arXiv

Learning Local Constraints for Reinforcement-Learned Content Generators

Constraint-based game content generators that learn local constraints from existing content, such as Wave Function Collapse (WFC), can generate visually satisfying game levels but face challenges in guaranteeing global properties, such as playability. On the other hand, reinforcement-learning trained generators can guarantee global properties -- because such properties can easily be included in reward functions -- but the results can be visually dissatisfying. In this paper, we explore ways to combine these methods. Specifically, we constrain the action space of a PCGRL generator with constraints learned by WFC, effectively allowing the PCGRL generator to achieve global properties while forced to adhere to local constraints. To better analyze how this hybrid content generation method operates, we vary the number and type of inputs, and we test whether to randomly collapse the starting state and exclude rare patterns. While the method is sensitive to hyperparameter tuning, the best of our trained generators produce visually satisfying and playable puzzle-platform game levels -- such as Lode Runner levels -- with desired global properties.

preprint2026arXiv

The Garden of Forking Paths: Narrative Arc-Conditioned Gameplay Planning

Narrative archetypes (e.g., Hero's Journey, Three-act structure) provide universal story structures that resonate across cultures and media and are important for video game storytelling, yet existing LLM-based methods lack explicit use of these archetypes in procedurally generated games. We propose Forking Garden, a framework for narrative arc-conditioned gameplay planning that generates branching games from user-provided storylines. Our approach first generates a diverse pool of independent nodes, then assembles them into a dungeon graph via arc-guided constraint algorithms, where each node achieves multimodal alignment of gameplay elements. We develop an end-to-end interactive system that instantiates the framework.

preprint2025arXiv

Mortar: Evolving Mechanics for Automatic Game Design

We present Mortar, a system for autonomously evolving game mechanics for automatic game design. Game mechanics define the rules and interactions that govern gameplay, and designing them manually is a time-consuming and expert-driven process. Mortar combines a quality-diversity algorithm with a large language model to explore a diverse set of mechanics, which are evaluated by synthesising complete games that incorporate both evolved mechanics and those drawn from an archive. The mechanics are evaluated by composing complete games through a tree search procedure, where the resulting games are evaluated by their ability to preserve a skill-based ordering over players -- that is, whether stronger players consistently outperform weaker ones. We assess the mechanics based on their contribution towards the skill-based ordering score in the game. We demonstrate that Mortar produces games that appear diverse and playable, and mechanics that contribute more towards the skill-based ordering score in the game. We perform ablation studies to assess the role of each system component and a user study to evaluate the games based on human feedback.

preprint2023arXiv

Pathfinding Neural Cellular Automata

Pathfinding makes up an important sub-component of a broad range of complex tasks in AI, such as robot path planning, transport routing, and game playing. While classical algorithms can efficiently compute shortest paths, neural networks could be better suited to adapting these sub-routines to more complex and intractable tasks. As a step toward developing such networks, we hand-code and learn models for Breadth-First Search (BFS), i.e. shortest path finding, using the unified architectural framework of Neural Cellular Automata, which are iterative neural networks with equal-size inputs and outputs. Similarly, we present a neural implementation of Depth-First Search (DFS), and outline how it can be combined with neural BFS to produce an NCA for computing diameter of a graph. We experiment with architectural modifications inspired by these hand-coded NCAs, training networks from scratch to solve the diameter problem on grid mazes while exhibiting strong generalization ability. Finally, we introduce a scheme in which data points are mutated adversarially during training. We find that adversarially evolving mazes leads to increased generalization on out-of-distribution examples, while at the same time generating data-sets with significantly more complex solutions for reasoning tasks.

preprint2022arXiv

Aesthetic Bot: Interactively Evolving Game Maps on Twitter

This paper describes the implementation of the Aesthetic Bot, an automated Twitter account that posts images of small game maps that are either user-made or generated from an evolutionary system. The bot then prompts users to vote via a poll posted in the image's thread for the most aesthetically pleasing map. This creates a rating system that allows for direct interaction with the bot in a way that is integrated seamlessly into a user's regularly updated Twitter content feed. Upon conclusion of the each voting round, the bot learns from the distribution of votes for each map to emulate user preferences for design and visual aesthetic in order to generate maps that would win future vote pairings. We discuss the ongoing results and emerging behaviors that have occurred since the release of this system from both the bot's generation of game maps and the participating Twitter users.

preprint2022arXiv

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at https://github.com/icaros-usc/dqd-rl

preprint2022arXiv

Designer Modeling through Design Style Clustering

We propose modeling designer style in mixed-initiative game content creation tools as archetypical design traces. These design traces are formulated as transitions between design styles; these design styles are in turn found through clustering all intermediate designs along the way to making a complete design. This method is implemented in the Evolutionary Dungeon Designer, a research platform for mixed-initiative systems to create adventure and dungeon crawler games. We present results both in the form of design styles for rooms, which can be analyzed to better understand the kind of rooms designed by users, and in the form of archetypical sequences between these rooms, i.e., Designer Personas.

preprint2022arXiv

Diversity and Novelty MasterPrints: Generating Multiple DeepMasterPrints for Increased User Coverage

This work expands on previous advancements in genetic fingerprint spoofing via the DeepMasterPrints and introduces Diversity and Novelty MasterPrints. This system uses quality diversity evolutionary algorithms to generate dictionaries of artificial prints with a focus on increasing coverage of users from the dataset. The Diversity MasterPrints focus on generating solution prints that match with users not covered by previously found prints, and the Novelty MasterPrints explicitly search for prints with more that are farther in user space than previous prints. Our multi-print search methodologies outperform the singular DeepMasterPrints in both coverage and generalization while maintaining quality of the fingerprint image output.

preprint2022arXiv

Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi

Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect, but playing in an ad-hoc setting requires agents to adapt to its partner's strategies with no previous coordination. Evaluating an agent in this setting requires a diverse population of potential partners, but so far, the behavioral diversity of agents has not been considered in a systematic way. This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate diverse populations for this purpose, and generates a population of diverse Hanabi agents using MAP-Elites. We also postulate that agents can benefit from a diverse population during training and implement a simple "meta-strategy" for adapting to an agent's perceived behavioral niche. We show this meta-strategy can work better than generalist strategies even outside the population it was trained with if its partner's behavioral niche can be correctly inferred, but in practice a partner's behavior depends and interferes with the meta-agent's own behavior, suggesting an avenue for future research in characterizing another agent's behavior during gameplay.

preprint2022arXiv

Generating Diverse Indoor Furniture Arrangements

We present a method for generating arrangements of indoor furniture from human-designed furniture layout data. Our method creates arrangements that target specified diversity, such as the total price of all furniture in the room and the number of pieces placed. To generate realistic furniture arrangement, we train a generative adversarial network (GAN) on human-designed layouts. To target specific diversity in the arrangements, we optimize the latent space of the GAN via a quality diversity algorithm to generate a diverse arrangement collection. Experiments show our approach discovers a set of arrangements that are similar to human-designed layouts but varies in price and number of furniture pieces.

preprint2022arXiv

Illuminating Diverse Neural Cellular Automata for Level Generation

We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.

preprint2022arXiv

Keke AI Competition: Solving puzzle levels in a dynamically changing mechanic space

The Keke AI Competition introduces an artificial agent competition for the game Baba is You - a Sokoban-like puzzle game where players can create rules that influence the mechanics of the game. Altering a rule can cause temporary or permanent effects for the rest of the level that could be part of the solution space. The nature of these dynamic rules and the deterministic aspect of the game creates a challenge for AI to adapt to a variety of mechanic combinations in order to solve a level. This paper describes the framework and evaluation metrics used to rank submitted agents and baseline results from sample tree search agents.

preprint2022arXiv

Learning Controllable 3D Level Generators

Procedural Content Generation via Reinforcement Learning (PCGRL) foregoes the need for large human-authored data-sets and allows agents to train explicitly on functional constraints, using computable, user-defined measures of quality instead of target output. We explore the application of PCGRL to 3D domains, in which content-generation tasks naturally have greater complexity and potential pertinence to real-world applications. Here, we introduce several PCGRL tasks for the 3D domain, Minecraft (Mojang Studios, 2009). These tasks will challenge RL-based generators using affordances often found in 3D environments, such as jumping, multiple dimensional movement, and gravity. We train an agent to optimize each of these tasks to explore the capabilities of previous research in PCGRL. This agent is able to generate relatively complex and diverse levels, and generalize to random initial states and control targets. Controllability tests in the presented tasks demonstrate their utility to analyze success and failure for 3D generators.

preprint2022arXiv

Mech-Elites: Illuminating the Mechanic Space of GVGAI

This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for $4$ different games in GVG-AI: Zelda, Solarfox, Plants, and RealPortals.

preprint2022arXiv

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Counterfactual Regret Minimization (CFR) has achieved many fascinating results in solving large-scale Imperfect Information Games (IIGs). Neural network approximation CFR (neural CFR) is one of the promising techniques that can reduce computation and memory consumption by generalizing decision information between similar states. Current neural CFR algorithms have to approximate cumulative regrets. However, efficient and accurate approximation in a large-scale IIG is still a tough challenge. In this paper, a new CFR variant, Recursive CFR (ReCFR), is proposed. In ReCFR, Recursive Substitute Values (RSVs) are learned and used to replace cumulative regrets. It is proven that ReCFR can converge to a Nash equilibrium at a rate of $O({1}/{\sqrt{T}})$. Based on ReCFR, a new model-free neural CFR with bootstrap learning, Neural ReCFR-B, is proposed. Due to the recursive and non-cumulative nature of RSVs, Neural ReCFR-B has lower-variance training targets than other neural CFRs. Experimental results show that Neural ReCFR-B is competitive with the state-of-the-art neural CFR algorithms at a much lower training cost.

preprint2022arXiv

Mutation Models: Learning to Generate Levels by Imitating Evolution

Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative level generator based on machine learning. We train a model to imitate the evolutionary process and use the trained model to generate levels. This trained model is able to modify noisy levels sequentially to create better levels without the need for a fitness function during inference. We evaluate our trained models on a 2D maze generation task. We compare several different versions of the method: training the models either at the end of evolution (normal evolution) or every 100 generations (assisted evolution) and using the model as a mutation function during evolution. Using the assisted evolution process, the final trained models are able to generate mazes with a success rate of 99% and high diversity of 86%. The trained model is many times faster than the evolutionary process it was trained on. This work opens the door to a new way of learning level generators guided by an evolutionary process, meaning automatic creation of generators with specifiable constraints and objectives that are fast enough for runtime deployment in games.

preprint2022arXiv

Persona-driven Dominant/Submissive Map (PDSM) Generation for Tutorials

In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavioral characteristics of levels which are evolved using the quality-diversity algorithm known as Constrained MAP-Elites. An evolved map's quality is determined by its simplicity: the simpler it is, the better it is. Within this work, we show that the generated maps can strongly encourage or discourage different persona-like behaviors and range from simple solutions to complex puzzle-levels, making them perfect candidates for a tutorial generative system.

preprint2022arXiv

Pommerman: A Multi-Agent Playground

We present Pommerman, a multi-agent environment based on the classic console game Bomberman. Pommerman consists of a set of scenarios, each having at least four players and containing both cooperative and competitive aspects. We believe that success in Pommerman will require a diverse set of tools and methods, including planning, opponent/teammate modeling, game theory, and communication, and consequently can serve well as a multi-agent benchmark. To date, we have already hosted one competition, and our next one will be featured in the NIPS 2018 competition track.

preprint2022arXiv

Predicting Personas Using Mechanic Frequencies and Game State Traces

We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player persona, one using regular supervised learning and aggregate measures of game mechanics initiated, and another based on sequence learning on a trace of closely cropped gameplay observations. While both of these methods achieve high accuracy when predicting play personas defined by agreement with procedural personas, they utterly fail to predict play style as defined by the players themselves using a questionnaire. This interesting result highlights the value of using computational methods in defining play personas.

preprint2022arXiv

Reinforcement Learning with Dual-Observation for General Video Game Playing

Reinforcement learning algorithms have performed well in playing challenging board and video games. More and more studies focus on improving the generalisation ability of reinforcement learning algorithms. The General Video Game AI Learning Competition aims to develop agents capable of learning to play different game levels that were unseen during training. This paper summarises the five years' General Video Game AI Learning Competition editions. At each edition, three new games were designed. The training and test levels were designed separately in the first three editions. Since 2020, three test levels of each game were generated by perturbing or combining two training levels. Then, we present a novel reinforcement learning technique with dual-observation for general video game playing, assuming that it is more likely to observe similar local information in different levels rather than global information. Instead of directly inputting a single, raw pixel-based screenshot of the current game screen, our proposed general technique takes the encoded, transformed global and local observations of the game screen as two simultaneous inputs, aiming at learning local information for playing new levels. Our proposed technique is implemented with three state-of-the-art reinforcement learning algorithms and tested on the game set of the 2020 General Video Game AI Learning Competition. Ablation studies show the outstanding performance of using encoded, transformed global and local observations as input.

preprint2022arXiv

Transfer Dynamics in Emergent Evolutionary Curricula

PINSKY is a system for open-ended learning through neuroevolution in game-based domains. It builds on the Paired Open-Ended Trailblazer (POET) system, which originally explored learning and environment generation for bipedal walkers, and adapts it to games in the General Video Game AI (GVGAI) system. Previous work showed that by co-evolving levels and neural network policies, levels could be found for which successful policies could not be created via optimization alone. Studied in the realm of Artificial Life as a potentially open-ended alternative to gradient-based fitness, minimal criteria (MC)-based selection helps foster diversity in evolutionary populations. The main question addressed by this paper is how the open-ended learning actually works, focusing in particular on the role of transfer of policies from one evolutionary branch ("species") to another. We analyze the dynamics of the system through creating phylogenetic trees, analyzing evolutionary trajectories of policies, and temporally breaking down transfers according to species type. Furthermore, we analyze the impact of the minimal criterion on generated level diversity and inter-species transfer. The most insightful finding is that inter-species transfer, while rare, is crucial to the system's success.

preprint2022arXiv

Watts: Infrastructure for Open-Ended Learning

This paper proposes a framework called Watts for implementing, comparing, and recombining open-ended learning (OEL) algorithms. Motivated by modularity and algorithmic flexibility, Watts atomizes the components of OEL systems to promote the study of and direct comparisons between approaches. Examining implementations of three OEL algorithms, the paper introduces the modules of the framework. The hope is for Watts to enable benchmarking and to explore new types of OEL algorithms. The repo is available at \url{https://github.com/aadharna/watts}

preprint2021arXiv

Greedy Scheduling: A Neural Network Method to Reduce Task Failure in Software Crowdsourcing

Context: Highly dynamic and competitive crowdsourcing software development (CSD) marketplaces may experience task failure due to unforeseen reasons, such as increased competition over shared supplier resources, or uncertainty associated with a dynamic worker supply. Existing analysis reveals an average task failure ratio of 15.7\% in software crowdsourcing markets. Goal: The objective of this study is to provide a task scheduling recommendation model for software crowdsourcing platforms in order to improve the success and efficiency of software crowdsourcing. Method: We propose a task scheduling method based on neural networks, and develop a system that can predict and analyze task failure probability upon arrival. More specifically, the model uses a range of input variables, including the number of open tasks in the platform, the average task similarity between arriving tasks and open tasks, the winner's monetary prize, and task duration, to predict the probability of task failure on the planned arrival date and two surplus days. This prediction will offer the recommended day associated with the lowest task failure probability to post the task. The proposed model is based on the workflow and data of Topcoder, one of the primary software crowdsourcing platforms. Results: We present a model that suggests the best recommended arrival dates for any task in the project with surplus of two days per task in the project. The model on average provided 4\% lower failure ratio per project.

preprint2021arXiv

Interactive Constrained MAP-Elites: Analysis and Evaluation of the Expressiveness of the Feature Dimensions

We propose the Interactive Constrained MAP-Elites, a quality-diversity solution for game content generation, implemented as a new feature of the Evolutionary Dungeon Designer: a mixed-initiative co-creativity tool for designing dungeons. The feature uses the MAP-Elites algorithm, an illumination algorithm that segregates the population among several cells depending on their scores with respect to different behavioral dimensions. Users can flexibly and dynamically alternate between these dimensions anytime, thus guiding the evolutionary process in an intuitive way, and then incorporate suggestions produced by the algorithm in their room designs. At the same time, any modifications performed by the human user will feed back into MAP-Elites, closing a circular workflow of constant mutual inspiration. This paper presents the algorithm followed by an in-depth analysis of its behaviour, with the aims of evaluating the expressive range of all possible dimension combinations in several scenarios, as well as discussing their influence in the fitness landscape and in the overall performance of the mixed-initiative procedural content generation.

preprint2021arXiv

Mixed-Initiative Level Design with RL Brush

This paper introduces RL Brush, a level-editing tool for tile-based games designed for mixed-initiative co-creation. The tool uses reinforcement-learning-based models to augment manual human level-design through the addition of AI-generated suggestions. Here, we apply RL Brush to designing levels for the classic puzzle game Sokoban. We put the tool online and tested it in 39 different sessions. The results show that users using the AI suggestions stay around longer and their created levels on average are more playable and more complex than without.

preprint2020arXiv

A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking

This paper introduces an information-theoretic method for selecting a subset of problems which gives the most information about a group of problem-solving algorithms. This method was tested on the games in the General Video Game AI (GVGAI) framework, allowing us to identify a smaller set of games that still gives a large amount of information about the abilities of different game-playing agents. This approach can be used to make agent testing more efficient. We can achieve almost as good discriminatory accuracy when testing on only a handful of games as when testing on more than a hundred games, something which is often computationally infeasible. Furthermore, this method can be extended to study the dimensions of the effective variance in game design between these games, allowing us to identify which games differentiate between agents in the most complementary ways.

preprint2020arXiv

A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization

A fast and efficient stochastic opposition-based learning (OBL) variant is proposed in this paper. OBL is a machine learning concept to accelerate the convergence of soft computing algorithms, which consists of simultaneously calculating an original solution and its opposite. Recently, a stochastic OBL variant called BetaCOBL was proposed, which is capable of controlling the degree of opposite solutions, preserving useful information held by original solutions, and preventing the waste of fitness evaluations. While it has shown outstanding performance compared to several state-of-the-art OBL variants, the high computational cost of BetaCOBL may hinder it from cost-sensitive optimization problems. Also, as it assumes that the decision variables of a given problem are independent, BetaCOBL may be ineffective for optimizing inseparable problems. In this paper, we propose an improved BetaCOBL that mitigates all the limitations. The proposed algorithm called iBetaCOBL reduces the computational cost from $O(NP^{2} \cdot D)$ to $O(NP \cdot D)$ ($NP$ and $D$ stand for population size and a dimension, respectively) using a linear time diversity measure. Also, the proposed algorithm preserves strongly dependent variables that are adjacent to each other using multiple exponential crossover. We used differential evolution (DE) variants to evaluate the performance of the proposed algorithm. The results of the performance evaluations on a set of 58 test functions show the excellent performance of iBetaCOBL compared to ten state-of-the-art OBL variants, including BetaCOBL.

preprint2020arXiv

Advanced Cauchy Mutation for Differential Evolution in Numerical Optimization

Among many evolutionary algorithms, differential evolution (DE) has received much attention over the last two decades. DE is a simple yet powerful evolutionary algorithm that has been used successfully to optimize various real-world problems. Since it was introduced, many researchers have developed new methods for DE, and one of them makes use of a mutation based on the Cauchy distribution to increase the convergence speed of DE. The method monitors the results of each individual in the selection operator and performs the Cauchy mutation on consecutively failed individuals, which generates mutant vectors by perturbing the best individual with the Cauchy distribution. Therefore, the method can locate the consecutively failed individuals to new positions close to the best individual. Although this approach is interesting, it fails to take into account establishing a balance between exploration and exploitation. In this paper, we propose a sigmoid based parameter control that alters the failure threshold for performing the Cauchy mutation in a time-varying schedule, which can establish a good ratio between exploration and exploitation. Experiments and comparisons have been done with six conventional and six advanced DE variants on a set of 30 benchmark problems, which indicate that the DE variants assisted by the proposed algorithm are highly competitive, especially for multimodal functions.

preprint2020arXiv

Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.

preprint2020arXiv

Automatic Critical Mechanic Discovery Using Playtraces in Video Games

We present a new method of automatic critical mechanic discovery for video games using a combination of game description parsing and playtrace information. This method is applied to several games within the General Video Game Artificial Intelligence (GVG-AI) framework. In a user study, human-identified mechanics are compared against system-identified critical mechanics to verify alignment between humans and the system. The results of the study demonstrate that the new method is able to match humans with higher consistency than baseline. Our system is further validated by comparing MCTS agents augmented with critical mechanics and vanilla MCTS agents on $4$ games from GVG-AI. Our new playtrace method shows a significant performance improvement over the baseline for all 4 tested games. The proposed method also shows either matched or improved performance over the old method, demonstrating that playtrace information is responsible for more complete critical mechanic discovery.

preprint2020arXiv

Baba is Y'all: Collaborative Mixed-Initiative Level Design

We present a collaborative mixed-initiative system for building levels for the puzzle game "Baba is You". Unlike previous mixed-initiative systems, Baba is Y'all is designed for collaborative asynchronous creation by multiple users over the internet. The system includes several AI-assisted features to help designers, including a level evolver and an automated player for playtesting. The level archives catalogues levels according to which mechanics are implemented and not implemented, allowing the system to ask users to design levels with specific combinations of mechanics. We describe the operation of the system and the results of small-scale informal user test, and discuss future development paths for this system as well as for collaborative mixed-initiative systems in general.

preprint2020arXiv

Capturing Local and Global Patterns in Procedural Content Generation via Machine Learning

Recent procedural content generation via machine learning (PCGML) methods allow learning from existing content to produce similar content automatically. While these approaches are able to generate content for different games (e.g. Super Mario Bros., DOOM, Zelda, and Kid Icarus), it is an open questions how well these approaches can capture large-scale visual patterns such as symmetry. In this paper, we propose match-three games as a domain to test PCGML algorithms regarding their ability to generate suitable patterns. We demonstrate that popular algorithm such as Generative Adversarial Networks struggle in this domain and propose adaptations to improve their performance. In particular we augment the neighborhood of a Markov Random Fields approach to not only take local but also symmetric positional information into account. We conduct several empirical tests including a user study that show the improvements achieved by the proposed modifications, and obtain promising results.

preprint2020arXiv

Co-generation of game levels and game-playing agents

Open-endedness, primarily studied in the context of artificial life, is the ability of systems to generate potentially unbounded ontologies of increasing novelty and complexity. Engineering generative systems displaying at least some degree of this ability is a goal with clear applications to procedural content generation in games. The Paired Open-Ended Trailblazer (POET) algorithm, heretofore explored only in a biped walking domain, is a coevolutionary system that simultaneously generates environments and agents that can solve them. This paper introduces a POET-Inspired Neuroevolutionary System for KreativitY (PINSKY) in games, which co-generates levels for multiple video games and agents that play them. This system leverages the General Video Game Artificial Intelligence (GVGAI) framework to enable co-generation of levels and agents for the 2D Atari-style games Zelda and Solar Fox. Results demonstrate the ability of PINSKY to generate curricula of game levels, opening up a promising new avenue for research at the intersection of procedural content generation and artificial life. At the same time, results in these challenging game domains highlight the limitations of the current algorithm and opportunities for improvement.

preprint2020arXiv

Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space

We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver-sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains.

preprint2020arXiv

Evaluating the Rainbow DQN Agent in Hanabi with Unseen Partners

Hanabi is a cooperative game that challenges exist-ing AI techniques due to its focus on modeling the mental states ofother players to interpret and predict their behavior. While thereare agents that can achieve near-perfect scores in the game byagreeing on some shared strategy, comparatively little progresshas been made in ad-hoc cooperation settings, where partnersand strategies are not known in advance. In this paper, we showthat agents trained through self-play using the popular RainbowDQN architecture fail to cooperate well with simple rule-basedagents that were not seen during training and, conversely, whenthese agents are trained to play with any individual rule-basedagent, or even a mix of these agents, they fail to achieve goodself-play scores.

preprint2020arXiv

Increasing Generality in Machine Learning through Procedural Content Generation

Procedural Content Generation (PCG) refers to the practice, in videogames and other games, of generating content such as levels, quests, or characters algorithmically. Motivated by the need to make games replayable, as well as to reduce authoring burden, limit storage space requirements, and enable particular aesthetics, a large number of PCG methods have been devised by game developers. Additionally, researchers have explored adapting methods from machine learning, optimization, and constraint solving to PCG problems. Games have been widely used in AI research since the inception of the field, and in recent years have been used to develop and benchmark new machine learning algorithms. Through this practice, it has become more apparent that these algorithms are susceptible to overfitting. Often, an algorithm will not learn a general policy, but instead a policy that will only work for a particular version of a particular task with particular initial parameters. In response, researchers have begun exploring randomization of problem parameters to counteract such overfitting and to allow trained policies to more easily transfer from one environment to another, such as from a simulated robot to a robot in the real world. Here we review the large amount of existing work on PCG, which we believe has an important role to play in increasing the generality of machine learning methods. The main goal here is to present RL/AI with new tools from the PCG toolbox, and its secondary goal is to explain to game developers and researchers a way in which their work is relevant to AI research.

preprint2020arXiv

Mario Level Generation From Mechanics Using Scene Stitching

This paper presents a level generation method for Super Mario by stitching together pre-generated "scenes" that contain specific mechanics, using mechanic-sequences from agent playthroughs as input specifications. Given a sequence of mechanics, our system uses an FI-2Pop algorithm and a corpus of scenes to perform automated level authoring. The system outputs levels that have a similar mechanical sequence to the target mechanic sequence but with a different playthrough experience. We compare our system to a greedy method that selects scenes that maximize the target mechanics. Our system is able to maximize the number of matched mechanics while reducing emergent mechanics using the stitching process compared to the greedy approach.

preprint2020arXiv

Multi-Objective level generator generation with Marahel

This paper introduces a new system to design constructive level generators by searching the space of constructive level generators defined by Marahel language. We use NSGA-II, a multi-objective optimization algorithm, to search for generators for three different problems (Binary, Zelda, and Sokoban). We restrict the representation to a subset of Marahel language to push the evolution to find more efficient generators. The results show that the generated generators were able to achieve good performance on most of the fitness functions over these three problems. However, on Zelda and Sokoban, they tend to depend on the initial state than modifying the map.

preprint2020arXiv

Multi-Stage Transfer Learning with an Application to Selection Process

In multi-stage processes, decisions happen in an ordered sequence of stages. Many of them have the structure of dual funnel problem: as the sample size decreases from one stage to the other, the information increases. A related example is a selection process, where applicants apply for a position, prize, or grant. In each stage, more applicants are evaluated and filtered out, and from the remaining ones, more information is collected. In the last stage, decision-makers use all available information to make their final decision. To train a classifier for each stage becomes impracticable as they can underfit due to the low dimensionality in early stages or overfit due to the small sample size in the latter stages. In this work, we proposed a \textit{Multi-StaGe Transfer Learning} (MSGTL) approach that uses knowledge from simple classifiers trained in early stages to improve the performance of classifiers in the latter stages. By transferring weights from simpler neural networks trained in larger datasets, we able to fine-tune more complex neural networks in the latter stages without overfitting due to the small sample size. We show that it is possible to control the trade-off between conserving knowledge and fine-tuning using a simple probabilistic map. Experiments using real-world data demonstrate the efficacy of our approach as it outperforms other state-of-the-art methods for transfer learning and regularization.

preprint2020arXiv

PCGRL: Procedural Content Generation via Reinforcement Learning

We investigate how reinforcement learning can be used to train level-designing agents. This represents a new approach to procedural content generation in games, where level design is framed as a game, and the content generator itself is learned. By seeing the design problem as a sequential task, we can use reinforcement learning to learn how to take the next action so that the expected final level quality is maximized. This approach can be used when few or no examples exist to train from, and the trained generator is very fast. We investigate three different ways of transforming two-dimensional level design problems into Markov decision processes and apply these to three game environments.

preprint2020arXiv

Rotation, Translation, and Cropping for Zero-Shot Generalization

Deep Reinforcement Learning (DRL) has shown impressive performance on domains with visual inputs, in particular various games. However, the agent is usually trained on a fixed environment, e.g. a fixed number of levels. A growing mass of evidence suggests that these trained models fail to generalize to even slight variations of the environments they were trained on. This paper advances the hypothesis that the lack of generalization is partly due to the input representation, and explores how rotation, cropping and translation could increase generality. We show that a cropped, translated and rotated observation can get better generalization on unseen levels of two-dimensional arcade games from the GVGAI framework. The generality of the agents is evaluated on both human-designed and procedurally generated levels.

preprint2020arXiv

Tree Search vs Optimization Approaches for Map Generation

Search-based procedural content generation uses stochastic global optimization algorithms to search for game content. However, standard tree search algorithms can be competitive with evolution on some optimization problems. We investigate the applicability of several tree search methods to level generation and compare them systematically with several optimization algorithms, including evolutionary algorithms. We compare them on three different game level generation problems: Binary, Zelda, and Sokoban. We introduce two new representations that can help tree search algorithms deal with the large branching factor of the generation problem. We find that in general, optimization algorithms clearly outperform tree search algorithms, but given the right problem representation certain tree search algorithms perform similarly to optimization algorithms, and in one particular problem, we see surprisingly strong results from MCTS.

preprint2020arXiv

Unified Multi-Domain Learning and Data Imputation using Adversarial Autoencoder

We present a novel framework that can combine multi-domain learning (MDL), data imputation (DI) and multi-task learning (MTL) to improve performance for classification and regression tasks in different domains. The core of our method is an adversarial autoencoder that can: (1) learn to produce domain-invariant embeddings to reduce the difference between domains; (2) learn the data distribution for each domain and correctly perform data imputation on missing data. For MDL, we use the Maximum Mean Discrepancy (MMD) measure to align the domain distributions. For DI, we use an adversarial approach where a generator fill in information for missing data and a discriminator tries to distinguish between real and imputed values. Finally, using the universal feature representation in the embeddings, we train a classifier using MTL that given input from any domain, can predict labels for all domains. We demonstrate the superior performance of our approach compared to other state-of-art methods in three distinct settings, DG-DI in image recognition with unstructured data, MTL-DI in grade estimation with structured data and MDMTL-DI in a selection process using mixed data.

preprint2019arXiv

Empowering Quality Diversity in Dungeon Design with Interactive Constrained MAP-Elites

We propose the use of quality-diversity algorithms for mixed-initiative game content generation. This idea is implemented as a new feature of the Evolutionary Dungeon Designer, a system for mixed-initiative design of the type of levels you typically find in computer role playing games. The feature uses the MAP-Elites algorithm, an illumination algorithm which divides the population into a number of cells depending on their values along several behavioral dimensions. Users can flexibly and dynamically choose relevant dimensions of variation, and incorporate suggestions produced by the algorithm in their map designs. At the same time, any modifications performed by the human feed back into MAP-Elites, and are used to generate further suggestions.

preprint2019arXiv

Procedural Content Generation through Quality Diversity

Quality-diversity (QD) algorithms search for a set of good solutions which cover a space as defined by behavior metrics. This simultaneous focus on quality and diversity with explicit metrics sets QD algorithms apart from standard single- and multi-objective evolutionary algorithms, as well as from diversity preservation approaches such as niching. These properties open up new avenues for artificial intelligence in games, in particular for procedural content generation. Creating multiple systematically varying solutions allows new approaches to creative human-AI interaction as well as adaptivity. In the last few years, a handful of applications of QD to procedural content generation and game playing have been proposed; we discuss these and propose challenges for future work.