Source author record

Jialin Liu

Jialin Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Neural and Evolutionary Computing Computer Science and Game Theory Computer Vision Distributed, Parallel, and Cluster Computing math.OC astro-ph.GA astro-ph.SR Computational Engineering, Finance, and Science eess.IV eess.SP hep-ph Performance physics.chem-ph physics.comp-ph physics.data-an

Catalog footprint

What is connected

23works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bifurcation Models: Learning Set-Valued Solution Maps with Weight-Tied Dynamics

Many scientific and combinatorial problems admit multiple correct solutions, not a single label. Standard supervised learning resolves this ambiguity by choosing one solution as the target, but this hidden selector can be arbitrary, discontinuous, and harder to learn than the underlying solution set. We study bifurcation models, a weight-tied dynamical view in which different initializations can converge to different stable equilibria, so the model represents an attractor landscape rather than one chosen branch. We prove that broad set-valued maps with locally Lipschitz branches can be represented by regular equilibrium dynamics and that the induced selectors are almost everywhere regular, while manual selectors can be arbitrarily irregular. Experiments on frustrated Ising models show that such dynamics can discover multiple valid equilibria without branch labels and outperform single-branch supervision. Allen--Cahn experiments further show that diversity is not automatic: it can be encouraged explicitly, but with an accuracy--diversity tradeoff.

preprint2026arXiv

The identification of new Herbig Ae/Be stars from LAMOST DR7

Herbig Ae/Be stars (HAeBes) are critical tracers of intermediate- and high-mass star formation, yet their census remains incomplete compared to low-mass young stellar objects like T-Tauri stars. To expand the known population, we systematically searched for HAeBes in LAMOST DR7 low-resolution spectra. Following Sun et al., we applied Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and Support Vector Machine (SVM) classification, identifying $\sim$240,000 spectra with potential H$α$ emission. After removing contaminants (non-stellar objects, extragalactic sources, CVs, and Algol systems) and restricting to B/A-type stars, we obtained 1,835 candidates through 2MASS/WISE visual inspection. Spectral energy distribution analysis confirmed 143 sources with infrared excess ($J$-band or longer wavelengths), including 92 known HAeBes. From the remaining 51 candidates, we classified 26 with strong infrared excess as new HAeBes. Color-index analysis of confirmed HAeBes and classical Ae/Be stars (CAeBes) revealed that the $(K-W1)_0$ vs. $(W2-W3)_0$ diagram effectively separates these populations: CAeBes predominantly occupy $(K-W1)_0 \leq 0.5$ and $(W2-W3)_0 \leq 1.1$, while other regions trace transition disks ($(K-W1)_0 < 0.5$ and $(W2-W3)_0 > 1.1$), globally depleted disks ($(K-W1)_0 > 0.5$ and $(W2-W3)_0 < 1.1$), and Class I/Flat/II HAeBes ($(K-W1)_0 > 0.5$ and $(W2-W3)_0 > 1.1$). More importantly, the HAeBes exhibit a clear evolutionary gradient on this diagram, with those in the Class III, Class II, Flat-SED, and Class I evolutionary stages being effectively distinguished by concentric ellipses that are roughly centered at (0,0) with semi-major axes of $a$=1.5, $a$=3.0, and $a$=4.0, and a semi-major to semi-minor axis ratio of 1.6:1.

preprint2022arXiv

Generating Game Levels of Diverse Behaviour Engagement

Recent years, there has been growing interests in experience-driven procedural level generation. Various metrics have been formulated to model player experience and help generate personalised levels. In this work, we question whether experience metrics can adapt to agents with different personas. We start by reviewing existing metrics for evaluating game levels. Then, focusing on platformer games, we design a framework integrating various agents and evaluation metrics. Experimental studies on \emph{Super Mario Bros.} indicate that using the same evaluation metrics but agents with different personas can generate levels for particular persona. It implies that, for simple games, using a game-playing agent of specific player archetype as a level tester is probably all we need to generate levels of diverse behaviour engagement.

preprint2022arXiv

Online Game Level Generation from Music

Game consists of multiple types of content, while the harmony of different content types play an essential role in game design. However, most works on procedural content generation consider only one type of content at a time. In this paper, we propose and formulate online level generation from music, in a way of matching a level feature to a music feature in real-time, while adapting to players' play speed. A generic framework named online player-adaptive procedural content generation via reinforcement learning, OPARL for short, is built upon the experience-driven reinforcement learning and controllable reinforcement learning, to enable online level generation from music. Furthermore, a novel control policy based on local search and k-nearest neighbours is proposed and integrated into OPARL to control the level generator considering the play data collected online. Results of simulation-based experiments show that our implementation of OPARL is competent to generate playable levels with difficulty degree matched to the ``energy'' dynamic of music for different artificial players in an online fashion.

preprint2022arXiv

Reinforcement Learning with Dual-Observation for General Video Game Playing

Reinforcement learning algorithms have performed well in playing challenging board and video games. More and more studies focus on improving the generalisation ability of reinforcement learning algorithms. The General Video Game AI Learning Competition aims to develop agents capable of learning to play different game levels that were unseen during training. This paper summarises the five years' General Video Game AI Learning Competition editions. At each edition, three new games were designed. The training and test levels were designed separately in the first three editions. Since 2020, three test levels of each game were generated by perturbing or combining two training levels. Then, we present a novel reinforcement learning technique with dual-observation for general video game playing, assuming that it is more likely to observe similar local information in different levels rather than global information. Instead of directly inputting a single, raw pixel-based screenshot of the current game screen, our proposed general technique takes the encoded, transformed global and local observations of the game screen as two simultaneous inputs, aiming at learning local information for playing new levels. Our proposed technique is implemented with three state-of-the-art reinforcement learning algorithms and tested on the game set of the 2020 General Video Game AI Learning Competition. Ablation studies show the outstanding performance of using encoded, transformed global and local observations as input.

preprint2020arXiv

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics. Through solving RFLP, the decision-maker can obtain reliable location decisions under the risk of facilities' disruptions or failures. In this paper, we propose a novel model for the RFLP. Instead of assuming allocating a fixed number of facilities to each customer as in the existing works, we set the number of allocated facilities as an independent variable in our proposed model, which makes our model closer to the scenarios in real life but more difficult to be solved by traditional methods. To handle it, we propose EAMLS, a hybrid evolutionary algorithm, which combines a memorable local search (MLS) method and an evolutionary algorithm (EA). Additionally, a novel metric called l3-value is proposed to assist the analysis of the algorithm's convergence speed and exam the process of evolution. The experimental results show the effectiveness and superior performance of our EAMLS, compared to a CPLEX solver and a Genetic Algorithm (GA), on large-scale problems.

preprint2020arXiv

A Novel CNet-assisted Evolutionary Level Repairer and Its Applications to Super Mario Bros

Applying latent variable evolution to game level design has become more and more popular as little human expert knowledge is required. However, defective levels with illegal patterns may be generated due to the violation of constraints for level design. A traditional way of repairing the defective levels is programming specific rule-based repairers to patch the flaw. However, programming these constraints is sometimes complex and not straightforward. An autonomous level repairer which is capable of learning the constraints is needed. In this paper, we propose a novel approach, CNet, to learn the probability distribution of tiles giving its surrounding tiles on a set of real levels, and then detect the illegal tiles in generated new levels. Then, an evolutionary repairer is designed to search for optimal replacement schemes equipped with a novel search space being constructed with the help of CNet and a novel heuristic function. The proposed approaches are proved to be effective in our case study of repairing GAN-generated and artificially destroyed levels of Super Mario Bros. game. Our CNet-assisted evolutionary repairer can also be easily applied to other games of which the levels can be represented by a matrix of objects or tiles.

preprint2020arXiv

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable posterior inference in the structured model defined by the simulator code base. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of a Markov chain Monte Carlo baseline.

preprint2020arXiv

Interactive Evolution and Exploration Within Latent Level-Design Space of Generative Adversarial Networks

Generative Adversarial Networks (GANs) are an emerging form of indirect encoding. The GAN is trained to induce a latent space on training data, and a real-valued evolutionary algorithm can search that latent space. Such Latent Variable Evolution (LVE) has recently been applied to game levels. However, it is hard for objective scores to capture level features that are appealing to players. Therefore, this paper introduces a tool for interactive LVE of tile-based levels for games. The tool also allows for direct exploration of the latent dimensions, and allows users to play discovered levels. The tool works for a variety of GAN models trained for both Super Mario Bros. and The Legend of Zelda, and is easily generalizable to other games. A user study shows that both the evolution and latent space exploration features are appreciated, with a slight preference for direct exploration, but combining these features allows users to discover even better levels. User feedback also indicates how this system could eventually grow into a commercial design tool, with the addition of a few enhancements.

preprint2020arXiv

Learning Convolutional Sparse Coding on Complex Domain for Interferometric Phase Restoration

Interferometric phase restoration has been investigated for decades and most of the state-of-the-art methods have achieved promising performances for InSAR phase restoration. These methods generally follow the nonlocal filtering processing chain aiming at circumventing the staircase effect and preserving the details of phase variations. In this paper, we propose an alternative approach for InSAR phase restoration, i.e. Complex Convolutional Sparse Coding (ComCSC) and its gradient regularized version. To our best knowledge, this is the first time that we solve the InSAR phase restoration problem in a deconvolutional fashion. The proposed methods can not only suppress interferometric phase noise, but also avoid the staircase effect and preserve the details. Furthermore, they provide an insight of the elementary phase components for the interferometric phases. The experimental results on synthetic and realistic high- and medium-resolution datasets from TerraSAR-X StripMap and Sentinel-1 interferometric wide swath mode, respectively, show that our method outperforms those previous state-of-the-art methods based on nonlocal InSAR filters, particularly the state-of-the-art method: InSAR-BM3D. The source code of this paper will be made publicly available for reproducible research inside the community.

preprint2020arXiv

Task Augmentation by Rotating for Meta-Learning

Data augmentation is one of the most effective approaches for improving the accuracy of modern machine learning models, and it is also indispensable to train a deep model for meta-learning. In this paper, we introduce a task augmentation method by rotating, which increases the number of classes by rotating the original images 90, 180 and 270 degrees, different from traditional augmentation methods which increase the number of images. With a larger amount of classes, we can sample more diverse task instances during training. Therefore, task augmentation by rotating allows us to train a deep network by meta-learning methods with little over-fitting. Experimental results show that our approach is better than the rotation for increasing the number of images and achieves state-of-the-art performance on miniImageNet, CIFAR-FS, and FC100 few-shot learning benchmarks. The code is available on \url{www.github.com/AceChuse/TaskLevelAug}.

preprint2020arXiv

Towards in-store multi-person tracking using head detection and track heatmaps

Computer vision algorithms are being implemented across a breadth of industries to enable technological innovations. In this paper, we study the problem of computer vision based customer tracking in retail industry. To this end, we introduce a dataset collected from a camera in an office environment where participants mimic various behaviors of customers in a supermarket. In addition, we describe an illustrative example of the use of this dataset for tracking participants based on a head tracking model in an effort to minimize errors due to occlusion. Furthermore, we propose a model for recognizing customers and staff based on their movement patterns. The model is evaluated using a real-world dataset collected in a supermarket over a 24-hour period that achieves 98% accuracy during training and 93% accuracy during evaluation.

preprint2020arXiv

Versatile Black-Box Optimization

Choosing automatically the right algorithm using problem descriptors is a classical component of combinatorial optimization. It is also a good tool for making evolutionary algorithms fast, robust and versatile. We present Shiwa, an algorithm good at both discrete and continuous, noisy and noise-free, sequential and parallel, black-box optimization. Our algorithm is experimentally compared to competitors on YABBOB, a BBOB comparable testbed, and on some variants of it, and then validated on several real world testbeds.

preprint2020arXiv

Wavelet Scattering Networks for Atomistic Systems with Extrapolation of Material Properties

The dream of machine learning in materials science is for a model to learn the underlying physics of an atomic system, allowing it to move beyond interpolation of the training set to the prediction of properties that were not present in the original training data. In addition to advances in machine learning architectures and training techniques, achieving this ambitious goal requires a method to convert a 3D atomic system into a feature representation that preserves rotational and translational symmetry, smoothness under small perturbations, and invariance under re-ordering. The atomic orbital wavelet scattering transform preserves these symmetries by construction, and has achieved great success as a featurization method for machine learning energy prediction. Both in small molecules and in the bulk amorphous $\text{Li}_α\text{Si}$ system, machine learning models using wavelet scattering coefficients as features have demonstrated a comparable accuracy to Density Functional Theory at a small fraction of the computational cost. In this work, we test the generalizability of our $\text{Li}_α\text{Si}$ energy predictor to properties that were not included in the training set, such as elastic constants and migration barriers. We demonstrate that statistical feature selection methods can reduce over-fitting and lead to remarkable accuracy in these extrapolation tasks.

preprint2019arXiv

Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.

preprint2016arXiv

Automatically Reinforcing a Game AI

A recent research trend in Artificial Intelligence (AI) is the combination of several programs into one single, stronger, program; this is termed portfolio methods. We here investigate the application of such methods to Game Playing Programs (GPPs). In addition, we consider the case in which only one GPP is available - by decomposing this single GPP into several ones through the use of parameters or even simply random seeds. These portfolio methods are trained in a learning phase. We propose two different offline approaches. The simplest one, BestArm, is a straightforward optimization of seeds or parame- ters; it performs quite well against the original GPP, but performs poorly against an opponent which repeats games and learns. The second one, namely Nash-portfolio, performs similarly in a "one game" test, and is much more robust against an opponent who learns. We also propose an online learning portfolio, which tests several of the GPP repeatedly and progressively switches to the best one - using a bandit algorithm.

preprint2016arXiv

Bandit-Based Random Mutation Hill-Climbing

The Random Mutation Hill-Climbing algorithm is a direct search technique mostly used in discrete domains. It repeats the process of randomly selecting a neighbour of a best-so-far solution and accepts the neighbour if it is better than or equal to it. In this work, we propose to use a novel method to select the neighbour solution using a set of independent multi- armed bandit-style selection units which results in a bandit-based Random Mutation Hill-Climbing algorithm. The new algorithm significantly outperforms Random Mutation Hill-Climbing in both OneMax (in noise-free and noisy cases) and Royal Road problems (in the noise-free case). The algorithm shows particular promise for discrete optimisation problems where each fitness evaluation is expensive.

preprint2016arXiv

Learning opening books in partially observable games: using random seeds in Phantom Go

Many artificial intelligences (AIs) are randomized. One can be lucky or unlucky with the random seed; we quantify this effect and show that, maybe contrarily to intuition, this is far from being negligible. Then, we apply two different existing algorithms for selecting good seeds and good probability distributions over seeds. This mainly leads to learning an opening book. We apply this to Phantom Go, which, as all phantom games, is hard for opening book learning. We improve the winning rate from 50% to 70% in 5x5 against the same AI, and from approximately 0% to 40% in 5x5, 7x7 and 9x9 against a stronger (learning) opponent.

preprint2016arXiv

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausability), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to TB-sized problems in particle physics, climate modeling and bioimaging. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark's data-parallel model. We perform scaling experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.

preprint2016arXiv

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures

Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to its inherent sequentiality, linear algorithms are being used in practice. This limits the applicability of such methods to millions of data points, with limited scalability for Big Data analytics challenges in the scientific domain. In this paper, we present parallel and highly optimized kd-tree based KNN algorithms (both construction and querying) suitable for distributed architectures. Our algorithm includes novel approaches for pruning search space and improving load balancing and partitioning among nodes and threads. Using TB-sized datasets from three science applications: astrophysics, plasma physics, and particle physics, we show that our implementation can construct kd-tree of 189 billion particles in 48 seconds on utilizing $\sim$50,000 cores. We also demonstrate computation of KNN of 19 billion queries in 12 seconds. We demonstrate almost linear speedup both for shared and distributed memory computers. Our algorithms outperforms earlier implementations by more than order of magnitude; thereby radically improving the applicability of our implementation to state-of-the-art Big Data analytics problems. In addition, we showcase performance and scalability on the recently released Intel Xeon Phi processor showing that our algorithm scales well even on massively parallel architectures.

preprint2016arXiv

Rolling Horizon Coevolutionary Planning for Two-Player Video Games

This paper describes a new algorithm for decision making in two-player real-time video games. As with Monte Carlo Tree Search, the algorithm can be used without heuristics and has been developed for use in general video game AI. The approach is to extend recent work on rolling horizon evolutionary planning, which has been shown to work well for single-player games, to two (or in principle many) player games. To select an action the algorithm co-evolves two (or in the general case N) populations, one for each player, where each individual is a sequence of actions for the respective player. The fitness of each individual is evaluated by playing it against a selection of action-sequences from the opposing population. When choosing an action to take in the game, the first action is chosen from the fittest member of the population for that player. The new algorithm is compared with a number of general video game AI algorithms on three variations of a two-player space battle game, with promising results.

preprint2015arXiv

Algorithm Portfolios for Noisy Optimization

Noisy optimization is the optimization of objective functions corrupted by noise. A portfolio of solvers is a set of solvers equipped with an algorithm selection tool for distributing the computational power among them. Portfolios are widely and successfully used in combinatorial optimization. In this work, we study portfolios of noisy optimization solvers. We obtain mathematically proved performance (in the sense that the portfolio performs nearly as well as the best of its solvers) by an ad hoc portfolio algorithm dedicated to noisy optimization. A somehow surprising result is that it is better to compare solvers with some lag, i.e., propose the current recommendation of best solver based on their performance earlier in the run. An additional finding is a principled method for distributing the computational power among solvers in the portfolio.

preprint2015arXiv

Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints

Consider convex optimization problems subject to a large number of constraints. We focus on stochastic problems in which the objective takes the form of expected values and the feasible set is the intersection of a large number of convex sets. We propose a class of algorithms that perform both stochastic gradient descent and random feasibility updates simultaneously. At every iteration, the algorithms sample a number of projection points onto a randomly selected small subsets of all constraints. Three feasibility update schemes are considered: averaging over random projected points, projecting onto the most distant sample, projecting onto a special polyhedral set constructed based on sample points. We prove the almost sure convergence of these algorithms, and analyze the iterates' feasibility error and optimality error, respectively. We provide new convergence rate benchmarks for stochastic first-order optimization with many constraints. The rate analysis and numerical experiments reveal that the algorithm using the polyhedral-set projection scheme is the most efficient one within known algorithms.

Jialin Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

Bifurcation Models: Learning Set-Valued Solution Maps with Weight-Tied Dynamics

The identification of new Herbig Ae/Be stars from LAMOST DR7

Generating Game Levels of Diverse Behaviour Engagement

Online Game Level Generation from Music

Reinforcement Learning with Dual-Observation for General Video Game Playing

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

A Novel CNet-assisted Evolutionary Level Repairer and Its Applications to Super Mario Bros

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

Interactive Evolution and Exploration Within Latent Level-Design Space of Generative Adversarial Networks

Learning Convolutional Sparse Coding on Complex Domain for Interferometric Phase Restoration

Task Augmentation by Rotating for Meta-Learning

Towards in-store multi-person tracking using head detection and track heatmaps

Versatile Black-Box Optimization

Wavelet Scattering Networks for Atomistic Systems with Extrapolation of Material Properties

Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

Automatically Reinforcing a Game AI

Bandit-Based Random Mutation Hill-Climbing

Learning opening books in partially observable games: using random seeds in Phantom Go

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures

Rolling Horizon Coevolutionary Planning for Two-Player Video Games

Algorithm Portfolios for Noisy Optimization

Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints