Source author record

Tristan Cazenave

Tristan Cazenave appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computer Science and Game Theory Robotics eess.SY math.CO Multiagent Systems Systems and Control

Catalog footprint

What is connected

12works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Optimal Solving of Constrained Path-Planning Problems with Graph Convolutional Networks and Optimized Tree Search

Deep learning-based methods are growing prominence for planning purposes. In this paper, we present a hybrid planner that combines a graph machine learning model and an optimal solver based on branch and bound tree search for path-planning tasks. More specifically, a graph neural network is used to assist the branch and bound algorithm in handling constraints associated with a desired solution path. There are multiple downstream practical applications, such as Autonomous Unmanned Ground Vehicles (AUGV), typically deployed in disaster relief or search and rescue operations. In off-road environments, AUGVs must dynamically optimize a source-destination path under various operational constraints, out of which several are difficult to predict in advance and need to be addressed online. We conduct experiments on realistic scenarios and show that graph neural network support enables substantial speedup and smoother scaling to harder path-planning problems. Additionally, information provided by the graph neural network enables the approach to outperform problem-specific handcrafted heuristics, highlighting the potential graph neural networks hold for path-planning tasks.

preprint2022arXiv

Refutation of Spectral Graph Theory Conjectures with Monte Carlo Search

We demonstrate how Monte Carlo Search (MCS) algorithms, namely Nested Monte Carlo Search (NMCS) and Nested Rollout Policy Adaptation (NRPA), can be used to build graphs and find counter-examples to spectral graph theory conjectures in minutes.

preprint2022arXiv

Solving Disjunctive Temporal Networks with Uncertainty under Restricted Time-Based Controllability using Tree Search and Graph Neural Networks

Planning under uncertainty is an area of interest in artificial intelligence. We present a novel approach based on tree search and graph machine learning for the scheduling problem known as Disjunctive Temporal Networks with Uncertainty (DTNU). Dynamic Controllability (DC) of DTNUs seeks a reactive scheduling strategy to satisfy temporal constraints in response to uncontrollable action durations. We introduce new semantics for reactive scheduling: Time-based Dynamic Controllability (TDC) and a restricted subset of TDC, R-TDC. We design a tree search algorithm to determine whether or not a DTNU is R-TDC. Moreover, we leverage a graph neural network as a heuristic for tree search guidance. Finally, we conduct experiments on a known benchmark on which we show R-TDC to retain significant completeness with regard to DC, while being faster to prove. This results in the tree search processing fifty percent more DTNU problems in R-TDC than the state-of-the-art DC solver does in DC with the same time budget. We also observe that graph neural network search guidance leads to substantial performance gains on benchmarks of more complex DTNUs, with up to eleven times more problems solved than the baseline tree search.

preprint2021arXiv

Optimizing $αμ$

$αμ$ is a search algorithm which repairs two defaults of Perfect Information Monte Carlo search: strategy fusion and non locality. In this paper we optimize $αμ$ for the game of Bridge, avoiding useless computations. The proposed optimizations are general and apply to other imperfect information turn-based games. We define multiple optimizations involving Pareto fronts, and show that these optimizations speed up the search. Some of these optimizations are cuts that stop the search at a node, while others keep track of which possible worlds have become redundant, avoiding unnecessary, costly evaluations. We also measure the benefits of parallelizing the double dummy searches at the leaves of the $αμ$ search tree.

preprint2021arXiv

Stabilized Nested Rollout Policy Adaptation

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.

preprint2020arXiv

Generalized Nested Rollout Policy Adaptation

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to generalize NRPA with a temperature and a bias and to analyze theoretically the algorithms. The generalized algorithm is named GNRPA. Experiments show it improves on NRPA for different application domains: SameGame and the Traveling Salesman Problem with Time Windows.

preprint2020arXiv

Mobile Networks for Computer Go

The architecture of the neural networks used in Deep Reinforcement Learning programs such as Alpha Zero or Polygames has been shown to have a great impact on the performances of the resulting playing engines. For example the use of residual networks gave a 600 ELO increase in the strength of Alpha Go. This paper proposes to evaluate the interest of Mobile Network for the game of Go using supervised learning as well as the use of a policy head and a value head different from the Alpha Zero heads. The accuracy of the policy, the mean squared error of the value, the efficiency of the networks with the number of parameters, the playing speed and strength of the trained networks are evaluated.

preprint2020arXiv

Monte Carlo Game Solver

We present a general algorithm to order moves so as to speedup exact game solvers. It uses online learning of playout policies and Monte Carlo Tree Search. The learned policy and the information in the Monte Carlo tree are used to order moves in game solvers. They improve greatly the solving time for multiple games.

preprint2020arXiv

Monte Carlo Inverse Folding

The RNA Inverse Folding problem comes from computational biology. The goal is to find a molecule that has a given folding. It is important for scientific fields such as bioengineering, pharmaceutical research, biochemistry, synthetic biology and RNA nanostructures. Nested Monte Carlo Search has given excellent results for this problem. We propose to adapt and evaluate different Monte Carlo Search algorithms for the RNA Inverse Folding problem.

preprint2020arXiv

Polygames: Improved Zero Learning

Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning; and in Havannah. We also won several first places at the TAAI competitions.

preprint2016arXiv

Learning opening books in partially observable games: using random seeds in Phantom Go

Many artificial intelligences (AIs) are randomized. One can be lucky or unlucky with the random seed; we quantify this effect and show that, maybe contrarily to intuition, this is far from being negligible. Then, we apply two different existing algorithms for selecting good seeds and good probability distributions over seeds. This mainly leads to learning an opening book. We apply this to Phantom Go, which, as all phantom games, is hard for opening book learning. We improve the winning rate from 50% to 70% in 5x5 against the same AI, and from approximately 0% to 40% in 5x5, 7x7 and 9x9 against a stronger (learning) opponent.

preprint2015arXiv

Depth, balancing, and limits of the Elo model

-Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, human-centered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to a continuous measure. We provide new depth results and present tool (given-first-move, pie rule, size extension) for increasing it. We also use these measures for analyzing games and opening moves in Y, NoGo, Killall Go, and the effect of pie rules.

Tristan Cazenave

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Optimal Solving of Constrained Path-Planning Problems with Graph Convolutional Networks and Optimized Tree Search

Refutation of Spectral Graph Theory Conjectures with Monte Carlo Search

Solving Disjunctive Temporal Networks with Uncertainty under Restricted Time-Based Controllability using Tree Search and Graph Neural Networks

Optimizing $αμ$

Stabilized Nested Rollout Policy Adaptation

Generalized Nested Rollout Policy Adaptation

Mobile Networks for Computer Go

Monte Carlo Game Solver

Monte Carlo Inverse Folding

Polygames: Improved Zero Learning

Learning opening books in partially observable games: using random seeds in Phantom Go

Depth, balancing, and limits of the Elo model