Editors
Tristan Cazenave
Universit Paris-Dauphine, Paris, France
Olivier Teytaud
Facebook FAIR, Paris, France
Mark H. M. Winands
Maastricht University, Maastricht, Limburg, The Netherlands
ISSN 1865-0929 e-ISSN 1865-0937
Communications in Computer and Information Science
ISBN 978-3-030-89452-8 e-ISBN 978-3-030-89453-5
https://doi.org/10.1007/978-3-030-89453-5
Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
These proceedings contain the papers presented at the first Monte Carlo Search Workshop, which was held virtually as an IJCAI workshop on January 7, 2021.
Monte Carlo search is a family of search algorithms that have many applications in different domains. It is the state of the art in many perfect- and imperfect-information games. Other applications include the RNA inverse folding problem, logistics, multiple sequence alignment, general game playing, puzzles, 3D packing with object orientation, cooperative pathfinding, software testing and heuristic model checking.
In recent years, many researchers have explored different variants of Monte Carlo search, their relationship to deep reinforcement learning, and their different applications. The purpose of the workshop was to bring these researchers together to present their research, discuss future research directions, and cross-fertilize the different communities. Submissions were welcome in all fields related to Monte Carlo search, including:
Monte Carlo tree search and upper confidence trees,
Nested Monte Carlo search,
Non-locality in Monte Carlo search,
Combination with zero learning,
Monte Carlo belief-state estimation,
Self-adaptive Monte Carlo search,
Monte Carlo in many player games,
Industrial applications,
Scientific applications,
Applications in games,
Applications in puzzles.
Each paper was sent to three reviewers. If conflicting views on a paper were reported, it was sent to an additional reviewer. Program chairs were not involved in the review assignment and final selection of their own papers as these were handled separately by the other chairs. In the end, 11 contributions were accepted for presentation at the workshop, of which nine made it into these proceedings. Here we provide a brief outline of these nine contributions, in the order in which they appear in the proceedings.
The Search Algorithm for the Game of Bridge by Tristan Cazenave and Vronique Ventos proposes a search algorithm that improves on perfect information Monte Carlo search for some contracts of bridge. It repairs two known defects of perfect information Monte Carlo search, namely strategy fusion and non locality.
Stabilized Nested Rollout Policy Adaptation by Tristan Cazenave, Jean-Baptiste Sevestre, and Mathieu Toulemont presents a straightforward improvement of Nested Rollout Policy Adaptation (NRPA) that gives good results in SameGame, the Traveling Salesman with Time Windows, and Expression Discovery. It stabilizes the learning of the policy by playing multiple playouts before adaptations at level one.
In zoNNscan: A Boundary-Entropy Index for Zone Inspection of Neural Models, Adel Jaouen and Erwan Le Merrer introduce an index that is intended to inform the boundary uncertainty (in terms of the presence of other classes) around one given input data point. It is based on confidence entropy, and is implemented through Monte Carlo sampling in the multidimensional ball surrounding that input.
Reward shaping is often a hidden critical component of reinforcement learning. In Ordinal Monte Carlo Tree Search, Tobias Joppen and Johannes Frnkranz discuss the use of only a ranking information on rewards (as opposed to numerical values). This makes their proposed Monte Carlo tree search (MCTS) variant robust to poorly chosen reward values.
Monte Carlo Game Solver by Tristan Cazenave uses the information on move ordering collected during a Monte Carlo tree search to order moves in Alpha-Beta search. Various small games are solved with this Alpha-Beta variant equipped with Monte Carlo move ordering. Overall, it improves on Alpha-Beta with simple move ordering heuristics.
Generalized Nested Rollout Policy Adaptation by Tristan Cazenave demonstrates mathematically that Generalized Nested Rollout Policy Adaptation with a bias and a temperature is equivalent to Nested Rollout Policy Adaptation with initialization of the weights and a different learning rate. Using a bias is more convenient and more general than initializing weights as demonstrated for SameGame. Good results are also obtained for the Traveling Salesman Problem with Time Windows.
Monte Carlo Inverse Folding by Tristan Cazenave and Thomas Fournier addresses the design of RNA molecules with Monte Carlo search. They test various optimizations of NRPA on the RNA design problem and reach state-of-the-art results on the Eterna100 benchmark using general improvements of Monte Carlo search.