Turn-based multi-action adversarial games are challenging scenarios in which each player turn consists of a sequence of atomic actions. The order in which an AI agent runs these atomic actions may hugely impact the outcome of the turn. One of the main challenges of game artificial intelligence is to design a heuristic function to help agents to select the optimal turn to play, given a particular state of the game. In this paper, we report results using the recently developed N-Tuple Bandit Evolutionary Algorithm to tune the heuristic function parameters. For evaluation, we measure how the tuned heuristic function affects the performance of the state-of-the-art evolutionary algorithm Online Evolution Planning. The multi-action adversarial strategy card game Legends of Code and Magic was used as a testbed. Results indicate that the NTuple Bandit Evolutionary Algorithm can effectively tune the heuristic function parameters to improve the performance of the agent.
DOI:https://doi.org/10.1007/978-3-030-43722-0_26
Cite this work
@inproceedings{montoliu2020cards, author= {Raul Montoliu and Raluca D. Gaina and Diego Perez-Liebana and Daniel Delgado and Simon M. Lucas}, title= {{Efficient Heuristic Policy Optimisation for a Challenging Strategic Card Game}}, year= {2020}, booktitle= {{International Conference on the Applications of Evolutionary Computation (EvoStar)}}, volume= {12104}, pages= {403--418}, organization= {Springer}, doi= {https://doi.org/10.1007/978-3-030-43722-0_26}, abstract= {Turn-based multi-action adversarial games are challenging scenarios in which each player turn consists of a sequence of atomic actions. The order in which an AI agent runs these atomic actions may hugely impact the outcome of the turn. One of the main challenges of game artificial intelligence is to design a heuristic function to help agents to select the optimal turn to play, given a particular state of the game. In this paper, we report results using the recently developed N-Tuple Bandit Evolutionary Algorithm to tune the heuristic function parameters. For evaluation, we measure how the tuned heuristic function affects the performance of the state-of-the-art evolutionary algorithm Online Evolution Planning. The multi-action adversarial strategy card game Legends of Code and Magic was used as a testbed. Results indicate that the NTuple Bandit Evolutionary Algorithm can effectively tune the heuristic function parameters to improve the performance of the agent.},
}