In recent years, researchers have achieved great success in applying Deep Reinforcement Learning (DRL) algorithms to Real-time Strategy (RTS) games, creating strong autonomous agents that could defeat professional players in StarCraft~II. However, existing approaches to tackle full games have high computational costs, usually requiring the use of thousands of GPUs and CPUs for weeks. This paper has two main contributions to address this issue: 1) We introduce Gym-μRTS (pronounced 'gym-micro-RTS') as a fast-to-run RL environment for full-game RTS research and 2) we present a collection of techniques to scale DRL to play full-game μRTS as well as ablation studies to demonstrate their empirical importance. Our best-trained bot can defeat every μRTS bot we tested from the past μRTS competitions when working in a single-map setting, resulting in a state-of-the-art DRL agent while only taking about 60 hours of training using a single machine (one GPU, three vCPU, 16GB RAM).
URL: https://arxiv.org/abs/2105.13807
Github: https://github.com/vwxyzjn/gym-microrts
Cite this work
@article{huang2021gym, author= {Huang, Shengyi and Ontanon, Santiago and Bamford, Chris and Grela, Lukasz}, title= {{Gym-μRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning}}, year= {2021}, journal= {{arXiv:2105.13807}}, url= {https://arxiv.org/abs/2105.13807}, abstract= {In recent years, researchers have achieved great success in applying Deep Reinforcement Learning (DRL) algorithms to Real-time Strategy (RTS) games, creating strong autonomous agents that could defeat professional players in StarCraft~II. However, existing approaches to tackle full games have high computational costs, usually requiring the use of thousands of GPUs and CPUs for weeks. This paper has two main contributions to address this issue: 1) We introduce Gym-μRTS (pronounced 'gym-micro-RTS') as a fast-to-run RL environment for full-game RTS research and 2) we present a collection of techniques to scale DRL to play full-game μRTS as well as ablation studies to demonstrate their empirical importance. Our best-trained bot can defeat every μRTS bot we tested from the past μRTS competitions when working in a single-map setting, resulting in a state-of-the-art DRL agent while only taking about 60 hours of training using a single machine (one GPU, three vCPU, 16GB RAM).},
}