"A Reinforcement Learning Self-Play Approach for Informing Wargaming An" by Kathleen A. MacLean

Date of Award

3-2024

Document Type

Thesis

Degree Name

Master of Science in Operations Research

Department

Department of Operational Sciences

First Advisor

Matthew Robbins, PhD

Abstract

The integration of RL into wargames to learn strategic and operational insights is of interest to the United States Air Force. This thesis explores the application of a RL SARSA(λ) algorithm to the wargame Stratagem MIST. The primary objective is to select air and ground combat policies for the Blue Agent to effectively counter various opponent strategies across different terrains. This testing enables a comprehensive evaluation of the Blue Agent’s adaptability and performance under varying combat conditions. The use of basis functions, linear value function approximations, and specific air and ground strategies simplifies the state and action spaces of the MDP, enabling computational tractability. A Latin hypercube design is employed to explore hyperparameter configurations, aiming to maximize total rewards in various combat scenarios. Key findings reveal the efficacy of SARSA(λ) in the Stratagem MIST environment, highlighting the promising role of RL algorithms and self-play in wargaming. Limitations due to computational resources point to the need for enhanced capabilities for more extensive simulations.

AFIT Designator

AFIT-ENS-MS-24-M-088

Comments

A 12-month embargo was observed for posting this work on AFIT Scholar.

Distribution Statement A, Approved for Public Release. PA case number on file.

Share

COinS