"A Reinforcement Learning Approach to the 2v2 Beyond Visual Range Air C" by Jacob J. Pike

Author

Jacob J. Pike

Date of Award

3-2024

Document Type

Thesis

Degree Name

Master of Science

Department

Department of Operational Sciences

First Advisor

Matthew Robbins, PhD

Abstract

This research examines a 2v2 air combat maneuvering problem (ACMP) in a Beyond Visual Range (BVR) environment. A discrete-time, infinite-horizon Markov Decision Process (MDP) model represents the BVR-ACMP, seeking to determine high-quality policies for a pair of autonomous aircraft to execute tactical maneuvers and firing decisions. The Advanced Framework for Simulation, Integration, and Modeling (AFSIM) characterizes the complex six-degree of freedom (6-DOF) aircraft operations, encompassing kinematics, sensors, and weapons. Given the high dimensionality and continuous nature of the state and decision variables, a deep reinforcement learning (RL) solution approach is adopted wherein the value function is approximated via a Neural Network (NN). The research includes designing neutral starting state scenarios for training and assessing the impact of adversarial behaviors and missile characteristics on decision policies. A three stage hyperparameter tuning experiment is conducted to obtain high-quality policies. Several case studies are examined to evaluate the effectiveness of the deep RL approach, demonstrating its feasibility for generating aircraft behavior models for air combat AFSIM based simulation studies.

AFIT Designator

AFIT-ENS-MS-24-M-095

Comments

A 12-month embargo was observed for posting this work on AFIT Scholar.

Distribution Statement A, Approved for Public Release. PA case number on file.

Share

COinS