Date of Award

3-2025

Document Type

Thesis

Degree Name

Master of Science

Department

Department of Operational Sciences

First Advisor

Matthew A. Robbins, PhD

Abstract

Artificial intelligence (AI) grows ever-more important in warfighting. Emerging technologies allow for the use of AI to control aircraft and weapons systems. This research investigates the application of reinforcement learning (RL) through the Proximal Policy Optimization (PPO) algorithm to a two-versus-two (2v2) beyond-visual-range (BVR) air combat maneuvering problem (ACMP). Implemented in the Advanced Framework for Simulation, Integration, and Modeling (AFSIM), the methodology frames the engagement as a Markov decision process, wherein an autonomous RL agent learns continuous control decisions—throttle, pitch, roll, and yaw—under a cooperative communication scheme. A multi-phase curriculum-learning approach facilitates the progressive acquisition of flight stability, weapon deployment, and air combat tactics. Through hyperparameter tuning and reward shaping, the PPO agent demonstrates the emergent capacity to balance offensive missile usage with evasive maneuvers. Findings highlight the algorithm’s potential to evolve intelligent, adaptive behaviors in aerial engagements, offering pathways to improved tactical simulations and future research in reinforcement learning for combat aviation.

AFIT Designator

AFIT-ENS-MS-25-M-169

Comments

An embargo was observed for this posting.

Distribution A: Approved for public release, Distribution Unlimited. PA case number 88ABW-2025-0320

Share

COinS