Date of Award
Master of Science in Operations Research
Department of Operational Sciences
Matthew J.D. Robbins, PhD.
The United States Air Force (USAF) officer sustainment system involves making accession and promotion decisions for nearly 64 thousand officers annually. We formulate a discrete time stochastic Markov decision process model to examine this military workforce planning problem. The large size of the motivating problem suggests that conventional exact dynamic programming algorithms are inappropriate. As such, we propose two approximate dynamic programming (ADP) algorithms to solve the problem. We employ a least-squares approximate policy iteration (API) algorithm with instrumental variables Bellman error minimization to determine approximate policies. In this API algorithm, we use a modified version of the Bellman equation based on the post-decision state variable. Approximating the value function using a post-decision state variable allows us to find the best policy for a given approximation using a decomposable mixed integer nonlinear programming formulation. We also propose an approximate value iteration algorithm using concave adaptive value estimation (CAVE). The CAVE algorithm identities an improved policy for a test problem based on the current USAF officer sustainment system. The CAVE algorithm obtains a statistically significant 2.8% improvement over the currently employed USAF policy, which serves as the benchmark.
DTIC Accession Number
Hoecherl, Joseph C., "Approximate Dynamic Programming Algorithms for United States Air Force Officer Sustainment" (2015). Theses and Dissertations. 116.