Date of Award

3-26-2015

Document Type

Thesis

Degree Name

Master of Science in Operations Research

Department

Department of Operational Sciences

First Advisor

Matthew J.D. Robbins, PhD.

Abstract

The United States Air Force (USAF) officer sustainment system involves making accession and promotion decisions for nearly 64 thousand officers annually. We formulate a discrete time stochastic Markov decision process model to examine this military workforce planning problem. The large size of the motivating problem suggests that conventional exact dynamic programming algorithms are inappropriate. As such, we propose two approximate dynamic programming (ADP) algorithms to solve the problem. We employ a least-squares approximate policy iteration (API) algorithm with instrumental variables Bellman error minimization to determine approximate policies. In this API algorithm, we use a modified version of the Bellman equation based on the post-decision state variable. Approximating the value function using a post-decision state variable allows us to find the best policy for a given approximation using a decomposable mixed integer nonlinear programming formulation. We also propose an approximate value iteration algorithm using concave adaptive value estimation (CAVE). The CAVE algorithm identities an improved policy for a test problem based on the current USAF officer sustainment system. The CAVE algorithm obtains a statistically significant 2.8% improvement over the currently employed USAF policy, which serves as the benchmark.

AFIT Designator

AFIT-ENS-MS-15-M-126

DTIC Accession Number

ADA615246

Share

COinS