Approximate Dynamic Programming for the United State Air Force Officer Manpower Planning Problem

Kimberly S. West


The United States Air Force (USAF) makes officer accession and promotion decisions annually. Optimal manpower planning of the commissioned officer corps is vital to ensuring a well-balanced manpower system. A manpower system that is neither over-manned nor under-manned is desirable as it is most cost effective. The Air Force Officer Manpower Planning Problem (AFO-MPP) is introduced, which models officer accessions, promotions, and the uncertainty in retention rates. The objective for the AFO-MPP is to identify the policy for accession and promotion decisions that minimizes expected total discounted cost of maintaining the required number of officers in the system over an infinite time horizon. The AFO-MPP is formulated as an infinite-horizon Markov decision problem, and a policy is found using approximate dynamic programming. A least-squares temporal differencing (LSTD) algorithm is employed to determine the best approximate policies. Six computational experiments are conducted with varying retention rates and officer manning starting conditions. The policies determined by the LSTD algorithm are compared to the benchmark policy, which is the policy currently practiced by the USAF. Results indicate that when the manpower system is in a starting state with on-target numbers of officers per rank, the ADP policy outperforms the benchmark policy. When the starting state is unbalanced, with more officers in junior ranking positions, the benchmark policy outperforms the ADP policy. When the starting state is unbalanced, with more officers in senior ranking positions, there is not statistical difference between the ADP and benchmark policy. In this starting state, ADP policy has smaller variance, indicating the ADP policy is more dependable than the benchmark policy.