Document Type


Publication Date



A gambler in possession of R chips/coins is allowed N(>R) pulls/trials at a slot machine. Upon pulling the arm, the slot machine realizes a random state i ɛ{1, ..., M} with probability p(i) and the corresponding positive monetary reward g(i) is presented to the gambler. The gambler can accept the reward by inserting a coin in the machine. However, the dilemma facing the gambler is whether to spend the coin or keep it in reserve hoping to pick up a greater reward in the future. We assume that the gambler has full knowledge of the reward distribution function. We are interested in the optimal gambling strategy that results in the maximal cumulative reward. The problem is naturally posed as a Stochastic Dynamic Program whose solution yields the optimal policy and expected cumulative reward. We show that the optimal strategy is a threshold policy, wherein a coin is spent if and only if the number of coins r exceeds a state and stage/trial dependent threshold value. We illustrate the utility of the result on a military operational scenario.


Sourced from the publisher's version at:
Krishnamoorthy, K., Pachter, M., & Casbeer, D. W. (2016). Optimal Policy for Sequential Stochastic Resource Allocation. Procedia Computer Science, 95, 483–488.

Published under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Part of the special issue of Procedia Computer Science: Complex Adaptive Systems Los Angeles, CA November 2-4, 2016



Source Publication

Procedia Computer Science