10.1016/j.procs.2016.09.325">
 

Optimal Policy for Sequential Stochastic Resource Allocation

Document Type

Article

Publication Date

11-2018

Abstract

A gambler in possession of R chips/coins is allowed N(>R) pulls/trials at a slot machine. Upon pulling the arm, the slot machine realizes a random state i ɛ{1, ..., M} with probability p(i) and the corresponding positive monetary reward g(i) is presented to the gambler. The gambler can accept the reward by inserting a coin in the machine. However, the dilemma facing the gambler is whether to spend the coin or keep it in reserve hoping to pick up a greater reward in the future. We assume that the gambler has full knowledge of the reward distribution function. We are interested in the optimal gambling strategy that results in the maximal cumulative reward. The problem is naturally posed as a Stochastic Dynamic Program whose solution yields the optimal policy and expected cumulative reward. We show that the optimal strategy is a threshold policy, wherein a coin is spent if and only if the number of coins r exceeds a state and stage/trial dependent threshold value. We illustrate the utility of the result on a military operational scenario.

Comments

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way. CC BY-NC-ND 4.0

Sourced from the published version of record cited below.

Part of the special issue of Procedia Computer Science: Complex Adaptive Systems Los Angeles, CA November 2-4, 2016

Source Publication

Procedia Computer Science

Share

COinS