Document Type

Conference Proceeding

Publication Date



Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the environment it is operating in changes. This ability to learn in an unsupervised manner in a changing environment is applicable in complex domains through the use of function approximation of the domain’s policy. The function approximation presented here is that of fuzzy state aggregation. This article presents the use of fuzzy state aggregation with the current policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF), exceeding the learning rate and performance of the combined fuzzy state aggregation and Q-learning reinforcement learning. Results of testing using the TileWorld domain demonstrate the policy hill climbing performs better than the existing Q-learning implementations.


AFIT Scholar furnishes the draft of this conference paper. The published version, as it appears in the proceedings cited below, is available to purchase from ACTA Press as part of the IASTED CA 2006 proceedings, or as an individual paper (paper 529-049).

Source Publication

Eighth IASTED International Conference on Control and Applications (CA 2006)