Document Type
Conference Proceeding
Publication Date
5-2006
Abstract
Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the environment it is operating in changes. This ability to learn in an unsupervised manner in a changing environment is applicable in complex domains through the use of function approximation of the domain’s policy. The function approximation presented here is that of fuzzy state aggregation. This article presents the use of fuzzy state aggregation with the current policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF), exceeding the learning rate and performance of the combined fuzzy state aggregation and Q-learning reinforcement learning. Results of testing using the TileWorld domain demonstrate the policy hill climbing performs better than the existing Q-learning implementations.
Source Publication
Eighth IASTED International Conference on Control and Applications (CA 2006)
Recommended Citation
Wardell, D., & Peterson, G. L. (2006). Fuzzy State Aggregation and Off-Policy Reinforcement Learning for Stochastic Environments. Eighth IASTED International Conference on Control and Applications (CA 2006), 2006, Pp. 145-152., 133–138.
Comments
AFIT Scholar furnishes the draft of this conference paper. The published version, as it appears in the proceedings cited below, is available to purchase from ACTA Press as part of the IASTED CA 2006 proceedings, or as an individual paper (paper 529-049).