10.1145/3487920">
 

Developing Action Policies with Q-Learning and Shallow Neural Networks on Reconfigurable Embedded Devices

Document Type

Article

Publication Date

12-2020

Abstract

The size of sensor networks supporting smart cities is ever increasing. Sensor network resiliency becomes vital for critical networks such as emergency response and waste water treatment. One approach is to engineer “self-aware” sensors that can proactively change their component composition in response to changes in work load when critical devices fail. By extension, these devices could anticipate their own termination, such as battery depletion, and offload current tasks onto connected devices. These neighboring devices can then reconfigure themselves to process these tasks, thus avoiding catastrophic network failure. In this article, we compare and contrast two types of self-aware sensors. One set uses Q-learning to develop a policy that guides device reaction to various environmental stimuli, whereas the others use a set of shallow neural networks to select an appropriate reaction. The novelty lies in the use of field programmable gate arrays embedded on the sensors that take into account internal system state, configuration, and learned state-action pairs, which guide device decisions to meet system demands. Experiments show that even relatively simple reward functions develop both Q-learning policies and shallow neural networks that yield positive device behaviors in dynamic environments.

Comments

The "Link to Full Text" on this page opens the full article hosted at ACM.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/authors.

Source Publication

ACM Transactions on Autonomous and Adaptive Systems (ISSN 1556-4665 | eISSN 1556-4703)

Share

COinS