TY - JOUR
T1 - Asynchronous action-reward learning for nonstationary serial supply chain inventory control
AU - Kim, Chang Ouk
AU - Kwon, Ick Hyun
AU - Baek, Jun Geol
PY - 2008/2
Y1 - 2008/2
N2 - Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.
AB - Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.
UR - http://www.scopus.com/inward/record.url?scp=37649013265&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=37649013265&partnerID=8YFLogxK
U2 - 10.1007/s10489-007-0038-2
DO - 10.1007/s10489-007-0038-2
M3 - Article
AN - SCOPUS:37649013265
SN - 0924-669X
VL - 28
SP - 1
EP - 16
JO - Applied Intelligence
JF - Applied Intelligence
IS - 1
ER -