http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Implementing Action Mask in Proximal Policy Optimization (PPO) Algorithm
Cheng-Yen Tang,Chien-Hung Liu,Woei-Kae Chen,Shingchern D. You 한국통신학회 2020 ICT Express Vol.6 No.3
The proximal policy optimization (PPO) algorithm is a promising algorithm in reinforcement learning. In this paper, we propose to add an action mask in the PPO algorithm. The mask indicates whether an action is valid or invalid for each state. Simulation results show that, when compared with the original version, the proposed algorithm yields much higher return with a moderate number of training steps. Therefore, it is useful and valuable to incorporate such a mask if applicable.