1 "https://github.com/hill-a/stable-baselines/pull/453"
2 "https://github.com/hill-a/stable-baselines"
3 "https://en.wikipedia.org/wiki/Snake_(video_game_genre)"
4 Y.-C. Wu, "TAM: Using trainable-action-mask to improve sample-efficiency in reinforcement learning for dialogue systems" 1-8, 2019
5 "Source code"
6 R. Sutton, "Reinforcement Learning: An Introduction" MIT Press 2018
7 J. Schulman, "Proximal policy optimization algorithms"
8 "Modified code"
9 T. Zahavy, "Learn what not to learn: Action elimination with deep reinforcement learning" 3562-3573, 2018
10 V. Mnih, "Human-level control through deep reinforcement learning" 518 : 529-533, 2015
1 "https://github.com/hill-a/stable-baselines/pull/453"
2 "https://github.com/hill-a/stable-baselines"
3 "https://en.wikipedia.org/wiki/Snake_(video_game_genre)"
4 Y.-C. Wu, "TAM: Using trainable-action-mask to improve sample-efficiency in reinforcement learning for dialogue systems" 1-8, 2019
5 "Source code"
6 R. Sutton, "Reinforcement Learning: An Introduction" MIT Press 2018
7 J. Schulman, "Proximal policy optimization algorithms"
8 "Modified code"
9 T. Zahavy, "Learn what not to learn: Action elimination with deep reinforcement learning" 3562-3573, 2018
10 V. Mnih, "Human-level control through deep reinforcement learning" 518 : 529-533, 2015
11 OpenAI Ltd., "Gym toolkit software"
12 X. Gao, "Deep reinforcement learning for time series: playing idealized trading games"