RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
Development of License Plate Recognition on Complex Scene with Plate-Style Classification and Confidence Scoring Based on KNN

MONTERO, Vince Jebryl,JEONG, Yong-Jin Institute of Electronics, Information and Communic 2018 IEICE transactions on information and systems Vol.101e.d No.12
- 원문보기
2
Solving Survival Gridworld Problem Using Hybrid Policy Modified Q-Based Reinforcement

Vince Jebryl Montero,Woo-Young Jung,Yong-Jin Jeong 한국전기전자학회 2019 전기전자학회논문지 Vol.23 No.4
- 원문보기 2
  KCI
  
  DBpia
This paper explores a model-free value-based approach for solving survival gridworld problem. Survival gridworld problem opens up a challenge involving taking risks to gain better rewards. Classic value-based approach in model-free reinforcement learning assumes minimal risk decisions. The proposed method involves a hybrid on-policy and off-policy updates to experience roll-outs using a modified Q-based update equation that introduces a parametric linear rectifier and motivational discount. The significance of this approach is it allows model-free training of agents that take into account risk factors and motivated exploration to gain better path decisions. Experimentations suggest that the proposed method achieved better exploration and path selection resulting to higher episode scores than classic off-policy and on-policy Q-based updates.
3
Solving Survival Gridworld Problem Using Hybrid Policy Modified Q-Based Reinforcement

Montero, Vince Jebryl,Jung, Woo-Young,Jeong, Yong-Jin Institute of Korean Electrical and Electronics Eng 2019 전기전자학회논문지 Vol.23 No.4
- 원문보기
This paper explores a model-free value-based approach for solving survival gridworld problem. Survival gridworld problem opens up a challenge involving taking risks to gain better rewards. Classic value-based approach in model-free reinforcement learning assumes minimal risk decisions. The proposed method involves a hybrid on-policy and off-policy updates to experience roll-outs using a modified Q-based update equation that introduces a parametric linear rectifier and motivational discount. The significance of this approach is it allows model-free training of agents that take into account risk factors and motivated exploration to gain better path decisions. Experimentations suggest that the proposed method achieved better exploration and path selection resulting to higher episode scores than classic off-policy and on-policy Q-based updates.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천