RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Environment-Agnostic Architecture for Heterogeneous Multi-Environment Reinforcement Learning = 이종 다중환경 강화학습을 위한 환경-범용적 아키텍처

      한글로보기

      https://www.riss.kr/link?id=T16955296

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract) kakao i 다국어 번역

      In new environments, training a Reinforcement Learning (RL) agent from scratch can prove to be inefficient. The computational and temporal costs can be significantly reduced if the agent is capable of learning across diverse environments and effectively engaging in transfer learning. However, achieving learning across multiple environments is challenging due to the varying state and action spaces inherent in different RL problems. A naive parameter sharing with environment-specific layers for different state-action spaces does not effectively facilitate transfer learning. In this work, we present a flexible and environment-agnostic architecture designed to facilitate learning across multiple environments simultaneously, while enabling efficient transfer learning for new environments. We also develop training algorithms within the proposed architecture to facilitate both online and offline RL. Our experiments demonstrate that multi-environment training with one agent is possible in heterogeneous environments and parameter sharing is not effective in transfer learning.
      번역하기

      In new environments, training a Reinforcement Learning (RL) agent from scratch can prove to be inefficient. The computational and temporal costs can be significantly reduced if the agent is capable of learning across diverse environments and effective...

      In new environments, training a Reinforcement Learning (RL) agent from scratch can prove to be inefficient. The computational and temporal costs can be significantly reduced if the agent is capable of learning across diverse environments and effectively engaging in transfer learning. However, achieving learning across multiple environments is challenging due to the varying state and action spaces inherent in different RL problems. A naive parameter sharing with environment-specific layers for different state-action spaces does not effectively facilitate transfer learning. In this work, we present a flexible and environment-agnostic architecture designed to facilitate learning across multiple environments simultaneously, while enabling efficient transfer learning for new environments. We also develop training algorithms within the proposed architecture to facilitate both online and offline RL. Our experiments demonstrate that multi-environment training with one agent is possible in heterogeneous environments and parameter sharing is not effective in transfer learning.

      더보기

      목차 (Table of Contents)

      • Table of Contents
      • Abstract i
      • 국문초록 ii
      • Preface iii
      • Table of Contents iii
      • Table of Contents
      • Abstract i
      • 국문초록 ii
      • Preface iii
      • Table of Contents iii
      • List of Tables vi
      • List of Figures vii
      • 1 Introduction 1
      • 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
      • 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
      • 2 Background 4
      • 2.1 Multi-Environment Reinforcement Learning . . . . . . . . . . . . . 4
      • 2.2 Proximal Policy Optimization . . . . . . . . . . . . . . . . . . . . . 5
      • 2.3 Implicit Q Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 5
      • 2.4 Structured State Space Sequence Model . . . . . . . . . . . . . . . 5
      • 3 Methods 7
      • 3.1 Environment-Agnostic Architecture for Heterogeneous Multi-Environment RL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
      • 3.1.1 Arbitrary 1D Input-Output Agent . . . . . . . . . . . . . . 7
      • 3.1.2 Decentralized Distributed Algorithm for Heterogeneous Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
      • 3.1.3 DDPPO for Heterogeneous Environments . . . . . . . . . . 11
      • 3.1.4 DDIQL for Heterogeneous Environments . . . . . . . . . . . 12
      • 3.1.5 Stabilizing Multi-Objective Optimization . . . . . . . . . . 12
      • 4 Experiments 13
      • 4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 13
      • 4.2 Online and Offline Multi-Environment Training . . . . . . . . . . 13
      • 4.2.1 Online Multi-Environment Training . . . . . . . . . . . . . 13
      • 4.2.2 Offline Multi-Environment Training . . . . . . . . . . . . . 16
      • 4.3 Online Multi-Environment Pretraining and Transfer Learning . . 17
      • 4.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
      • 5 Conclusion 21
      • Reference 22
      • Appendix 28
      • A Additional Experiment Results 28
      • A.1 Classic-to-Mujoco . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
      • A.2 Mujoco-to-Classic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
      • B Ablation Study 31
      • B.1 Transfer learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
      • B.2 Scratch learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼