RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        A tutorial on the art of dynamic programming for some issues concerning Bellman’s principle of optimality

        Mizutani Eiji,Dreyfus Stuart 한국통신학회 2023 ICT Express Vol.9 No.6

        Reinforcement learning (RL) is fundamental to current artificial intelligence (AI). Since dynamic programming, which is based on Richard Bellman’s principle of optimality, is the basis of RL (and other AI disciplines such as A* search), it is important to apply that principle correctly and artfully. This tutorial uses examples, many from published articles, to examine both the artful and occasionally erroneous application of Bellman’s principle. Our contribution is twofold. First, we emphasize the necessity of a general thought experiment by asking what we call “consultant questions” to determine the proper state space for DP formulation to be established by Bellman’s optimality principle. Second, we convey the role of art based on common sense and effortful cleverness in designing efficient DP. Since Bellman’s principle is intuitive based on common sense, it is sometimes misunderstood. For illustrative purposes, given a directed acyclic graph (DAG), we consider four deterministic minimum-cost path network problems (found in the literature), each bringing up some validity issue concerning Bellman’s principle on a different criterion. We approach them consistently by the conventional wisdom of state augmentation (on top of the physical system state space). As a result, we show that our artful choice of state description not only renders the principle valid in each problem, but also makes each DP as efficient as the standard DP that solves a shortest-path problem in the same DAG, circumventing successfully the so-called curse of dimensionality, a price to be paid frequently by state enlargement. After that, we discuss the potential importance of the treated type of path network problems for practical applications (e.g., communications network).

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼