시공간 강화 학습 기반 HVAC 시뮬레이션 모델 보정 계수 최적화 = Optimizing Calibration Coefficient in HVAC Simulation Models Using Spatio-Temporal Reinforcement Learning|RISS 상세보기

다국어 초록 (Multilingual Abstract)

Heating, Ventilation, and Air Conditioning (HVAC) simulation models contribute to refining the performance of real-world HVAC systems by emulating their thermodynamic behavior. However, purely physics-based simulation models have limitations in accurately capturing the complexities of thermodynamic interactions and dynamic environmental variations. To address this, simulation models introduce calibration coefficients that are adjusted through numerical optimization within thermodynamic process. While this approach mitigates some discrepancies, it remains insufficient in fully bridging the gap between simulation and reality. Thus, robust decision-making models capable of handling uncertainties and dynamic environmental conditions are essential.
Reinforcement Learning (RL), an adaptive decision-making algorithm for dynamic environments, has been widely explored within the Partially Observable Markov Decision Process (POMDP) framework using deep neural networks. POMDP models system states as only partially observable, allowing optimal decision-making under uncertainty. This property makes it particularly suitable for HVAC systems, which experience dynamic variations and incomplete observations.
This study proposes a calibration coefficient optimization method that integrates spatio- temporal representation into a POMDP-based RL framework. The proposed framework effectively models hidden variables and dynamic environmental variations, addressing uncertainties inherent in HVAC systems. By leveraging spatio-temporal representation, it incorporates measured data to compensate for missing or uncertain information, ensuring seamless
integration of multiple HVAC unit datasets while preserving temporal dependencies. Notably, the proposed method achieves stable optimization, even under rapid temperature fluctuations and varying load conditions, significantly reducing discrepancies between simulations and actual measurements. These findings confirm that the POMDP-based RL calibration framework, enhanced with spatio-temporal representation, effectively bridges the gap between simulation models and real-world HVAC systems.

번역하기

다국어 초록 (Multilingual Abstract)

HVAC (Heating, Ventilation, and Air Conditioning) 시뮬레이션 모델은 열역학 적 거동을 모사함으로써 실제 HVAC 시스템의 성능을 정교하게 개선하는 데 기여한 다. 그러나 순수 물리 기반 시뮬레이션 모델만으로는 열역학적 상호작용의 복잡성과 동적 환경 변화를 충분히 포착하는 데 한계가 존재한다. 이를 보완하기 위해 시뮬레 이션 모델에서는 수치 최적화를 통해 조정되는 보정 계수를 열역학적 과정 내에 도입 한다. 이러한 접근은 시뮬레이션과 실제 시스템 간의 차이를 완화하지만, 그 간극을 완전히 해소하기에는 여전히 부족하다. 따라서 불확실성과 동적 환경 조건을 다룰 수 있는 견고한 의사결정 모델이 필수적이다.
강화 학습은 동적 환경에서의 적응적 의사결정 알고리즘으로, 심층 신경망 기반의 POMDP(Partially Observable Markov Decision Process) 프레임워크를 중심으로 활발히 연구되고 있다. POMDP는 시스템 상태가 부분적으로만 관측된다는 특성을 바 탕으로, 불확실성 하에서의 최적 의사결정을 가능하게 하며, 이러한 특성은 변동이 잦고 관측이 불완전한 HVAC 시스템에 적합하다.
본 연구에서는 시공간 표현을 POMDP 기반 강화 학습 프레임워크에 통합한 보정 계수 최적화 기법을 제안한다. 제안된 프레임워크는 HVAC 시스템의 불확실성과 동적 환경 변화를 효과적으로 모델링하며, 측정 데이터를 활용해 누락되거나 불확실 한 정보를 보완하고, 시간적 의존성을 유지한 채 다수의 HVAC 유닛 데이터셋을 통 합할 수 있다. 특히 제안한 기법은 급격한 온도 변동 및 부하 변화 조건에서도 안정 적인 최적화를 달성하며, 시뮬레이션과 실제 측정값 간의 차이를 크게 감소시킨다. 이러한 결과는 시공간 표현을 활용한 POMDP 기반 강화 학습 보정 프레임워크가 시 뮬레이션 모델과 실제 HVAC 시스템 간의 간극을 효과적으로 줄임을 입증한다.

번역하기

HVAC (Heating, Ventilation, and Air Conditioning) 시뮬레이션 모델은 열역학 적 거동을 모사함으로써 실제 HVAC 시스템의 성능을 정교하게 개선하는 데 기여한 다. 그러나 순수 물리 기반 시뮬레이션 모...

목차 (Table of Contents)