Heating, Ventilation, and Air Conditioning (HVAC) simulation models contribute to refining the performance of real-world HVAC systems by emulating their thermodynamic behavior. However, purely physics-based simulation models have limitations in accura...
Heating, Ventilation, and Air Conditioning (HVAC) simulation models contribute to refining the performance of real-world HVAC systems by emulating their thermodynamic behavior. However, purely physics-based simulation models have limitations in accurately capturing the complexities of thermodynamic interactions and dynamic environmental variations. To address this, simulation models introduce calibration coefficients that are adjusted through numerical optimization within thermodynamic process. While this approach mitigates some discrepancies, it remains insufficient in fully bridging the gap between simulation and reality. Thus, robust decision-making models capable of handling uncertainties and dynamic environmental conditions are essential.
Reinforcement Learning (RL), an adaptive decision-making algorithm for dynamic environments, has been widely explored within the Partially Observable Markov Decision Process (POMDP) framework using deep neural networks. POMDP models system states as only partially observable, allowing optimal decision-making under uncertainty. This property makes it particularly suitable for HVAC systems, which experience dynamic variations and incomplete observations.
This study proposes a calibration coefficient optimization method that integrates spatio- temporal representation into a POMDP-based RL framework. The proposed framework effectively models hidden variables and dynamic environmental variations, addressing uncertainties inherent in HVAC systems. By leveraging spatio-temporal representation, it incorporates measured data to compensate for missing or uncertain information, ensuring seamless
integration of multiple HVAC unit datasets while preserving temporal dependencies. Notably, the proposed method achieves stable optimization, even under rapid temperature fluctuations and varying load conditions, significantly reducing discrepancies between simulations and actual measurements. These findings confirm that the POMDP-based RL calibration framework, enhanced with spatio-temporal representation, effectively bridges the gap between simulation models and real-world HVAC systems.