Quantitative estimates of the CO2 Net Atmospheric Flux (NAF) of lakes and reservoirs in various climatic regions have been presented based on the thermodynamic equilibrium theory of carbonate system, but there is still great uncertainty in global esti...
Quantitative estimates of the CO2 Net Atmospheric Flux (NAF) of lakes and reservoirs in various climatic regions have been presented based on the thermodynamic equilibrium theory of carbonate system, but there is still great uncertainty in global estimates. This is because the CO2 NAFs are largely dependent on the partial pressure of carbon dioxide (PCO2) in the water, but the PCO2 measurement data are mostly rare. The purpose of this study was to estimate the CO2 NAF of Daecheong Reservoir (Korea) in 2012 and 2013 considering the uncertainty of the PCO2 estimation method using the filed data collected at various surface waters in Geum River and Saemangeum basin of Korea. In addition, multiple linear regression models (MLR) and machine learning models (RF and ANN) were used to identify the major environmental factors that determine daily PCO2 variations in Daecheong Reservoir, and developed the NAF prediction models with selected input variables. This result showed that pH, Alk, and DIC measurement data are thermodynamically satisfactory within the carbonate system, although calculated PCO2 is highly sensitive to the accuracy of pH measurements, particularly at low pH. Daecheong Reservoir was found to be the source of atmospheric CO2 emission, and the NAFs in 2012 and 2013 were 2,590 and 771 mg CO2 m-2d-1, respectively. A stepwise multiple regression model selected five independent variables (WT, pH, Alk, Chl-a, Uw) for the parsimonious model. The R2 values of MLR, RF, and ANN for the estimated NAF were 0.699, 0.975, 0.997. The RF and ANN models showed much enhanced performance in the estimation of the high NAF values, while MLR model significantly underestimated them. A cross validation with 10-fold random samplings was applied to evaluate the performance of three models, and indicated that the ANN model is best, and followed by RF and MLR models.