DOAJ Open Access 2026

Hourly ozone concentration estimation and its health impact study based on ensemble machine learning: A case study of Taiyuan City

Rule DU Xiaojuan YANG Ruixia NIU Yang XU Guiming ZHU +2 lainnya

Abstrak

BackgroundOzone (O3) is a major air pollutant. The existing monitoring system has uneven distribution of sites, insufficient coverage in underdeveloped areas, and low temporal resolution, making it difficult to obtain hourly data. This limits the dynamic identification of pollution and the formulation of prevention and control strategies. ObjectiveTo construct an hourly O3 concentration estimation model based on ensemble machine learning, aiming to improve the accuracy of pollution exposure assessment and explore O3 health impacts. MethodsThis study integrated land use regression modeling with modern machine learning techniques, employing random forest and XGBoost algorithms to construct base models, and stacking integration using non-negative least squares. The ensemble model was trained and validated across China using high-resolution, multi-source geographic data (e.g., meteorologicaldata, population density, land cover types, and aerosol optical thickness). It was tested in Taiyuan City, combined with a distributed lag non-linear model to analyze the association between O3 and emergency admissions.ResultsThe constructed ensemble model performed well in predicting O3 concentration, with a higher coefficient of determination (R2) and a lower root-mean-square deviation (RMSE) compared to the single models. The R2 improved from 0.90 to 0.92, and the RMSE decreased from 11.41 to 10.62, enhancing both prediction accuracy and generalization ability. In the application to Taiyuan City, the model successfully imputed the hourly-level data for the entire year. The distributed lag non-linear model analysis revealed that the relative risk (RR) values for the 6th to 8th days following O3 exposure were 1.14 (95%CI: 1.01, 1.29), 1.16 (95%CI: 1.02, 1.31), and 1.14 (95%CI: 1.01, 1.29), respectively, which were significantly higher than 1, indicating a significant lagged association (lagged 6-8 d) between O3 and the number of emergency room visits.ConclusionA high-precision, hourly-level O3 concentration estimation model is successfully constructed by combining the land use regression model with an ensemble machine learning approach to provide a scientific basis for environmental policy formulation and public health intervention. The application of the model verifies its generalization ability and practical application value, which can provide a new technical framework for subsequent environmental health research.

Penulis (7)

R

Rule DU

X

Xiaojuan YANG

R

Ruixia NIU

Y

Yang XU

G

Guiming ZHU

Q

Qian GAO

T

Tong WANG

Format Sitasi

DU, R., YANG, X., NIU, R., XU, Y., ZHU, G., GAO, Q. et al. (2026). Hourly ozone concentration estimation and its health impact study based on ensemble machine learning: A case study of Taiyuan City. https://doi.org/10.11836/JEOM25283

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.11836/JEOM25283
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.11836/JEOM25283
Akses
Open Access ✓