Evolutionary-Assisted Data-Driven Approach for Dissolved Oxygen Modeling: A Case Study in Kosovo
Abstrak
Dissolved oxygen (DO) is widely recognized as a fundamental parameter in assessing water quality, given its critical role in supporting aquatic ecosystems. Accurate estimation of DO levels is crucial for effective management of riverine environments, especially in anthropogenically stressed regions. In this study, a hybrid machine learning (ML) framework is introduced to predict DO concentrations, where optimization is performed through Genetic Algorithm Search with Cross-Validation (GASearchCV). The methodology was applied to a dataset collected from the Sitnica River in Kosovo, comprising more than 18,000 observations of temperature, conductivity, pH, and dissolved oxygen. The ML models Elastic Net (EN), Support Vector Regression (SVR), and Light Gradient Boosting Machine (LGBM) were fine-tuned using cross-validation and assessed using five performance metrics: coefficient of determination (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></semantics></math></inline-formula>), root mean square error (RMSE), mean absolute error (MAE), mean absolute relative error MARE, and mean square error (MSE). Among them, the LGBM model yielded the best predictive results, achieving an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></semantics></math></inline-formula> of 0.944 and RMSE of 8.430 mg/L on average. A Monte Carlo Simulation-based uncertainty analysis further confirmed the model’s robustness, enabling comparison of the trade-off between uncertainty and predictive precision. Comparison with recent studies confirms the proposed framework’s competitive performance, demonstrating the effectiveness of automated tuning and ensemble learning in achieving reliable and real-time water quality forecasting. The methodology offers a scalable and reliable solution for advancing data-driven water quality forecasting, with direct applicability to real-time environmental monitoring and sustainable resource management.
Topik & Kata Kunci
Penulis (12)
Bruno da S. Macêdo
Larissa Lima
Douglas Lima Fonseca
Tales H. A. Boratto
Camila M. Saporetti
Osman Fetoshi
Edmond Hajrizi
Pajtim Bytyçi
Uilson R. V. Aires
Roland Yonaba
Priscila Capriles
Leonardo Goliatt
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.3390/earth6030081
- Akses
- Open Access ✓