Source identification of sudden water pollution events in the Dongliao River using a hybrid machine learning framework

Fuente: PubMed "swarm"
Sci Rep. 2026 Mar 4. doi: 10.1038/s41598-026-41724-8. Online ahead of print.ABSTRACTRapid and accurate identification of pollution sources is critical for emergency management but remains challenged by the high computational cost and uncertainty of traditional numerical models. To address this, this study aims to develop a novel hybrid framework that integrates machine learning with numerical modeling for efficient and robust source inversion. A MIKE 21 hydrodynamic-water quality model of the Dongliao River was developed to generate a synthetic dataset to train long short-term memory (LSTM), kernel extreme learning machine, and support vector machine surrogate models. Among them, LSTM achieved superior accuracy and was selected for further integration. For deterministic source identification, a whale optimization algorithm (WOA)-LSTM model was developed, significantly reducing both inversion error and computation time. A probabilistic inversion system was subsequently established by coupling the WOA-LSTM model with a Bayesian framework to characterize posterior probability distributions. Comparative analysis under data noise scenarios revealed that while the deterministic method performed poorly, the probabilistic approach demonstrated remarkable robustness, improving inversion accuracy by over 47%. These findings demonstrate that integrating a physics-informed ML surrogate with Bayesian inference effectively addresses the trade-off between efficiency and uncertainty. This framework offers a powerful tool for intelligent early warning systems, supporting decision-makers in the effective management and mitigation of sudden water pollution incidents.PMID:41781444 | DOI:10.1038/s41598-026-41724-8