Evaluating heavy metal pollution and health risks in river systems using Random Forest and XGBoost: Evidence from the Shkumbin River

dc.contributor.authorShyti, Bederiana
dc.contributor.authorBasha, Lule
dc.contributor.authorBekteshi, Lirim
dc.date.accessioned2025-12-16T09:52:06Z
dc.date.available2025-12-16T09:52:06Z
dc.date.issued2025-12-19
dc.description.abstractSurface water contamination by heavy metals poses significant ecological and health risks due to their persistence, bioaccumulation, and toxicity. This research evaluated the concentrations of cadmium (Cd), chromium (Cr), copper (Cu), iron (Fe), lead (Pb), and zinc (Zn) in river water samples and assessed their impact on the Heavy Metal Pollution Index (HPI). Descriptive statistics revealed substantial variation among sampling sites, with HPI values ranging from 2.15 to 21.94. Although Cd and Pb were generally present in low concentrations, their localized maxima indicated potential hot spots of contamination, whereas Fe and Zn showed higher overall levels. To identify the most influential predictors of HPI, two machine learning regression models, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), were implemented. The RF model explained more than 90% of the variance in HPI, with Cd, Zn, and Cr emerging as the most critical contributors. The XGBoost model achieved even higher predictive accuracy (R² = 0.998, RMSE = 0.76), confirming Cd and Cr as dominant predictors, together accounting for nearly 80% of the model’s explanatory power. These findings highlight the pivotal role of Cd and Cr in shaping HPI dynamics and demonstrate the utility of ensemble learning methods for environmental monitoring and risk assessment.
dc.identifier.issn1313-9940
dc.identifier.urihttps://doi.uni-plovdiv.bg/handle/store/833
dc.language.isoen
dc.publisherPlovdiv University Press "Paisii Hilendarski"
dc.subjectHeavy Metal Pollution Index
dc.subjectMachine Learning Models
dc.subjectRandom Forest
dc.subjectXGBoost Models
dc.titleEvaluating heavy metal pollution and health risks in river systems using Random Forest and XGBoost: Evidence from the Shkumbin River
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
eb20252084.pdf
Size:
569.26 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
73 B
Format:
Item-specific license agreed to upon submission
Description: