Fuente:
Foods - Revista científica (MDPI)
Foods, Vol. 15, Pages 1176: Ensemble Learning Based on Bagging and Hybrid Sampling for Food Safety Risk Prediction
Foods doi: 10.3390/foods15071176
Authors:
Dafang Li
Zhengyong Zhang
Qingchun Wu
Xin Chen
Food safety sampling inspections are critical for risk prevention in complex supply chains, yet the extremely low frequency of high-risk samples poses substantial challenges for accurate risk prediction. To address the limitations of conventional machine learning models under severe class imbalance, this study proposes a unified Bagging–Stacking framework that integrates stacking ensembles, bagging, and SMOTE–Tomek hybrid resampling to enhance minority-class detection in food safety risk prediction. The stacking ensemble serves as the core of the framework, combining five tree-based base learners with Logistic Regression as the meta-learner to enhance classification robustness. Balanced bootstrap subsets generated through bagging and SMOTE–Tomek hybrid resampling further improve minority-class representation, while a probability-based threshold optimization mechanism is incorporated to refine high-risk classification. Experiments on real-world inspection data show that the proposed framework substantially improves high-risk recall while simultaneously increasing precision, yielding the highest F1 among all compared models. It also maintains a stable overall performance across varying test set proportions, demonstrating strong robustness and consistent generalization under varying evaluation conditions. SHAP analysis identifies storage conditions, production month, shelf life, package, and food category as key contributors to risk prediction, aligning with established mechanisms of food safety risk formation. Overall, the proposed framework provides accurate, robust, and interpretable support for food safety risk prediction, offering practical value for proactive risk prevention and more efficient regulatory resource allocation.