Data-driven reference evapotranspiration (ET0) estimation: a comparative study of regression and machine learning techniques

Fecha de publicación: 01/05/2024
Fuente: Journal of Applied Research in Technology & Engineering (JARTE)
Abstract
Precise computation of evapotranspiration is critical for an efficient irrigation planning and managing agriculture water. This investigation aimed to study the performance of regression techniques and machine learning (ML) techniques in estimating daily reference evapotranspiration (ET0) and compare them to the ET0 computed by the Penman–Monteith (PM-56) technique of “Food and Agriculture Organization” (FAO). The study used the meteorological data of the Pusa Institute, ICAR-IARI, New Delhi, which is located in a semi-arid climatic zone, for a period of 31 years (1990–2020). Six regression techniques were used, including multiple linear regression (MLR), elastic net regression (ENR), ridge regression (RDR), lasso regression (LASSOR), partial least square regression (PLSR), and Poisson regression (POR). Four machine learning models, namely radial basis function (RBF), M5Tree, locally weighted learning (LWL), and gradient boosted tree (GBT), were also evaluated for predicting daily ET0. The M5Tree model outclassed all other models in predicting daily ET0, with a mean absolute error (MAE), mean squared error (MSE), root-mean-squared error (RMSE), R-squared (R2), mean absolute percentage error (MAPE), and Willmott's index (d) of 0.088, 0.018, 0.136, 0.994, 3.073%, and 0.914, respectively, in training of the models. While during testing period, M5Tree model gave MAE, MSE, RMSE, R2, MAPE and d values as 0.114, 0.16, 0.382, 0.949, 3.946% and 0.988, respectively. During training phase, the GBT model reported MAE, MSE, RMSE, R2, MAPE and d as 0.097, 0.021, 0.145, 0.993, 3.257% and 0.998, respectively, which were slightly poor than the M5Tree model. Further, the GBT model showed the best performance during model testing, with MAE, MSE, RMSE, R2, MAPE, and d of 0.126, 0.053, 0.230, 0.982, 4.166%, and 0.995, respectively. Additionally, the POR model performed the worst in predicting daily ET0 values, as evidenced by the prediction error statistics. In conclusion, the developed ET0 model may be utilized to precisely predict ET0 in semi-arid region for efficient irrigation scheduling, especially in the absence of weighing type field lysimeter.