标题: Digital Mapping of Soil pH and Driving Factor Analysis Based on Environmental Variable Screening
作者: Huang, H (Huang, He); Liu, YL (Liu, Yaolin); Liu, YF (Liu, Yanfang); Tong, ZM (Tong, Zhaomin); Ren, ZQ (Ren, Zhouqiao); Xie, YF (Xie, Yifan)
来源出版物: SUSTAINABILITY 卷: 17 期: 7 文献号: 3173 DOI: 10.3390/su17073173 Published Date: 2025 APR 3
摘要: This study comprehensively considers soil formation factors such as land use types, soil types, depths, and geographical conditions in Lanxi City, China. Using multi-source public data, three environmental variable screening methods, the Boruta algorithm, Recursive Feature Elimination (RFE), and Particle Swarm Optimization (PSO), were used to optimize and combine 47 environmental variables for the modeling of soil pH based on the data collected from farmland in the study area in 2022, and their effects were evaluated. A Random Forest (RF) model was used to predict soil pH in the study area. At the same time, Pearson correlation analysis, an environmental variable importance assessment based on the RF model, and SHAP explanatory model were used to explore the main controlling factors of soil pH and reveal its spatial differentiation mechanism. The results showed that in the presence of a large number of environmental variables, the model with covariates selected by PSO before the application of the Random Forest algorithm had higher prediction accuracy than that of Boruta-RF, RFE-RF, and all variable prediction RF models (MAE = 0.496, RMSE = 0.641, R2 = 0.413, LCCC = 0.508). This indicates that PSO, as a covariate selection method, effectively optimized the input variables for the RF model, enhancing its performance. In addition, the results of the Pearson correlation analysis, RF-model-based environmental variable importance assessment, and SHAP explanatory model consistently indicate that Channel Network Base Level (CNBL), Elevation (DEM), Temperature mean (T_m), Evaporation (E_m), Land surface temperature mean (LST_m), and Humidity mean (H_m) are key factors affecting the spatial differentiation of soil pH. In summary, the approach of using PSO for covariate selection before applying the RF model exhibits high prediction accuracy and can serve as an effective method for predicting the spatial distribution of soil pH, providing important references for accurately simulating the spatial mapping of soil attributes in hilly and basin areas.
作者关键词: PSO; environmental variable screening; SHAP; soil pH; digital soil mapping
KeyWords Plus: FEATURE-SELECTION; SPATIAL PREDICTION; SEMIARID REGION; RANDOM FOREST; MODEL; REFLECTANCE; REGRESSION; PEDOLOGY; QUALITY; SYSTEM
地址: [Huang, He; Liu, Yaolin; Liu, Yanfang; Tong, Zhaomin; Xie, Yifan] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China.
[Ren, Zhouqiao] Zhejiang Acad Agr Sci, Inst Digital Agr, Hangzhou 310021, Peoples R China.
通讯作者地址: Liu, YL (通讯作者),Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China.
电子邮件地址: 2022182050055@whu.edu.cn; yaolin610@yeah.net; yfliu610@whu.edu.cn; 2019202050107@whu.edu.cn; renzq@zaas.ac.cn; 2021182050044@whu.edu.cn
影响因子:3.3