Multiclass spatial predictions of borehole yield in southern Mali by means of machine learning classifiers

Study region: Regions of Bamako, Kati and Kangaba, southwestern Mali Study focus: Machine learning-based mapping of borehole yield. Three algorithms were trained on an imbalanced multiclass database of boreholes, while twenty variables were used as predictors for borehole yield. All models returned...

Descripción completa

Detalles Bibliográficos
Autores: Gómez Escalonilla, Víctor, Diancoumba, Oumou, Traoré, D.Y., Montero González, Esperanza, Martín Loeches, Miguel Martín, Martínez Santos, Pedro
Tipo de recurso: artículo
Fecha de publicación:2022
País:España
Institución:Universidad Complutense de Madrid (UCM)
Repositorio:Docta Complutense
Idioma:inglés
OAI Identifier:oai:docta.ucm.es:20.500.14352/72575
Acceso en línea:https://hdl.handle.net/20.500.14352/72575
Access Level:acceso abierto
Palabra clave:556.3
Machine learning
Groundwater exploration
Yield prediction
GIS
Mali
Hidrología
2508 Hidrología
Descripción
Sumario:Study region: Regions of Bamako, Kati and Kangaba, southwestern Mali Study focus: Machine learning-based mapping of borehole yield. Three algorithms were trained on an imbalanced multiclass database of boreholes, while twenty variables were used as predictors for borehole yield. All models returned balanced and geometric scores in the order of 0.80, with area under the receiver operating characteristic curve up to 0.87. Three main methodological conclusions are drawn: (a) The evaluation of different machine learning classifiers and various resampling strategies and the subsequent selection of the best performing ones is shown to be a good strategy in this type of studies; (b) ad hoc calibration tools, such as data on borehole success rates, provide an apt complement to standard machine learning metrics; and (c) a multiclass approach with an unbalanced database represents a greater challenge than predicting a bivariate outcome, but potentially results in a finer depiction of field conditions. New hydrological insights for the region: Alluvial sediments were found to be the most productive areas, while the Mandingue Plateau has the lowest groundwater potential. The piedmont areas showcase an intermediate groundwater prospect. Elevation, basement depth, slope and geology rank among the most important variables. Lower values of clay content, slopes and elevations, and higher values of basement depth and saturated thickness were linked to the most productive class.