Soil moisture is essential for various applications in natural resource management, including agricultural monitoring, drought and flood prediction, forest fire risk assessment, and water resource management. Data-driven models have emerged as vital tools for understanding and forecasting soil moisture dynamics, as they leverage extensive datasets to uncover complex patterns and relationships that traditional models might miss. The rapid advancement of artificial intelligence technology is propelling more research and practical applications of machine learning in this area, with notable progress expected in the near future. Watershed hydrology transferability refers to the ability to apply established hydrological models and their parameters to different basins, especially those with limited data.
Transferability is crucial for accurate hydrological forecasting and effective water resource management. While data-driven models show potential for watershed hydrology, their transferability depends on factors such as model selection, regional characteristics, and the unique hydrological processes of each basin. Additionally, satellite data plays a significant role in validating these models, enhancing their accuracy in monitoring and predicting soil moisture. SMAP (Soil Moisture Active Passive) data are used in this study to validate the output from Data-Driven Model.
Figure 1: Map of the study area showing key river basins in Thailand, heavily influenced by the Asian summer monsoon, essential for agriculture and water resource management.
The LSTM model, initially developed for the Salawin Basin, aimed to predict soil moisture using key hydrological variables: precipitation, evapotranspiration, surface runoff, and groundwater storage. Each variable was normalized with MinMaxScaler to enhance the model’s learning efficiency. The dataset was divided into training and testing sets, and the LSTM model was designed in a sequential format with three LSTM layers followed by a dense output layer. Training was conducted over 100 epochs using the Adam optimizer and mean squared error as the loss function. The output from the LSTM model was compared with SMAP soil moisture data to assess the model’s predictive accuracy. This comparison was conducted by evaluating model performance using metrics such as Nash-Sutcliffe efficiency and percent bias.
Figure 2: Illustrated Methodology for Soil Moisture Prediction Using LSTM: A Step-by-Step Workflow from Data Preprocessing to Model Transfer Across Basins.
The LSTM model, initially developed for the Salawin Basin, aimed to predict soil moisture using key hydrological variables: precipitation, evapotranspiration, surface runoff, and groundwater storage. Each variable was normalized with MinMaxScaler to enhance the model’s learning efficiency. The dataset was divided into training and testing sets, and the LSTM model was designed in a sequential format with three LSTM layers followed by a dense output layer. Training was conducted over 100 epochs using the Adam optimizer and mean squared error as the loss function. The output from the LSTM model was compared with SMAP soil moisture data to assess the model’s predictive accuracy. This comparison was conducted by evaluating model performance using metrics such as Nash-Sutcliffe efficiency and percent bias.
Figure 3: Soil Moisture Prediction Using LSTM Model Across Multiple Basins: Salawin, North Khong, Mun, Chao Phraya, and Tha Chin.
The LSTM model, initially developed to predict soil moisture levels in the Salawin Basin using data from April 2015 to December 2020 and tested from January 2021 to October 2023, demonstrated strong performance during its validation phase with SMAP satellite data. Upon transferring the model to other basins with varying hydrological characteristics, it achieved Nash-Sutcliffe efficiency (NSE) values ranging from 0.74 on the Tha Chin Basin to 0.86 on the Mun Basin during the training period. However, during testing, performance varied, with the Tha Chin and North Khong Basins showing lower NSE values. Despite this, the percent bias (PBIAS) metric generally fell within the “very good” range, except for the Tha Chin Basin, where the model underestimated soil moisture levels.
The Mun Basin’s high predictive accuracy was attributed to its hydrological and geographical similarity to the Salawin Basin, including comparable average elevation and land use patterns. Conversely, the Tha Chin Basin’s unique characteristics, such as its low mean elevation of 31 meters and distinct precipitation regime, posed challenges for the model, leading to reduced accuracy and underestimation of soil moisture. These findings highlight the influence of regional hydrological and geographical differences on the model’s transferability and performance.
Download the pdf version here.
References:
Entekhabi, D., Yueh, S. H., Bindlish, R., Entin, J. K., & Garcia, M. D. (2022). SMAP science and application results. IGARSS 2022 – 2022 IEEE International Geoscience and Remote Sensing Symposium, 4224–4227. https://doi.org/10.1109/IGARSS46834.2022.9884333
Van Der Linden, S., & Woo, M.-K. (n.d.). Transferability of hydrological model parameters between basins in data-sparse areas, subarctic Canada. Journal of Hydrology. Retrieved from www.elsevier.com/locate/jhydrol
Yang, S., Yang, D., Chen, J., Santisirisomboon, J., Lu, W., & Zhao, B. (2020). A physical process and machine learning combined hydrological model for daily streamflow simulations of large watersheds with limited observation data. Journal of Hydrology, 590, 125206. https://doi.org/10.1016/j.jhydrol.2020.125206
Yu, Q., Tolson, B. A., Shen, H., Han, M., Mai, J., & Lin, J. (2024). Enhancing long short-term memory (LSTM)-based streamflow prediction with a spatially distributed approach. Hydrology and Earth System Sciences, 28(9), 2107–2122. https://doi.org/10.5194/hess-28-2107-2024
ALICE-LAB: Asian Land Information for Climate and Environmental Research Laboratory
Artificial Neural Network (ANN)
Modeled after the brain’s neural structure, ANNs are powerful tools in AI, excelling in tasks like hydrology and climate prediction by processing complex data patterns, from rainfall to extreme weather events, to enhance forecasting accuracy and resource management.
Introduction on Big Data
Big Data revolutionizes water and climate prediction by leveraging vast datasets from satellites, sensors, and historical records. Advanced analytics and machine learning improve prediction accuracy, aid in flood and drought management, and help assess climate change impacts, enabling informed decisions for a sustainable future.