Document Type : Research Article
Authors
1
Department of civil engineering, faculty of engineering, university of Zabol , Zabol, Iran
2
Department of information technology, Payamenoor University (PNU), P.O.Box, 19395-3697, Tehran, I.R of Iran
3
PHD student in civil engineering, water resource management and engineering, Faculty of Technology and Engineering, Islamic Azad University, South Tehran Branch, Tehran.
4
Surveying Department, Faculty of Engineering, University of Zabol, Zabol, Iran.
10.22034/ijwer.2025.536685.1102
Abstract
Accurate prediction of suspended sediment transport is importance for the sustainability of river engineering. The aim of this study is to investigate the feasibility of a new intelligent model called the M5 tree model with radial function basis (RM5Tree) for predicting suspended sediment load using daily data at the Trenton meteorological station, located on the Delaware River (USA). For this purpose, several combinations of input characteristics have been defined based on sediment and river flow information. The prediction accuracy of the proposed model has been validated by statistical evaluations and graphical displays in comparison with several well-known predictive models, including the ANN method and the classical M5 tree-based model. The results obtained from the values of root mean square error and coefficient of determination show the remarkable prediction accuracy of the proposed RM5Tree model.
Keywords: Sediment transport prediction, river engineering sustainability, RM5Tree
Introduction : Among the types of sediment load, including bed load and suspended load (SSL), SSL is the main part of sediment transport and has a more complex pattern compared to bed load. Therefore, providing an intelligent and reliable predictive model for SSL is a fundamental research topic for water resources researchers. The SSL pattern has many stochastic characteristics due to the influence of several hydrological, and morphological variables related to the characteristics of the watershed (Kisi and Yaseen 2019). Laboratory determination of sediment concentration requires extensive efforts to collect samples and perform several analytical processes. In addition, these processes are time-consuming and unreliable in flood conditions. To overcome these disadvantages, computational tools have provided suitable and practical solutions, which are introduced in the form of machine learning models. According to the review of the research literature, very limited studies have addressed the estimation of SSL using the potential of decision tree models(MT). Talebi, Mahjoobi et al. (2017) used MT and regression tree (RT) to predict daily sediment discharge using stream discharge and precipitation as predictor variables in the Heidarabad basin of Iran. They compared the used decision MT with ANN and concluded the high performance of decision MT.
The present study emphasizes on the implementation of a new version of the M5Tree model integrated with the radial basis function in the form of a hybrid model for predicting SSL on a daily time scale (RM5Tree). The results of the newly developed model are validated in comparison with the classical M5Tree models.
Methodology: This study utilized 32 years of daily river discharge (Q) and suspended sediment load (SSL) data from the Trenton Station on the Delaware River, USA (USGS Station No. 01463500). The data were split into training (70%) and testing (30%) sets. Descriptive statistics for the input variables are provided in Table 1, and the study area is illustrated in Figure 1.
To capture temporal dependencies, six input combinations (scenarios) were developed using current and lagged values of Q and SSL:
i. Qt,St-1,St-2.
ii. Qt,Qt-1,St-1.
iii. Qt,Qt-1,Qt-2.
iv. Qt,Qt-1,St-1,St-2.
v. Qt,Qt-1,Qt-2,St-1.
vi. Qt,Qt-1,Qt-2,St-1,St-2
Three predictive models were applied: Multilayer Perceptron Neural Network (MLPNN), classical M5 Model Tree (M5Tree), and the proposed Radial Basis M5Tree (RM5Tree). MLPNN used the Levenberg–Marquardt backpropagation algorithm (Figure 2a). M5Tree created regression-based decision trees, while RM5Tree enhanced this by mapping input data into radial space via a normal cumulative distribution function (Figure 2b, 2c).
Performance was evaluated using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Nash–Sutcliffe Efficiency (NSE), and Agreement Index (d). These metrics, computed for all models and input combinations, are presented in Table 2 to determine the most effective approach for daily SSL prediction.
Results and Discussion: The proposed Radial Basis M5Tree (RM5Tree) model demonstrated superior performance in predicting daily suspended sediment load (SSL) compared to benchmark models (ANN, and classical M5Tree) across all six input scenarios. Evaluation metrics, including RMSE, MAE, NSE, and agreement index (d), are presented in Table 2, revealing that RM5Tree consistently outperformed others in both training and testing phases. The optimal input combination included current discharge (Qt) and two lagged values each of discharge and sediment load (Qt, Qt-1, Qt-2, SSLt-1, SSLt-2). This configuration, combined with a 15-center radial transformation, yielded the lowest RMSE (2090 t/day), highest NSE (0.86), and best overall agreement (d = 0.92). The enhanced performance is further illustrated in Figure 3, showing the highest d/MAE ratio, and Figure 4, where RM5Tree’s predictions most closely align with observed data.
The innovation of this study lies in integrating radial basis transformation with the M5Tree framework to improve generalization and handle nonlinearity more effectively. By mapping input data into a radial space using a normal cumulative distribution function, RM5Tree captures subtle variations in SSL dynamics that traditional models overlook. Compared to previous models such as GEP, ANN, W-GEP, and neuro-fuzzy approaches used by Shiri and Kişi (2012) and Vafakhah (2012), RM5Tree achieved significantly lower RMSE values—improving prediction accuracy by up to 66%. While models like CART (Choubin et al., 2018b) and MT (Talebi et al., 2017) also leveraged decision trees, they lacked the hybrid transformation mechanism that distinguishes RM5Tree, limiting their adaptability to high-stochasticity environments. A major strength of RM5Tree is its adaptability across varying input configurations. Unlike ANN, which is sensitive to input dimensionality and structure, RM5Tree maintained stable performance across all scenarios. Additionally, it avoids overfitting by transforming data into a smoother, radially-distributed input space, improving its generalization capability. In summary, this study introduces a novel hybrid machine learning framework for SSL prediction, which outperforms both conventional and state-of-the-art models in accuracy and robustness. The results contribute a practical and scalable solution for sediment management in river engineering, offering significant implications for sustainable water resource planning.
Conclusion: This study introduced a novel hybrid model (RM5Tree) for predicting daily suspended sediment load (SSL) using river discharge and historical sediment data. Among six tested input scenarios, the best performance was achieved using current discharge and two preceding values of discharge and SSL. The RM5Tree model significantly outperformed benchmark models (ANN, M5Tree), achieving the lowest RMSE (2090 t/day) and highest NSE (0.86), as shown in Table 2. The innovation lies in transforming input data into radial space, enhancing the model's ability to capture nonlinear, stochastic sediment patterns. Compared to previous models such as ANN, GEP, and CART, RM5Tree demonstrated higher accuracy and generalization. This approach offers a reliable tool for sediment prediction and river engineering applications, contributing to better watershed management and infrastructure planning.
Keywords
Subjects