Predicting axial load capacity in elliptical fiber reinforced polymer concrete steel double skin columns using machine learning | Scientific Reports

Blog

HomeHome / Blog / Predicting axial load capacity in elliptical fiber reinforced polymer concrete steel double skin columns using machine learning | Scientific Reports

Jun 26, 2025

Predicting axial load capacity in elliptical fiber reinforced polymer concrete steel double skin columns using machine learning | Scientific Reports

Scientific Reports volume 15, Article number: 12899 (2025) Cite this article 1449 Accesses 1 Citations Metrics details The current study investigates the application of artificial intelligence (AI)

Scientific Reports volume 15, Article number: 12899 (2025) Cite this article

1449 Accesses

1 Citations

Metrics details

The current study investigates the application of artificial intelligence (AI) techniques, including machine learning (ML) and deep learning (DL), in predicting the ultimate load-carrying capacity and ultimate strain ofboth hollow and solid hybrid elliptical fiber-reinforced polymer (FRP)–concrete–steel double-skin tubular columns (DSTCs) under axial loading. Implemented AI techniques include five ML models — Gene Expression Programming (GEP), Artificial Neural Network (ANN), Random Forest (RF), Adaptive Boosting (ADB), and eXtreme Gradient Boosting (XGBoost) — and one DL model — Deep Neural Network (DNN).Due to the scarcity of experimental data on hybrid elliptical DSTCs, an accurate finite element (FE) model was developed to provide additional numerical insights. The reliability of the proposed nonlinear FE model was validated against existing experimental results. The validated model was then employed in a parametric study to generate 112 data points.The parametric study examined the impact of concrete strength, the cross-sectional size of the inner steel tube, and FRP thickness on the ultimate load-carrying capacity and ultimate strain of both hollow and solid hybrid elliptical DSTCs.The effectiveness of the AI application was assessed by comparing the models’ predictions with FE results.Among the models, XGBoost and RF achieved the best performance in both training and testing with respect to the determination coefficient (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE) values. The study provided insights into the contributions of individual features to predictions using the SHapley Additive exPlanations (SHAP) approach. The results from SHAP, based on the best prediction performance of the XGBoost model, indicate that the area of the concrete core has the most significant effect on the load-carrying capacity of hybrid elliptical DSTCs, followed by the unconfined concrete strength and the total thickness of FRP multiplied by its elastic modulus. Additionally, a user interface platform was developed to streamline the practical application of the proposed AI models in predicting the axial capacity of DSTCs.

Currently, a variety of engineering techniques are employed to enhance the strength, durability, and weather resistance of concrete structures1,2. Among these, Engineered Cementitious Composites (ECC) and fiber-reinforced polymer (FRP) have emerged as widely adopted methods in the field of structural engineering. These materials offer significant improvements in performance, making them increasingly popular for modern construction applications. Engineered Cementitious Composites (ECC) are high-performance fiber-reinforced materials with superior crack control and strain-hardening properties3,4,5,6,7. Their durability and mechanical strength make them ideal for structural repair and reinforcement8,9,10,11,12,13,14,15. Additionally, confining concrete with fiber-reinforced polymer (FRP) effectively mitigates the brittle nature of concrete and enhances its durability 16,17,18,19. Confining concrete between an outer FRP tube and an inner steel tube to form hybrid double-skin tubular columns (hybrid DSTCs) (see Fig. 1) provides significant advantages. The outer FRP tube acts as formwork for concrete casting, reducing the costs associated with conventional concrete molding formwork. Due to the high corrosion resistance of FRP, structures confined with FRP are well-suited for use in harsh environmental conditions20,21,22,23,24,25,26,27,28,29,30,31,32.ECC also offers a notable advantage by serving as permanent formwork for the infilled concrete11. Lai et al.11 propose an innovative semi-precast steel-reinforced concrete (SRC) hybrid composite column featuring a precast Engineered Cementitious Composite (ECC) jacket as permanent formwork. Experimental tests on six stub columns, varying ECC thickness, steel size, and reinforcement spacing, demonstrated improved load capacity, ductility, and deformation response compared to conventional SRC columns.

Hybrid DSTCs. (a) Circular DSTCs with a circular steel tube. (b) Circular DSTCs with a square steel tube. (c) Elliptical DSTCs with an elliptical steel tube. (d) Elliptical DSTCs with a rectangular steel tube 28.

In DSTCs, the infilled concrete prevents early local buckling of the inner steel tube and enhances its load-carrying capacity33. The structural benefits of integrating FRP, concrete, and steel tubes in DSTCs have garnered significant interest from researchers.

Teng et al.34,35conducted the first experimental study on hybrid DSTCs under axial compression, outlining their advantages. This led to a broad range of research, including: (1) monotonic axial compression34,35,36,37,38,39,40, (2) cyclic axial compression41,42,43, (3)eccentric compression44, (4) combined axial compression and cyclic lateral loading45,46,47,and (5) lateral impact loading48,49. These studies highlight the performance and potential of hybrid DSTCs in various loading conditions.

The effectiveness of confinement and the overall performance of hybrid DSTCs are influenced by various factors. Specifically, as the concrete strength without confinement increases, the enhancements in strength, strain capacity, and energy absorption provided by FRP confinement significantly decline50. Refs22,51,52,53highlighted the influence of FRP thickness, ply configurations, and aspect ratio on the strength and strain behavior of FRP-confined columns.

Gu et al. 54 highlighted the influence of FRP fracture strain on the drift capacity of FRP-retrofitted reinforced concrete under simulated seismic loads. Under monotonic compression loading, circular CFRP-confined concrete demonstrates superior confinement effectiveness compared to rectangular CFRP-confined concrete55. Unconfined concrete exhibits brittle behavior upon reaching its ultimate strength, while FRP confinement enhances the strength of the infilled concrete and enables it to display ductile behavior56. As a result, the stress–strain response of confined concrete differs from that of unconfined concrete due to the additional strength provided by FRP. Numerous researchers have developed various stress–strain models for confined concrete under static compression and simulated seismic loads57,58,59,60.

Existing studies, however, are mostly focused on hybridDSTCs with a circular cross-section (i.e., the cross-section ofthe outer FRP tube is circular) (Figs. 1a and b). Although circular DSTCs are attractive as bridge piers, elliptical DSTCs are preferred when the column is subjected to different loadsin the two horizontal directions (Figs. 1c and d). Elliptical DSTCs can providedifferent bending stiffness and moment capacity around twoaxes of symmetry without significantly reducing the confiningeffect of the FRP tube61.

The existing literature on hybrid elliptical DSTCs with an elliptical steel tube can be divided into two categories: (1) DSTCs with a hollow inner steel tube (hollow—hybrid elliptical DSTCs), shown in Fig. 2a, and (2) DSTCs with the inner steel tube filled with concrete (solid—hybrid elliptical DSTCs), depicted in Fig. 2b.

Cross-sectional configurations of elliptical columns.

Most existing studies focus on hollow hybrid elliptical DSTCs28,61,62,63. However, a gap remains in investigating the behavior of solid hybrid elliptical DSTCs under axial loading. Additionally, the experimental tests28,62justified the effectsof elliptical aspect ratio, FRP tube thickness, and void area ratio of hollow hybrid elliptical DSTCs. This study addresses this gap through a nonlinear finite element analysis (FEA) of short hybrid elliptical DSTCs. The reliability of the proposed FEM was validated by comparing its results with the relevant experimental test data presented in Ref62. The validated model was then employed in a parametric study to generate 112 data points.The parametric study examined the impact of concrete strength (ranging from 29 to 72 MPa), the cross-sectional size of the inner steel tube, and FRP thickness on the ultimate load-carrying capacity and ultimate strain of both hollow and solid hybrid elliptical DSTCs. In addition, several Artificial Intelligence (AI) models were proposed to predictthe ultimate load-carrying capacity and ultimate strainof hybrid elliptical DSTCs. Artificial Intelligence (AI), particularly Machine Learning (ML), has gained significant attention in civil engineering for solving complex problems64.

ML is widely used in structural engineering to predict component behavior65,66,67,68,assess buckling loads69,70,71,72, and forecast axial load capacity in columns, offering an alternative to traditional methods73,74,75,76.In their study, Yang et al.77propose a hybrid machine learning model using a beta differential evolution-improved particle swarm optimization algorithm (BDE-IPSO) to optimize an artificial neural network (ANN) for predicting alkali-silica reaction (ASR)-induced concrete expansion. With 11 input variables and a database of 1900 ASR datasets, the model accurately captures experimental aspects of ASR expansion. Lai et al.78 analyze the seismic performance of steel-reinforced concrete (SRC) columns using machine learning (ML). A database of 248 SRC column tests under axial and cyclic loading was used. RF and XGBoost models excelled in failure mode and bearing capacity prediction, respectively, demonstrating ML’s accuracy and robustness for SRC column design.In another study, Lai et al. 79 highlight the importance of concrete confinement in enhancing strength and ductility, noting that traditional models are often time-consuming and unreliable. They evaluate these models by breaking down the process into key steps: confining pressure determination, confinement effectiveness calculation, and confined strength/strain assessment. These steps are rigorously compared with experimental and numerical results. Using a database of 466 specimens, six machine learning (ML) models are employed to analyze correlations between input parameters and confinement degree. Interpretability analysis further validates the ML models, demonstrating their superiority over traditional semi-empirical methods for rapid and efficient confinement effect quantification, paving the way for advancements in concrete structural analysis. Isleem et al.80 have recently used different ANN approaches. Their analysis considers a database of 226 FRP-confined circular and noncircular concrete specimens to predict the different components of the stress–strain response.Additionally, Isleem et al81 used various Artificial Neural Network (ANN) models to predict the confined compressive load of GFRP-RC hollow-core concrete columns at different loading stages, based on simulations of 116 specimens. In another study, Abdulla82 developed an ANN-based empirical expression to predict the axial compression capacity and strain of CFPT specimens, using 72 test data sets. The ANN model demonstrated superior accuracy, with a 2.8% average absolute error in compressive strength prediction and a 6.58% error in strain at peak stress. Ali et al.83 predicted the axial load-carrying capacities of FRP-RC columns using Deep Neural Network (DNN) andConvolutional NeuralNetwork (CNN)models. These models, calibrated with various neurons in hidden layers, accurately predicted the capacities, achieving R2 values of 0.943 for DNN and 0.936 for CNN.

Additionally, the study provided insights into the contributions of individual features to predictions using the SHapley Additive exPlanations (SHAP) approach. Using the best-trained ML model, a graphical user interface programmed in Python for practical use in designing hybrid elliptical DSTCs under axial loading.

In this study, the finite element (FE) software ABAQUS 84 was utilized to develop a model for determining the ultimate load-carrying capacity and the ultimate strain of hybrid elliptical DSTCs under axial loading. The developed FE model was then employed to generate a total of 112 FE datasets, as detailed in Table. Table presents the cross-sectional dimensions, material properties, ultimate load(\({P}_{u,FE}\)), and ultimate strain (eu,FE) of the columns. The FE model considered two typical configurations for hybrid elliptical DSTCs, as shown in Fig. 2. The following sections provide a detailed explanation of the material definition, especially the nonlinear behavior, surface identification and interaction, element type and mesh selection, as well as the loading and boundary conditions.

Due to the solid nature of the core concrete, 8-node reduced integration brick elements (C3D8R) with three degrees of freedom per node effectively simulate its deformation features. For the steel tube, both solid and shell elements are suitable for capturing deformation and local buckling. However, for thick steel tubes, the shell thickness may be much smaller than the element size, which can affect modeling accuracy. Using C3D8R ensures that the steel tube meshes conform to the curved contact boundary, accurately depicting the deformation of the elliptical steel tube. For the FRP tube, S4R shell elements are appropriate for analyzing stress and strain due to their thin, papery nature.

Rigid body constraints were used to tie the top and bottom surfaces to their respective reference points (RPs) at the lower and upper surfaces. Boundary conditions were applied to RP1 and RP2 (see Fig. 3). At RP2, all displacements and rotations were constrained (i.e., \({d}_{x}\)=\({d}_{y}\)=\({d}_{z}\)= 0.0 and \({\theta }_{x}\),\({\theta }_{y}\), \({\theta }_{z}\)=0.0). At RP1, all were constrained except vertical displacement (\({d}_{z}\)), which was free. The load was simulated by applying a downward displacement to RP1 equivalent to 1/6th of the column’s length85.

Typical meshing, boundary conditions, and loading conditions of hybrid elliptical DSTCs.

In the FE model, surface-to-surface contact was utilized to account for the interaction between the inner core concrete and the steel tube. According to test results on hybrid elliptical DSTCs, a friction coefficient of 0.3 in the tangential direction was applied.This coefficient was also consistent with that used in Ref.86,87,88.most numerical models assumed perfect bonding between the FRP tube and RC members89,90,91. In this model, the external FRP is bonded to the concrete surface using surface-to-surface tie constraints. Here, the FRP contact surface is designated as the slave surface, while the concrete surface serves as the master surface18.The use of the default pressure-overclosure relationship for “hard” contact in normal behavior is essential for simulating realistic interactions between structural materials, such as steel and concrete. This type of contact mechanism ensures that penetration of the slave surface into the master surface is minimized, effectively replicating physical contact where compression can occur, but tensile forces cannot be transferred across the interface92.For tangential behavior, the transfer of both shear and normal forces is facilitated by the frictional interaction at the interface. Frictional behavior is defined by the stresses acting at the contact surfaces and is essential in simulating the load transfer and resistance to relative motion. The current model’s application of the stiffness (penalty) method to simulate friction between steel plates and FRP with concrete surfaces provides a balanced approach where a defined friction coefficient governs how the surfaces resist sliding. This penalty method applies a corrective force when slip occurs, ensuring that the interaction remains consistent with expected physical behavior. This comprehensive modeling of friction helps capture the complex response of materials under combined shear and normal forces, contributing to a more accurate analysis of the structure’s performance92.

Linear and nonlinear material behavior was used in defining the behavior of materials in FEA. The detail of each material behavior in hybrid elliptical DSTCs is described in the following subsections.

In this study, a unidirectional FRP with a nominal thickness of 0.354 mm and density of \(1.7{e}^{-9}\) g/mm3 was employed. The material behavior of FRP was idealized as orthotropic linear elastic up-to-tensile rupture. In ABAQUS, the elastic behavior of FRP was modeled by defining its engineering constants, including the elastic modulus (E, MPa), Poisson’s ratio (ν), tensile strength (\({f}_{u}\), MPa), and shear modulus (G, MPa), along the material’s principal direction. These constants, taken from Ref. 62, specify an elastic modulus of 72 GPa, a tensile strength of 1530 MPa, and an ultimate strain of 2.13%.

Unconfined concrete demonstrates brittle behavior under uniaxial monotonic loading after attaining the ultimate strength. Confining concrete using steel tube or composite FRP improves the behavior of concrete and allows concrete to perform as an inelastic (elastic–plastic)material 93. The strength and strain enhancements of the confined concrete mainly depend on the degree of confinement provided 93. The core concrete confined in FRP acquires the confinement’s pressure as the applied axial load increases and concrete starts expanding laterally. This action of confinement is passive interaction. In FEA, modeling the behavior of concrete stress–strain (σ—ε) relation by considering this confinement pressure plays a vital role in the accuracy of the analysis results. The current model adopts the stress–strain relationship for confined concrete proposed by Chen et al. 62to predict the behavior of elliptical DSTCs. This relationship is described as:

where, \({\rho }_{k}\) represents the confinement stiffness ratio, \({t}_{f}\) and \({E}_{f}\) denote the thickness and elastic modulus of the FRP tube, respectively, \({\rho }_{f}\) is the FRP volumetric ratio, and \({\varepsilon }_{h,rup}\) refers to the FRP ultimate hoop strain.

ABAQUS offers several models for defining the behavior of elastic–plastic materials like the confined concrete. In the current study, the modified Drucker-Prager yield criterion model with the Drucker-Prager hardening sub-option offered by ABAQUS was utilized in defining the nonlinear plasticity behavior of the confined concrete. The extended Drucker-Prager model can be used in conjunction with the elastic behavior model to define the plasticity of materials by allowing the simultaneous inelastic dilation (increase in volume)94.

The behavior of elliptical steel tube defined as bilinear elastic–plastic material with isotropic strain hardening was used to define the material behavior of the inner steel tube. The bilinear elastic–plastic steel material has two regions in the stress–strain curve; the linear elastic region and the plastic region, as proposed by Han and Huo 95. For the linear elastic behavior, the elastic modulus (E) and Poisson’s ratio (ν) were defined. The plastic behavior was defined using the ultimate strength (fu) with the associated plastic strain of the material. The tensile stress–strain relation was drawn by using the expression given in Eqs. (7)– (9).

where \({\sigma }_{i}\), \({E}_{s}\), \({\varepsilon }_{s}\), \({\varepsilon }_{sy}\) are the tensile stress (MPa), elastic modulus (MPa), strain (mm/mm), and yield strain (mm/mm), respectively.

The most crucial aspect of modeling such columns using the finite element technique is the validation and calibration of the proposed model. Generally, a wide range of datasets from experimentally investigated columns is used to verify the developed model. However, for thehybrid elliptical DSTC section chosen in this study, it was not possible to compile an extensive dataset. This section compares the structural behavior of hybrid elliptical DSTC, specifically their ultimate axial capacities and load–deflection responses predicted by FE models, with selected experimental results reported byChen et al.62.Chen et al. 62 conducted a laboratory test on eight hybrid elliptical DSTCs. The sectional dimensions of the outer elliptical concrete are 250mm(mm) × 125 mm (mm) (\(2a*2b\)). Similarly, the sectional dimensions of the inner steel tube are 176 mm × 88 mm × 6 mm (\({2a}_{s}*{2b}_{s}*{t}_{s}\)). The average cubic compressive strength of the infilled concrete was 72.4 MPa. The column height (H) is 500 mm. The yield strengths of the outer and inner steel tubes are presented in Table 1.

Table 1 presents the ultimate axial strength values from both experimental testing and finite element (FE) analysis, where \({P}_{u,FE}\) represents the predicted ultimate axial capacity from the FE models and \({P}_{u,exp}\) represents the ultimate axial capacity from experimental testing. the mean value (μ) of \({P}_{u,exp}/{P}_{u,FE}\) ratio is 1.008, with a coefficient of variation (CoV) of 0.021.The superior performance of this FE model is likely due to its consideration of the confinement effect induced by the inner steel tube.

Additionally, Fig. 4 illustrates a comparison of the axial load versus axial displacement curves derived from FE analyses and those obtained from experimental measurements. The parameters of the specimens depicted in the figure are provided in Table 1. In general, both FE models effectively predict the initial stiffness of the specimens. The discrepancy between FE model predictions and experimental results can be attributed to several factors. Firstly, the boundary conditions in FE models may not accurately replicate those in experiments, potentially influencing the behavior observed. Additionally, variations in material properties among experimental specimens, such as concrete compressive strength and steel yield strength, are not fully accounted for by FE models which typically use average material properties. Moreover, experimental variability due to factors like fabrication tolerances, measurement errors, and environmental conditions further contributes to discrepancies 88,96.

Validation of FEM models with experimental tests of Chen et al. 62.

Within the vast domain of Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are established as progressive subsets that delve deeper into the capabilities of automated data processing and pattern recognition97. ML represents a significant advancement in AI, where algorithms learn from data and make predictions98. ML models identify trends and patterns through iterative learning from input data, improving with experience akin to human knowledge. These models excel at handling structured datasets and performing various tasks, from classification to regression, without specific programming for each task’s nuances. DL, a subset of ML, draws inspiration from the human brain’s information-processing patterns99. It utilizes neural networks with multiple layers (hence “deep”) that process data in intricate ways. DL is particularly effective for handling vast volumes of data, recognizing patterns, and making decisions100. It is ideally suited for tasks such as image recognition, speech recognition, and natural language processing, where the data is abundant and complex.

AI’s architecture can be visualized as concentric circles, as depicted in Fig. 5, with AI as the broadest category encompassing all computational intelligence forms. ML nests within AI, indicating it is a method to achieve AI through automated data interpretation. DL, in turn, is a specialized segment within ML, representing a refined approach that models high-level abstractions in data through complex neural network architectures. Understanding the relationships between these subsets is crucial for advancements in fields such as predictive modelling for structural engineering. Here, the focus is on predicting the first peak and failure loads using these technologies. ML lays the foundation with predictive algorithms, while DL adds the necessary sophistication for capturing structural data’s complex, non-linear relationships101. This hierarchical structure highlights the evolution of AI from simple task automation to sophisticated problem-solving capabilities.

Visual representation of AI, ML, and DL.

In this study, five ML models, i.e., Gene Expression Programming (GEP), Artificial Neural Network (ANN), Random Forest (RF), Adaptive Boosting (ADB), eXtreme Gradient Boosting (XGBoost), and a DL model (i.e., Deep Neural Network (DNN)) was conducted using the Python programming environment within the Anaconda software. The ensemble techniques boost the performance metrics of predictive models, notably diminishing error rates and increasing higher correlations between predicted and actual values. The improvement in the model’s performance can be credited to the ensemble’s ability to mitigate issues like underfitting, overfitting, or the lack of unity between the model and the dataset. DNN is designed to capture complex patterns and relationships in the data through multiple layers of interconnected neurons. The DNN architecture comprised several hidden layers, each consisting of numerous neurons that apply non-linear transformations to the input data. This architecture allows the model to learn high-level abstractions from raw data, making it particularly powerful for regression tasks.

The development of adopted models to estimate the ultimate load and the ultimate strain of hybrid elliptical DSTCsusing a final dataset of 112 data points with five input features. The area of the steel tube (As,mm2), yield strength of the steel tube (fy, MPa), area of concrete core (Ac, mm2), unconfined concrete strength (fc, MPa), total thickness of FRP multiplied by its elastic modulus (tf · Ef, mm MPa). These variables were designated as Input1 (X1), Input2 (X2), Input3 (X3), Input4 (X4), and Input5 (X5), respectively. The ultimate load (Pu, kN) and ultimate strain (eu, mm/mm) of hybrid elliptical DSTCs were considered as output variables and denoted as Y1 and Y2.

A comprehensive statistical summary was compiled for each feature to provide a foundational understanding of the dataset utilized in this study. This summary encapsulates central tendency measures (mean, median), dispersion metrics (standard deviation, variance), and shape statistics (skewness, kurtosis), which are instrumental in preliminary data inspection and subsequent analytical phases. Table 2 presents these descriptive statistics for the variables in the dataset.

X1 exhibits a significant range between its maximum and minimum values, indicating variability within the dataset. The standard deviation and variance reflect the spread of the values around the mean, which is centrally located between the maximum and minimum. The skewness indicates that the distribution is left-skewed, meaning there are higher values than lower ones, while the kurtosis suggests the distribution has lighter tails than a normal distribution. X2 also shows a notable range, with the standard deviation and variance indicating a moderate spread around the mean. The distribution is right-skewed, suggesting more values are concentrated on the lower side, and the kurtosis indicates lighter tails compared to a normal distribution. X3 has a wide range of values, and its standard deviation and variance demonstrate considerable variability around the mean. The skewness is positive, indicating a right-skewed distribution with more values on the lower end, while the kurtosis shows that the distribution has heavier tails than a normal distribution.

X4 exhibits a moderate range, with a standard deviation and variance indicating variability around the mean. The skewness suggests a slight right skew, while the kurtosis indicates that the distribution has lighter tails than a normal distribution. X5 shows a substantial range and significant variability around the mean, as indicated by its standard deviation and variance. The skewness is positive, indicating a slight right skew and the kurtosis suggests lighter tails than a normal distribution. Y1 has a wide range of values, with a standard deviation and variance indicating a significant spread around the mean. The distribution is right-skewed, suggesting lower values and the kurtosis indicates heavier tails compared to a normal distribution, pointing to the presence of outliers or extreme values. Y2 shows a narrow range of values, with a standard deviation and variance reflecting minimal variability around the mean. The distribution is nearly symmetric, as indicated by the low skewness, and the kurtosis suggests lighter tails compared to a normal distribution. Overall, the descriptive statistics reveal the complexity of the dataset, showcasing key characteristics such as variability, central tendencies, and distribution patterns.

Examining histogram distributions provides an empirical data visualization, highlighting each variable’s frequency and distribution characteristics. Figure 6 shows Kernel Density Estimation (KDE) plots for all variables. Each plot visualizes the density distribution of the respective variable. For X1, the plot shows a bimodal distribution with two distinct peaks. The mean value is centered around 2381.56 mm2 with a standard deviation of 272.85 mm2. The distribution is left-skewed, as indicated by a skewness of -1.24, meaning more values are on the higher side. The kurtosis value of 0.49 suggests the distribution has lighter tails than a normal distribution, indicating fewer extreme values. For X2, the plot exhibits a bimodal distribution with two prominent peaks. The mean is around 432.39 MPa, with a standard deviation of 121.49 MPa. The distribution is right-skewed, shown by a skewness of 0.67, indicating more values are concentrated on the lower side. The kurtosis value of -1.56 suggests the distribution has lighter tails than a normal distribution, indicating a lower likelihood of outliers.

KDE plots of all features.

For X3, the plot shows a broad range of values with a single peak. The mean value is approximately 22,559.84 mm2, and the standard deviation is 9498.47 mm2, indicating significant variability. The skewness of 1.20 suggests a right-skewed distribution, with more values on the lower side. The kurtosis value of 0.94 indicates that the distribution has heavier tails than a normal distribution, pointing to outliers or extreme values. For X4, the plot represents a bimodal distribution with two distinct peaks. The mean value is around 49.13 MPa, with a standard deviation of 18.38 MPa. The skewness of 0.30 indicates a slight right skew, while the kurtosis value of -1.60 suggests the distribution has lighter tails than a normal distribution, meaning fewer extreme values. For X5, the plot represents a distribution with multiple peaks, indicating significant variability. The mean value is approximately 108,324.00 mm MPa, and the standard deviation is 48,202.50 mm MPa. The distribution is slightly right-skewed, as shown by a skewness of 0.58, with more values on the lower side. The kurtosis value of -0.64 suggests the distribution has lighter tails than a normal distribution.

For Y1, the plot shows a broad range of values with a single peak. The mean value is around 3144.08 kN, with a standard deviation of 1253.10 kN, indicating significant spread. The skewness of 1.41 suggests a right-skewed distribution with lower values. The kurtosis value of 2.45 indicates the distribution has heavier tails than a normal distribution, pointing to outliers. For Y2, the plot shows a narrow range of values with a distinct peak. The mean value is approximately 0.02 mm/mm, with a standard deviation of 0.002 mm/mm, indicating minimal variability. The skewness of 0.20 suggests a nearly symmetric distribution, while the kurtosis value of -0.39 indicates lighter tails than a normal distribution.

Figure 7 presents a Pearson correlation heatmap between the input variables (X1 to X5) and the output variables (Y1 and Y2). Pearson correlation coefficients measure the linear relationship between two variables, with values ranging from -1, indicating a perfect negative correlation, to + 1, indicating a perfect positive correlation. In the heatmap, several key interrelationships are observed. X1 shows a moderate positive correlation with X3 (0.275) and Y1 (0.283), indicating that as X1 increases, both X3 and Y1 also tend to increase. This suggests a somewhat direct relationship among these variables. Conversely, X2 has no significant correlation with most of the other variables, suggesting independence within this dataset.

Pearson correlation heatmap of all features.

X3 stands out with a strong positive correlation with Y1 (0.727) and a moderate positive correlation with X1 (0.275). This implies that as X3 increases, Y1 and X1 also tend to increase, suggesting a close relationship between these variables. Meanwhile, X4 and X5 exhibit a strong positive correlation (0.709), indicating that these variables tend to increase together, reflecting a strong interdependence. Additionally, X5 shows a moderate positive correlation with Y1 (0.495) and Y2 (0.445), indicating that as X5 increases, both Y1 and Y2 tend to increase. This suggests that X5 has a significant impact on both Y1 and Y2. Y1 has a moderate positive correlation with X3 (0.727), X4 (0.551), and X5 (0.495), indicating that these variables tend to increase together, further highlighting interconnection. Y2 presents an interesting pattern, with a moderate positive correlation with X4 (0.222) and X5 (0.445) but a moderate negative correlation with X3 (-0.597). This indicates that as Y2 increases, X3 tends to decrease, while increases in X4 and X5 are associated with increases in Y2. This dual behavior suggests a complex relationship involving Y2, X4, X5, and X3.

The GEP model is an evolutionary algorithm that evolves computer models or programs. The GEP model was first developed by Ferreira 102. The model combines the advantages of genetic programming and algorithms, using fixed-length chromosomes to represent varying-length expressions. The model can create models that adapt to complex, non-linear problems, making it a powerful tool for tasks requiring symbolic regression or classification. The GEP model starts with a random set of chromosomes and is then evaluated for effectiveness. The best ones reproduce, creating new variations. This development, selection, and reproduction cycle continues, with some randomness added for diversity. The current study used GeneXproTools software to develop the GEP model 103.

ANN model is capable of modelling both simple and complex systems, avoiding standard statistical assumptions71,72. The structure of an ANN encompasses seven essential elements: neurons or processing units, their activation state, each neuron’s output function, a network of weighted connections, a rule for activity propagation through these weights, a function to calculate new activation levels from inputs and current states, and a learning rule to update weights based on new experiences. This framework, as outlined by Flood and Kartam104, enables ANNs to adapt and learn from data inputs without the limitations of traditional statistical methods. During the training procedure of the ANN model, the input and output data are employed to alter the interconnections among the neurons. This adaptation is achieved through the implementation of a learning algorithm. The training process of an ANN can be dissected into two fundamental stages: the feed-forward phase and the back-propagation phase. In the initial stage, the signals entering the input layer traverse the hidden layer before reaching the output layer. The Levenberg–Marquardt algorithm is employed to reduce the overall error function, leading to adjustments in the weights. This approach is noted for its speed and effectiveness, surpassing traditional procedures105. In this study, the ANN model was developed using the neural network toolbox in MATLAB R2021a software106.

Firstly, the RF model creates several training sets from the original data using bootstrap sampling107. Each of these sets contains about two-thirds of the original data, leaving one-third out, known as out-of-bag data. Then, for each training set, a regression tree is built. These trees together make up a forest. During the growth of each tree, the best attributes for branching are chosen randomly from a pre-defined maximum depth. This process ensures that each regression model is different, improving the overall prediction power of the combined model. After training the model multiple times, the algorithm takes the average of all the predictions from the different trees to estimate the value for a new sample. This final step is based on an equation that combines these averages.

AdaBoost is a type of machine learning algorithm under the category of ensemble methods and is recognized as the new boosting algorithm for both classification and regression tasks68. Specifically, AdaBoost regression targets regression problems. Like other ensemble approaches, this method merges several simple models to form a more accurate and robust classifier. The core concept behind AdaBoost regression involves constructing a robust model by amalgamating several weak ones, where each new model focuses more on the data points that its predecessors inaccurately predicted. Through iterative training, AdaBoost regression fine-tunes weak models on a weighted dataset version, emphasizing previously mis predicted data points with each round. This process continues until several models are trained or a desired level of accuracy is reached.

XGBoost is a highly effective and versatile ML library that excels in handling large datasets and complex predictive modelling tasks108. Its ability to handle missing values, regularization techniques, and parallel processing capabilities make it a reliable choice for various applications. With its high accuracy, speed, and scalability, XGBoost is a popular choice for many industries, including finance, computer vision, and natural language processing107. Overall, XGBoost is a powerful tool for data scientists and machine learning engineers, offering a robust and efficient way to build predictive models that can drive business decisions and improve outcomes.

Hyperparameter tuning using Bayesian Optimization (BO) is a sophisticated method for optimizing ML models. Unlike brute force or grid search methods, BO intelligently explores the hyperparameter space by leveraging probabilistic models109. It begins with an initial set of hyperparameters and evaluates their performance using a chosen metric. Based on these results, BO updates its probabilistic model to predict which hyperparameters will yield better performance. It then selects new hyperparameters to evaluate, balancing exploring new regions with exploiting known good areas. This iterative process continues until satisfactory hyperparameters are found. By dynamically adapting to the model’s performance, BO offers a more efficient and practical approach to hyperparameter tuning, saving time and resources while maximizing model performance.

In this study, BO is conducted using fivefold cross-validation (CV) to evaluate model performance during the hyperparameter tuning process. This approach ensures a balance between computational efficiency and statistical reliability, making it a practical choice for optimizing complex machine learning models. The use of fivefold CV instead of the more traditional tenfold CV is a strategic decision aimed at reducing the computational cost of Bayesian Optimization without compromising the robustness of performance estimates. While tenfold CV is often preferred for its slightly more precise evaluation metrics, the additional computational overhead is substantial, particularly in resource-intensive optimization processes. By using 5 folds, the computational burden is reduced by nearly half, while still providing sufficient diversity and reliability in the training and validation splits. Based on previous studies, the difference in variance and bias of performance estimates between fivefold and tenfold CV is minimal, particularly for larger datasets where each fold still contains a representative subset of the data.

The DNN architecture for predicting the two outputs is meticulously designed to handle the complexity and non-linearity inherent in structural engineering data. As depicted in Fig. 8, the model’s architecture is crafted to identify intricate patterns within the data, enabling precise and accurate predictions. This sophisticated design ensures the DNN can effectively capture and model the relationships between input features and the desired outputs.

DNN model architecture with all the layers.

In this study, the input layer of the DNN model is designed to receive the pre-processed features from the dataset. Given the number of features, the input layer includes one neuron per feature, resulting in 5 neurons corresponding to the 5 input features (X1 to X5). The hidden layer of the DNN model consists of multiple hidden layers, each containing a specific number of neurons. A common practice is to start with a larger number of neurons in the initial hidden layer and gradually reduce the number in subsequent layers. However, the architecture can also include complex patterns such as ‘bottleneck’ layers to aid in feature compression and abstraction. The first hidden layer may contain twice the number of neurons as the input features to capture the non-linear relationships adequately. Subsequent hidden layers may gradually reduce the neuron count to funnel the network towards its output. Each neuron in these layers is connected to every neuron in the preceding and subsequent layers, forming a dense network. The DNN model is exposed to the data in batches during its training stage. After each batch, the model’s weights are updated to minimize the loss function. The model undergoes a predefined number of epochs, where one epoch corresponds to the network seeing the entire dataset once. Thus, the performance of the model is regularly assessed on the test dataset not seen by the model during training. This helps monitor for overfitting and guides hyperparameter tuning.

In this DNN model, the backpropagation technique is utilized to optimize weights and biases through the gradient descent algorithm, which is fundamental for training neural networks. According to Tseng and Yun110, backpropagation is an iterative, gradient-based optimization process aimed at minimizing prediction errors. The algorithm involves two crucial stages: forward propagation and backward propagation. During forward propagation, input data traverses through the network’s interconnected layers, where each neuron processes inputs by applying a weighted sum followed by an activation function to generate an output. This output is then passed forward to the next layer, continuing this process until the final layer produces the model’s output. In the backward propagation stage, the error, defined as the difference between the predicted output and the actual target values, is calculated and propagated backward through the network. This process evaluates the contribution of each weight to the error, adjusting the weights to reduce it, thereby refining the model’s predictive capabilities.

The backpropagation algorithm updates the weights and biases in a DNN by applying the chain rule to compute the gradient of the loss function with respect to each weight and bias. The updates are done in the direction that reduces the loss and is proportional to the negative gradient. The general update equations for weights, using Eq. (10), and biases, using Eq. (11).

where \({W}_{ij}^{\left(l\right)}\)​ represents the weight from neuron \(j\) in layer \((l-1)\) to neuron \(i\) in layer \(l\), \({b}_{i}^{\left(l\right)}\)​ is the bias for neuron \(i\) in layer \(l\), \(\mathcal{L}\) is the loss function that measures the difference between the predicted output and the actual target values, \(\gamma\) is the learning rate, a hyperparameter that controls the size of the update step, \(\frac{\partial \mathcal{L}}{\partial {W}_{ij}^{\left(l\right)}}\)​ is the partial derivative of the loss function with respect to the weight \({W}_{ij}^{\left(l\right)}\)​ ​, and \(\frac{\partial \mathcal{L}}{\partial {b}_{i}^{\left(l\right)}}\)​ is the partial derivative of the loss function with respect to the bias \({b}_{i}^{\left(l\right)}\)​.

During the backward pass of backpropagation, partial derivatives (gradients) are computed for each layer, starting from the output layer and moving backward through the network. Each neuron’s error signal is calculated using the derivative of the activation function and the error signals propagated from the subsequent layer. This error signal is crucial for determining the gradients of the loss function with respect to the weights and biases. These gradients are then used in a gradient descent step to adjust the weights and biases to minimize the loss function. This process is iteratively repeated for multiple epochs over the training dataset until the network’s performance converges or meets a predefined stopping criterion. Table 3 illustrates the steps of DNN model development.

A crucial component of the DNN is the activation function used within the neurons. In a DNN, the activation function is critical in dictating how a neural network transforms input into output, determining whether and to what extent a signal should proceed through the network. The non-linear transformation applied over the input signal allows the network to learn and perform more complex tasks beyond mere linear classification or regression. By introducing non-linearity, activation functions enable the network to understand and represent sophisticated phenomena in the data, such as the hierarchical or interactive effects between variables. Without activation functions, a DNN would essentially operate as a linear regression model, incapable of handling the intricacies of complex data patterns that are commonplace in real-world problems.

In this study, the activation functions used in the development of the DNN model: Rectified Linear Unit (ReLU), which is computationally efficient and reduces the vanishing gradient problem; Sigmoid, which outputs values between 0 and 1 for binary classification tasks; Tanh, which produces values from -1 to 1 and aids faster convergence due to zero-centered outputs; Softmax, which is used in multi-class classification to generate probability distributions; Softplus, a smooth approximation of ReLU; Softsign, a simpler alternative to Tanh; Scaled Exponential Linear Unit (SELU), which offers self-normalizing properties by preserving input mean and variance; Exponential Linear Unit (ELU), which helps mitigate the vanishing gradient problem; the Exponential function, used in specialized layers; and Swish, which combines the properties of ReLU and Sigmoid for smooth non-linearity.

Proper initialization of weights is crucial for the efficient training and performance of the neural network. In this study, the following initializers are considered. The `glorot_uniform` initializer, also known as Xavier uniform initializer, sets the initial weights by drawing samples from a uniform distribution. The range of this distribution is calculated based on the number of input and output units in the weight tensor, helping to maintain the variance of activations through the layers and preventing vanishing or exploding gradient problems. Similarly, the `glorot_normal` initializer, or Xavier normal initializer, uses a truncated normal distribution centered on zero with a standard deviation that accounts for the number of input and output units. The `he_normal` and `he_uniform` initializers are particularly suited for layers with ReLU activation functions. The `he_normal` initializer draws weights from a truncated normal distribution, while the `he_uniform` initializer uses a uniform distribution. Both are calculated based on the number of input units, ensuring efficient gradient flow through the network.

The `lecun_normal` and `lecun_uniform` initializers are designed for layers with scaled exponential linear units (SELU). The `lecun_normal` initializer draws samples from a truncated normal distribution, and the `lecun_uniform` initializer uses a uniform distribution. These initializers ensure the network maintains proper variance across layers, which is vital for deep networks. Other initializers include the `normal` and `random_normal` initializers, which draw weights from normal distributions, and the `random_uniform` initializer, which uses a uniform distribution. While straightforward, these initializers may not always be optimal compared to those that account for the network’s architecture. The `orthogonal` initializer sets weights to be orthogonal, preserving gradient flow in recurrent neural networks, and the `identity` initializer sets weights to the identity matrix, primarily for recurrent layers.

Lastly, the `zeros` and `ones` initializers set all weights to zero or one, respectively. These are generally not recommended for hidden layers as they do not break symmetry, leading to neurons learning the same features. By allowing the selection of these initializers during hyperparameter tuning, the hyperparameter choice function provides flexibility in optimizing the model’s performance, ensuring efficient and practical training.

The model employs the Mean Squared Error (MSE) loss function to quantify the difference between predicted values and actual values. This choice of loss function is particularly suitable for regression tasks, where the goal is to minimize the squared differences between the predicted and observed outcomes. By doing so, MSE penalizes larger errors more heavily, encouraging the model to make more accurate predictions. Optimizers play a crucial role in this process by adjusting the network’s weights to minimize the loss function, thereby enhancing the model’s performance.

Adam, a widely used optimizer, combines the advantages of Root Mean Square Propagation (RMSprop) and Momentum. It adapts the learning rates for each parameter based on their first and second moments of the gradients, effectively balancing the speed and stability of the training process.

Another commonly used optimizer is Stochastic Gradient Descent (SGD), which updates parameters using the gradient of the loss function. SGD can be used with or without momentum, where momentum helps accelerate gradients vectors in the right directions, thus leading to faster converging.

RMSprop is another adaptive learning rate method that adjusts the learning rate for each parameter, making it particularly effective in non-stationary settings. AdamW, a variant of Adam, incorporates weight decay to improve regularization and prevent overfitting, offering a balance between adaptive learning rates and strong regularization. Adadelta extends Adagrad by dynamically adjusting the learning rate over time, reducing the aggressive, monotonically decreasing learning rate issue observed in Adagrad. Adagrad, which adapts learning rates based on the frequency of parameter updates, is particularly useful for dealing with sparse data.

Adamax, a variant of Adam, operates based on the infinity norm and is designed to be robust and stable. Nadam combines Adam’s adaptive learning rates with Nesterov’s accelerated gradients, aiming to achieve faster convergence by anticipating future gradients. Lastly, the Lion optimizer employs a combination of learning rate scaling and weight decay, promoting robust optimization by maintaining a balance between exploration and exploitation during training. Each of these optimizers offers unique advantages, making them suitable for different types of tasks and data characteristics, thereby providing flexibility in optimizing the neural network’s performance.

Hyperparameter tuning is a critical step in optimizing a DNN to improve its performance. In this study, the BO technique is utilized as a tuning method. This technique stands out for its efficiency in finding the best hyperparameters by building a probabilistic model of the function mapping from hyperparameter space to the objective based on past evaluations. By using BO, one can efficiently navigate the hyperparameter space and converge on an optimal set that offers the best performance as per the defined evaluation metric. This method reduces the computational expense and time required compared to grid search or random search methods, especially when the hyperparameter space is large. The objective is to find hyperparameters that yield the most accurate predictions while maintaining the model’s ability to generalize to new, unseen data. BO takes a probabilistic view of modelling the objective function (MSE) is taken in the study) and uses that model to make decisions about which point in the hyperparameter space to try next. The general idea is to use the Gaussian Process (GP) to fit the observed data and then choose the following observation to improve the model. Table 4 shows the BO process for DNN hyperparameter tuning.

Assessing the effectiveness of each model is essential to ensure the practicality and scientific reliability of the outcomes111. While training datasets help construct models, they only reveal how well they fit the given data. Therefore, testing datasets are crucial for validating the models. Evaluation and comparison of models typically involve two main methods: visual and quantitative assessments. Visual methods include scatter plots, violin boxplots, and Taylor diagrams, which offer quick insights into the accuracy prediction of various statistical measures such as maximum, minimum, median, and quartiles112,113,114. Unlike quantitative metrics, which may not capture these aspects, visual methods provide rapid, engaging, and informative comparisons. However, they may lack detailed information about model performance115,116,117,118,119,120,121. As a result, eight quantitative metrics were utilized: correlation coefficient (r), determination coefficient (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Bias Error (MBE), Cumulative Prediction Error Index (CPI), and Variance Accounted For (VAF). The equations for these performance metrics and their ideal values are listed in Table 5.

To analyze the sensitivity and interpret ML models on both a wide-scale and a more detailed level, researchers use the SHAP approach, which draws on principles from cooperative game theory122. The SHAP method was employed to gauge the comparative impact of input variables on the prediction process. As an advanced method within the realm of explainable AI, SHAP helps clarify the complex interactions between the input variables and the model predictions. It offers critical insights by identifying which features are most influential on predictions and how they modify the predicted results123. Equation (20) shows the Shapley value \(\left({\phi }_{i}\right)\) for feature i is determined by calculating the average marginal contribution of that feature across all possible permutations of features. In this equation, N represents the set of all features, S represents a subset of features that excludes feature i, ∣S∣ denotes the cardinality of set (S), v(S) represents the model’s prediction when only features in set (S) are considered, and v(S ∪ \(\left\{i\right\}\)) represents the model’s prediction when feature i is added to set S124.

A PDP demonstrates the functional relationship between a limited set of input parameters and the model’s predictions. It illustrates the extent to which the predictions are influenced by the values of these specific input parameters125. Moreover, PDPs highlight the impact of each parameter on the predicted outcomes generated by a machine learning algorithm. As a global method, PDPs consider all instances within the dataset, thereby revealing the overall relationship between a feature and the forecasted outcome126. These plots offer valuable insights into the relative contribution of each input variable to the predicted outcome. Furthermore, one-dimensional PDPs (PDPs-1D) are utilized to depict the relationship between the predicted outcome and a single input parameter.

To make the predictive model for estimating the CS accessible and user-friendly, a GUI is developed using the Tkinter library in Python.Tkinter, a standard GUI library, offers a straightforward approach to creating interactive applications127. The development process starts with setting up the Python environment with Tkinkter, which is included in the standard Python distribution. The interface design focuses on simplicity and user guidance, incorporating input fields for users to enter required features, a button to trigger predictions, and labels or instructions for ease of use. The core functionality of the GUI involves creating input fields with TextEntry widgets, a Button widget to initiate the prediction process, and a Label or Text widget to display the predicted CS. The trained predictive model is integrated into the GUI using a library-like pickle to load the model and make predictions based on user inputs. To ensure accessibility and collaboration, the GUI application is hosted on GitHub (https://github.com), which provides version control, collaboration features, and easy distribution.

Table 6 provides a detailed overview of the optimized hyperparameters for the adopted ML models to predict Y1 and Y2. The table presents optimal hyperparameter values for various models when predicting Y1 and Y2. For the GEP model, when predicting Y1, the optimal values include having 100 chromosomes, 3 genes, a head size of 8, a tail size of 9, employing the strategy of optimal evolution, using multiplication as the linking function, and incorporating a function set consisting of addition, subtraction, multiplication, and division. For predicting Y2 with GEP, the settings are similar with 100 chromosomes and 3 genes, but the tail size is increased to 25, the linking function changes to addition, and the function set is expanded to include a variety of operations such as square root, absolute value, and additional mathematical combinations. In the case of ANN, for both Y1 and Y2, the optimal structure involves having a single hidden layer with 10 neurons, utilizing the sigmoid activation function, and employing the Levenberg–Marquardt estimation method.

For RF model, the prediction of Y1 is optimized with 589 estimators and a maximum depth of 28, whereas for Y2, fewer estimators (228) and a shallower depth (8) are preferred. The optimal settings of ADB model for Y1 include 1000 estimators and a learning rate of 1.78, while for Y2, these values are adjusted to 502 estimators and a learning rate of 1.82. Lastly, the XGBoost model for Y1 achieves optimal performance with 456 estimators, a maximum depth of 3, and a learning rate of 0.253. For Y2, the model is fine-tuned to use 599 estimators, a maximum depth of 5, and a learning rate of 0.201. This detailed description underscores the tailored hyperparameter adjustments necessary for each model to optimize performance for Y1 and Y2.

For Y1, the developed equation (Eq. 21) from the GEP model can be expressed as follows:

where d0, d1, d2, d3, d4, and d5, indicated X1, X2, X3, X4, and X5, respectively. The numerical constant in the first gene, c2 and c7, was -238.78 and 322.39, respectively. Similarly, the constants’ values in the second gene: c1, c4, c6, and c7, were 79.69, -7.72, -2.02, and 536,960.90, respectively. Likewise, c1 and c2 in the third gene were 2562.40 and 13.18, respectively.

Figure 9a shows the comparison between the actual and predicted values of Y1 using the GEP model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, and the MAPE shows that the average percentage error is minimal. The MBE indicates a slight underestimation bias, while CPI and VAF reflect the model’s high accuracy and VAF. For the testing dataset, the model still shows a high positive correlation, though slightly lower than the training data. The RMSE and MAE are higher for the testing data, indicating larger errors compared to the training data. The MAPE shows a moderate increase in percentage error, and the MBE indicates a more substantial underestimation bias. However, the CPI and VAF still demonstrate that the model retains good predictive performance and explains a significant portion of the variance in the actual values.

Scatter plots showing the comparison between the actual and predicted values based on the GEP model for (a) Y1 and (b) Y2.

For Y2, the developed equation (Eq. 22) from the GEP model can be expressed as follows:

where d0, d1, d2, d3, d4, and d5, indicated X1, X2, X3, X4, and X5, respectively. The numerical constant in the 1st gene, c1, c2, c3, and c7, was -9.19, 23.47, -2.37 and 95.19, respectively. In the 3rd gene, the values of constants c2, c5, and c8 were -0.441, -2.22, and 382.82, respectively. Notably, there were no constants in the 2nd gene.

Figure 9b shows the comparison between the actual and predicted values of Y2 using the GEP model. For the training dataset, the model shows a moderate positive correlation between actual and predicted values, explaining a reasonable percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, suggesting that the prediction errors are small on average. The MAPE demonstrates that the average percentage error is moderate, and the MBE is close to zero, showing minimal bias in the predictions. The CPI and VAF reflect the model’s performance, indicating moderate accuracy. For the testing dataset, the model maintains a similar level of performance, with a positive correlation between actual and predicted values. However, the correlation is slightly lower than that of the training data. The RMSE and MAE are consistent with those of the training data, indicating stable prediction errors. The MAPE shows a slight increase in percentage error, and the MBE remains close to zero, indicating minimal bias. The CPI and VAF values are slightly lower for the testing data, suggesting a slight decrease in predictive performance but still demonstrating reasonable accuracy.

Equation (23) was obtained using the developed ANN model to predict Y1 and Y2. It can be expressed as follows:

where x1 is the input layer matrix; w1 is the weight matrix of connections between the neurons of the input and hidden layer; b1 is the vector of weights of bias neurons at the hidden layer; b2 is the vector of weights of bias neurons at the output layer; and w2 is the weight matrix of connections between the hidden and output layer. The final results of the weights and biases of the ANN model are shown in Eqs. (24–27).

Figure 10a shows the comparison between the actual and predicted values of Y1 using the ANN model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, and the MAPE shows that the average percentage error is minimal. The Mean Bias Error (MBE)indicates a slight underestimation bias, while CPI and VAF reflect the model’s high accuracy and VAF. For the testing dataset, the model still shows a high positive correlation, though slightly lower than the training data. The RMSE and MAE are higher for the testing data, indicating larger errors compared to the training data. The MAPE shows a moderate increase in percentage error, and the MBE indicates a more substantial underestimation bias. However, the CPI and VAF still demonstrate that the model retains good predictive performance and explains a significant portion of the variance in the actual values. Compared to the GEP model, the ANN model shows slightly better performance in terms of lower RMSE and higher VAF, indicating a more accurate prediction capability.

Scatter plots showing the comparison between the actual and predicted values based on the ANN model for (a) Y1 and (b) Y2.

Figure 10b shows the comparison between the actual and predicted values of Y2 using the ANN model. For the training dataset, the model demonstrates a high positive correlation between actual and predicted values, explaining a substantial percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, suggesting small average prediction errors. The MAPE demonstrates that the average percentage error is moderate, and the MBE is zero, showing no bias in the predictions. The CPI and VAF reflect good model performance, indicating high accuracy. For the testing dataset, the model maintains a positive correlation between actual and predicted values, although the correlation is lower than for the training data. The RMSE and MAE remain consistent with those of the training data, indicating stable prediction errors. The MAPE shows a slight increase in percentage error, and the MBE remains zero, indicating no bias. The CPI and VAF values are slightly lower for the testing data, suggesting a slight decrease in predictive performance but still demonstrating reasonable accuracy.

Compared to the GEP model, the ANN model shows a higher positive correlation for both training and testing datasets, indicating better predictive capability. The ANN model has slightly lower RMSE and MAE values, suggesting smaller prediction errors on average. The MAPE values are also lower for the ANN model, indicating better average percentage error. The ANN model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Overall, the ANN model outperforms the GEP model, providing more accurate and reliable predictions.

Figure 11a shows the comparison between the actual and predicted values of Y1 using the RF model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, and the MAPE shows that the average percentage error is minimal. The MBE indicates a slight underestimation bias, while CPI and VAF reflect the model’s high accuracy. For the testing dataset, the model still shows a high positive correlation, though lower than the training data. The RMSE and MAE are significantly higher for the testing data, indicating larger errors compared to the training data. The MAPE shows a noticeable increase in percentage error, and the MBE indicates a substantial underestimation bias. However, despite these increases in error metrics, the CPI and VAF demonstrate that the model retains a reasonable predictive performance and explains a substantial portion of the variance in the actual values.

Scatter plots showing the comparison between the actual and predicted values based on the RF model for (a) Y1 and (b) Y2.

Compared to the GEP and ANN models, the RF model shows a higher RMSE and MAE for the testing data, indicating larger prediction errors. The RF model’s performance metrics suggest that while it performs well on the training data, its generalization to the testing data is not as strong as the other models.

Figure 11b shows the comparison between the actual and predicted values of Y2 using the RF model. For the training dataset, the model demonstrates a very high positive correlation between actual and predicted values, explaining a substantial percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are extremely low, suggesting very small average prediction errors. The MAPE demonstrates that the average percentage error is minimal, and the MBE is zero, showing no bias in the predictions. The CPI and VAF reflect excellent model performance, indicating very high accuracy. For the testing dataset, the model maintains a high positive correlation between actual and predicted values, although the correlation is slightly lower than for the training data. The RMSE and MAE remain low, indicating stable prediction errors. The MAPE shows a small increase in percentage error, and the MBE remains zero, indicating no bias. The CPI and VAF values are slightly lower for the testing data, suggesting a slight decrease in predictive performance but still demonstrating high accuracy.

Compared to the GEP and ANN models, the RF model shows the highest positive correlation for both training and testing datasets, indicating superior predictive capability. The RF model has the lowest RMSE and MAE values, suggesting the smallest prediction errors on average. The MAPE values are also the lowest among the three models, indicating the best average percentage error. The RF model’s CPI and VAF values are the highest, reflecting the best overall accuracy and variance explanation. Overall, the RF model outperforms both the GEP and ANN models, providing the most accurate and reliable predictions.

Figure 12a shows the comparison between the actual and predicted values of Y1 using the ADB model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, and the MAPE shows that the average percentage error is moderate. The MBE indicates an overestimation bias, while CPI and VAF reflect the model’s high accuracy. For the testing dataset, the model still shows a high positive correlation, though significantly lower than the training data. The RMSE and MAE are substantially higher for the testing data, indicating larger errors compared to the training data. The MAPE shows a significant increase in percentage error, and the MBE indicates a slight overestimation bias. Despite these increases in error metrics, the CPI and VAF demonstrate that the model retains reasonable predictive performance but with reduced accuracy compared to the training data. Compared to the GEP, ANN, and RF models, the ADB model shows higher RMSE and MAE for both training and testing data, indicating larger prediction errors. The performance metrics suggest that while the ADB model performs well on the training data, its generalization to the testing data is weaker, showing larger errors and lower accuracy than the other models.

Scatter plots showing the comparison between the actual and predicted values based on the ADB model for (a) Y1 and (b) Y2.

Figure 12b shows the comparison between the actual and predicted values of Y1 using the ADB model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are extremely low, suggesting very small average prediction errors. The MAPE demonstrates that the average percentage error is minimal, and the MBE is zero, showing no bias in the predictions. The CPI and VAF reflect excellent model performance, indicating very high accuracy. For the testing dataset, the model maintains a high positive correlation between actual and predicted values, although the correlation is lower than for the training data. The RMSE and MAE remain low, indicating stable prediction errors. The MAPE shows an increase in percentage error, and the MBE remains zero, indicating no bias. The CPI and VAF values are lower for the testing data, suggesting a decrease in predictive performance but still demonstrating good accuracy.

Compared to the GEP model, the ADB model shows a higher positive correlation for both training and testing datasets, indicating better predictive capability. The ADB model has lower RMSE and MAE values than the GEP model, suggesting smaller prediction errors on average. The MAPE values are also lower for the ADB model, indicating better average percentage error. The ADB model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Compared to the ANN model, the ADB model shows a similar level of positive correlation for the training data but a slightly lower correlation for the testing data. The RMSE and MAE values for the ADB model are comparable to those of the ANN model, suggesting similar prediction errors. The MAPE values are slightly higher for the ADB model, indicating a larger average percentage error. The CPI and VAF values are slightly lower for the ADB model, reflecting slightly lower overall accuracy and variance explanation.

Compared to the RF model, the ADB model shows a lower positive correlation for both training and testing datasets, indicating lower predictive capability. The ADB model has higher RMSE and MAE values than the RF model, suggesting larger prediction errors on average. The MAPE values are also higher for the ADB model, indicating a larger average percentage error. The RF model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Overall, the ADB model performs better than the GEP model but not as well as the ANN and RF models, providing accurate and reliable predictions but with slightly lower accuracy and higher errors compared to ANN and RF.

Figure 13a shows the comparison between the actual and predicted values of Y1 using the XGBoost model. For the training dataset, the model shows an almost perfect positive correlation between actual and predicted values, explaining nearly all the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are extremely low, and the MAPE shows that the average percentage error is minimal. The MBE is practically zero, indicating no bias in the predictions, while CPI and VAF reflect the model’s exceptionally high accuracy.

Scatter plots showing the comparison between the actual and predicted values based on the XGBoost model for (a) Y1 and (b) Y2.

For the testing dataset, the model still shows a very high positive correlation, though slightly lower than the training data. The RMSE and MAE are higher for the testing data compared to the training data, indicating larger errors, but they are still relatively low. The MAPE shows a small percentage error, and the MBE indicates a slight underestimation bias. Despite these increases, the CPI and VAF values remain high, demonstrating that the model retains excellent predictive performance and explains a substantial portion of the variance in the actual values.

Compared to the GEP, ANN, RF, and ADBoost models, the XGBoost model shows the lowest RMSE and MAE for both training and testing datasets, indicating the smallest prediction errors. The performance metrics suggest that the XGBoost model outperforms the other models in terms of accuracy and predictive capability, making it the most effective model in this comparison.

Figure 13b shows the comparison between the actual and predicted values of Y2 using the XGBoost model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are extremely low, suggesting very small average prediction errors. The MAPE indicates that the average percentage error is minimal, and the MBE is zero, showing no bias in the predictions. The CPI and VAF reflect excellent model performance, indicating very high accuracy. For the testing dataset, the model maintains a high positive correlation between actual and predicted values, although the correlation is slightly lower than for the training data. The RMSE and MAE remain low, indicating stable prediction errors. The MAPE shows an increase in percentage error, and the MBE remains zero, indicating no bias. The CPI and VAF values are lower for the testing data, suggesting a decrease in predictive performance but still demonstrating good accuracy.

Compared to the GEP model, the XGBoost model shows a much higher positive correlation for both training and testing datasets, indicating superior predictive capability. The XGBoost model has significantly lower RMSE and MAE values than the GEP model, suggesting smaller prediction errors on average. The MAPE values are also lower for the XGBoost model, indicating a better average percentage error. The XGBoost model’s CPI and VAF values are much higher, reflecting better overall accuracy and variance explanation. Compared to the ANN model, the XGBoost model demonstrates a higher positive correlation for both training and testing datasets, indicating better predictive capability. The RMSE and MAE values for the XGBoost model are lower than those of the ANN model, suggesting smaller prediction errors. The MAPE values are also lower for the XGBoost model, indicating a better average percentage error. The CPI and VAF values for the XGBoost model are higher, reflecting better overall accuracy and variance explanation.

In comparison to the RF model, the XGBoost model shows a slightly lower positive correlation for both training and testing datasets, but the differences are minimal. The RMSE and MAE values for the XGBoost model are comparable to those of the RF model, indicating similar prediction errors. The MAPE values are slightly higher for the XGBoost model, suggesting a larger average percentage error. The CPI and VAF values for the XGBoost model are lower than those of the RF model, reflecting slightly lower overall accuracy and variance explanation. Compared to the ADB model, the XGBoost model shows a higher positive correlation for both training and testing datasets, indicating better predictive capability. The XGBoost model has lower RMSE and MAE values than the ADB model, suggesting smaller prediction errors on average. The MAPE values are also lower for the XGBoost model, indicating a better average percentage error. The XGBoost model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Overall, the XGBoost model outperforms the GEP and ADB models and is comparable to the ANN and RF models, providing highly accurate and reliable predictions with low errors and high accuracy.

Figure 14 shows the training and validation loss curves for a deep learning model over 100 epochs. The X-axis represents the number of training epochs, where an epoch is one complete pass through the entire training dataset. The Y-axis represents the loss value, which is a measure of how well the model’s predictions match the actual target values. Lower loss values indicate better model performance. The blue line represents the training loss, which is the loss on the training dataset over each epoch. It shows how well the model is learning the training data. The red line represents the validation loss, which is the loss on the validation dataset over each epoch. This line shows how well the model generalizes unseen data.

Learning curve of DNN model training and validation loss against epoch.

At the beginning (epoch 0), both the training and validation loss values are high, indicating that the model starts with poor performance. As the epochs progress, both the training and validation loss values decrease rapidly, indicating that the model is learning and improving its performance. After about 20 epochs, both the training and validation loss values begin to converge and decrease more slowly, indicating that the model is approaching its optimal performance.

The training and validation loss curves are very close to each other throughout the training process, which suggests minimal overfitting. This indicates that the model is performing well on both the training and validation datasets. The plot might indicate the epoch at which early stopping was applied. Early stopping is a technique to prevent overfitting by stopping the training process when the validation loss stops improving. The figure also indicates that the DNN is training effectively, as evidenced by the continuous decrease in both training and validation loss. The close alignment of the training and validation loss curves suggests that the model has good generalization capability and is not overfitting to the training data. This means that the model should perform well on unseen data, making it a reliable predictor.

The best developed DNN model utilized the AdamW optimizer, ReLU activation function, normal initializer, and an l2 regularizer with a value of 0.000448. However, these hyperparameters have been found to work synergistically to create a robust and accurate model. The best architecture for the developed DNN model is configured as 5–1088-448–160-2. The DNN architecture consists of an input layer with the five features (X1-X5), followed by a first hidden dense layer with 1088 neurons, a second hidden dense layer with 448 neurons, a third hidden dense layer with 160 neurons, and an output dense layer with 2 neurons. Each dense layer is fully connected, meaning every neuron in one layer is connected to every neuron in the subsequent layer. This architecture transforms the input data through several layers of abstraction, producing the final output with the desired dimensionality.

Figure 15 shows the architecture of the developed DNN model. The input layer, dense_7, is a fully connected (dense) layer with an input shape of (None, 5), meaning each input sample has 5 features. The ‘None’ dimension represents the batch size, which can be any number of samples. The output shape of this layer is (None, 1088), indicating that this layer has 1088 neurons, and each input sample will be transformed into a 1088-dimensional output vector.

Best-developed DNN model architecture.

The first hidden layer, dense_8, is also a fully connected layer with an input shape of (None, 1088), where the input to this layer is the output from the previous layer. The output shape of this layer is (None, 448), meaning this layer has 448 neurons, transforming each 1088-dimensional input vector into a 448-dimensional output vector. The second hidden layer, dense_9, is another fully connected layer with an input shape of (None, 448). The input to this layer is the output from the previous layer, which is a vector of size 448. The output shape of this layer is (None, 160), indicating that this layer has 160 neurons, transforming each 448-dimensional input vector into a 160-dimensional output vector. The output layer, dense_10, is a fully connected layer with an input shape of (None, 160). The input to this layer is the output from the previous layer, which is a vector of size 160. The output shape of this layer is (None, 2), indicating that this layer has 2 neurons, transforming each 160-dimensional input vector into a 2-dimensional output vector. This is the final output layer to predict the two outputs.

Figure 16a shows the comparison between the actual and predicted values of Y1 using the DNN model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a significant percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are relatively low, and the MAPE shows that the average percentage error is minimal. The MBE indicates a slight overestimation bias, while CPI and VAF reflect the model’s high accuracy. For the testing dataset, the model still shows a high positive correlation, though slightly lower than the training data. The RMSE and MAE are higher for the testing data compared to the training data, indicating larger errors. The MAPE shows a moderate increase in percentage error, and the MBE indicates a slight underestimation bias. Despite these increases, the CPI and VAF values remain high, demonstrating that the model retains good predictive performance and explains a substantial portion of the variance in the actual values. Compared to the GEP, ANN, RF, ADB, and XGBoost models, the DNN model shows competitive performance with relatively low RMSE and MAE for both training and testing datasets. The performance metrics suggest that the DNN model performs well in terms of accuracy and predictive capability, making it a strong contender among the models compared.

Scatter plots showing the comparison between the actual and predicted values based on the DNN model for (a) Y1 and (b) Y2.

Figure 16b shows the comparison between the actual and predicted values of Y2 using the DNN model. For the training dataset, the model shows a very high positive correlation between actual and predicted values, explaining a substantial percentage of the variance in the actual values. The model’s errors, as indicated by RMSE and MAE, are low, suggesting small average prediction errors. The MAPE indicates that the average percentage error is minimal, and the MBE is zero, showing no bias in the predictions. The CPI and VAF reflect good model performance, indicating high accuracy. For the testing dataset, the model maintains a positive correlation between actual and predicted values, though the correlation is lower than for the training data. The RMSE and MAE are higher for the testing data, indicating larger prediction errors. The MAPE shows an increase in percentage error, and the MBE remains zero, indicating no bias. The CPI and VAF values are lower for the testing data, suggesting a decrease in predictive performance but still demonstrating reasonable accuracy.

Compared to the GEP model, the DNN model shows a higher positive correlation for both training and testing datasets, indicating better predictive capability. The DNN model has lower RMSE and MAE values than the GEP model, suggesting smaller prediction errors on average. The MAPE values are also lower for the DNN model, indicating a better average percentage error. The DNN model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Compared to the ANN model, the DNN model demonstrates a similar level of positive correlation for the training data but a lower correlation for the testing data. The RMSE and MAE values for the DNN model are comparable to those of the ANN model for the training data but are higher for the testing data, suggesting larger prediction errors. The MAPE values are slightly higher for the DNN model, indicating a larger average percentage error. The CPI and VAF values are lower for the DNN model, reflecting slightly lower overall accuracy and variance explanation.

In comparison to the RF model, the DNN model shows a lower positive correlation for both training and testing datasets, indicating lower predictive capability. The DNN model has higher RMSE and MAE values than the RF model, suggesting larger prediction errors on average. The MAPE values are also higher for the DNN model, indicating a larger average percentage error. The RF model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Compared to the ADB model, the DNN model shows a similar level of positive correlation for the training data but a lower correlation for the testing data. The RMSE and MAE values for the DNN model are comparable to those of the ADB model for the training data but are higher for the testing data, suggesting larger prediction errors. The MAPE values are slightly higher for the DNN model, indicating a larger average percentage error. The CPI and VAF values are lower for the DNN model, reflecting slightly lower overall accuracy and variance explanation.

Finally, compared to the XGBoost model, the DNN model shows a lower positive correlation for both training and testing datasets, indicating lower predictive capability. The DNN model has higher RMSE and MAE values than the XGBoost model, suggesting larger prediction errors on average. The MAPE values are also higher for the DNN model, indicating a larger average percentage error. The XGBoost model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation. Overall, the DNN model performs better than the GEP model but not as well as the ANN, RF, ADB, and XGBoost models, providing accurate and reliable predictions but with slightly lower accuracy and higher errors compared to the other models.

Violin boxplots combine elements of boxplots and kernel density plots to provide a more detailed representation of the data distribution. When comparing actual and predicted values, they can show the spread and density of errors, giving insights into the variance and bias of the model’s predictions. Figure 17a shows the violin plot comparing the distributions of predicted values for different models to the actual testing dataset of Y1. The actual testing dataset of Y1 shows a relatively smooth and symmetrical distribution, indicating that the observed values are spread out in a balanced manner around a central value with some degree of variability. This distribution serves as the benchmark against which the models are evaluated. The GEP model’s violin plot shows a distribution that is quite similar to the actual data, suggesting that GEP captures the variability of the real observations well. It maintains a balanced spread, indicating that the model’s predictions are neither to spread out nor too concentrated around the mean. The ANN model’s violin plot displays a distribution that closely mirrors the actual data, with a slightly wider spread. This suggests that the ANN model accurately captures the central tendency of the actual data while also reflecting the variability observed in the real measurements.

Violin boxplots comparing actual and predicted values from the adopted models in the testing stage for (a) Y1 and (b) Y2.

The RF model’s violin plot is narrower compared to the actual data, indicating less variability in its predictions. This suggests that the RF model’s predictions are more concentrated around the mean value, which may result in underestimating the spread of the actual observations. The ADB model’s violin plot shows a wider distribution compared to the actual data, indicating that this model predicts a broader range of values. This suggests that the ADB model tends to produce more varied predictions, which might include more extreme values than those observed in the actual data. The XGBoost model’s violin plot demonstrates a distribution that is quite aligned with the actual data, though slightly more concentrated. This indicates that the XGBoost model captures the central tendency well, with a slight underestimation of the variability compared to the real observations. The DNN model’s violin plot shows a distribution that is similar in shape to the actual data but is slightly narrower. This suggests that the DNN model captures the central tendency effectively but underestimates the spread, indicating less variability in its predictions compared to the actual observations.

Figure 17b shows the violin plot comparing the distributions of predicted values for different models to the actual testing dataset of Y2. The actual testing dataset of Y2 demonstrates a relatively smooth and symmetrical distribution, indicating that the observed values are spread out in a balanced manner around a central value with some degree of variability. This distribution serves as the benchmark against which the models are evaluated. The GEP model’s violin plot displays a distribution that is quite similar to the actual data, suggesting that GEP captures the variability of the real observations well. It maintains a balanced spread, indicating that the model’s predictions are neither to spread out nor too concentrated around the mean. This suggests that the GEP model can effectively mimic the actual data distribution. The ANN model’s violin plot shows a distribution that closely mirrors the actual data, with a slightly wider spread. This indicates that the ANN model accurately captures the central tendency of the actual data while also reflecting the variability observed in the real measurements. This wider spread suggests that the ANN model is able to account for a broader range of values within the data.

The RF model’s violin plot appears narrower compared to the actual data, indicating less variability in its predictions. This suggests that the RF model’s predictions are more concentrated around the mean value, potentially underestimating the spread of the actual observations. This narrower distribution implies that the RF model might not capture the full variability of the data. The ADB model’s violin plot shows a wider distribution compared to the actual data, indicating that this model predicts a broader range of values. This suggests that the ADB model tends to produce more varied predictions, which might include more extreme values than those observed in the actual data. This wider spread indicates that the ADB model may be sensitive to capturing outliers or extreme values in the data. The XGBoost model’s violin plot demonstrates a distribution that is quite aligned with the actual data, though slightly more concentrated. This indicates that the XGBoost model captures the central tendency well, with a slight underestimation of the variability compared to the real observations. This suggests that while the XGBoost model is accurate in predicting the central values, it might miss some of the variability present in the actual data. The DNN model’s violin plot shows a distribution that is similar in shape to the actual data but is slightly narrower. This suggests that the DNN model captures the central tendency effectively but underestimates the spread, indicating less variability in its predictions compared to the actual observations. This narrower distribution suggests that the DNN model might not fully capture the variability of the data.

Taylor diagrams are a specialized graphical representation that quantifies the similarity between actual and predicted values. These diagrams plot the correlation, the standard deviation, and the root mean square error of predictions on a single chart. This provides a comprehensive view of a model’s accuracy, variability, and overall performance compared to the actual observations. Figure 18a shows the Taylor diagram for Y1, evaluating the performance of the adopted models against the actual testing dataset. The GEP model had a standard deviation of 1264.59 and a correlation coefficient of 0.983. Its position in the diagram indicates that it follows the observed data pattern closely but with slightly higher variability. The centeredRoot Mean Square Deviation (RMSD)of 299.19 suggests a relatively small deviation from the actual values, making it a reasonably accurate model. The ANN model stands out with a high standard deviation of 1423.39 and a strong correlation coefficient of 0.995. This model captures the actual data’s variability well and has a very small centered RMSD of 136.59, indicating its predictions are very close to the actual values, making ANN a highly reliable model in this context.

Taylor diagrams during the testing phase for (a) Y1 and (b) Y2.

The RF model shows a lower standard deviation of 1100.67 compared to the actual data, with a correlation coefficient of 0.950. Despite having a higher centered RMSD of 513.07, it captures the variability moderately well but has more deviations in its predictions compared to models like ANN. ADB exhibits the lowest standard deviation of 963.66, indicating less variability in its predictions. However, the correlation coefficient of 0.905 and a high centered RMSD of 688.55 suggest that while ADB’s predictions are less variable, they are less accurate and more distant from the actual data. XGBoost has a standard deviation of 1320.72 and a high correlation coefficient of 0.995, indicating strong alignment with the actual data’s variability and pattern. The centered RMSD of 175.95 shows that XGBoost’s predictions are very close to the actual values, making it one of the top-performing models. The DNN model, with a standard deviation of 1258.58 and a high correlation coefficient of 0.992, captures the variability and the pattern of the actual data well. The centered RMSD of 242.69 indicates that DNN’s predictions are also quite close to the actual values, marking it as another reliable model.

In conclusion, the Taylor diagram visually and quantitatively demonstrates that models like ANN and XGBoost are top performers due to their high correlation coefficients and low-centered RMSD, indicating predictions closely matching the actual data. GEP and DNN also perform well, though with slightly higher RMSD. Models like RF and ADB, while capturing some variability, exhibit higher deviations from the actual values and, therefore, are less accurate in comparison to ANN and XGBoost.

Figure 18b shows the Taylor diagram for Y2, evaluating the performance of the adopted models against the actual testing dataset. The GEP model has a mean of 0.0192 and a standard deviation of 0.00145. The variance is 2.11E-6, with a centered RMS difference of 0.001. The correlation coefficient for GEP is 0.784, indicating a moderate correlation with the actual data. The ANN model has a mean of 0.0195 and a standard deviation of 0.0017. Its variance is 2.72E-6, and it has a centered RMS difference of 8.46E-4. The correlation coefficient is 0.878, suggesting a strong correlation with the actual data and a good predictive performance. The RF model has a mean of 0.0193 and a standard deviation of 0.0016. The variance is 2.43E-6, with a centered RMS difference of 5.39E-4. The correlation coefficient is 0.956, indicating a very high correlation and excellent predictive performance. The ADB model, marked by a blue star, shows a mean of 0.0194 and a standard deviation of 0.00147. The variance is 2.16E-6, with a centered RMS difference of 7.4E-4. The correlation coefficient for ADB is 0.912, demonstrating a strong correlation and good performance. The XGBoost has a mean of 0.0193 and a standard deviation of 0.0015. Its variance is 2.28E-6, and the centered RMS difference is 5.59E-4.The correlation coefficient is 0.955, showing a very high correlation and excellent predictive performance. The DNN model has a mean of 0.0194 and a standard deviation of 0.00147. The variance is 2.16E-6, with a centered RMS difference of 8.41E-4. The correlation coefficient for DNN is 0.881, indicating a strong correlation and good predictive accuracy. Overall, the Taylor diagram reveals that the RF and XGBoost models perform the best, with the highest correlation coefficients and lower RMS differences, indicating their predictions are closest to the actual data. The ANN, ADB, and DNN models also show good performance but with slightly lower correlation coefficients and higher RMS differences. The GEP model, while still useful, has a lower correlation coefficient and higher RMS difference, indicating it is less accurate than the other models.

A rank analysis is performed to assess the overall performance of the adopted models based on their performance metrics. Table 7 illustrates the ranking of each model’s performance across each metric in predicting Y1. These scores were then summed to give an overall rank, with lower scores indicating better performance. The total rank for each model is the sum of its training and testing scores. However, the model with the highest total rank is considered the least effective, whereas the model with the lowest total rank is considered the most effective.

The GEP model exhibited a relatively strong performance across the board, particularly during the training stage. In training, it ranked fifth or fourth in most metrics and performed particularly well in CPI and VAF. The total score of 71 reflects a consistent performance, ranking it fifth overall. The ANN model showed a varied performance between training and testing. During training, it performed exceptionally well in RMSE but lower in MAE and MAPE. In the testing stage, it ranked first or second in most metrics, demonstrating excellent generalization capabilities. The total score of 32 ranks it as the second-best model overall. The RF model demonstrated balanced and consistent performance across both stages. It ranked well in training for RMSE and CPI but was less strong in testing for MAE and MAPE. With a total score of 67, it is considered a reliable model with stable performance, ranking fourth overall. The ADB model had the lowest performance among all models, consistently ranking sixth in nearly every metric for both training and testing stages. The high total score of 92 indicates significant challenges in predicting Y1 accurately, placing it at the bottom of the ranking.

The XGBoost model showed outstanding performance, consistently ranking first in nearly all metrics during training and maintaining strong performance in the testing stage, especially in R2 and MAPE. The total score of 25 makes it the best-performing model overall, indicating its robustness and strong generalization ability. The DNN model displayed a mixed performance, ranking second or third in several training metrics but falling slightly in testing. Despite this, it showed strong generalization with solid performance in RMSE and MAPE during testing. The total score of 49 positions it as a competitive model, particularly in its ability to generalize well. In summary, the rank analysis highlights the XGBoost model as the best performer, followed by the ANN and DNN models. The GEP and RF models provide stable and reliable results, while the ADB model is the least effective in predicting Y1.

Table 8 illustrates the ranking of each model’s performance across each metric in predicting Y2. The GEP model exhibited the lowest performance in predicting Y2. It consistently ranked sixth across most metrics in both the training and testing stages. The only relatively better performance was in MBE during the test stage, where it ranked third. With a high total score of 88, it is the least effective model for predicting Y2, ranking sixth overall. The ANN model also showed low performance, ranking fifth across most metrics in both the training and testing stages. It consistently underperformed, similar to the GEP model. The total score of 82 places it fifth overall, indicating significant challenges in its predictive capabilities.

The RF model demonstrated the best performance among all models. It consistently ranked first in nearly all metrics during both training and testing stages, except for VAF in the training stage, where it ranked fourth. With a total score of 20, it is the top-performing model, showing strong predictive power and generalization ability. The ADB model exhibited moderate performance, ranking third across most metrics in both the training and testing stages. While not the best, it provided stable and reliable results. The total score of 50 ranks it third overall, indicating a decent balance between predictive accuracy and generalization. The XGBoost model showed strong performance, consistently ranking second across most metrics in both the training and testing stages. It excelled in maintaining low error rates and high predictive accuracy. The total score of 32 makes it the second-best model overall, showcasing its robustness and efficiency. The DNN model displayed mixed performance, ranking fourth in both the training and testing stages. While it performed relatively well in some metrics, it fell behind in others. The total score of 64 positions it fourth overall, indicating moderate effectiveness in predicting Y2. In summary, the rank analysis highlights the RF model as the best performer, followed by the XGBoost and ADB models. The DNN and ANN models provide moderate results, while the GEP model is the least effective in predicting Y2.

SHAP summary plots were created based on the best prediction performance of XGBoost and RF models for Y1 and Y2, respectively. Figure 19a is a SHAP summary bar plot, which illustrates the average impact of each feature on Y1. The x-axis represents the mean absolute SHAP value, indicating the average magnitude of the impact each feature has on the predictions for Y1. The y-axis lists the features in descending order of importance. X3 has the highest impact, significantly influencing Y1, followed by X4, X5, X2, and X1. This plot reveals that X3 contributes the most to Y1, while X1 has the least influence. Figure 19b is a SHAP summary dot plot, providing a more detailed view of the impact and importance of each feature on Y1. Each dot represents a single prediction, with the position along the x-axis indicating the SHAP value (impact on Y1) for that prediction. The color of the dots represents the feature value, ranging from blue (low value) to red (high value). This plot shows not only the magnitude of the impact each feature has on Y1 but also how this impact varies with different feature values. For example, higher values of X3 tend to have a larger positive impact on Y1. The SHAP summary dot plot for X3 shows that higher values of X3 significantly increase the predictions for Y1. The dots are mostly red on the positive side of the x-axis, indicating that high values of X3 lead to higher Y1. In contrast, the blue dots on the negative side show that low values of X3 reduce the predictions for Y1.

SHAPfeature importance plots for Y1 based on the XGBoost model (a) summary and (b) bar plots.

For X4, the SHAP summary dot plot shows a similar trend. Higher values of X4 are associated with positive SHAP values, indicating an increase in Y1. Conversely, lower values of X4 correspond to negative SHAP values, decreasing Y1. In the case of X5, the plot reveals that both low and high values have significant impacts, but high values) generally increase Y1, while low values tend to decrease it. The SHAP summary dot plot for X2 shows a balanced distribution with high values having a moderate positive impact on Y1, and low values having a negative impact. Lastly, X1 has the least influence on Y1, as seen in both the bar plot and the summary dot plot. The impact of X1 is smaller compared to other features, with high values slightly increasing Y1 and low values slightly decreasing it.

Figure 20a is is a SHAP summary bar plot, illustrating the average impact of each feature on Y2. The x-axis represents the mean absolute SHAP value, indicating the average magnitude of the impact each feature has on predictions for Y2. The y-axis lists the features in descending order of importance. X1 has the highest impact, significantly influencing Y2, followed by X5, X3, X2, and X4. This plot reveals that X1 contributes the most to Y2, while X4 has the least influence. Figure 20b is a SHAP summary dot plot, providing a more detailed view of the impact and importance of each feature on Y2. Each dot represents a single prediction, with the position along the x-axis indicating the SHAP value (impact on Y2) for that prediction. The color of the dots represents the feature value, ranging from blue (low value) to red (high value). This plot shows not only the magnitude of the impact each feature has on Y2 but also how this impact varies with different feature values. The SHAP summary dot plot for X1 shows that higher values of X1 significantly increase the predictions for Y2. The dots are mostly red on the positive side of the x-axis, indicating that high values of X1 lead to higher Y2. In contrast, the blue dots on the negative side show that low values of X1 reduce the predictions for Y2. For X5, the SHAP summary dot plot shows that higher values of X5 generally increase Y2, while lower values decrease it. The plot for X3 indicates a varied impact on Y2, with higher values showing a tendency to increase Y2, while lower values tend to have a lesser impact. The SHAP summary dot plot for X2 shows a balanced distribution with high values having a moderate positive impact on Y2, and low values having a negative impact. Lastly, X4 has the least influence on Y2, as seen in both the bar plot and the summary dot plot. The impact of X4 is smaller compared to other features, with high values slightly increasing Y2 and low values slightly decreasing it.

SHAPfeatures importance plots for Y2 based on the RF model (a) summary and (b) bar plots.

Figure 21 represents PDPs for the features X1 through X5, showing their impact on the output variable Y1. Each plot illustrates how changes in the value of a single feature influence the predicted outcome, Y1, while keeping other features constant. The plot for X1 indicates a relatively steady increase in Y1 as the value of X1 increases from around 1800 to 2600. This suggests that higher values of X1 generally lead to higher predicted values of Y1. The plot for X2 shows a similar upward trend, indicating that as X2 increases from approximately 350 to 600, Y1 also increases, albeit at a slightly slower rate compared to X1. The plot for X3 displays a more pronounced increase, with a noticeable jump in Y1 as X3 values rise from 20,000 to 40,000. This indicates that X3 has a significant positive impact on Y1, especially at higher values. The plot for X4 shows a steady increase in Y1 as X4 values rise from around 30 to 70, suggesting that higher X4 values are associated with higher Y1 predictions. Finally, the plot for X5 illustrates a consistent upward trend, with Y1 increasing as X5 values move from 50,000 to 200,000. This indicates that higher values of X5 are positively correlated with higher predicted values of Y1.

PDPs for Y1 based on the XGBoost model.

Overall, both the SHAP and PDP analyses agree on the relative importance of the features and their effects on Y1. X3, X4, and X5 are identified as the most influential features in both analyses, while X2 and X1 have comparatively lesser impacts. This consistency between the SHAP and PDP analyses reinforces the reliability of the model’s interpretation and the significance of these features in predicting Y1.

Figure 22 represents PDPs for the features X1 through X5, showing their impact on the output variable Y2. Each plot illustrates how changes in the value of a single feature influence the predicted outcome, Y2, while keeping other features constant. The plot for X1 shows a non-linear relationship with Y2. Initially, as X1 increases from around 1800 to 2400, Y2 also increases, indicating a positive impact. However, a sharp decline is observed after X1 reaches approximately 2400, suggesting a threshold beyond which increases in X1 lead to a decrease in Y2. The plot for X2 indicates a relatively flat relationship with Y2. There is a slight initial decrease in Y2 as X2 increases from 350 to 400, but after that, Y2 remains nearly constant as X2 continues to increase up to 600. This suggests that changes in X2 have a minimal impact on Y2 within this range.

PDPs for Y2 based on the RF model.

The plot for X3 shows a generally decreasing trend. As X3 increases from around 20,000 to 40,000, Y2 decreases, indicating that higher values of X3 tend to reduce Y2. This negative relationship suggests that X3 has a detrimental effect on Y2. The plot for X4 is relatively flat, indicating that changes in X4 have little to no impact on Y2. Y2 remains almost constant across the range of X4 values from 30 to 70, suggesting that X4 is not a significant predictor for Y2. The plot for X5 shows a positive relationship with Y2. As X5 increases from 50,000 to 200,000, Y2 also increases. This indicates that higher values of X5 lead to higher predictions for Y2, highlighting the positive impact of X5 on Y2.

Overall, both the SHAP and PDP analyses agree on the relative importance of the features and their effects on Y2. X1 and X5 are identified as the most influential features in both analyses, while X3 shows a notable but complex influence. X2 and X4 have comparatively lesser impacts, as indicated by both methods. This consistency reinforces the reliability of the model’s interpretation and the significance of these features in predicting Y2.

This section presents a significant advancement to meet the practical needs of engineers and designers in efficiently utilizing ML models. Although the complex processes of database assembly, model training, and validation have traditionally impeded the seamless integration of ML into everyday design tasks, an innovative solution has been developed. A Python web application has been created featuring a model with optimized hyperparameters accessible through an intuitive graphical user interface (GUI) built with the Tkinter package128. This GUI is specifically designed to predict outputs, as shown in Fig. 23.

GUI screenshot for predicting Y1 and Y2.

The GUI presents a streamlined layout where users can enter values for the input variables. Upon inputting these variables, both calculated outputs are dynamically displayed, thereby providing immediate and tangible insights into the structural capacity of the column under consideration. To facilitate wider access and foster collaborative improvements, the GUI was and has been hosted on GitHub, making it readily available for use and further development by the community. This not only democratizes the use of advanced predictive models but also invites contributions to refine the tool and adapt it to various specific needs within the field of structural engineering. Finally, the GUI can be freely accessed at the following URL: https://github.com/mkamel24/ULS.

In this study, five machine learning (ML) models—GEP, ANN, RF, ADB, and XGBoost—and one deep learning (DL) model, DNN, are employed to predict the ultimate load-carrying capacity and ultimate strain of hybrid elliptical double-skin tubular columns (DSTCs) under axial loading. Additionally, two typical configurations for hybrid elliptical DSTCs were selected for exploration: the hollow-hybrid elliptical DSTC and the filled-hybrid elliptical DSTC. The dataset used for training and testing the ML and DL models consists of 112 data points generated from finite element (FE) models. The accuracy of the ML models is assessed by comparing their predictions withFE results. A user-friendly graphical user interface (GUI) tool has been developed for practicing engineers. The following outcomes are obtained from this study (Table 9).

The RF and XGBoost models perform the best, with the highest correlation coefficients and lower RMS differences, indicating their predictions are closest to the actual data.

The ANN, ADB, and DNN models also show good performance but with slightly lower correlation coefficients and higher RMS differences. The GEP model, while still useful, has a lower correlation coefficient and higher RMS difference, indicating it is less accurate than the other models.

The XGBoost model’s CPI and VAF values are higher, reflecting better overall accuracy and variance explanation.

The results from SHAP, based on the best prediction performance of the XGBoost model, indicate that the area of the concrete core has the most significant effect on the load-carrying capacity of hybrid elliptical DSTCs, followed by the unconfined concrete strength and the total thickness of FRP multiplied by its elastic modulus.

A user-friendly GUI was developed using Tkinter, providing a convenient platform for users to input parameters and predict ultimate load-carrying capacity and ultimate strain of hybrid elliptical (DSTCs) under axial loading.

The data presented in this study are available on request from the corresponding author.

Fayed, S. et al. Shear strengthening of RC beams using prestressed near-surface mounted bars reducing the probability of construction failure risk. Materials. 17, 5701 (2024).

Article CAS PubMed PubMed Central Google Scholar

Hamoda, A. et al. Strengthening of reinforced concrete columns incorporating different configurations of stainless-steel plates 106577 (Elsevier, 2024).

Google Scholar

Yuan, F., Pan, J. & Leung, C. K. Y. Flexural behaviors of ECC and concrete/ECC composite beams reinforced with basalt fiber-reinforced polymer. J. Compos. Constr. 17, 591–602 (2013).

Article CAS Google Scholar

Lu, C., Yu, J. & Leung, C. K. Tensile performance and impact resistance of Strain Hardening Cementitious Composites (SHCC) with recycled fibers. Constr. Build. Mater. 171, 566–576 (2018).

Article CAS Google Scholar

Li, L.-Z., Bai, Y., Yu, K.-Q., Yu, J.-T. & Lu, Z.-D. Reinforced high-strength engineered cementitious composite (ECC) columns under eccentric compression: Experiment and theoretical model. Eng. Struct. 198, 109541 (2019).

Article Google Scholar

Hamoda, A. et al. Shear strengthening of simply supported deep beams using galvanized corrugated sheet filled with high-performance concrete. Case Stud. Constr. Mater. 21, e04085 (2024).

Google Scholar

Hamoda, A. et al. Experimental and numerical investigations of the shear performance of reinforced concrete deep beams strengthened with hybrid SHCC-mesh. Case Stud. Constr. Mater. 21, e03495 (2024).

Google Scholar

Li VC. Engineered cementitious composites (ECC). Engineered Cementitious Composites (ECC). 2019:286–90.

Zhang, Y., Deng, M. & Dong, Z. Seismic response and shear mechanism of engineered cementitious composite (ECC) short columns. Eng. Struct. 192, 296–304 (2019).

Article Google Scholar

Zhu, J.-X., Xu, L.-Y., Huang, B.-T., Weng, K.-F. & Dai, J.-G. Recent developments in Engineered/Strain-Hardening Cementitious Composites (ECC/SHCC) with high and ultra-high strength. Constr. Build. Mater. 342, 127956 (2022).

Article Google Scholar

Lai, B.-L., Zhang, M.-Y., Chen, Z.-P., Liew, J. R. & Zheng, Y.-Y. Axial compressive behavior and design of semi-precast steel reinforced concrete composite columns with permanent ECC formwork 105130 (Elsevier, 2023).

Google Scholar

Hamoda, A., Shahin, R. I., Ahmed, M., Abadel, A. A. & Yehia, S. A. Flexural behaviour of normal concrete circular beams strengthened with ECC and stainless steel tubes. Magazine Concr. Res. 77, 1–18 (2024).

Google Scholar

Hamoda, A., Yehia, S. A., Abadel, A. A., Sennah, K. & Shahin, R. I. Strengthening of simply supported deep beams with openings using steel-reinforced ECC and externally bonded CFRP sheets. Magazine Concr. Res. 77, 1–17 (2024).

Google Scholar

Sennah K, Hamoda A, Abadel A, Yehia S, Shahin R. Shear strengthening of simply-supported deep beams with openings incorporating combined steel-reinforced engineered cementitious composites and externally bonded carbon fibre-reinforced polymer sheets. Magazine of Concrete Research. 2024.

Hamoda, A. et al. Shear strengthening of normal concrete deep beams with openings using strain-hardening cementitious composites with glass fiber mesh 107994 (Elsevier, 2025).

Google Scholar

Zeng, J.-J., Lv, J.-F., Lin, G., Guo, Y.-C. & Li, L.-J. Compressive behavior of double-tube concrete columns with an outer square FRP tube and an inner circular high-strength steel tube. Constr. Build. Mater. 184, 668–680 (2018).

Article Google Scholar

Zhang, B. et al. Seismic performance of elliptical FRP-concrete-steel tubular columns under combined axial load and reversed lateral load. Eng. Struct. 286, 116135 (2023).

Article Google Scholar

Isleem, H. F. et al. Parametric investigation of rectangular CFRP-confined concrete columns reinforced by inner elliptical steel tubes using finite element and machine learning models. Heliyon. 10, e23666 (2024).

Article CAS PubMed Google Scholar

Mohamed, H. S. et al. Compressive behavior of elliptical concrete-filled steel tubular short columns using numerical investigation and machine learning techniques. Sci. Rep. 14, 27007 (2024).

Article CAS PubMed PubMed Central Google Scholar

Green, M. F., Bisby, L. A., Fam, A. Z. & Kodur, V. K. FRP confined concrete columns: Behaviour under extreme conditions. Cement Concr. Compos. 28, 928–937 (2006).

Article CAS Google Scholar

Saenz, N. & Pantelides, C. P. Short and medium term durability evaluation of FRP-confined circular concrete. J. Compos. Constr. 10, 244–253 (2006).

Article CAS Google Scholar

Wong, Y., Yu, T., Teng, J. & Dong, S. Behavior of FRP-confined concrete in annular section columns. Compos. B Eng. 39, 451–466 (2008).

Article Google Scholar

El-Hacha, R., Green, M. F. & Wight, G. R. Effect of severe environmental exposures on CFRP wrapped concrete columns. J. Compos. Constr. 14, 83–93 (2010).

Article CAS Google Scholar

Isleem, H. F., Wang, D. & Wang, Z. Modeling the axial compressive stress-strain behavior of CFRP-confined rectangular RC columns under monotonic and cyclic loading. Compos. Struct. 185, 229–240 (2018).

Article Google Scholar

Isleem, H. F., Wang, Z., Wang, D. & Smith, S. T. Monotonic and cyclic axial compressive behavior of CFRP-confined rectangular RC columns. J. Compos. Constr. 22, 04018023 (2018).

Article Google Scholar

Isleem, H. F., Wang, D. & Wang, Z. A new numerical model for polymer-confined rectangular concrete columns. Proc. Inst. Civ. Eng. Struct. Build. 172, 528–544 (2019).

Article Google Scholar

Isleem, H. F., Tahir, M. & Wang, Z. Axial stress–strain model developed for rectangular RC columns confined with FRP wraps and anchors 779–788 (Elsevier, 2020).

Google Scholar

Zhang, B. et al. Elliptical FRP-concrete-steel double-skin tubular columns under monotonic axial compression. Adv. Polym. Technol. 2020, 7573848 (2020).

Google Scholar

Isleem, H. F. et al. Axial compressive strength models of eccentrically-loaded rectangular reinforced concrete columns confined with FRP. Materials. 14, 3498 (2021).

Article ADS CAS PubMed PubMed Central Google Scholar

Ahmed, M. et al. Numerical analysis of circular steel–reinforced concrete-filled steel tubular stub columns. Mag. Concr. Res. 76, 303–318 (2023).

Article Google Scholar

Isleem, H. F. et al. Finite element, analytical, artificial neural network models for carbon fibre reinforced polymer confined concrete filled steel columns with elliptical cross sections. Front. Mater. 9, 1115394 (2023).

Article ADS Google Scholar

Ahmed, M. et al. Nonlinear analysis of square steel-reinforced concrete-filled steel tubular short columns considering local buckling. Struct. Concr. 25, 69–84 (2024).

Article Google Scholar

İpek, S., Erdoğan, A. & Güneyisi, E. M. Compressive behavior of concrete-filled double skin steel tubular short columns with the elliptical hollow section. J. Build. Eng. 38, 102200 (2021).

Article Google Scholar

Teng, J., Yu, T., Wong, Y. Behavior of hybrid FRP-concrete-steel double-skin tubular columns. In Proc, 2nd Int Conf on FRP Composites in Civil Engineering: Adelaide Australia; 2004. p. 811–8.

Teng, J., Yu, T., Wong, Y. & Dong, S. Hybrid FRP–concrete–steel tubular columns: concept and behavior. Constr. Build. Mater. 21, 846–854 (2007).

Article Google Scholar

Yu, T., Teng, J. & Wong, Y. Stress-strain behavior of concrete in hybrid FRP-concrete-steel double-skin tubular columns. J. Struct. Eng. 136, 379–389 (2010).

Article Google Scholar

Zhang, B., Yu, T., Teng, J. Axial compression tests on hybrid double-skin tubular columns filled with high strength concrete. 2011.

Fanggi, B. A. L. & Ozbakkaloglu, T. Compressive behavior of aramid FRP–HSC–steel double-skin tubular columns. Constr. Build. Mater. 48, 554–565 (2013).

Article Google Scholar

Zhang, B., Teng, J. G. & Yu, T. Compressive behavior of double-skin tubular columns with high-strength concrete and a filament-wound FRP tube. J. Compos. Constr. 21, 04017029 (2017).

Article Google Scholar

Zheng, J. & Ozbakkaloglu, T. Sustainable FRP–recycled aggregate concrete–steel composite columns: Behavior of circular and square columns under axial compression. Thin-Walled Structures. 120, 60–69 (2017).

Article Google Scholar

Ozbakkaloglu, T. & Akin, E. Behavior of FRP-confined normal-and high-strength concrete under cyclic axial compression. J. Compos. Constr. 16, 451–463 (2012).

Article CAS Google Scholar

Yu, T., Zhang, B., Cao, Y. & Teng, J. Behavior of hybrid FRP-concrete-steel double-skin tubular columns subjected to cyclic axial compression. Thin-walled structures. 61, 196–203 (2012).

Article Google Scholar

Ozbakkaloglu, T. & Louk Fanggi, B. A. FRP–HSC–steel composite columns: behavior under monotonic and cyclic axial compression. Mater. Struct. 48, 1075–1093 (2015).

Article Google Scholar

Yu, T., Wong, Y. & Teng, J. Behavior of hybrid FRP-concrete-steel double-skin tubular columns subjected to eccentric compression. Adv. Struct. Eng. 13, 961–974 (2010).

Article Google Scholar

Han, L.-H., Tao, Z., Liao, F.-Y. & Xu, Y. Tests on cyclic performance of FRP–concrete–steel double-skin tubular columns. Thin-Walled Structures. 48, 430–439 (2010).

Article Google Scholar

Zhang, B., Teng, J. & Yu, T. Experimental behavior of hybrid FRP–concrete–steel double-skin tubular columns under combined axial compression and cyclic lateral loading. Eng. Struct. 99, 214–231 (2015).

Article Google Scholar

Idris, Y. & Ozbakkaloglu, T. Behavior of square fiber reinforced polymer–high-strength concrete–steel double-skin tubular columns under combined axial compression and reversed-cyclic lateral loading. Eng. Struct. 118, 307–319 (2016).

Article Google Scholar

Wang, R., Han, L.-H. & Tao, Z. Behavior of FRP–concrete–steel double skin tubular members under lateral impact: Experimental study. Thin-Walled Structures. 95, 363–373 (2015).

Article Google Scholar

Abdelkarim, O. I. & ElGawady, M. A. Performance of hollow-core FRP–concrete–steel bridge columns subjected to vehicle collision. Eng. Struct. 123, 517–531 (2016).

Article Google Scholar

Cui, C. & Sheikh, S. Experimental study of normal-and high-strength concrete confined with fiber-reinforced polymers. J. Compos. Constr. 14, 553–561 (2010).

Article CAS Google Scholar

Parvin, A. & Jamwal, A. S. Effects of wrap thickness and ply configuration on composite-confined concrete cylinders. Compos. Struct. 67, 437–442 (2005).

Article Google Scholar

Wu, Y.-F. & Wei, Y.-Y. Effect of cross-sectional aspect ratio on the strength of CFRP-confined rectangular concrete columns. Eng. Struct. 32, 32–45 (2010).

Article Google Scholar

Rashid, S. P. & Bahrami, A. Structural performance of infilled steel–concrete composite thin-walled columns combined with FRP and CFRP: A comprehensive review. Materials. 16, 1564 (2023).

Article ADS CAS PubMed PubMed Central Google Scholar

Gu, D.-S., Wu, G., Wu, Z.-S. & Wu, Y.-F. Confinement effectiveness of FRP in retrofitting circular concrete columns under simulated seismic load. J. Compos. Constr. 14, 531–540 (2010).

Article Google Scholar

Ozbakkaloglu, T. & Oehlers, D. J. Concrete-filled square and rectangular FRP tubes under axial compression. J. Compos. Constr. 12, 469–477 (2008).

Article CAS Google Scholar

Al-Rousan, R. Z. & Barfed, M. H. Impact of curvature type on the behavior of slender reinforced concrete rectangular column confined with CFRP composite. Compos. B Eng. 173, 106939 (2019).

Article CAS Google Scholar

Teng, J., Huang, Y., Lam, L. & Ye, L. Theoretical model for fiber-reinforced polymer-confined concrete. J. Compos. Constr. 11, 201–210 (2007).

Article CAS Google Scholar

Eid, R. & Paultre, P. Analytical model for FRP-confined circular reinforced concrete columns. J. Compos. Constr. 12, 541–552 (2008).

Article CAS Google Scholar

Turgay, T., Köksal, H. O., Polat, Z. & Karakoç, C. Stress–strain model for concrete confined with CFRP jackets. Mater. Des. 30, 3243–3251 (2009).

Article Google Scholar

Cui, C. & Sheikh, S. Analytical model for circular normal-and high-strength concrete columns confined with FRP. J. Compos. Constr. 14, 562–572 (2010).

Article CAS Google Scholar

Bing, Z. et al. Behavior of elliptical GFRP-concrete-steel doubleskin tubular columns under axial compression. Journal of Building Structures. 40, 185–191 (2019).

Google Scholar

Chen, G., Wang, Y., Yu, T., Zhang, B. & Han, B. Elliptical FRP–concrete–steel double-skin tubular columns: axial behavior, interaction mechanism, and modeling. J. Compos. Constr. 26, 04022078 (2022).

Article CAS Google Scholar

Zhang, B. et al. Elliptical concrete-filled FRP tubes with an embedded H-shaped steel under axial compression and cyclic lateral loading: Experimental study and modelling. Compos. Struct. 330, 117839 (2024).

Article Google Scholar

Pan, Y. & Zhang, L. Roles of artificial intelligence in construction engineering and management: A critical review and future trends. Autom. Constr. 122, 103517 (2021).

Article Google Scholar

Pham, T. M. & Hadi, M. N. Predicting stress and strain of FRP-confined square/rectangular columns using artificial neural networks. J. Compos. Constr. 18, 04014019 (2014).

Article Google Scholar

Naser, M. & Kodur, V. Explainable machine learning using real, synthetic and augmented fire tests to predict fire resistance and spalling of RC columns. Eng. Struct. 253, 113824 (2022).

Article Google Scholar

Yehia, S. A., Fayed, S., Zakaria, M. H. & Shahin, R. I. Prediction of RC T-Beams Shear Strength based on machine learning. Int. J. Concr. Struct. Mater. 18, 52 (2024).

Article Google Scholar

Yehia, S. A., Shahin, R. I. & Fayed, S. Compressive behavior of eco-friendly concrete containing glass waste and recycled concrete aggregate using experimental investigation and machine learning techniques. Constr. Build. Mater. 436, 137002 (2024).

Article Google Scholar

Waszczyszyn, Z. & Bartczak, M. Neural prediction of buckling loads of cylindrical shells with geometrical imperfections. Int. J. Non-Linear Mech. 37, 763–775 (2002).

Article Google Scholar

Degtyarev, V. V. & Tsavdaridis, K. D. Buckling and ultimate load prediction models for perforated steel beams using machine learning algorithms. J. Build. Eng. 51, 104316 (2022).

Article Google Scholar

Shahin, R. I., Ahmed, M., Yehia, S. A. & Liang, Q. Q. ANN model for predicting the elastic critical buckling coefficients of prismatic tapered steel web plates under stress gradients. Eng. Struct. 294, 116794 (2023).

Article Google Scholar

Shahin, R. I., Ahmed, M., Liang, Q. Q. & Yehia, S. A. Predicting the web crippling capacity of cold-formed steel lipped channels using hybrid machine learning techniques. Eng. Struct. 309, 118061 (2024).

Article Google Scholar

Asteris, P. G., Lemonis, M. E., Nguyen, T.-A., Van Le, H. & Pham, B. T. Soft computing-based estimation of ultimate axial load of rectangular concrete-filled steel tubes. Steel Compos. Struct. Int. J. 39, 471–491 (2021).

Google Scholar

Ly, H.-B. et al. Estimation of axial load-carrying capacity of concrete-filled steel tubes using surrogate models. Neural Comput. Appl. 33, 3437–3458 (2021).

Article Google Scholar

Sarir, P., Chen, J., Asteris, P. G., Armaghani, D. J. & Tahir, M. Developing GEP tree-based, neuro-swarm, and whale optimization models for evaluation of bearing capacity of concrete-filled steel tube columns. Eng. Comput. 37, 1–19 (2021).

Article Google Scholar

Tran, V. L., Ahmed, M. & Gohari, S. Prediction of the ultimate axial load of circular concrete-filled stainless steel tubular columns using machine learning approaches. Struct. Concr. 24, 3908–3932 (2023).

Article Google Scholar

Yang, L. et al. Prediction of alkali-silica reaction expansion of concrete using artificial neural networks. Cement Concr. Compos. 140, 105073 (2023).

Article CAS Google Scholar

Lai, B.-L., Bao, R.-L., Zheng, X.-F., Vasdravellis, G. & Mensinger, M. Machine-learning assisted analysis on the seismic performance of steel reinforced concrete composite columns 107065 (Elsevier, 2024).

Google Scholar

Lai, B. L., Zhou, X., Zheng, X. F., Li, S. & Venkateshwaran, A. Theoretical reevaluation and machine learning analysis on the concrete confinement effect of square reinforced concrete columns. Struct. Concr. https://doi.org/10.1002/suco.202400992 (2024).

Article Google Scholar

Isleem, H. F., Peng, F. & Tayeh, B. A. Confinement model for LRS FRP-confined concrete using conventional regression and artificial neural network techniques. Compos. Struct. 279, 114779 (2022).

Article Google Scholar

Isleem, H. F. et al. Finite element and artificial neural network modeling of FRP-RC columns under axial compression loading. Front. Mater. 9, 888909 (2022).

Article ADS Google Scholar

Abdulla, N. A. Using the artificial neural network to predict the axial strength and strain of concrete-filled plastic tube. J. Soft Comput. Civ. Eng. 4, 63–84 (2020).

Google Scholar

Ali, S., Ahmad, J., Iqbal, U., Khan, S. & Hadi, M. N. Neural network-based models versus empirical models for the prediction of axial load-carrying capacities of FRP-reinforced circular concrete columns. Struct. Concr. 25, 1148–1164 (2024).

Article Google Scholar

ABAQUS. Standard user’s manual, version 6.12. Providence, RI (USA): Dassault Systemes Corp. 2012.

Pagoulatou, M., Sheehan, T., Dai, X. & Lam, D. Finite element analysis on the capacity of circular concrete-filled double-skin steel tubular (CFDST) stub columns. Eng. Struct. 72, 102–112 (2014).

Article Google Scholar

Dai, X. & Lam, D. Numerical modelling of the axial compressive behaviour of short concrete-filled elliptical steel columns. J. Constr. Steel Res. 66, 931–942 (2010).

Article Google Scholar

Dai, X., Lam, D., Jamaluddin, N. & Ye, J. Numerical analysis of slender elliptical concrete filled columns under axial compression. Thin-Walled Structures. 77, 26–35 (2014).

Article Google Scholar

Wang, J., Shen, Q., Wang, F. & Wang, W. Experimental and analytical studies on CFRP strengthened circular thin-walled CFST stub columns under eccentric compression. Thin-Walled Structures. 127, 102–119 (2018).

Article ADS Google Scholar

Mortazavi, A. A., Pilakoutas, K. & Son, K. S. RC column strengthening by lateral pre-tensioning of FRP. Constr. Build. Mater. 17, 491–497 (2003).

Article Google Scholar

Jiang, T. & Teng, J. Theoretical model for slender FRP-confined circular RC columns. Constr. Build. Mater. 32, 66–76 (2012).

Article CAS Google Scholar

Ganganagoudar, A., Mondal, T. G. & Prakash, S. S. Analytical and finite element studies on behavior of FRP strengthened RC beams under torsion. Compos. Struct. 153, 876–885 (2016).

Article Google Scholar

Yehia, S. A., Fayed, S., Shahin, R. I. & Ahmed, R. B. Effect of existing holes under the loading plate on local compressive strength of plain concrete blocks: an experimental and numerical study. Case Stud. Constr. Mater. 21, e03937 (2024).

Google Scholar

Binici, B. An analytical model for stress–strain behavior of confined concrete. Eng. Struct. 27, 1040–1051 (2005).

Article Google Scholar

Dassault SystemesSimuliaCorpia Corp. ABAQUS/Standard User’s Manual, Version 6.9, 2020.

Han, L.-H. & Huo, J.-S. Concrete-filled hollow structural steel columns after exposure to ISO-834 fire standard. J. Struct. Eng. 129, 68–78 (2003).

Article Google Scholar

Yan, X.-F., Zhao, Y.-G. & Lin, S. Compressive behaviour of circular CFDST short columns with high-and ultrahigh-strength concrete. Thin-Walled Structures. 164, 107898 (2021).

Article Google Scholar

Madvari RF. Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) on Health, Safety and Environment (HSE). Archives of Occupational Health. 2022.

Alzubi, J., Nayyar, A. & Kumar, A. Machine learning from theory to algorithms: an overview. J. Phys. Conf. Ser. 1142, 012012 (2018).

Article Google Scholar

Mahmud, M., Kaiser, M. S., McGinnity, T. M. & Hussain, A. Deep learning in mining biological data. Cogn. Comput. 13, 1–33 (2021).

Article Google Scholar

Patterson J, Gibson A. Deep learning: A practitioner’s approach: " O'Reilly Media, Inc."; 2017.

Ma, L. & Sun, B. Machine learning and AI in marketing–Connecting computing power to human insights. Int. J. Res. Mark. 37, 481–504 (2020).

Article Google Scholar

Ferreira C. Gene expression programming in problem solving. Soft computing and industry: recent applications: Springer; 2002. p. 635–53.

Hamed, A. K., Elshaarawy, M. K. & Alsaadawi, M. M. Stacked-based machine learning to predict the uniaxial compressive strength of concrete materials. Comput. Struct. 308, 107644. https://doi.org/10.1016/j.compstruc.2025.107644 (2025).

Article Google Scholar

Flood, I. & Kartam, N. Neural networks in civil engineering II: Systems and application. J. Comput. Civ. Eng. 8, 149–162 (1994).

Article Google Scholar

Elshaarawy, M. K. & Hamed, A. K. Modeling hydraulic jump roller length on rough beds: a comparative study of ANN and GEP models. J. Umm Al-Qura Univ. Eng. Architect. https://doi.org/10.1007/s43995-024-00093-x (2025).

Article Google Scholar

Elshaarawy, M. K. & Eltarabily, M. G. Machine learning models for predicting water quality index: optimization and performance analysis for El Moghra Egypt. Water Supply. 24(9), 3269–3294. https://doi.org/10.2166/ws.2024.189 (2024).

Article CAS Google Scholar

Eltarabily, M. G. et al. Predicting seepage losses from lined irrigation canals using machine learning models. Front. Water. 5, 1287357. https://doi.org/10.3389/frwa.2023.1287357 (2023).

Article Google Scholar

Chen, T., Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acmsigkdd international conference on knowledge discovery and data mining 2016. p. 785–94.

Luat, N.-V., Han, S. W. & Lee, K. Genetic algorithm hybridized with eXtreme gradient boosting to predict axial compressive capacity of CCFST columns. Compos. Struct. 278, 114733 (2021).

Article Google Scholar

Tseng, P. & Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009).

Article MathSciNet Google Scholar

Eltarabily, M. G., Elshaarawy, M. K., Elkiki, M. & Selim, T. Modeling surface water and groundwater interactions for seepage losses estimation from unlined and lined canals. Water Sci. 37, 315–328. https://doi.org/10.1080/23570008.2023.2248734 (2023).

Article Google Scholar

Elshaarawy, M. K. Stacked-based hybrid gradient boosting models for estimating seepage from lined canals. J. Water Process Eng. 1(70), 106913. https://doi.org/10.1016/j.jwpe.2024.106913 (2025).

Article Google Scholar

Eltarabily, M. G., Selim, T., Elshaarawy, M. K. & Mourad, M. H. Numerical and experimental modeling of geotextile soil reinforcement for optimizing settlement and stability of loaded slopes of irrigation canals. Environ. Earth Sci. 83, 246. https://doi.org/10.1007/s12665-024-11560-y (2024).

Article ADS Google Scholar

Tian, W., Isleem, H. F., Hamed, A. K. & Elshaarawy, M. K. Enhancing discharge prediction over Type-A piano key weirs: An innovative machine learning approach. Flow Measure. Instrument. 100, 102732. https://doi.org/10.1016/j.flowmeasinst.2024.102732 (2024).

Article Google Scholar

Isleem, H. F., Elshaarawy, M. K. & Hamed, A. K. Analysis of flow dynamics and energy dissipation in piano key and labyrinth weirs using computational fluid dynamics. Comput. Fluid Dyn. Anal. Simul. https://doi.org/10.5772/intechopen.1006332 (2024).

Article Google Scholar

Shahin, R. I., Ahmed, M. & Yehia, S. A. Elastic buckling of prismatic web plate under shear with simply-supported boundary conditions. Buildings 13, 2879 (2023).

Article Google Scholar

Elshaarawy, M. K. & Elmasry, N. H. Experimental and numerical modeling of seepage in trapezoidal channels. Knowl. Based Eng. Sci. 5(3), 43–60. https://doi.org/10.51526/kbes.2024.5.3.43-60 (2024).

Article Google Scholar

Isleem, H. F. et al. Numerical and machine learning modeling of GFRP confined concrete-steel hollow elliptical columns. Sci. Rep. 14(1), 18647. https://doi.org/10.1038/s41598-024-68360-4 (2024).

Article CAS PubMed PubMed Central Google Scholar

Shaeer, Z. A. S. A., Shahin, R. I., El-Baghdady, G. I. & Yehia, S. A. Numerical and Analytical Solution for Nonlinear Free Vibration of Tapered Beams. Mansoura Engineering Journal. 49, 6 (2024).

Article Google Scholar

Yehia S, Shahin R. Elastic local buckling of trapezoidal plates under linear stress gradients. Magazine of Civil Engineering. 2024;17.

Yehia, S. A., Tayeh, B. & Shahin, R. I. Critical buckling coefficient for simply supported tapered steel web plates. Struct. Eng. Mech. 90, 273 (2024).

Google Scholar

Elshaarawy, M. K., Elmasry, N. H., Selim, T., Elkiki, M. & Eltarabily, M. G. Determining seepage loss predictions in lined canals through optimizing advanced gradient boosting techniques. Water Conserv. Sci. Eng. 9(2), 75. https://doi.org/10.1007/s41101-024-00306-3 (2024).

Article Google Scholar

Vakharia, V., Gupta, V. K. & Kankar, P. K. A comparison of feature ranking techniques for fault diagnosis of ball bearing. Soft. Comput. 20, 1601–1619 (2016).

Article Google Scholar

Mantena, S., Mahammood, V. & Rao, K. N. Prediction of soil salinity in the Upputeru river estuary catchment, India, using machine learning techniques. Environ. Monit. Assess. 195, 1006 (2023).

Article PubMed Google Scholar

Zhang, J., Li, D. & Wang, Y. Toward intelligent construction: Prediction of mechanical properties of manufactured-sand concrete using tree-based models. J. Clean. Prod. 258, 120665 (2020).

Article Google Scholar

Tran, V. Q., Dang, V. Q. & Ho, L. S. Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach. Constr. Build. Mater. 323, 126578 (2022).

Article Google Scholar

Lundh F. An introduction to tkinter. URL: www pythonware com/library/tkinter/introduction/index htm. 1999; 539:540.

Elshaarawy, M. K. & Hamed, A. K. Machine learning and interactive GUI for estimating roller length of hydraulic jumps. Neural Comput. Appl. https://doi.org/10.1007/s00521-024-10846-3 (2024).

Article Google Scholar

Download references

School of Art and Design, Yunnan Light and Textile Industry VocationalCollege, Kunming City, 650300, Yunnan Province, China

Focai Yu

Department of Computer Science, University of York, York, YO10 5DD, UK

Haytham F. Isleem

School of Business, Nanjing University of Information Science & Technology, Jiangbei New District, Nanjing City, Jiangsu Province, China

Walaa J. K. Almoghayer

Department of Civil Engineering, Higher Institute of Engineering and Technology, Kafrelsheikh, Egypt

Ramy I. Shahin & Saad A. Yehia

Department of Electrical Engineering, Imam Khomeini Naval Science University of Nowshahr, Nowshahr, Iran

Mohammad Khishe

Applied Science Research Center, Applied Science Private University, Amman, Jordan

Mohammad Khishe

Civil Engineering Department, Faculty of Engineering, Horus University-Egypt, New Damietta, 34517, Egypt

Mohamed Kamel Elshaarawy

Saveetha Institute of Medical and Technical Sciences, Department of Biosciences, Saveetha School of Engineering, Chennai, 602105, India

Mohammad Khishe

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Search author on:PubMed Google Scholar

Supervision, Funding acquisition, Methodology: H.F.I., K.E.; Project Administration, W.J.K.A.; Conceptualization, Formal Analysis, Validation, Writing – original draft, Investigation, Software, Resources, Data Curation, F.Y., H.F.I., W.J.K.A., M.K., M.K.E.; Visualization, Writing-Review Editing: F.Y., H.F.I., W.J.K.A., R.I.S., S.A.Y., M.K.E., M.K. All authors have read and agreed to the published version of the manuscript.

Correspondence to Haytham F. Isleem, Walaa J. K. Almoghayer or Mohammad Khishe.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table 9.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Yu, F., Isleem, H.F., Almoghayer, W.J.K. et al. Predicting axial load capacity in elliptical fiber reinforced polymer concrete steel double skin columns using machine learning. Sci Rep 15, 12899 (2025). https://doi.org/10.1038/s41598-025-97258-y

Download citation

Received: 23 November 2024

Accepted: 03 April 2025

Published: 15 April 2025

DOI: https://doi.org/10.1038/s41598-025-97258-y

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative