Abstract
This paper aims to introduce a robust framework for forecasting demand, including data preprocessing, data transformation and standardization, feature selection, cross-validation, and regression ensemble framework. Bagging (random forest regression (RFR)), boosting (gradient boosting regression (GBR) and extreme gradient boosting regression (XGBR)), and stacking (STACK) are employed as ensemble models. Different machine learning (ML) approaches, including support vector regression (SVR), extreme learning machine (ELM), and multilayer perceptron neural network (MLP), are adopted as reference models. In order to maximize the determination coefficient (R-2) value and reduce the root mean square error (RMSE), hyperparameters are set using the grid search method. Using a steel industry dataset, all tests are carried out under identical experimental conditions. In this context, STACK(1) (ELM + GBR + XGBR-SVR) and STACK(2) (ELM + GBR + XGBR-LASSO) models provided better performance than other models. The highest accuracies of R-2 of 0.97 and 0.97 are obtained using STACK(1) and STACK(2), respectively. Moreover, the rank according to performances is STACK(1), STACK(2), XGBR, GBR, RFR, MLP, ELM, and SVR. As it improves the performance of models and reduces the risk of decision-making, the ensemble method can be used to forecast the demand in a steel industry one month ahead.