Application of GSTARI (1,1,1) Model for Forecasting the Consumer Price Index (CPI) in Three Cities in Central Java

ABSTRACT


A. INTRODUCTION
One indicator that describes the economic condition is inflation rate (Ireland, 2020). Inflation is a condition when the price of some goods and service tend to increase over time (Fahlevi et al., 2020). Inflation is regarded as actually identical with an increase in the quantity of money. Too high or too low an inflation rate negatively impacts the economic (Ardiansyah, 2017). According to the Central Bank of Indonesia (BI), the inflation rate target in 2021 is 3% with deviance ±1%. Several indicators are used in the calculation of inflation, one of which is the Consumer Price Index (CPI) (Gospodinov & Ng, 2013). The CPI measures the average change in the price paid by consumers for consumer goods and services (Yaziz et al., 2017). Good and services used are obtained from the Household Expenditure Survey conducted by the Central Bureau of Statistics Indonesia (BPS). The CPI data are collected from time to time in a specified period from several locations, especially 90 districts/cities in Indonesia, which are called inflation districts/cities. Central Java is one of the provinces that play a significant role in Indonesia's economy, and it covers six inflation districts/cities that can provide an overview of price changes in Central Java Province. In 2019, the inflation rate in Central Java was 2.81%. This value decreased by 0.01% compared to 2018, 2.82%. This inflation rate remains within the range of the values targeted by BI. The dynamics of the CPI in inflation districts/cities in Central Java are affected by time and other locations, thereby producing space-time data.
Some research to forecast the CPI data has been conducted, one of which is forecasting the CPI in Indonesia using ARIMA (p,d,q) model. This research concludes that the best model to forecast the CPI data in Indonesia is ARIMA (1,0,0), and the model has excellent accuracy because the MAPE value is less than 10% (Ahmar et al., 2018). The other research found that the best ARIMA model to forecast CPI data in Bandar Lampung city is ARIMA (1,1,0) (Kharimah et al., 2015). Compared to machine learning methods like ANN, kNN, and SVR, the time series method Autoregressive Distributed Lag (ARDL) has higher performance on forecasting CPI data (Ulke et al., 2016). This research showed that the time series autoregressive (AR) model could accurately forecast CPI data. However, the model forecast the CPI data only based on the previous time without involving other locations' spatial effect. Space Time Autoregressive (STARMA) is a time series model that involves space and time effects on forecasting data (Cheng et al., 2011;Zou et al., 2018). The STARMA model was first introduced to predict crime rates in 14 areas in Southeast Boston in 1980 (Pfeifer & Deutrch, 1980). In STARMA, the model parameter is estimated using the spatial weight matrix. Uniform weight matrix or homogenous spatial weight matrix is most ordinarily utilized as spatial weight matrix in STARMA modeling (Rathod et al., 2018). In STARMA, each location is assumed to be homogeneous so that the model parameters are global for all locations. This assumption makes the STARMA model inappropriate used at heterogeneous locations (Cheng et al., 2014). Generalized Space Time Autoregressive (GSTAR) was introduced to address heterogeneity conditions at each location (Borovkova et al., 2008). GSTAR is a generalization of the STAR model that provides different autoregressive parameters for each location to be applied to locations with heterogeneous characteristics (Ruchjana et al., 2012). Some research has been conducted forecasting CPI data using the GSTAR model. Compared to the ARIMA model, the GSTARI model has 7%-38% higher prediction accuracy on forecasting CPI data in 4 main cities (Dalian, Shenyang, Changchun, and Harbin) in China (Ji et al., 2019). The other study, show that the best model to forecast CPI data in Surabaya, Kediri and Probolinggo City is GSTAR (1,1) (Harini & Nuronia, 2020) . Meanwhile, the GSTARI-ARCH model was used to forecast CPI in Medan, Padangsidimpuan, Pematangsiantar, and Sibolga City to overcome inconstant variance error (Bonar et al., 2017). This research aims to obtain the GSTAR best model to forecast the CPI data in three inflation cities in Central Java (Surakarta, Semarang, and Tegal).

B. METHODS
This study is applied research that find the best method to forecast the CPI index in three cities in Central Java. The data used is secondary data from Statistics of Central Java Province. There are three variables used, Z 1 (t): Consumer Price Index (CPI) of Surakarta City, Z 2 (t): Consumer Price Index (CPI) of Semarang City, and Z 3 (t): Consumer Price Index (CPI) of Tegal City. The data is time-series data from January 2015-December 2019 and divided into sample data as 48 observations (January 2015-December 2018) and out sample data as 12 observations (January 2019-December 2019). In this study, the GSTAR parameter estimation was calculated using the Inverse Distance Weight (IDW) and the Normalization Cross-Correlation (NCC) weighting (Setiawan et al., 2016). The stages of the analysis is shown in Figure 1. The process starts with calculating correlation among the variables. Then, checking heterogeneity of locations using the Gini index and checking stationarity of the data (mean and variance). If the variable highly correlated, gini index greater than 1, and stationare in mean and variance, then calculating spatial weight matrix consisting of IDW and NCC weighting.

Figure 1. CPI Forecasting Process
After that, Plotting STACF and STPACF (Deng et al., 2016;Mukhaiyar & Pasaribu, 2012;Wei, 2019) to determining the autoregressive order (p) and spatial order (l) in the GSTAR model based on the AIC value. Then, estimating parameters of GSTAR (p,l) using IDW and using NCC weighting. Diagnostic checks of residual including homoscedasticity, multivariate normality and white noise assumptions. If the residual fulfill the assumption, find the best model by comparing the MAPE value. The last, forecasting CPI data use the best model. The GSTAR ( , ) model can be written as follows: is spatial order of the ℎ autoregressive term; Ф kl is the diagonal matrices with the diagonal elements as autoregressive and the space time for each location (Ф (1) , … , Ф ( ) ); and e(t) is the white noise with mean vector 0 and variance-covariance matrix 2 C. RESULT AND DISCUSSION 1. Variable Correlation Figure 2 generally describes the CPI data pattern for the three cities in Central Java Province. The CPI data pattern in the three locations is relatively the same, increasing with the time. This finding indicates a relationship in CPI data among the three locations. A correlation coefficient is calculated in CPI data among three locations to quantify that.  Table 1 shows that the CPI among the three cities has a high correlation, roughly close to 1. Thus, multivariate modeling can be performed on this data.

Gini Index
The next step is to check the location's heterogeneity using the Gini index. Gini Index of CPI data in Surakarta, Semarang and Tegal is shown in Figure 2. From the table, gini index in three cities is 1.00, indicating that the CPI of the three cities is heterogeneous and can be modelled by GSTAR.

Mean and Variance Stationarity
The next stage is checking the mean and variance data stationarity. Figure 2 show that CPI data is not stationare in mean but stationare in variance. Using the ACF, PACF, and Dicky Fuller Test plots, the data are stationary at level 1 or differencing 1 as shown in Figure 3. Thus, the model used is GSTARI.

Autoregressive and Spatial Order
Identifying the optimum order of time lag and spatial lag for the GSTARI model is performed following the STACF and STPACF plots using stationary sample data. Figure 4 shows the STACF and STPACF plots of Surakarta, Semarang, and Tegal CPI. Based on the plot, the selected spatial lag order is one because the three locations are in the same province. The STPACF plot also indicates cut-off at several time lags, including lags to 1, 4, and 6, so several time orders can be selected. The minimum AIC value determines the optimal time lag order. The 1st time lag order produces the minimum AIC value of -6.08 so that the selected time lag is p=1. Based on the identification, the model used is GSTARI (1:1:1).

GSTARI Model Parameter Estimation
The results of parameter estimation of the GSTARI model using IDW and NCC weighting are described in Table 3.

Diagnostic Check
A diagnostic check of the residual assumptions of the GSTARI (1,1,1) model is used to determine whether residuals of the model have fulfilled the assumptions of homoscedasticity, multivariate normality, and white noise. The assumption of homoscedastic denotes that the error variance is homogeneous. The assumption of homoscedasticity was examined by testing the GSTARI (1,1,1) model's squared error at each weighting. In contrast, while the white noise test was performed by re-modelling the residuals obtained from the model for each location and the multivariate normality test to test whether the model's residuals have a multivariate normal distribution. Based on the tests that have been performed, the results show that the GSTARI (1,1,1) model using both IDW and NCC weighting produces a residual model that meets the assumptions of homoscedasticity, white noise, and multivariate normality.

Selecting The Best GSTARI Model
Several measures can be used to assess forecasting accuracy in a time series model, one of which is the Mean Absolute Percentage Error (MAPE). The forecasting model is categorized as Very Good if it produces a MAPE smaller than 10%. MAPE can also be used as criteria to select the best model. Table 4 shows the MAPE values generated by the GSTARI model (1,1,1) in forecasting CPI data in three cities (Surakarta, Semarang, and Tegal). Based on Table 4, forecasting of CPI data using the GSTARI (1,1,1) model with the IDW and NCC weighting resulted in a MAPE of less than 10% in each location and on average. The values shows that the GSTARI (1,1,1) model with IDW and NCC weighting has a very good accuracy in forecasting CPI data in Surakarta, Semarang, and Tegal cities. Furthermore, it is necessary to determine the best model among the two weights used based on the minimum MAPE. The GSTARI (1,1,1) model that produces the lowest mean of MAPE from 3 locations is the GSTARI model with NCC weighting of 0.3701 in the in-sample data and 0.2914 in the out-sample data. Based on the location, the GSTARI (1,1,1), model which gives the smallest MAPE value, is in Semarang with the NCC weighting. This finding indicated that the best model for forecasting CPI in Surakarta, Semarang, and Tegal City is GSTARI (1,1,1) with NCC weighting.

Forecasting
The GSTARI (1,1,1) best model with NCC weighting is used to forecast the CPI in Surakarta, Semarang, and Tegal. From parameter estimation result, the forecasting model in each location is as follow.
Equation (2) is the GSTARI (1,1,1) model to forecast Surakarta City CPI data. The model in equation (2) implied that the CPI of Surakarta in the month t is influenced by the CPI of Surakarta (by 1,0938), Semarang (by 0,1583), and Tegal (by 0,1583) in a month (t-1). In addition, the CPI of Surakarta in month t is also influenced by CPI of Surakarta (by -0.0938), Semarang (by -0.1583), and Tegal (by -0.1583) in a month (t -2).
At the same time, equation (3) is the GSTARI (1,1,1) model to forecast Semarang City CPI data. The model in equation (3) implied that the CPI of Semarang in the month t is influenced by the CPI of Surakarta (by 0,1307), Semarang (by 1,1127), and Tegal (by 0,1313) in the month (t-1). In addition, the CPI of Semarang in month t is also influenced by the CPI of Surakarta (by -0,1307), Semarang (by -0.1127), and Tegal (by -0,1313) in a month (t -2). Equation (4) is the forecasting model for Tegal City. The model in equation (4) implied that the CPI of Tegal in the month t is influenced by the CPI of Surakarta (by 0,0650), Semarang (by 0,0652), and Tegal (by 1,2475) in month (t-1). In addition, the CPI of Tegal in month t is also influenced by CPI of Surakarta (by -0,0650), Semarang (by -0,0652), and Tegal (by -0,2475) in the month (t -2). From these equations, it can be concluded that CPI data in three cities in Central Java is influenced by values of a previous time in exact location and the value of previous time in another location.
To assess the accuracy of the forecasting model above, we used data out-sample to forecast CPI data in Surakarta, Semarang, and Tegal from January 2019 -December 2019. Predicted data will be compared with actual data to determine the model's accuracy based on MAPE value. Based forecasting CPI data period January 2019 -December 2019 in Surakarta City using equation (2), the MAPE value is 0,2944 %. Meanwhile, the forecasting CPI data in Semarang City uses equation (3), and Tegal City uses equation (4) results the MAPE value 0,2693 % and 0,3106 % respectively. The model that produces the lowest MAPE values in equation (3) to predicts the CPI data in Semarang City. On average, the three models produce MAPE values 0,2914. The results show that GSTARI (1,1,1) model with NCC weighting in equations (2), (3), (4) has good accuracy in forecasting CPI data in Surakarta, Semarang, and Tegal because the MAPE value is very small or less than 10%.

Figure 5. Actual and Prediction Data Plots
Forecasting uses the GSTARI (1,1,1) model with NCC weighting produces predictive and actual data plots as shown in Figure 5. The red line shows the actual CPI data plot in Surakarta City period January 2019-December 2019. While the red dotted line shows the forecasting results using GSTARI (1,1,1) model with NCC weighting in Surakarta city in equation (2). The two lines show an upward trend pattern over time, and pattern of predicted CPI value tends to resemble the actual data pattern at the corresponding lag. The green lines show the actual CPI data plot in Semarang city, and the green dotted line shows the predicted values. These plots show that GSTARI (1,1,1) model in equation (3) predicts the CPI data in Semarang city very well because the predicted values result in a very similar pattern to actual data pattern. The last blue line shows the actual CPI data plot in Tegal City, and blue dotted line shows the predicted value. These plots show that the predicted value is very close to the actual data. From this chart, we found that forecasting CPI data using GSTARI (1,1,1) with NCC weighting in the three locations resembles the actual data pattern at the corresponding lag. Therefore, it can be concluded that the GSTARI (1:1:1) model NCC weighting can be used to forecast CPI data in Surakarta, Semarang, and Tegal cities. This research results show that GSTAR model has a good accuracy in forecasting data that highly correlated with other location and heterogeneous among location.

D. CONCLUSION AND SUGGESTIONS
Results and analysis from this study concluded that GSTARI (1,1,1) model using Normalization Cross-Correlation (NCC) weighting is the best model for forecasting CPI data in 3 cities in Central Java (Surakarta, Semarang, and Tegal). The best model is chosen by the lowest MAPE value and diagnostic check that the residuals fulfill the assumptions of homoscedasticity, white noise, and multivariate normality. The forecasting plot shows that the forecasting results tend to follow the actual data pattern, so the GSTARI (1,1,1) model with normalized cross-correlation has excellent accuracy in forecasting CPI data in 3 cities (Surakarta, Semarang, and Tegal). The results of this research can be used as consideration for the government in making economic policies at the present and in the future. This study only uses three inflation cities of a total of 6 inflation districts/cities in Central Java Province (Surakarta, Semarang, and Tegal). Future studies need to involve all six regencies/cities in the analysis to determine the spatio-temporal effect of other districts/cities.