Rainfall Forecasting Using an Adaptive Neuro-Fuzzy Inference System with a Grid Partitioning Approach to Mitigating Flood Disasters

ABSTRACT


A. INTRODUCTION
Rainfall is water droplets that fall from a group of clouds over a certain period of time above the ground surface and is measured in millimetres in height (Pendergrass, 2018).Rain has positive and negative impacts on society.The positive impact produced by rainfall is hydroelectric power generation, while the negative impact is flooding.Floods can have a wider impact on various sectors such as agriculture, transportation and the economy of a region.The sectors most affected are the economy and transportation.The negative impact is a focus for the government to be able to mitigate.Several factors influence rainfall such as temperature, humidity and wind speed in each region (Rohmana et al., 2019).
Semarang is one of the regions in Indonesia that is prone to being affected by hydrometeorological disasters such as floods (Hidayat et al., 2018).El Nino Southern Oscillation (ENSO) is closely related to hydrometeorological phenomena such as rainfall in Semarang.On the other hand, climate change makes hydrometeorological disasters worse (Suryadi et al., 2017).In recent years, Semarang has experienced frequent flood disasters; it was recorded that seven sub-districts experienced flooding in 2022 due to high rainfall.
Predicting future rainfall intensity is very important to prepare for mitigation.There are many techniques used to make predictions, such as linear regression, Autoregressive Integrated Moving Average (ARIMA), Artificial Neural Network (ANN) and artificial neural networks (ANN) (Chukwueloka & Nwosu, 2023;Yolanda & Kariyam, 2023).The characteristics of rainfall data do not follow a normal distribution, so the ARIMA approach does not work optimally.Nonlinear methods like the Adaptive Neuro Fuzzy Inference System (ANFIS) will work better than ARIMA.
ANFIS research works well in capturing the variability that exists in rainfall.A study conducted in Sudan used ANFIS to develop a long-term weather forecast model to predict rainfall.This study used monthly meteorological data from 2000 to 2012 for 24 meteorological stations spread across the country.The research results show that ANFIS can capture the dynamic behaviour of rainfall data and provide satisfactory results (Bushara & Abraham, 2015).The ANFIS method was applied by Suparta and Samah (2012) to predict rainfall in Tangerang with an accuracy of 80%.
ANFIS can be combined with other methods to obtain more accurate results.The ANFIS method can be combined with grid partitioning, subtractive clustering, and fuzzy c-means clustering methods.Sanikhani et al. (2012) compared ANFIS with grid partitioning, ANFIS with subtractive clustering, and ANFIS with fuzzy c-means clustering.The results show that ANFIS with grid partitioning is better than ANFIS with subtractive clustering and ANFIS with fuzzy cmeans clustering.Several studies have applied the ANFIS method (Abebe & Endalie, 2023;Sahoo et al., 2023;Yildirim et al., 2022).
The urgency of accurate forecasting to mitigate hydrometeorological disasters is very important.So this research will use ANFIS with a grid partition approach to predict rainfall in the city of Semarang.Making rules using a grid partition approach avoids the curse of dimensionality in forecasting and is expected to produce maximum accuracy.The ANFIS-GP model will be evaluated with Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE).The best model will be used to forecast thirty days into the future.

B. METHODS 1. Data and Variable
This research uses secondary data from the Central Java Province Meteorology, Climatology and Geophysics Agency (BMKG) website.Data was taken from January 2021 to December 2022; the data taken is daily data.Data is divided into two, namely, in-sample and out-sample.The insample data is 609 for January 2021 to August 2022.Meanwhile, the out-sample data is 121, which is data for September to December 2022.The variables used in this research are rainfall, temperature, humidity and wind speed.

Data Pre-processing
The first step taken in this research was data pre-processing.Data pre-processing is carried out on missing data.Missing Value in this study was handled by averaging the values on the exact dates and months in different years (Hadeed et al., 2020).Precipitation, temperature, humidity, and wind speed are seasonal variables.This approach to handling missing values is very appropriate.The next step is to determine the membership function.A membership function is a function that can work in mapping points in the input data to their membership values.Membership functions that are generally used and have advantages in representing and have broader capabilities in forming various forms of distribution are Gaussian and Generalized Bell membership functions (Gupta et al., 2023).
The Gaussian membership function, as in Figure 1, is formed from two parameters, namely  and .Where () is the degree of membership of a variable ,  is the variance, and  is the mean.(2) The curve in Figure 2 is formed from 3 parameters, namely ,  and .Where () is the degree of membership of a variable, parameter  is positive, and parameter  indicates the center and middle value.The third step is to model rainfall data using the ANFIS method.ANFIS functionally has almost the same architecture as the fuzzy rule base model and has almost the same construction as a neural network that contains radial functions (Zhu et al., 2022).A linear combination of radial basis functions of input and neuron parameters is the output of this network.If it is assumed that the fuzzy inference system has two inputs, , and , and has one output, f, then according to the Sugeno order I model, there are two rules as follows (Zhu et al., 2022): The neural network in the ANFIS method has the same function as a fuzzy inference system.To update the parameters in the fuzzy inference system, a learning process in a neural network is used with several data pairs.The ANFIS network consists of several layers, as shown in Figure 3 below (Guerra et al., 2022):

Layer 1
The first layer is called the fuzzification layer and is the input layer.The membership degree is the output of each neuron given by the input membership function.For example, one of the membership functions is Generalized Bell.The shape of the Generalized Bell curve will change if the values of these parameters change.These parameters are called premise parameters.

Layer 2
The second layer, called the fuzzification layer, in the form of fixed neurons (with the symbol ), is the product of all input, which is formulated as follows: Where   is the  ℎ node,   is a signal from the ith layer  neuron.In general, we use the AND type operator.

Layer 3
This layer called the fuzzy reasoning layer, is a fixed neuron (with the symbol N), which is the result of calculating the ratio of the  ℎ firing strength (  ) to the sum of the total firing strengths in the second layer with the following formula: where   ̅̅̅ is the normalized firing strength, the results of the calculations in the third layer are called normalized firing strength.

Layer 4
The layer, which is called the reconciliation fuzzy layer, is in the form of neurons, which are adaptive neurons to an output, which is formulated as follows: ̅̅̅  =   (   1, +    2, +   )" (5) where   ̅̅̅ is the normalized firing strength in the third layer, and   ,   , and   are the parameters in the neuron.These parameters are called consequent parameters.

Layer 5
In the fifth layer, a single neuron (with the symbol Ʃ) is the result of the sum of all the outputs in the fourth layer, which is formulated as follows:  (Fatkhurokhman Fauzi et al., 2023;Kharisudin et al., 2023).RMSE and MAPE calculations were performed on in-sample and out-sample data to determine the model's goodness.After knowing the model's goodness, the next step is forecasting rainfall for the next month.
where   is actual data,  ̂ is forecasting data,  is lots of data.

C. RESULT AND DISCUSSION
The results and discussion in this research are divided into two parts, namely descriptive statistics and modeling using the Adaptive Neuro-Fuzzy Inference System (ANFIS) method.Descriptive statistics discuss the general description of the research variables.Meanwhile, ANFIS modeling discusses determining the best model and forecasting.

Descriptive Statistics
The rainfall pattern in the city of Semarang is of the monsoon type; this type theoretically has peak rainfall from December to February (Kusumawardhani & Gernowo, 2015).Based on Figure 4(a), it can be seen that the highest rainfall is in December-February, but anomalies often occur in certain months.However, in recent years, the monsoon rainfall type has shifted in peak rainfall caused by climate change and El Nino or Lanina.Rainfall anomalies often occur in the city of Semarang due to its location, which borders directly on the Java Sea.Anomalies often occur every month throughout 2021.The highest anomaly occurred in June 2021 at 171 mm2.Several factors that cause rainfall anomalies include climate change, atmospheric stability, population density, and local topography (Harada et al., 2020;Lakshmi & Schaaf, 2001;Lima et al., 2017;Zhao et al., 2019).Rainfall anomalies in Indonesia significantly impact the agricultural sector, especially food crop production (Dirgahayu et al., 2012).Temperature and relative humidity have a negative correlation.The relationship between these two variables is inversely proportional; if the temperature is high, then the relative humidity is low (see Figures 4(b) and 4(c)).The average temperature in Semarang over the last two years was 28.01°C, while the actual humidity was 76.2%.The highest temperature for two years was 31°C, while the lowest was 24°C.

Rainfall Modeling Using the ANFIS Method
In predicting rainfall in Semarang, a model will be created using the Adaptive Neuro-Fuzzy Inference System (ANFIS) method with a grid partitioning (GP) approach.After obtaining the best model, the smallest RMSE and MAPE values will be selected for the membership function, which will then be used for predictions.The following is a drawing of the ANFIS architecture: The first stage in the ANFIS method is fuzzification, where the input data with a set of classical numbers will be converted into fuzzy numbers.In this layer, the membership function that will be used will be determined.This process will produce nonlinear parameters for each membership function or premise parameter to convert classical numbers into fuzzy numbers.In this research, Gaussian and Generalized Bell membership functions are used.The parameters  and  are in the Gaussian membership function for nonlinear parameter values produced by the Generalized Bell membership function, which has three parameters: , , and .The parameters resulting from the backward flow process are optimal.The respective nonlinear parameters differ according to the two membership functions used.Parameter values are used to determine the degree of membership.The three inputs used each have two membership functions so that six groups of degrees of membership are produced.The second step in the ANFIS method is to determine the rules.In this study, a grid partition approach was used.In Figure 5, it is found that the number of rules is eight, where the rules are obtained from the number of memberships of two raised to the power of the number of input variables of three.The following are the rules used:    Based on Table 6, it is found that the Gaussian membership function has the smallest RMSE and MAPE values.Figure 6 (red line and blue line) shows that the Gaussian Membership function can follow actual data.However, there are prediction errors on certain days caused by temperature, humidity, and wind speed, which change dynamically daily.These results align with research conducted by Yonar & Yonar (2023) regarding air pollution prediction using the ANFIS approach.This study's RMSE and Mean Square Error (MSE) values were 0.065925 and 0.0043461.Other research also confirms that ANIFS performs well with RMSE values in the range 0-1,389 (Navale & Mhaske, 2023).This research shows that ANIFS is very good for predicting volatile data and spikes, such as rainfall data.Several studies have produced the same conclusions about the ANFIS method (Li et al., 2023;Saleh et al., 2023;Samantaray et al., 2023;Sangar et al., 2024).

D. CONCLUSION AND SUGGESTIONS
Rainfall in Semarang City fluctuates, and anomalies occur every month.Climate change events and El Nino or La Nina cause this anomaly.The adaptive neuro-fuzzy inference system (ANFIS) with a Grid Partitioning (GP) approach with a Gaussian membership function performs better than the generalized bell membership function.The RMSE and MAPE values obtained by ANFIS-GP with the Gaussian membership function are 0.0898 and 5.2911, respectively.The results of rainfall forecasting for the next 30 days experienced an anomaly on day 30 of 102.53 mm; this could cause hydrometeorological disasters such as flooding.For further research, other membership functions can be used to increase forecasting accuracy.

Figure 4 .
Figure 4. Time Series Plot of (a) Rainfall, (b) Mean Temperature, (c) Relative Humidity, and (d) Wind Speed

Figure 6 .
Figure 6.Prediction with ANFIS-GP using Gaussian Membership Function 6)   ̅̅̅ is the normalized firing strength.∑   ̇  is the result of the sum of the output on layer 4. Meanwhile, ∑    is the result of the output sum on Layer 3. The final step in this research is to evaluate the model using Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE)

Table 2 .
Non-Linear Parameter Values of Gaussian Membership Functions

Table 3 .
Non-Linear Parameter Values of Generalized Bell Membership Function is temperature,   is humidity, and   is wind speed.   ,    ,    are premises.C is conclusion.The next step is defuzzification, which involves returning fuzzy numbers to classical numbers using linear or consequent parameters resulting from the forward learning process.In contrast to nonlinear parameters, the linear parameters produced at this stage have the exact quantities, namely   ,   ,   ,   .The resulting parameters correspond to the number of rules used, namely eight rules, so there are eight groups of linear parameters.

Table 4 .
Linear Parameter Values of Gaussian Membership Function

Table 5 .
Linear Parameter Values of Generalized Bell Membership Function Meanwhile, for the ANFIS equation, the Generalized Bell membership function is based on the parameter values in Table 5 as follows.The ANFIS model with Gaussian and Generalized Bell membership function approaches is evaluated using the Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) values.The ANFIS model is good if it has small RMSE and MAPE values.The following is a comparison table of performance goodness between Gaussian and Generalized Bell membership functions.

Table 6 .
Comparison of RMSE and MAPE values