Comparasion Model Analysis Time of Earthquake Occurrence in Indonesia based on Hazard Rate with Single Decrement Method

ABSTRACT


A. INTRODUCTION
Indonesia is an archipelagic country that is prone to natural disasters, especially earthquakes and tsunamis. The main cause of this disaster was due to the tectonic conditions of the Indonesian region which is the meeting point of 3 major world plates, namely the Eurasian Continental Plate, the Indian-Australian Ocean Plate and the Pacific Ocean Plate. The meeting of these plates causes earthquakes. Active faults on land and at sea can also cause earthquakes. The occurrence of an earthquake resulted in damage to buildings and loss of life (Cipta et al., 2021).
Throughout 2016 there have been 18 destructive earthquakes in Indonesia, in 2017 there were 19 destructive earthquakes and in 2018 there were 23 destructive earthquakes (Siswadi, 2018). Among the destructive earthquakes was the Sumenep earthquake on October 10, 2018 with a magnitude of 6.4 which damaged dozens of houses and 3 people died. The biggest earthquake in the last year was the Donggala-Palu earthquake on September 28, 2018 with a magnitude of 7.5 which caused a tsunami and liquefaction of more than 2000 people died with 1000 people missing and damaged thousands of houses and the earthquake. The earthquake in Lombok occurred on August 5, 2018 with an initial magnitude of 7 followed by aftershocks. The victims in this incident were 259 people died, more than 1000 people suffered serious injuries and more than 200 thousand people suffered minor injuries and damaged various public facilities and buildings (BBC News, 2018). The damage and loss of life caused considerable financial losses. The National Disaster Management Agency (BNPB) said the total loss of the Lombok earthquake reached Rp 12.15 trillion. This figure includes damage to buildings of Rp 10.15 trillion and economic losses of Rp 2 trillion. There is a way to overcome these financial losses, namely with insurance (Gumelar, 2018).
There are a lot of models to forecasting a earthquakes, such as assessment of Point Process Models using epicentral locations data (Bray & Schoenberg, 2013). This method shows a location dominant of the earthquakes. Another method used to know a postearthquake emergency response in a smart city using a semi-Markov model (Ghosh & Gosavi, 2017). An elaborate semi-Markov model to capture the stochastic dynamics of the events that follow an earthquake, which will be used to quantify the hazard rate to which people are exposed and estimate the restoration time. Another modelling used from exponentiated Weibull distribution to modelling a earthquake interevent time (Pasari & Dikshit, 2018). This models describe some exponentiated Weibull distribution properties, such as the survival rate, mode, median and hazard rate. Besides that, Insurance companies certainly need a numerical analysis to estimate the probability of an earthquake at a certain location and time, the expected value and the variety of the distribution of earthquakes. The hazard rate has an important role in the theory of the likelihood of an earthquake occurring. If the hazard rate is known, then the joint density distribution for the realization of earthquake data in (0,T) can be known. Generally, earthquake hazard rates are estimated based on the point process likelihood equation introduced by Vere-Jones in 1995 (Daley & Vere-Jones, 2003). A lot of application that can use single decrement method. Single decrement method can be used to estimate claim occurrence in non-life insurance (Rangkuti & Sunusi, 2019). This research concluded that the level of risk in influenced by the comparison of the number of claim filed with the number of those who had accident but did not submit a claim and the ratio between time intervals of the claims occurrence from the beginning of the observation too.
Suppose is called a random variable if is a sample space with a probability measure and X is a real-valued function defined from members of (Miller, 2014). The expected value is defined into two, namely: 1. If is a discrete random variable with the probability mass function , then the expected value of is defined as ( ) =, if the above quantities converge absolutely. If the above numbers diverge, then the expected value of is non-existent. 2. If is a continuous random variable with probability density function , then the expected value of is defined as ( ) = ∫ ( ) ∞ −∞ , if the above integral is absolutely convergent. If the above integral diverges, then the expected value of is non-existent (Hogg, RV;McKean, JW;Craig, 2019).
The variance of if is a random variable can be defined as (Ghahramani, 2018).
The cumulative distribution function ( ) for a random variable is the probability that is less than or equal to a given value, denoted ( ) = ( ≤ ). The cumulative distribution function must meet the following conditions: 1. 0 ≤ ( ) ≤ 1 for every 2.
( ) is a non-descending function 3. lim The survival function ( ) for a random variable is the probability that is greater than a given value, can be denoted by the formula ( ) = ( > ) = 1 − ( ). The survival function must meet the following conditions: 1. 0 ≤ ( ) ≤ 1 for every 2.
( ) is a non-increasing function 3. lim (2) In addition to the above formula, the hazard function can also be formulated with The relationship between survival function and hazard function is Hazard function can be used to a lot of things, such as nationally consistent probabilistic tsunami hazard assessment for Indonesia. This assessment produces time-independent forecast of tsunami hazards at the coast using data from tsunami generated by local, regional and distant earthquake source (Horspool et al., 2014). Hazard can be use also to illustrate limitation of earthquake hazard mapping. But, key aspects of hazard maps often depend on poorly constrained parameters, whose values are chosen based on the mapmakers' preconceptions (Stein et al., 2012). The additional step should be done to make it more perfect, one of the way is use a single decrement. Hazard rate using the single decrement can be used to estimate the mortality in DKI Jakarta (Nisa, 2017). This The hazard rate approach using the single decrement method with the likelihood approach requires exit time information, namely the time when an earthquake occurs. Let 0 represent the number of earthquakes that occurred in the interval ( 0 , 0 + 1] and ( 0 − 0 ) represent the number of earthquakes that occurred after 0 . Because the time of occurrence for each earthquake is different, the earthquakes are considered separately and take the product of the contribution of each earthquake to the likelihood function. Likelihood L for the i-th earthquake in the interval ( , + 1] is given by the probability density function for the occurrence of an earthquake in that interval if it is known that no earthquake occurred until 0 . This can be stated as follows.
i.e. the i-th contribution to L. If = 0 ( ) − 0 is the time of occurrence of the i-th earthquake in the interval ( 0 , 0 + 1], with 0 < ≤ 1, then The contribution of the number of earthquakes 0 on L is ∏ 0 0+

=1
. The contribution of 0 − 0 earthquakes that occur after 0 + 1 is ( 0 ) 0 − 0 where 0 is the number of earthquakes appears at time 0 or later. Thus, the total likelihood of L is To solve equation (7) for 0 , it is necessary to assume that the distribution of 0 0+ is expressed in the form 0 .
Then we review three cases, namely if 0 + represents the number of earthquakes after 0 + is assumed to be linearly distributed and exponentially required to express 0 0+ . If 0 + represents the number of earthquakes that occur after 0 + is a linear function, then 0 + = + .
(8) The estimator for 0 is If 0 + states the number of earthquakes that occur after 0 + is an exponential function, then 0 + = .
(10) The estimator for 0 is which is the maximum likelihood estimator for 0 . The hazard rate has an important role because the hazard rate is related to the estimated probability of an event occurring in a certain location. Based on the definition, then for the time interval ( 0 , 0 + ∆ 0 ), the probability of occurrence of an event is ( 0 )∆ 0 , and the probability that there is no event in the interval ( 0 , 0 + ) is This study discusses the method for estimating the earthquake hazard rate using the single decrement method. This method is adapted from the estimation method in actuarial studies which is commonly used in making mortality tables. The results of a study conducted show that the single-decrement hazard rate estimation method is more informative than the point process hazard rate likelihood, then an estimation of temporal point process to count earthquake hazard rate in nusatenggara region (Sunusi et al., 2013).
The problem to be investigated in this study is how to find the probability of an earthquake, the expected value and the variance of the data on the occurrence of an earthquake in (0,T) in each province in Indonesia. The purpose of this study is to obtain the probability of an earthquake, the expected value and the variance of the data on the occurrence of an earthquake within (0,T) in each province in Indonesia by numerical analysis using the hazard rate estimation obtained from the single decrement method with the likelihood approach.
The conclusions obtained from research activities are expected to provide input to insurance companies and the government to overcome the risk of loss due to earthquakes in Indonesia in each province. The results of this study can also be used by insurance companies and the government for more information pay attention to the provinces that have the greatest chance of experiencing an earthquake disaster. The expected output of this research is journal publication so that research results can be useful for industry and society. The published research results are in the form of modeling the risk of earthquakes in Indonesia based on data in each province.
Research relevant to this research is research from Maulidi in 2014. This research is only limited to the province of Aceh, which has experienced a tsunami. In this Study, the best model to estimate time of earthquake occurrence in Aceh is cubic model. Case studies in broader research will be very useful, namely in the scope of Indonesia, therefore this research was developed. In addition, there are studies that also develop earthquake countermeasures models. The method of classifying earthquakes in Indonesia can be used Self Organizing Maps (Halim & Widodo, 2017). This method does not require special assumptions and can perform analysis on multiple variables. The result of this research is that the 4 best clusters are formed based on each of the characteristics of the number of damaged buildings, damaged roads, dead victims and the number of victims who suffer. Besides that, the grouped earthquake events based on the hypocenter using the optimization concept of the Naïve Bayes algorithm and also optimized using the adaboost algorithm which concluded that the most earthquake classifications were earthquakes with a hypocenter of 60 km to 300 km (Prathivi, 2020).

B. METHODS
The research was carried out through a data analysis process using excel 2010, IBM SPSS Statistics 25, and Minitab 21. Data used in this study based on Indonesian Agency for Meteorological, Climatological and Geophysics (Badan Meteorologi, Klimatologi dan Geofisika or BMKG) on website http://repogempa.bmkg.go.id/repo_new/. Criteria of the data was date starting at January 2010 until December 2020 geographic rectangular area is all Indonesia (6 0 Top Latitude-11 0 Bottom Latitude and 95 0 Right Longitude-141 0 Left Longitude). Minimum Magnitude is 5SR and Maximum 15 SR.
The method used is quantitative analysis. Quantitative research is a research approach that is required to use numbers, starting from data collection, interpretation of the data, and the appearance of the results. This study uses secondary data analysis which leads to mathematical modeling of the analyzed data. Hazard rate has an important role in forecasting the probability of occurrence of events in a certain time interval. In this study, the hazard rate value was estimated using a single decrement method called HRSD. HRSD includes likelihood approach and moment approach. Hazard rate is usually denoted by 0 .
Suppose ( 0 ) = − 0 represents the waiting time until the occurrence of the next earthquake, if it is known that the time difference 0 since the occurrence of the last earthquake and is the time of reappearance between the occurrence of two earthquakes and , and represent the hazard rate, survival function, and opportunity density function, respectively. Hazard rate 0 can be expressed as By integrating (13) used = ln ( ) with a limit from 0 to 0 + ∆ 0 ,, it is obtained that the probability that no earthquake has occurred until 0 + ∆ 0 if it is known that there has not been an earthquake until 0 , is Suppose 0 = 0, i.e. shortly after an earthquake occurs, we get Earthquake forecast is formulated as a conditional probability of occurrence of an earthquake up to 0 + ∆ 0 , given that no earthquake has occurred until 0 . The distribution of the reappearance time and the waiting time for the next earthquake to occur ( 0 ) are respectively expressed as follows. T and In this expression, (17) is the probability that an earthquake occurs between 0 and 0 + ∆ 0 if it is known that no earthquake has occurred until 0 , and The data used is secondary data, namely data on earthquakes that occurred in Indonesia in each of the 34 provinces in Indonesia. The data needed is the time of the earthquake, the depth of the vibrating plate and its magnitude. The recorded data are earthquakes that have the potential to damage, namely those with a magnitude above 5. The study was conducted by looking for data on earthquakes that have occurred in all provinces in Indonesia through the central BMKG institution.
The process of data analysis carried out in this study are: 1. Determine the hazard rate The first step for data analysis is to determine the hazard rate. Hazard rate is used to estimate the probability of occurrence of an earthquake from each province. After the probability is known, the distribution of earthquake occurrences in each province can be known.

Determine the survival model for the occurrence of earthquakes in each province in Indonesia
The survival model can be traced after the hazard function is found. The survival model is used to estimate the expected value and risk of an earthquake in each province in Indonesia. This study compares four models that are linear ( 0 * = 0 + 1 0 + ), quadratic ( 0 * = 0 + 1 0 + 2 0 2 + ), cubic model ( 0 * = 0 + 1 0 + 2 0 2 + 3 0 3 + ) and exponential that has a least standard error of the estimate and the highest 2 .

Determine the expected value and variance of the distribution to estimate the risk of an earthquake in each province
The expected value of a random variable can be interpreted as a weighted average of the value of the random variable over the long term. While variance is a measure of data dispersion. The variance is a measure of the distance from the data values to the data mean. Variety can indicate the value of risk. The greater the variance, the greater the risk of the random variable to be generated. In this study, the risk of an earthquake can be estimated by looking for the variance obtained from the known survival function from the previous stage.

Summarizing the results of data analysis
The next step is to conclude the data that has been analyzed. The results of data analysis are in the form of the occurrence distribution function along with the expected value and variance of each province. Conclusions can be made by looking for provinces with a high risk of earthquakes. The risk limit for an earthquake is the average of all risks from 34 provinces. If the variance of each province is above the average, it is categorized as an area with a high risk of earthquakes.

C. RESULT AND DISCUSSION
To determine the probability of an event within a certain interval, the hazard rate has an important role. Maulidi in 2014 stated that this method uses two sub-methods, namely the maximum likelihood and the moment method. For example, ( 0 ) = − 0 represents the waiting time for an earthquake to appear, 0 is the time the first earthquake appears and the next time the next earthquake appears. Let , and represent the hazard rate, survival function and probability density function. The level of danger can be expressed by To determine the likelihood estimation, exit time information is needed (the time when an earthquake occurs). Suppose 0 represents the number of earthquakes that occurred in the interval ( 0 , 0 + 1] and 0 − 0 represents the number of earthquakes that occurred after 0 + 1. Likelihood for the i-th earthquake in the interval ( 1 , 1 + 1] is If = 0 ( ) + 0 is the time of occurrence of the i-th earthquake in the interval ( 0 , 0 + 1] with 0 < ≤ 1, then The number of earthquakes 0 at L is ∏ 0 . 0 + 0 =1 so the total likelihood is To complete the likelihood estimation, it is necessary to assume that the distribution of 0 . 0 + can be expressed in the form 0 . The assumption that can be used is a linear distribution, which is in the form of 0 + = + . (23) For the value of = 0 we get = while for = 1 we get + = 0 +1 → = 0+1 − = 0 +1 − 0 = − 0 , so that the equation Maulidi (2014) has calculated the estimated value of the estimated hazard rate for each point, which is obtained is the maximum likelihood estimation for 0 and the hazard rate value can be obtained using the equation In estimating the hazard rate value, it can be applied to model estimation for earthquakes in Indonesia. The data category selected is only data that has the potential to damage, which is above 5.00 SR. The scatterplot for earthquake magnitude data for 10 years is shown in Figure  1 from SPSS Analysis. The hazard rate value is estimated by using a single decrement using the likelihood estimator approach. The assumption of waiting time on the data is the assumption that the waiting time is linearly spread. The results of the calculation of the hazard rate value are in Table 1 and the hazard rate plot for each time unit is shown in Figure 2. The 0 is number of earthquakes that occurred during or after 0 . Symbol 0 is the number of earthquakes that occur in the interval ( 0 , 0+1 ]. 0 is probability of an earthquake occurring at interval ( 0 , 0+1 ] if it is known that there has been no earthquake until 0 and 0 is earthquake hazard rate immediately after 0 . Next, the model that fits the smallest error is determined from the actual data. To estimate the hazard function, it can be determined by estimating the coefficient with four types, namely linear ( 0 * = 0 + 1 0 + ) , quadratic ( 0 * = 0 + 1 0 + 2 0 2 + ) , cubic model ( 0 * = 0 + 1 0 + 2 0 2 + 3 0 3 + ) and exponential. Based on the analysis on SPSS, the quadratic model and the cubic model do not fit because of the collinearity based on Table  2 so that the quadratic and cubic models are not recommended to be used as estimators of the hazard rate value in earthquakes in Indonesia.

Linear Model
Suppose ( ) = + then we get the survival function Let Δ = , then We get the distribution function ( ) = 1 − ( ), Then the probability density function for is The following are the results of the linear model from the analysis to estimate the hazard function. Based on Table 3, the estimation of the hazard function equation is 0 = −159.002 + 0.079 0 . Meanwhile, the summary of the model conclusions is shown in Table 4.  Table 4 are get from regression model using SPSS that provides information that 55% of the 0 value can predict the hazard function with an estimator of an error value of 0.23. The expected value for this model is ( ( )) = 159.002 − 0.079 2 2 [−159.002 + 1 ], where = ∆

Exponential Model
The following are the results of the exponential model of the analysis to estimate the hazard function. Based on Table 5, the estimation of the hazard function equation is 0 = 2.735 + 0.260 0 . Meanwhile, the summary of the model conclusions is shown in Table 6  The independent variable is Year (X) Table 6 provides information that 74.9% of the t_0 value can predict the hazard function with an estimated error value of 0.484.

D. CONCLUSION AND SUGGESTIONS
Based on the results and data analysis, it can be concluded that The estimation of earthquake modeling in Indonesia can be done using the hazard rate function using the likelihood estimator in the single decrement method. To complete the likelihood estimator, it is necessary to assume that the distribution of 0 . 0 + can be expressed in the form 0 . The assumption that can be used is a linear distribution, so we get 0 = 0 0 and 0 = 0 The estimation of the hazard rate function that has the best model fit value is the exponential model with 2 = 74.9%. The value of 2 for the linear function is 55%, while the quadratic and cubic models are declared unfeasible.