Probabilistic Prediction Model Using Bayesian Inference in Climate Field: A Systematic Literature

Wildfires occur repeatedly every year and have a negative impact on natural ecosystems. Anticipation of wildfires is very necessary, therefore a prediction model is needed that can produce predictions with a good level of accuracy. One approach to develop probabilistic prediction models is Bayesian inference. The purpose of this research is to review the methods that can be used in developing probabilistic prediction models using the Bayesian approach. The methodology used is Systematic Literature Review (SLR) which can be used to provide a comprehensive review of Bayesian inference research in developing probabilistic prediction models. The research strategy used was the Boolean Technique applied to database sources including Scopus, IEEE Xplore, and ArXiv. The articles used have novelty and ease of explanation of Bayesian methods, especially predictions in the field of climate so that articles are selected based on inclusion and exclusion criteria. The results show that probabilistic models can provide more accurate results than deterministic models. The Bayesian Model Averaging (BMA) method is a widely used method because it is easy to implement and develop so that the prediction results can be more accurate. The development of probabilistic prediction models with a Bayesian approach has great potential to grow as seen from the development of the number of research publications over the past 5 years. The research position of probabilistic prediction models with Bayesian approaches in the field of climate is dominated by the research community in China with the main problems related to hydrology.

Wildfires occur repeatedly every year and have a negative impact on natural ecosystems. Anticipation of wildfires is very necessary, therefore a prediction model is needed that can produce predictions with a good level of accuracy. One approach to develop probabilistic prediction models is Bayesian inference. The purpose of this research is to review the methods that can be used in developing probabilistic prediction models using the Bayesian approach. The methodology used is Systematic Literature Review (SLR) which can be used to provide a comprehensive review of Bayesian inference research in developing probabilistic prediction models. The research strategy used was the Boolean Technique applied to database sources including Scopus, IEEE Xplore, and ArXiv. The articles used have novelty and ease of explanation of Bayesian methods, especially predictions in the field of climate so that articles are selected based on inclusion and exclusion criteria. The results show that probabilistic models can provide more accurate results than deterministic models. The Bayesian Model Averaging (BMA) method is a widely used method because it is easy to implement and develop so that the prediction results can be more accurate. The development of probabilistic prediction models with a Bayesian approach has great potential to grow as seen from the development of the number of research publications over the past 5 years. The research position of probabilistic prediction models with Bayesian approaches in the field of climate is dominated by the research community in China with the main problems related to hydrology.
Bayesian inference is widely used to build a prediction model, especially in the field of biostatistics (Xie, 2022;Gaskins et al., 2021;Patrick, 2012), economic (Lekar et al., 2019;Gilboa et al., 2008;Cooper et al., 2004), and industry (Adedipe et al., 2020;Ghosh et al., 2020;Basuki et al., 2014). Bayesian models provide instruments for risk estimation and enable decision makers to integrate objective expert estimates with historical data. Using a Bayesian network model, (Pendharkar et al., 2005) demonstrate how a confidence update procedure can be used to incorporate decision-making risk and compared the predictive performance of the Bayesian model to that of nonparametric neural network forecasting models and well-known regression tree forecasting models and demonstrate that the Bayesian model is a competitive model for forecasting software development effort. Deep Learning approaches in prediction models frequently produce deterministic estimates and do not account for the inherent uncertainty in model predictions. Abbaszadeh et al. (2022) presents a framework for probabilistic estimation that employs Bayesian Model Averaging (BMA). Taking into account model uncertainty, the proposed method generates more accurate and trustworthy soybean yield forecasts, as demonstrated by the results.
Climate change in the world greatly affects other natural phenomena that have the potential to cause natural disasters in some areas. Therefore, the importance of a prediction of climate change. Climate change affected the entire world, specifically India's fragile Himalayan mountain region, which has high significance due to being a climatic indicator, (Haq, 2022) developed a Deep Long Short-Term Memory (CDLSTM) model and optimized to forecast all Himalayan states' temperature and rainfall values. In the field of climate, the use of Bayesian inference to develop probabilistic prediction models is still small, especially in Indonesia. Therefore, this review article aims to study the application of the Bayesian approach in building probabilistic prediction models in the field of climate. For descriptive analysis and synthesis of articles, we use Systematic Literature Review (SLR). The findings of this study are expected to provide information on the current state of research and future research prospects related to the use of Bayesian inference in building probabilistic prediction models in climate analysis, especially in Indonesia.

B. METHODS
The methodological principles of this research were developed based on the Study Literature Review (SLR) by applying the Systematic Reviews and Meta-Analyses (PRISMA) methodology. PRISMA has a flowchart that can improve systematic reviews and meta-analyses (Moher et al., 2009). The PRISMA flowchart has four phases including identification, screening, eligibility, and inclusion. According to Albuquerque et al. (2021), There are 4 essential things in conducting SLR, including exploring the stages of literature assessment, identifying the Step process, developing research questions, and how to assess to overcome possible bias Exploring the stages of literature review and the step process can be done by Pickering & Byrne (2014) defining the topic, formulating research questions, identifying keywords, conducting electronic searches, assessing publications, data acquisition, data cleaning, testing publications, revising summary tables, method drafts, and writing revisions until the paper is finished. We used the general guidelines introduced by that focused on developing research questions and assessing articles to overcome possible bias in SLR. This study's descriptive analysis uses text mining to provide a comprehensive overview of the research topic. The detailed process can be seen in Figure 1 (Xiao & Watson, 2019).

Research Strategy
The article search strategy used in this research uses the Boolean Technique. The Boolean technique is adapted to obtain essential search terms in the article collection. The search terms used are "Bayesian AND Probabilistic AND Prediction AND Climate." The database sources used in the article search process were sourced from Scopus, IEEE Xplore, and ArXiv. The selected articles used are based on the inclusion and exclusion criteria in Table 1. Must be articles and conference papers that published in the last 10 years (2012 -2022).
Articles are written in another language.

2.
The article should include the development and implementation of the Bayesian inference for model prediction.
Bayesian implementation does not focus on prediction models.

Must be written in English
Articles do not focus on climate. 4. QA Value>80%, QA questions refer to Table 3.

Research Questions
The purpose of this study was also to answer the research questions in Table 2. We used the general guidelines developed by Petticrew & Roberts (2008) which focuses on developing research questions and how to assess articles to avoid bias in SLR. Articles obtained in the search strategy are then screened based on the title and abstract in answering the research question, as shown in Table 2. What are the advantages and disadvantages of probabilistic prediction models? 2.
What are the advantages and disadvantages of Bayesian Inference in Probabilistic models? 3.
What methods are easy to implement and develop with Bayesian Inference? 4.
What are the countries that have developed the concept of Bayesian Inference in prediction models?

Quality Assurance
After screening, all articles were ready for quality testing. The article quality test was conducted based on the method developed by (Al-Emran et al., 2018), we used six testing criteria in Table 3. This method scores articles by following the rules: score 1 for articles with "yes" quality test questions, 0 for articles with "no" quality test questions, and 0.5 for papers with "partial" quality test questions. Three people conducted the assessment, and the scores obtained were averaged to get the final score in Table 3. Articles on this study used papers with an assessment percentage above 80% in this study, as shown in Table 3. Quality Assurance 1 Are the research aims specified clearly? 2 Is the information presented clear and concise? 3 Does the study provide enough explanation of its methodology? 4 Do the findings of the study add to the understanding of Bayesian models? 5 Are the conclusions clearly identified? 6 Are the conclusions logical and concise with the flow of the paper?

C. RESULT AND DISCUSSION
Search results with keywords described in the research strategy section obtained 196 data, including articles, conference papers, conference reviews, reviews, and book chapters published in 2012 -2022 with the big theme of Bayesian in climate. Following the inclusion and exclusion criteria, the papers that have been collected are filtered based on the type of paper so that the papers used are only types of articles and conference papers. Scientific articles filtered by type are sourced from 3 databases, namely Scopus, IEEE Xplore, and ArXiv, so it is necessary to remove duplicates. Articles sourced from the three databases duplicated a total of two articles. Articles were evaluated based on the title and abstract according to the inclusion and exclusion criteria in Table 1.
Furthermore, 52 articles that have passed the title and abstract screening are evaluated based on the research question in Table 2. Based on the title and abstract screening, 15 papers were assessed based on the quality assessment in Table 3 with the evaluation described in the previous section. Table 4 shows the results of the review of 16 articles. Three papers did not qualify with a percentage score below 80%, so only 13 papers will be used in this study. Details of the paper selection process are shown in Figure 2.

Quality Assurance Results
Based on the questions in Table 4 and the ratings described in the previous section, the assessor gave ratings for the 16 articles used in this study. The maximum percentage score that can be obtained is 100%. Some articles scored below 80%, indicating that the articles needed to meet the quality assurance described in the previous section. This study does not use articles with a quality assurance value below 80%. The assessment results on 16 articles are shown in Table 4 with the information Q1-Q6 is Quality Assurance 1-6, A1-A3 is assessors 1-3, and S1-S16 is the source of articles 1-16, as shown in Table 4.

Descriptive Analysis
The position of Bayesian Inference research in probabilistic prediction models in the field of climate can be analyzed using diagrams and graphs. Figure 2 shows the number of citations of scientific articles. Figure 3 shows the classification results of related articles. Figure 4 shows the percentage of probabilistic prediction model development locations using Bayesian inference. Based on Figure 2, the number of articles cited each year is higher than those published each year. The highest publication was in 2013, and in the following year, no publication was related to it, indicating the need for more research on probabilistic prediction models using Bayesian inference, as shown in Figure 2. Most publications (38.5%) used the Bayesian Model Average (BMA) approach in building probabilistic prediction models shown in Figure 3. Based on the results of these studies, BMA is one approach that is easy to implement and modify to produce good accuracy in probabilistic Year Citation Year Publication prediction models. As a framework for model selection and combination, BMA may be used to a broad variety of issues. The determination of posterior model probabilities renders BMA a very simple model selection procedure. Under the Bayesian paradigm, such probabilities are clearly interpretable, the model space may be as large as necessary, and there is no need to keep track of the number of parameters or kind of penalty, as is needed by the information criterion approaches used in the statistical literature. The second and third widely used methods are Traditional Bayesian (TB) and Bayesian Network (BN). The selection of the prior model strongly influences both methods. This is different from the BMA method which uses the entire prior model as the posterior. In BN, there is no universally accepted method for building networks from data. This makes the design of Bayesian Networks a relatively large undertaking. A problem that arises is the fact that Bayesian networks can only exploit causal influences recognized by the person programming them, as shown in Figure 3.

Answer to Research Questions
Based on the objectives and findings summarized in Table 5, the drawback of the probabilistic prediction model is that it requires good prior knowledge because the prior knowledge strongly influences the computational results of the probabilistic model in the previous model. Therefore, its computation will require more development and computation time than the non-probabilistic model. Meanwhile, since the probabilistic model is an update of the previous model, the advantages of the probabilistic prediction model can provide more accurate prediction results. The accuracy of probabilistic models is also affected by the use of "uncertainty" in bias, model trends, and variability (Fan et al., 2017). Therefore, checking the quality of prediction accuracy can be done more efficiently. Bayesian inference in probabilistic models is one alternative to obtain prediction results. The Bayesian approach has the basic concept of updating the information in each model, such as the concept of probabilistic models, updating the information can significantly improve accuracy results. However, the computational time required is longer, and the proposed model tends to be "expensive" this is can be a disadvantage of Bayesian inference for probabilistic models. Although computationally expensive, Bayesian inference prediction models can provide good benefits considering the importance of the accuracy of prediction model results as a decision support system (Foster et al., 2020;Siegert et al., 2016).
The Bayesian approach has several methods, including Bayesian Model Averaging (BMA), Bayesian Network, Bayesian Hierarchical Model, and Bayesian Inferential Framework (Agustin & Adi, 2021;Alinezhad et al., 2021;Foster et al., 2020;Ji et al., 2019;Kapsch et al., 2012;Kim et al., 2017;Y. Li et al., 2022;Liang et al., 2013;Olson et al., 2016;Röpnack et al., 2013;Sharifahmadian & Latifi, 2013;Siegert et al., 2016). Based on the articles used in this study, 38% of studies used the BMA method. The BMA method is easy to implement and develop. The development of the BMA method and the provision of weights can increase the accuracy rate (Fan et al., 2017;Olson et al., 2016). The second order is traditional Bayesian which only uses one input, namely the error value or 'uncertainties' in the previous model. Next is the Bayesian network method for probabilistic prediction models. All of these methods utilize previous information, which aims to increase knowledge so that predictions can become more accurate. The three ways have characteristics whose usefulness can be adjusted according to the needs. The main problem in building a probabilistic prediction model with the Bayesian approach is the selection of several priors and models used as input for the next stage. The BMA and Bayesian Network methods are suitable for overcoming this problem. The BMA method is a method that is easy to implement and modify. There are three versions of the BMA method, which basically uses all priors or previous models in obtaining the posterior distribution. in the BMA method, the posterior distribution is assumed in equation 2 with is Gaussian distribution and hyperparameter (mean ( + ) and variance 2 ). Parameter vector on model provides ( , ), 2 and for each = 1, … , . For clarity, the different schematics of the three versions are shown in Figure 3. For clarity, the different schematics of the three versions are shown in Figure 5. (2) Of the three versions of BMA, the BMA method was modified to improve the quality of the prior distribution, namely Leave One Out (LOO) BMA and Perturbed BMA. These modifications are inspired by the concept of leave one out of cross-validation and random forest. LOO BMA estimates the mean with the LOO method and then calculates the weight and variance with maximum likelihood. Whereas in perturbed BMA, the parameters are chosen randomly. Perturbed BMA generates multiple models by injecting randomness into the data, similar to a random forest.
Another method is the Bayesian Network (BN). Unlike the BMA method, which mainly uses all fairly similar prior distributions, the BN method is effective for variables and knowledge that vary from prior knowledge. These variables and knowledge are intended to be integrated into a framework. BN is suitable in such cases because it includes a probabilistic relationship between variables and allows for mutually independent variables, as shown in Figure 5. The countries that have developed the concept of Bayesian Inference in prediction models are China, Germany, Indonesia, Australia, the USA, the UK, Iran, and Korea. The majority of the research is in China, with the main problem in the field of hydrology. In other countries, studies focus on Sea Surface Temperature (SST), which is influenced by rainfall and global climate. While some studies are conducted globally, some focus on one region, such as East Nusa Tenggara, East Asia, and South Australia, because they have consistent drought patterns yearly. Consistent drought patterns are an advantage in making prediction models with good results.  Foster et al. 2020(Foster et al., 2020 Implementation, Calibration, and performance evaluation of Linear Inverse Model in SST anomaly forecasting using Bayesian inference and Probabilistic Scoring  Bayesian inference provides better probabilistic and estimation capabilities when compared to traditional point estimation methods for LIM models.  The choice of prior distribution significantly impacts the estimation results, such as improving the ability of the model to capture the anomalous variability of SST.  Bayesian inference is computationally expensive. Siegert et al. 2016(Siegert et al., 2016 Application of Bayesian Framework for verification and recalibration of ensemble forecast based on small hindcast dataset  Bayesian framework can improve experience capabilities compared to more straightforward recalibration methods  Computational methods for Bayesian analysis tend to be "expensive."  Inferences depend on the assumption of appropriate parametric middles. Fan et al. 2017(Fan et al., 2017 Using Bayesian inference, calculate model weights on climate change projection ensembles to create probabilistic projections.  The method presented is more robust  Different priors used can lead to different conclusions  Bayesian inference considers "uncertainties" in model bias, trends, and internal variability, including errors in the observations used. Alinezhad et al. 2021(Alinezhad et al., 2021 Hydro-climate projections with uncertainties in the Zayandeh-Rud river basin using Bayesian Model Averaging (BMA).
The BMA approach produces better estimates of meteorology, including temperature and rainfall, from the observed values during the base period. Olson et al. 2016(Olson et al., 2016 Probabilistic projections using Regional Climate Models (RCMs) with Bayesian Model Averaging (BMA) Weighted projections are well-calibrated and more precise than unweighted projections. Liang et al. 2013(Liang et al., 2013 Forecasting on several combinations of hydrological models using Bayesian Model Averaging (BMA)  BMA is always better than Simple Average Methods (SAM)  BMA approach is excellent and robust for integrating single models.  Confidence interval prediction with BMA can provide information to support flood control decisions. Röpnack et al. 2013(Röpnack et al., 2013  Measuring disaster impact as a decision support system using Bayesian Network The updated Bayesian approach can be used to guide maintenance strategies and operations of buildings and infrastructure. Ji et al. 2019(Ji et al., 2019 Improving rainfall prediction capability in East Asia based on Ensemble Prediction System (EPS) outputs from ECMWF, NCEP, and UKMO from TIGGE dataset.
 Light rainfall events are more accurately predicted using the standard deterministic BMA with limited ability for moderate and heavy rainfall events.  The categorized deterministic BMA model is superior to the standard model, especially for moderate and heavy rainfall.  The categorized BMA deterministic model provides better-calibrated rainfall probabilities than the standard model.  The superiority of the BMA model to climatological forecasts only sometimes lasts for a more extended period. Li et al. 2022(Y. Li et al., 2022 Developing Spatiotemporal Projection Based Bayesian Hierarchical Model (STP-BHM) for sub-seasonal rainfall forecasting.
 STP-BHM model can provide skillful and reliable probabilistic forecasts  Bayesian statistical model is more flexible and efficient for assessing multiple reliable sources  The STP-BHM model shows the functional predictive ability for both below-normal and above-normal events, and positive Brier ability is seen at all critical times  The STP-BHM model outperforms the NCEP S2S dynamic model when the lead time is more than 5s. Sharifahmadian & Latifi 2013 (Sharifahmadian & Latifi, 2013) Measuring the performance of the Water Environment Risk prediction model using Bayesian Network.
The proposed method effectively predicts Waster Environment Risk and improves water resource.

D. CONCLUSION AND SUGGESTIONS
In conclusion, this article aims to conduct a systematic review of probabilistic prediction models using Bayesian inference in the field of climate. The main goal of this review was to gain insight into probabilistic prediction models using Bayesian inference and to find the best approach to implement them. However, overall, probabilistic prediction models can improve accuracy. Methods used in building probabilistic prediction models in the climate field include Bayesian Inferential Framework, Bayesian Network, Bayesian Hierarchical Model, and Bayesian Model Average. BMA is easy to modify and implement, so the majority of papers, as much as 38%, use BMA as a method for building prediction models. The majority of countries that developed probabilistic prediction models using Bayesian is China, with its main problems related to hydrology.