Analysis of Food Security Factors in Indonesia using SEM-GSCA with the Alternating Least Squares Method

ABSTRACT


A. INTRODUCTION
Food security is a critical issue and a top priority in the policies of developing countries.In this situation, several countries worldwide have been found to adopt several measures and policies, such as food reserves and price controls, in response to the anticipated food crisis (Ahmed & Ambinakudige, 2023;Fahad et al., 2023).To comprehensively understand the factors influencing food security in Indonesia, an adequate analytical method is required.One effective method is the use of Structural Equation Modeling (SEM), which is a conceptual modeling approach widely used to answer any research question related to the indirect (latent) or direct observation of one or more dependent and/or independent variables (Sarstedt et al., 2022;Setiawan et al., 2021).In line with this, in this research SEM was chosen as the analytical method because of its ability to analyse complex relationships between latent variables and observed variables, assessing various factors that influence food security simultaneously.
SEM combines the statistical aspects of factor analysis, which helps identify underlying latent variables, with the path analysis component, allowing for the examination of direct and indirect relationships among the variables within a structural framework (Huang et al., 2022;Mai et al., 2018).Accordingly, various research has been carried out on the subject of food security across several countries worldwide.For instance, Denny et al. (2018) conducted crossscale empirical research using SEM on food security in Africa.Furthermore, research conducted by Usman et al. (2023) used PLS-SEM modeling to explore the relationship between climate change, irrigation water, agriculture, rural livelihoods, and food security in Pakistan.Pervaiz et al. (2019) also used this method to determine the extent to which agricultural lands were being used and the rate of food security in Pakistan.SEM was also leveraged to assess the relationship between rural food security and both smallholder and commercial agriculture in Mexico (Galeana-Pizaña et al., 2021).Meanwhile, research in Indonesia was carried out by Riptanti et al. (2022), in his research he presented his analysis model for the loss of dryland agriculture in food insecure areas.Research it carried out in East Nusa Tenggara Province, which is an area with relatively high levels of food level of insecurity in Indonesia.
Although this conceptual approach has been widely applied in previous research, it has not been used in conjunction with the Generalized Structured Component Analysis (GSCA), which is an innovative method.According to Hwang et al. (2017) and Cho et al. (2020), within the context of SEM, GSCA operates as a component-based methodology, where latent variables are estimated through the use of weighted composites derived from a set of indicators.GSCA has been found to exhibit certain advantages, including the fact that it is not based on normal distribution assumptions, and the amount of data required does not need to be large.In addition, this form of analysis has been shown to effectively adhere to global optimum criteria, specifically in terms of the entire goodness of fit model, ensuring a consistent minimization of the sum of squared errors to derive accurate estimations of model parameters (Fakfare et al., 2023;Hwang et al., 2020).It is also important to clarify that this research examined GSCA-based SEM using the Alternating Least Squares (ALS) method.The method uses an optimization algorithm designed to ascertain the optimal parameter values by minimizing the squared error between empirical data and the data predicted by the model.In this situation, the use of ALS to determine the rate of food security in Indonesia provided an advantage in estimating complex parameters and yielded more accurate analytical results.Hence, this research aimed to identify and analyse the determinants of food security in Indonesia, with a focus on factors related to the availability, access and utilization of food resources.This is realized through the application of GSCA based SEM with the ALS method, which can provide insight into.

B. METHODS 1. Data
Data on food security indicators for all 34 provinces in Indonesia were gathered from authoritative sources, including the Central Bureau of Statistics (Indonesian: Badan Pusat Statistika or BPS) and the Indonesian Ministry of Agriculture.Furthermore, in this research, a total of 4 latent variables and 11 indicator variables were used, as presented in Table 1.It is essential to clarify that these variables were considered fixed, meaning each indicator variable had undergone systematic testing and effectively described a specific construct (factor) through Confirmatory Factor Analysis (Hair et al., 2020).The decision to consider these variables as fixed variables is based on theoretical and practical considerations.The rationale includes aligning the selected indicators with the established food security conceptual framework, ensuring compatibility with existing literature and theoretical models.In addition, practical reasons include the availability and collection of data from official sources such as the BPS and the Indonesian Ministry of Agriculture.The combination of theoretical relevance and reliability of the data supports the decision to treat the variables as fixed in the analysis.Based on relevant theories and the obtained research results, both a direct and an indirect relationship was observed between the latent variables.Consequently, a conceptual model was formulated and presented in Figure 1, showing these relationships in the form of a trajectory:

Methods
a. ALS Estimation of SEM-GSCA GSCA was decomposed into three sub-models using matrix algebra calculations by ALS.
Particularly, all submodels including measurement, structural, and weighted relation models were integrated into a single equation in matrix notation, as follows: Where  is the identity matrix with unit size , [   ′ ] is with the order matrix , [  ′ ′ ] is the matrix with the order , and [   ] is the size vector 1, where  =  + , Furthermore, Equation ( 1) can be written in matrix notation (2) as follows: Let   denote 1an indicator vector measured on a single observation of a sample from  observations ( = 1, 2, … , ).The parameters to be estimated include matrices , , and .These matrices were estimated by minimizing the least square criterion of all  residuals until they become as small as possible for all  observations.This can be represented using the following formula: Considering matrices  ,  , and  , where () =  ( ′ ) for all matrices  .
Suppose  denotes a matrix with sizes  composed of each observation one under the other, namely [ 1 ,  2 , … ,   ] ′ , Equation (3) can be rewritten as follows: ϕ = SS ( − ) ϕ = SS ( − ), where  =  and  =  (4) theseon (4) could not be solved directly primarily because of the tendency of matrices , , and  to equate to zero or àfixed elements.To address this, De Leeuw et al. (1976) developed ALS, which was used to minimize Equation (4).ALS in GSCA consists of updating matrix  for fixed  and , then updating matrices  and  for fixed .The first step includes updating the matrix  which was carried out by: 1) Updating the loadings or path coefficients in  for fixed  and .Equation (3) can be described using the properties of the vec trick (Airola and Pahikkala, 2018), as follows: 2) Applying the multiplication property of Kronecker multiplication to Equation ( 5), which yielded the following results: Assuming,  is the vector formed by removing the zero elements from ();  is matrix formed by deleting columns from (  ⨂) corresponding to zero elements in () , assuming that (  ⨂) is full rank.Therefore, the least square estimate obtained for  and  remains as follows: 3) Reconstructing the new matrix  from  ̂ assumption that  ′  is non-singular.The second step entailed updating  and  with fixed path coefficients of .As seen in Equation ( 4), some columns were often duplicated over  and  while others were included in  and .Therefore, each column in matrices  and  must be updated separately.The algorithm used is as follows: a) Suppose that there are only one parameter  shared by  and  ,   and   respectively denote the th column where  = 1,2, …  and the column where  = 1,2, … , and assuming   and   are the same for  and ,  =   =   .b) Define  =  c) Assuming,  (−) is matrix  where the  ath column is the zero vector;  * () is matrix  with all its columns zero vector except the th column;  (−) is matrix product of matrix  with the th column being the zero vector and matrix  where the  th row is the zero vector;  * () is matrix product of a matrix  with all columns zero vector except column  and matrix  with all its rows being zero vectors except row ; ′ () is row vector whose elements are all zero except the th element being one unity; ′ () is denote th row of matrix .To update , Equation (4) can also be rewritten as: d) Suppose  and  denote the number of columns of unknown parameters in  and , respectively, while  denotes the number of common columns in  and ,  =  +  − .To update all parameters in  and , Equation ( 6) can be generalized as; By applying the Kronecker multiplication property (Makkulau et al., 2010) in Equation ( 7), the following results were obtained: Given:   is vector formed by removing some fixed elements from   ;  is matrix formed by deleting columns from ⨂, which corresponds to a fixed element in   .The least squares estimate of   , assuming that  ′  is non-singular can be written as follows: The calculation of ALS is significantly intricate, in stark contrast to the relative simplicity of OLS.The ALS method involves iterative steps to minimize residual errors in parameter estimates.Initially, the procedure updates the parameter matrix  while keeping  matrices constant.Next, it updates  matrices while keeping  constant.This back and forth process continues iteratively until convergence, which is reached when the difference between the current estimate and the previous estimate is below a predetermined threshold, usually set at 0.0001.This iterative optimization process aims to achieve the optimal parameter matrix  which represents the SEM-GSCA parameter estimation matrix.
b. Data Analysis Procedure Using SEM-GSCA The procedures followed when conducting data analysis using SEM-GSCA are comprehensively described in the following steps: 1) Model specifications.Model specification comprised determining the structural and measurement models used to carry out the test.In this research, the model specification consisted of 4 latent variables, namely Availability (  1 ), Access ( 2 ), Utilization ( 3 ), Food Security ( 4 ), and 13 indicator variables, namely   , where  = 1, 2, … ., 11, as shown in Figure 2. 2) Parameter estimation.ALS for GSCA parameter estimation was carried out using the R-Studio 4.2.1 software.3) Evaluation of measurement models.Evaluation of the measurement model was carried out by examining the loading factor, which was used to test the validity of the variable.In this situation, the correlation is stated to meet convergent validity if it has a loading factor or correlation coefficient  ≥ 0.5 (Shrestha, 2021).Subsequently, Composite Reliability (CR) and Average Variance Extracted (AVE) testing were conducted to assess the reliability of the variables.According to previous research, a latent variable is deemed highly reliable when its CR exceeds the threshold of 0.7 (Hair Jr. et al., 2018).4) Structural model evaluation.Evaluation of the structural model of the CRb parameter significance test was carried out using R-Square ( 2 ) (Hwang and Takane, 2014).For instance, statistics  bootstrap, called Critical Ratio (CRb), can be calculated by dividing the parameter estimate by the bootstrap standard error.Accordingly, if the bootstrap  value is equal to or greater than the critical value  distribution, the parameter estimates are considered statistically significant at the level  = 0.05 (Jung et al., 2019).It is also essential to clarify that parameters are considered significant if the value of |CRb| > 2. 5) Evaluation of the entire fitness of the model.This evaluation was carried out after the measurement model and structural model were found to be significant.During this process, the FIT, Adjusted FIT (AFIT), and Goodness of Fit Index (GFI) values were considered, as these parameters have been previously referenced in model evaluations.From the analysis, it was observed that the FIT values obtained ranged from 0 to 1, by previous research (Shi et al., 2020).Meanwhile, the GFI values that were considered favorable within the range from 0.9 to 1 (Manosuthi et al., 2021).

C. RESULT AND DISCUSSION 1. Specification of Measurement Model
The measurement model was used to describe the relationship between latent variables and indicator variables.Figure 2 is a path diagram for the used measurement model:

Specification of Structural Model
The structural model, on the other hand, aided in describing the relationship between latent variables, five of which were used in this research.The following is a path diagram that was formed in this regard.The latent variable  1 influences the latent variables  2 and  3 with residual  1 and  3 3 respectively.Then the variables  1 ,  2 , and  3 show arrows or paths to  4 , meaning that these three latent variables influence latent variable  4 with residual  2 .

Specification of Weighted Relationship Model
GSCA defines latent variables as a component or combination of weighted indicators.In this situation, explicitly weighted relation models are used to express the relationship between indicator and latent variables.The weighted relation model in matrix form becomes:

Parameter Estimation
The results of the estimation for the weight parameters linking the indicator variables and the latent variables are presented in Figure 4.  (Jendryczko & Nussbeck, 2022).As shown in Figure 5, a positive estimation result was obtained, indicating that an increase in the value of one variable ultimately led to an increase in the value of another.On the other hand, a negative estimation result means an increase in the value of one variable corresponds to a decrease in the value of the other, with one example being the latent variable availability (  1 ).Between the three indicators, only one was observed to exhibit a negative value, which is  2 .This indicated that a greater average daily per capita protein consumption in rural and urban areas corresponds to a decreased food availability in Indonesia.The other indicator variables  1 and  3 yielded positive estimation values.This indicates that the greater the average consumption of tubers and vegetables per capita per day in the province, the greater the values of existing food availability.Based on these observations, the results of the estimation of loadings between latent variables and indicator variables can be seen in Figure 5.In a recursive relationship, it was observed that the latent variable is subject to influence by the indicator variable and this exhibited an influence on the very same indicator variable, as shown in Figure 5.In this situation, a reciprocal relationship or interdependence exists between the latent variable and the indicator variable.Apart from examining the relationships between indicator variables and their respective latent variables, it is possible to investigate the relationships among latent variables themselves, either directly or indirectly.For instance, the latent variable "access" exhibited a direct effect of -0.73 on food security.On the other hand, the latent variable "availability" was found to indirectly impact the access variable and continually influenced the food security variable.

Model Evaluation a. Evaluation of Measurement Models
Evaluation of the measurement model was carried out to test the relationship between indicators and latent variables through the following tests: 1) Convergent validity.Convergent validity testing, in this context, was carried out by examining the value of the estimated loading.As established by previous research, convergent validity is fulfilled if the obtained value exceeds 0.5 (Hair et al., 2010).
Based on the R program output, the value of the estimated loading is presented in Table 2.The results in Table 2 show that all indicators had an estimated loading value larger than 0.5, indicating the validity.

2) Convergent Reliability
In evaluating the appropriateness of the measurement model leveraged in the final SEM, the robustness of its measurement was established through reliability assessment, using measures such as Composite Reliability (CR) and Average Variance Extracted (AVE).CR and AVE values for evaluating the reliability of the measurement model are presented in Table 3. Table 3 shows that the CR and AVE values of each latent variable measured based on the indicator variable were greater than 0.7 and 0.5, respectively, meaning the variables used were reliable.
b. Evaluation of Structural Model 1) Parameter Significance Test CRb calculation was carried out by dividing the parameter coefficient values obtained through estimation by their standard error value, and the results are presented in Table 4. Based on the result in Table 4, the structural model form is as follows: The values of CRb in Table 4 showed that the availability variable significantly exhibited an impact on the access, utilization, and resilience variables.Similarly, the access variable substantially influenced the use and resilience variables.Food security was also found to be significantly influenced by availability, access, and use variables.
2) R-Square test ( 2 )  2 value is generally used to determine the influence of exogenous variables on the endogenous variables in a model.In the case of SEM-GSCA,  2 value was interpretand ed the same as the squared correlation, elucidating the proportion of the variance in the endogenous variables, which was accounted for by the model.The following are  2 of the endogenous latent variables presented in Table 5.Similarly, the estimate of   , which was presumed to have full rank yielded non-singular outcomes  ̂ = (′) −1  ′ ().Several factors were found responsible for the varying results obtained in this research, including the selection of the right model, the quality of the data used, the analytical method leveraged, as well as the characteristics and complexity of the estimated parameters.This research also considered these factors to ensure the validity and reliability of the research results.Based on this understanding, all the variables used were fixed before the analysis.This means that a good indicator variable test had been carried out to systematically describe a construct (factor) using CFA.This research has significant advantages, primarily stemming from the use of GSCA alongside ALS.This combination offers a solution to some of the challenges encountered in traditional SEM analysis, particularly the constraints associated with assuming normal data distribution and requiring large sample sizes.It is essential to acknowledge that regardless of the consistency and efficiency of the estimated parameters, the obtained analytical results were still very accurate.However, this research exhibited several shortcomings, including the fact that it could not describe numerically the iteration process utilized to obtain minimum residuals and achieve convergent conditions.This investigation is in line with Cho & Choi (2020) in terms of GSCA, where reflective indicators were found to produce the most efficient estimator in variancebased SEM.Cho et al. (2023) also observed similar results where the component-based method was more robust to construct misrepresentations than the factor-based method.However, it is important to acknowledge that the method used in this investigation is unique compared to previous research.The method included the use of GSCA with ALS, which made a novel contribution to the modeling of the latent variable relationships.Based on the definition of food security by the Food and Agriculture Organization (FAO), this research was compiled by the Food Security Index (FSI) from the four dimensions of availability, use, and access, with intake of per capita food production and GDP per capita serving as the key indicators to characterize the situation of food security.The results offer substantial implications for enhancing the comprehension of the interconnections among latent variables that characterize food security in Indonesia.In addition, it led to the introduction of a new method, comprising the integration of GSCA with ALS, which was used as an alternative in SEM analysis.It is essential to clarify that the method applies to various fields of social research, and with minimum data, it can still be used to obtain estimates of complex parameters as well as produce more detailed and accurate analyses.Lastly, this research serves as a basis for decision-making and strategic planning to improve food security in Indonesia.

D. CONCLUSION AND SUGGESTIONS
In conclusion, the use of GSCA-based SEM with ALS underscores the significant influence of latent variables availability, access, and utilization on Indonesia's food security, with a robust model assessment resulting in a fit of 98%.This analysis emphasizes the need to address geographic and seasonal disparities in food availability, particularly in remote or vulnerable areas, to ensure equitable access nationally.Additionally, understanding the factors that influence physical and economic food accessibility, such as transportation infrastructure and market availability, is critical to addressing urban and rural disparities.Exploring food consumption patterns and nutritional practices at various levels of society is essential to explain the dynamics of food security.Suggestions for further research require analyzing the impact of external factors such as climate change and the economic crisis on food availability, accessibility and overall food security in various regions in Indonesia.Besides that, the direct and indirect impacts of latent variables on food security data are studied, while also considering external indicators that have the potential to increase the completeness of the model proposed in this research.

Figure 2 .
Figure 2. Measurement Model Specification Path Diagram

Figure 3 .
Figure 3. Structural Model Specification Path Diagram

Figure 4 .
Figure 4. Path diagram of the estimated weight of indicator variables to latent variables

Figure 5 .
Figure 5. Path diagram for the estimation of loading parameters

Table 2 .
The estimated value of the measurement model loadings

Table 3 .
CR and AVE values of measurement models

Table 4 .
Parameter Significance Test Results

Table 5 .
2 of endogenous latent variables

Table 6 .
Model fit evaluation , it can be seen that FIT and AFIT values were larger than 0.5.In this situation, the FIT value signifies that the model could account for approximately 59% of the variation in the data.However, the obtained AFIT value of 56% was influenced by the complexity of the model.It is important to clarify that an increase in the number of variables present in a model tends to elevate both FIT and AFIT values.For GFI, a value of 98% was obtained, indicating that the model had a good fit.Based on the results of the model fit evaluation, it was concluded that the model estimate was good for describing the rate of food security in Indonesia.The principal outcome of this research pertained to the least square estimation of the vector  represented as at  ̂= ( ′ ) −1  ′ vec().It is imperative to acknowledge that this estimation necessitated the matrix (  ⨂) to possess full rank for the generation of non-singular  values.