4.2 Analysis of OLS and GWR Results
The coefficients obtained from the OLS untransformed model of residential facilities all showed a negative relationship except for flood depth and family income. Flood depth has 7339.759 and family income at 491.837. Flood duration has -0.041, inundated area has -0.014, and land price has -0.322. The positive results means that in every increase of flood depth and the amount of income of the affected families, there is an expected increase in damage and a decrease in flood duration, inundated area and land price. However, the constant value of the regression analysis showed a -149.248, which is a little bit different from the other untransformed OLS models. This could strongly signify that missing variables do exist (e.g. if the given five parameters are zero). On the transformed datasets, the intercept (Intercept = -4.0-6) values showed a negative relationship with regards to the damage amount. The same goes with the land price (BCLP = -8.0-5). The inundated area (BCFAR = 1.0-3), flood duration (BCDUR = 2.0-6), family income (BCINC = 5.5-5) and flood depth (BCDEP = 9.59-4) with again the BCFAR garnering the highest value. In case of the transformed data, the inundated area gets the highest bearing for flood damage followed by flood depth. Consistently, flood depth has the highest influence in the model as particularly having the largest coefficient value in the untransformed datasets, while inundated area gains the crown for the datasets that were transformed.
Nonetheless, the coefficients obtained from the OLS untransformed model of commercial facilities all showed a positive relationship. Flood depth has 11892.550, flood duration has -3.954, inundated area has 0.203, family income has 134.682 and land price with -1.155. The positive results show that in every increase of flood depth, the area of flooded region and the amount of income of the affected families, there is an expected increase in damage and a decrease in flood duration and land price. However, the constant value of the regression analysis showed a +2692.710. This indicates a positive relationship with respect to the amount of flood damage. On the other hand, on the transformed datasets, the intercept (Intercept = 0.230) values showed a positive relationship with regards to the damage amount. The same goes with the inundated area (BCFAR = 0.025), land price (BCLP = 2.522-3) and flood duration (BCDUR = 2.7-5) with the BCFAR gaining again the highest value. The rest are of negative values (BCDEP = -0.150 and BCINC = - 0.010). This shows that in the case of the transformed data, the inundated area gets the highest bearing for flood damage followed by flood depth.
The coefficients obtained from the OLS untransformed model of agricultural facilities all showed a positive relationship. Flood depth has 3509.237, flood duration has 6.721, inundated area has 0.936, family income has 1222.567 and land price has 3.301. The positive results show that in every increase of flood depth, duration in the flooding event, the area of flooded region, the amount of income of the affected families as well as the land price of the affected region, there is an expected rise in damage. However, the constant value of the regression analysis showed a -4790.713. This indicates a negative relationship with respect to the amount of flood damage. On the other hand, on the transformed datasets, the intercept (Intercept = 0.073) values showed a positive relationship with regards to the damage amount. The same goes with the inundated area (BCFAR = 0.456) and land price (BCLP = 2.57-4) with the former gaining the highest value. All the other parameter variables are of negative values (BCDEP = -0.025, BCDUR = -7.11-4 and BCINC = - 0.020). This shows that in case of the transformed data, the inundated area gets the highest bearing for flood damage followed by family income.
Some positive values in OLS have become negative in GWR and vice versa. This happens both in untransformed and transformed datasets. This, however, indicates that the factors have varying effects in the global and local conditions. Several parametric evaluation like coefficient of determination (R
2), log-likelihood and AIC were used to evaluate the OLS and GWR models (see
Table 6 and
7).
Table 6
OLS and GWR Evaluation for Untransformed Datasets
Parameter |
OLS (Residential) |
GWR (Residential) |
OLS (Commercial) |
GWR (Commercial) |
OLS (Agricultural) |
GWR (Agricultural) |
R2
|
0.284 |
0.614 |
0.175 |
0.566 |
0.270 |
0.979 |
Log-likelihood |
4928.850 |
4775.675 |
8291.250 |
8049.553 |
294.519 |
241.335 |
AIC |
9871.699 |
9752.683 |
16596.500 |
16435.690 |
603.038 |
534.615 |
Table 7
OLS and GWR Evaluation for Transformed Datasets
Parameter |
OLS (Residential) |
GWR (Residential) |
OLS (Commercial) |
GWR (Commercial) |
OLS (Agricultural) |
GWR (Agricultural) |
R2
|
0.138 |
0.481 |
0.190 |
0.320 |
0.305 |
0.873 |
Log-likelihood |
2904.051 |
3030.034 |
1446.733 |
1512.389 |
68.421 |
93.960 |
AIC |
-5794.102 |
-5845.397 |
-2879.465 |
-2671.310 |
-122.843 |
-134.956 |
The R-squared value or the coefficient of determination of all the above models have shown improved values from OLS to GWR. Commercial and residential facilities still showed an increase, but only ranges from 0.57 to 0.61, while the agricultural facilities achieved the highest R-squared value with 0.98. However, we should be reminded of the fact that the coefficient of determination is not the sole criteria for us to tell a significant improvement of the OLS model to the GWR model. In addition the Log-likelihood and AIC, are two necessary approach to evaluate the performance of the model. The log-likelihood of the GWR models of the untransformed datasets was lower than the log-likelihood of OLS. As for the AIC, as the rule says, an absolute difference for the AIC should be 3 in order to consider it as an improved performance. All of the generated models have found to have satisfied this condition.
Several tests were also performed to analyze the datasets. T-statistic test for the untransformed data showed that ‘flood depth’ (dep) is the only statistically significant parameter variable at 1% of significance level (p-value = 0.00000). For the transformed data, BCDEP (flood depth) and BCFAR (inundated area) were the statistically significant on 1% and 5% level, respectively.
The Jarque-Bera Test in this case have also failed the normality of the residual distribution for the untransformed data (JB =8812.902, p-value = 0.00000). Moreover, the Breusch-Pagan Test (BP = 1152.643, p-value = 0.00000), Koenker-Basset Test (KB = 106.607, p-value = 0.582) and the White Test on specification of robust test (WT = 132.148, p-value = 0.00000) confirmed the presence of spatial heteroscedasticity. This leads to the model being spatially non-stationary. However, the transformed data showed that JB = 53443.481, p-value = 0.00000, which means that it is not under normal distribution. Therefore, the Breusch-Pagan Test (BP = 98.723, p-value = 0.00000), Koenker-Basset Test KB = 3.776, p-value = 0.582) and the White Test on specification of robust test (WT = 7.106, p-value = 0.996) all failed the prediction of the data being spatially non-stationary.
For the datasets of residential facilities, the multi-co-linearity condition number was found to be 10.950 for the untransformed and 42.476 for the transformed. The untransformed value is less than 30, while that of the other is greater than the said standard value. Thus, the latter has an issue with multi-co-linearity. In case of commercial facilities of untransformed datasets, only 17.49% of the variation in the dependent variable is explained. Thus, this model tells only an approximately 17.49% of the flood damage in the 2012 flood event in Gunsan City. For the transformed data, it increased to 19.00%.
The results of the t-statistic test for the untransformed data showed that ‘flood depth’ (dep) is the only statistically significant parameter variable at 1% of significance level (p-value = 0.00000). For the transformed data, BCDEP (flood depth) is statistically significant on 1% level together with BCINC (family income). In this case, flood depth is indeed the highest influencing factor among the other four.
The Jarque-Bera Test in this case have failed the normality of the residual distribution for the untransformed data (JB = 96078.419, p-value = 0.00000). Additionally, the Breusch-Pagan Test (BP = 2656.296, p-value = 0.00000), Koenker-Basset Test (KB = 94.588, p-value = 0.00000) and the White Test on specification of robust test (WT = 117.630, p-value = 0.00000) confirmed the presence of spatial heteroscedasticity. This leads to the model being spatially non-stationary. However, the transformed data showed that JB = 8.797, p-value = 0.012, which means it is not under normal distribution. The following tests: Breusch-Pagan Test (BP = 6.184, p-value = 0.289), Koenker-Basset Test (KB = 5.205, p-value = 0.391) and the White Test on specification of robust test (WT = 17.955, p-value = 0.590) all failed the prediction of the data being spatially non-stationary.
The test for multi-co-linearity was also performed in commercial facilities datasets. The multi-co-linearity condition number was found to be 10.254 for untransformed and 38.679 for transformed. The untransformed value is less than 30, while that of the other is greater than the said standard value. Therefore, the latter has an issue with multi-co-linearity.
As shown in the results of the untransformed datasets, only 26.97% of the variation in the dependent variable is explained. Thus, this model tells only an approximately 26.97% of the flood damage in the 2012 flood event in Gunsan City. In the transformed data, it then increased to 30.48%.
All the resulting coefficients of the parameter variables are given in the same units as their associated explanatory variables. The coefficient reflects the expected change in the dependent variable for every 1 unit change in the associated explanatory variable, holding all other variables constant.
However, the results of the t-statistic test for the untransformed data showed that ‘flood depth’ (dep) is the only statistically significant parameter variable at 5% of significance level (p-value = 0.04). The t-test is used to assess whether or not an explanatory variable is statistically significant. The null hypothesis is that the coefficient is, for all intents and purposes, equal to zero (and consequently is NOT helping the model). When the probability is very small, the chance of the coefficient being essentially zero is also small. This again proves that flood depth is indeed the highest influencing factor among the other four.
The Jarque-Bera Test also reject the normality of the residual distribution for the untransformed data at 1% level (JB = 6.705, p-value = 0.04). Breusch-Pagan Test (BP = 26.905, p-value = 0.00006), Koenker-Basset Test (KB = 16.123, p-value = 0.00065) and the White Test on specification of robust test (WT = 29.427, p-value = 0.080) confirmed the absence of spatial heteroscedasticity. This leads to the model being stationary. However, the transformed data showed that JB = 1.649, p-value = 0.438, which means it is under normal distribution. In addition, the Breusch-Pagan Test (BP = 1.552, p-value = 0.907), Koenker-Basset Test (KB = 2.795, p-value = 0.732) and the White Test on specification of robust test (WT = 23.924, p-value = 0.246) all failed the prediction of the data being spatial non-stationary.
The test for multi-co-linearity was also performed. The multi-co-linearity condition number arrived with 8.283 for untransformed and 13.574 for transformed. In this final case, both values are less than 30 and thus indicate that multi-co-linearity problem no longer exist.