J. Korean Soc. Hazard Mitig Search

CLOSE


J. Korean Soc. Hazard Mitig. > Volume 25(3); 2025 > Article
Kim, Kang, Lee, Wang, Kim, and Baek: Daily Flow Forecasting based on Deterministic and Stochastic Features

Abstract

Abstract

Hydrological time series have been forecasted using various models. A study employed nonlinear dynamical methods, namely the correlation dimension and dynamic vs. stochastic (DVS) algorithm, to analyze daily flow data characteristics. Analysis using the correlation dimension and DVS algorithm revealed that the daily streamflow observed from the St. Johns River near Cocoa, Florida, USA, exhibited chaotic characteristics, whereas the daily inflow to the Soyanggang Dam reservoir showed stochastic properties. However, the nonlinearity of the flows has not been investigated, and the stochastic models have not been fit for flow modeling and forecasting. Therefore, the present study tests the nonlinearity and fits stochastic models of the flow data. In addition, the forecasting results obtained from the DVS algorithm and neural networks were compared with those from the fitted stochastic models. The forecasting results derived from the DVS algorithm and neural networks demonstrated higher accuracy for daily streamflow than for daily inflow. Furthermore, when applying the AR (1) model to daily flow and ARMA (3, 1) model to daily inflow, the results showed that the chaotic nature of the daily streamflow yielded more accurate forecasts. The findings suggest that the dynamic structure inherent in hydrological time series may influence forecasting performance. Notably, the two flows exhibited nonlinearity based on BDS statistics, indicating that nonlinear time-series models may be more appropriate for analysis.

요지

수문 시계열은 다양한 모형을 이용하여 예측되어 왔는데 선행 연구에서는 카오스 분석 기법인 상관 차원과 DVS (Dynamic vs. Stochastic) 알고리즘을 사용하여 일유량 자료의 특성을 분석하였다. 미국 플로리다주 코코아 지역 근처 St. Johns River에서 관측된 일유량 시계열은 비선형 동역학적 특성을 보인 반면, 소양강댐 저수지로의 일유입량은 추계학적 특성을 보였다. 그러나 해당 연구는 자료의 비선형성과 유량 예측 및 모형화를 위한 추계학적 모형을 검토 하지 않았다. 따라서 본 연구에서는 유량 자료의 비선형성을 검정하고, 선정된 추계학적 모형 및 DVS 알고리즘과 인공신경망에 의한 예측 결과를 비교 분석하고자 하였다. 그 결과, DVS 알고리즘과 인공신경망을 통해 도출한 예측 결과는 추계학적 특성의 일유입량보다 일유량 자료에서 더 높은 예측 정확도를 보였다. 또한, AR (1)모형의 일유량과 ARMA (3, 1)모형에 의해 일유입량 자료를 예측한 결과에서도 일유량 자료의 예측 정확도가 일유입량 보다 더 좋게 도출되었다. 이는 수문 시계열에 내재된 카오스적 동적 구조가 예측 성능에 영향을 줄 수 있음을 시사한다고 판단하지만, BDS 통계에 의한 비선형성 검토에서 두 시계열 모두 비선형성을 보여 선형 보다는 비선형 모형이 분석 및 예측에 보다 더 적합할 수 있음을 보여준다.

1. Introduction

Hydrological time series encompass the temporal variations of various hydrological variables such as precipitation, runoff, and reservoir inflow, and have long attracted attention due to their inherently complex dynamic characteristics. Traditionally, linear time series models such as ARMA (autoregressive moving average) and ARIMA (autoregressive integrated moving average) have been widely used for the analysis and forecasting of hydrological phenomena. These models offer computational simplicity and interpretability, but often fall short in capturing the nonlinear, nonstationary, and chaotic properties found in natural systems (Butts et al., 2014; Sang et al., 2015; Di et al., 2019; Ombadi et al., 2021).
However, natural hydrological processes are governed by inherently nonlinear dynamics arising from interactions among rainfall-runoff responses, land surface conditions, and human interventions. Such complexity suggests that hydrological time series may possess deterministic chaos, which exhibits irregular yet structured patterns and is highly sensitive to initial conditions. This perspective has led many researchers to analyze hydrological series using tools from chaos theory to uncover hidden deterministic structures and enhance short- term prediction accuracy (Sivakumar and Singh, 2012; Ehret et al., 2014; Kedra, 2014; Bancheri et al., 2019).
Over the past decades, chaos-based models such as those using correlation dimension and phase space reconstruction have been applied to hydrological series to distinguish chaotic behavior from stochastic noise (Grassberger and Procaccia, 1983; Holzfuss and Mayer-Kress, 1986; Tsonis and Elsner, 1988; Rodriguez-Iturbe et al., 1989; Graf and Elbert, 1990; Sharifi et al., 1990; Barnett, 1993; Lall et al., 1996; Puente and Obregon, 1996; Sangoyomi et al., 1996; Porporato and Ridolfi, 1997; Kim et al., 2001; Kim et al., 2003; Paik et al., 2005; Salas et al., 2005; Kim and Kim, 2008; Kim et al., 2009, Kyoung et al., 2011; Kim et al., 2014; H.S. Kim et al., 2015; S. Kim et al., 2015). More recently, nonlinear time series analysis techniques, including the DVS (Deterministic Variable Selection) algorithm, have emerged as effective tools for forecasting chaotic systems by leveraging local dynamic structures.
Yu et al. (2025) demonstrated the potential of using quantitative tools for effectively analyzing the nonlinear dynamics of hydrological time series and provided a theoretical foundation for chaos-based forecasting models. Similarly, Li et al. (2014) applied six nonlinear analytical methods to runoff series and identified both the presence of low-dimensional chaos and the varying intensity of nonlinearity across multiple time scales.
In parallel, rapid advances in machine learning have enabled the use of artificial neural networks (ANNs) and deep learning models such as LSTM and GRU for hydrological prediction. These models are capable of learning complex temporal patterns from data, and often outperform traditional models in terms of predictive accuracy (Gao et al., 2020; Swagatika et al., 2024; Waqas and Humphries, 2024; Widiasari and Efendi, 2024). Nevertheless, their black-box nature makes it difficult to interpret physical dynamics, which can limit their utility in practice-oriented hydrological applications (Jung et al., 2021; Kim et al., 2022; Kwak et al., 2022; Liu et al., 2022; Cambria et al., 2023; Han et al., 2023; Wang et al., 2024; Zhang et al., 2025).
To provide a comprehensive understanding of prediction strategies under different data characteristics, this study applies and compares three forecasting methodologies: (1) a chaos- based approach using the DVS algorithm, (2) a neural network- based learning approach, and (3) a traditional statistical model, ARMA. By including ARMA models as a baseline, the study aims to assess the relative strengths and limitations of linear versus nonlinear and data-driven models in forecasting hydrological series.
This research focuses on two distinct time series: the daily streamflow at the St. Johns River near Cocoa, Florida (characterized by low-dimensional deterministic chaos), and the daily inflow to Soyang Reservoir in South Korea (showing stochastic properties). The study evaluates the predictability of these datasets using each modeling approach under multiple lead times.
Ultimately, the goal is to provide practical guidance on selecting appropriate forecasting techniques based on the underlying dynamic nature of the hydrological series—whether deterministic, stochastic, or nonlinear. Through this comparison, the study demonstrates how a model’s forecasting capability can vary significantly depending on the internal structure of the data.
While this study draws upon the analytical foundation established by Wang et al. (2019), which first applied chaos theory to distinguish the deterministic characteristics of the St. Johns River and Soyang inflow series using DVS and ANN models, our research advances this work by expanding the methodological comparison and rigorously evaluating forecast performance under varying lead times and model classes. Specifically, this study (1) systematically compares chaos-based, machine learning-based, and statistical models under identical datasets, (2) incorporates additional performance indicators to highlight model degradation over time, and (3) provides practical insights into model suitability based on the underlying system dynamics—deterministic chaos versus stochastic processes. This allows for a more nuanced understanding of when and how nonlinear models offer meaningful forecasting advantages over traditional methods.

2. Study Area and Application Data

Data sets used in this study are a daily streamflow at St. Johns River near Cocoa, Florida, USA (case-1; USGS-0223 2400), and a daily inflow series at Soyang Reservoir in Korea (case-2; https://www.water.or.kr). We obtained these data from Wang et al. (2019). The case-1 series was analyzed for the investigation of its chaotic behavior by Wang et al. (2019), and it showed deterministic chaos. The case-1 series consists of 12,784 measurements from January 1, 1954 to December 31, 1988. The St. Johns River is the longest river in Florida, stretching approximately 310 miles (about 500 km). Unlike most rivers in North America, it flows northward from central Florida to the Atlantic Ocean. The river basin covers around 8,840 square miles (about 22,900 km2), accounting for approximately 23% of Florida’s total area.
Another data set used consists of 8,776 measurements from January 1, 1974 to December 31, 1997, corresponding to the inflow series of the Soyang Reservoir. The Soyang Reservoir, located in Gangwon Province, South Korea, was formed by the construction of Soyang Dam, which was completed in 1973. It is the largest multi-purpose dam in Korea, serving for flood control, water supply, and hydroelectric power generation. The dam is 123 meters high and 530 meters long, with a total storage capacity of approximately 2.9 billion cubic meters. The time series plots are shown in Figs. 1 and 2.
Fig. 1
Time Series of Case-1
kosham-2025-25-3-9-g001.jpg
Fig. 2
Time Series of Case-2
kosham-2025-25-3-9-g002.jpg

3. Chaos Characterization and BDS Statistic

3.1 Chaotic Behavior of Flow Series

Wang et al. (2019) reconstructed two flow series in phase space using the delay method suggested by Packard et al., 1980; Takens, 1981). A single record of some observable xt, t=1,2,…,N, where N is data size can be reconstructed on m-dimensional phase space and obtained the attractor. This reconstruction takes the form shown in Eq. (1):
(1)
{x(t),x(t+τ),x(t+2τ),...,x(t+(m1)τ)}
where τ is the delay time. There are methods for the estimation such as C-C algorithm (Kim et al., 1999), autocorrelation function, and mutual information. We use the autocorrelation function for the convenience and simplicity in this study. Wang et al. (2019) investigated the chaotic behaviors of case-1 and case-2 by the estimation of the correlation dimensions. Streamflow series at St. Johns river near Cocoa, USA shows the correlation dimension of 3.305 and it is possible to say that the time series has a chaotic characteristic. On the other hand, in the case of inflow series at Soyang reservoir, the correlation dimension calculated is increasing as embedding dimension is increased and it may be difficult to conclude that the inflow series is chaos. Therefore, the inflow series may have stochastic property.

3.2 BDS Statistic and Nonlinearity Test

Linear and nonlinear models are used for testing their residuals by conventional nonparametric test statistics as well as by a new test statistic, called BDS statistic. Brock et al. (1991), Brock et al. (1996) studied the BDS statistic, which is based on the correlation integral, to test the null hypothesis that the data are independently and identically distributed (iid). The correlation integral is defined in Eq. (2):
(2)
C(m,N,r)=2M(M1)1<i<jMΘ(r||xixj||),r>0
where Θ(a)= 0, if a < 0
Θ(a)= 1, if a ≥ 0
N is the size of the data set, M=N-(m-1) is the number of embedded points in m-dimensional space. This test has been particularly useful for chaotic systems and nonlinear stochastic systems. Under the iid hypothesis, the BDS statistic is defined in Eq. (3) for m > 1:
(3)
BDS(m,N,r)=Mσ[C(m,N,r)=Cm(1,M,r)]
As N→∞, this statistic converges to a standard normal distribution. The asymptotic variance σ2 (m, N, r) and K (m, M, r) is estimated as described in Eq. (4):
(4)
σ2(m,M,r)=4m(m1)C2(m1)(KC2)+KmC2m                               +2i=1m1[C2i(Kmi)C2(mi)mC2(mi)(Kc2)]                                +2i=1m1[C2i(KmiC2(mi))mC2(mi)(KC2)]K(m,M,r)=6M(M1)(M2)1<i<jM[Θ(r||xixj||)Θ(r||xjxk||)]
The values of BDS statistic distinguish random time series from the time series generated by chaotic or nonlinear stochastic processes. But, even though the BDS statistic cannot be used to distinguish between a nonlinear deterministic system and a nonlinear stochastic system, we can know two flow series have nonlinear characteristics from Tables 1 and 2. Therefore, the case-1 may have nonlinear deterministic property and the case-2 may have nonlinear stochastic property.
Table 1
BDS Statistic (Case-1)
BDS Statistic Value of statistic 95% C.I
M R
2 0.5 s -4.32405 [-1.96, 1.96]
1.0 s -266.10230
1.5 s -555.91740
2.0 s -953.57390
3 0.5 s 134.97800
1.0 s -73.88440
1.5 s -220.66850
2.0 s -404.65710
4 0.5 s 253.49710
1.0 s -5.70750
1.5 s -111.64900
2.0 s -227.21160
5 0.5 s 423.39840
1.0 s 33.57927
1.5 s -59.61061
2.0 s -144.09760
Table 2
BDS Statistic (Case-2)
BDS Statistic Value of statistic 95% C.I
M R
2 0.5 s -168.74060 [-1.96, 1.96]
1.0 s -464.05220
1.5 s -781.33750
2.0 s -1,082.61500
3 0.5 s -38.73886
1.0 s -180.08350
1.5 s -325.4670
2.0 s -460.67340
4 0.5 s 3.47516
1.0 s -89.57717
1.5 s -178.87250
2.0 s -260.64140
5 0.5 s 24.41526
1.0 s -47.67996
1.5 s -111.02740
2.0 s -167.59330

4. Flow Forecasting and Results Analysis

4.1 DVS Algorithm and Forecasting

For a scalar time series xi = x1,x2,…,xN, the DVS algorithm attempts to fit models of the form shown in Eq. (5):
(5)
xi+Tf(xi,xiτ,...xi(m1)τ)
It is used a least-squares method to find the function f that gives the best prediction for xi+T in the sense that the function minimizes the squared error within the model class. The integers T and m define the following quantities.
  • T: lead time or prediction horizon (prediction time into the future)

  • m: embedding dimension or dimension of the reconstructed phase space (number of taps of the tapped delay line)

Furthermore, the m are combined in the delay vector xi. Here assuming equal spacing of the taps of the delay line, i.e., xi+T≈f(xi,xi-τ,…,xi-(m-1)τ), where τ is the lag time or lag spacing between each of the taps. After these definitions, the DVS algorithm is given by
  • (1) Normalize the time series to zero mean and unit variance.

  • (2) Divide the time series into two parts:

    1) a training set or fitting set x1,…,xNf used to evaluate the model. Nf denotes the number of points in the fitting set, Nt the number of points in the test set.
  • (3) Choose T and m

  • (4) Choose a test delay vector xi for a T-step-ahead forecasting task (i>Nf).

  • (5) Compute the distances dij of the test vector xi from the training vectors xj (for all j such that (m-1)τ<j<i-T)

  • (6) Order the distances dij

  • (7) Find the k nearest neighbors xj(1) through xj(k) of xi, and fit an affine model with coefficients α0,…,αm of the following form shown in Eq. (6):

(6)
xj+T(l)α0+n=1mαnxj(n1)τ(l),l=1,...,k2(m+1)<k<NfT(m1)τ
(8) Use the fitted model from step (7) to estimate a T-step-ahead forecast x^i+T(k) starting from the test vector, and compute its error ei+T(k)=|x^i+Txi+T|
(9) Repeat step (4) through (8) as (i+T) runs through the test set, and compute the mean absolute forecasting error as in Eq. (7):
(7)
Em(k)=(i+T)=1)Ntei+T(k)Nt
Vary the embedding dimension m, and plot the curves Em(k) as functions of the number of nearest neighbor (k). Such a plot of the family of curves is called DVS plot.
The name of above algorithm derives from the fact that the shapes of the resulting plots can provide evidence of low dimensional deterministic chaos, or of high dimensional or stochastic dynamics. Low dimensional chaos is typically characterized by U-shaped or monotonically increasing plots whose minimum Em(k) values are small and occur at low values of k. High dimensional or stochastic behavior is often indicated by relatively large minimum Em(k) values occurring at high k values (Casdagli, 1992).
Wang et al. (2019) used DVS algorithm for the property examination and forecasting of two flow series and they found the same properties as this study found in the correlation dimension from the relation of Em(k) and k values. Figs. 3 and 4 obtained from Wang et al. (2019) show daily streamflow series at St. Johns river near Cocoa has a chaotic characteristic and daily inflow series at Soyang reservoir has no chaotic.
Fig. 3
DVS Plot for Case-1 (T = 1 day)
kosham-2025-25-3-9-g003.jpg
Fig. 4
DVS Plot for Case-2 (T = 1 day)
kosham-2025-25-3-9-g004.jpg
The results of the DVS plots show the best k and m. Based on the local linear approximation method (Farmer and Sidorrwich, 1987) with the best k and m, the forecasting is performed. The DVS algorithm has 301 days test sets of two daily flow series. The remaining data series are training sets. Because the DVS algorithm makes the relationship among the peak flows and among the low flows, the effect of the magnitude of data sets for forecast error may be small. Figs. 5 and 6 show the relationship between the observed and the forecasted values for each lead times (T = 1, 10, 20 days).
Fig. 5
Relationship between Observed and Forecasted Values by DVS Algorithm (Case-1)
kosham-2025-25-3-9-g005.jpg
Fig. 6
Relationship between Observed and Forecasted Values by DVS Algorithm (Case-2)
kosham-2025-25-3-9-g006.jpg
The chaotic streamflow time series shows that the correlation coefficients between the forecasted and the observed values are 0.9995 and the inflow time series shows 0.6311 for 1 day-ahead lead time (Figs. 5 and 6). As the lead time is increased the accuracy of the forecast is decreased. Chaotic streamflow at St. Johns river near Cocoa, shows more accurate than inflow in their correlation coefficients. The forecasting results of the lead time of 10, 20 day-ahead for chaotic streamflow are also relatively satisfactory.

4.2 Neural Network Forecasting

To improve prediction accuracy, this study applies a neural network approach, which is a widely adopted method in the field of artificial intelligence. A feedforward neural network with three layers of neurons was employed and trained using the backpropagation algorithm. The network consists of an input layer, one or more hidden layers, and an output layer, where information flows in one direction and errors are propagated backward during the training process to update weights.
In this study, streamflow forecasting is formulated as a nonlinear mapping problem where previous observations are used to estimate future values. The model can be expressed mathematically as Eq. (8):
(8)
Q^(t)=f[Q(t1),...,Q(tnQ)]
In this equation, Q^(t) represents the predicted streamflow at time t, while f denotes the nolinear function that captures the relationship between the past and future values of the series. The parameter nQ is the number of past observations used as input. In this case, five previous flow values [Q(t-1), Q(t-2), Q(t-3), Q(t-4), Q(t-5)] are utilized to predict the current value [Q(t)], forming a single input-output pair for training the network. The relationship between the inputs and the outputs is shown in Eq. (8).
In the results of Wang et al. (2019), when using a neural network for 1 day-ahead lead time, the chaotic streamflow time series shows that the correlation coefficients are 0.9994 and the non-chaotic inflow time series are 0.6286. The neural network also shows accurate forecasting results and low forecasting error for chaotic streamflow series. The results for the lead time of 10, 20 day-ahead are also relatively satisfactory even though the result based on the DVS algorithm is a little better. However, in daily inflow series at Soyang reservoir the forecasting results show relatively lower performance as we can see in Tables 3 and 4, and Figs. 7 and 8.
Table 3
Forecasting Results for Case-1 Based on Neural Network
Observed T = 1 day T = 10 day T = 20 day
Mean (cfs) 1,110.0166 1,128.3836 1,268.0264 1,262.9949
Standard dev. 1,172.8175 1,200.7887 1,238.7293 1,042.6037
Peak (cfs) 5,390 5,403.1866 5,020.2897 4,177.0611
Peak time (day) 3 1 9 19
Volume (ft3) 2.887*1010 2.935*1010 3.297*1010 3.285*1010
AMB 32.1488 264.9647 477.5641
RMSE 51.9367 366.3604 701.8710
RRMSE 0.0322 0.2271 0.4350
MRE 0.0303 0.3139 0.5832
Correlation coef. 0.9994 0.9638 0.8144
Table 4
Forecasting Results for Case-2 Based on Neural Network
Observed T = 1 day T = 10 day T = 20 day
Mean (cms) 78.2837 50.6108 77.9351 32.2398
Standard dev. 131.1263 39.7390 6.2353 3.1620
Peak (cms) 1,023.5 336.954 122.822 54.9504
Peak time (min) 64 65 74 84
Volume (m3) 2.036*109 1.316*109 2.027*109 0.838*109
AMB 41.1067 71.6665 57.2552
RMSE 113.7866 130.2752 138.5819
RRMSE 0.7460 0.8541 0.9086
MRE 0.6436 2.0956 0.7301
Correlation coef. 0.6286 0.0686 0.0751
Fig. 7
Relationship between Observed and Forecasted Values by Neural Network (Case-1)
kosham-2025-25-3-9-g007.jpg
Fig. 8
Relationship between Observed and Forecasted Values by Neural Network (Case-2)
kosham-2025-25-3-9-g008.jpg
To evaluate the prediction performance, several statistical error metrics were used: AMB (Absolute Mean Bias), which indicates the average size of bias; RMSE (Root Mean Squared Error), which emphasizes larger errors; RRMSE (Relative RMSE), which normalizes RMSE by the mean of observations for comparability; MRE (Mean Relative Error), which reflects the average proportional error; and the correlation coefficient (R), which shows the strength of the linear relationship between observed and predicted values.

4.3 ARMA Model Forecasting

This study selects the appropriate time series models for the two flow series using Akaike Information Criterion (AIC) and Schwarz Bayesian Criterion (SBC). As shown in Table 5, the AR (1) model was selected for the streamflow series at the St. Johns River near Cocoa (case-1) based on the lowest AIC and SBC values among candidate models. For the inflow series at Soyang Reservoir (case-2), the ARMA (3, 1) model was selected according to Table 6.
Table 5
Streamflow Series at St. Johns River Near Cocoa
Model AIC SBC
AR (1) -57,033.738 -57,018.826
AR (2) -60,451.587 -60,429.219
AR (3) -60,606.386 -60,576.562
AR (14) -60,866.959 -60,755.12
AR (15) -60,878.465 -60,759.17
AR (16) -60,876.471 -60,749.72
ARMA (1, 1) -59,572.22 -59,549.852
ARMA (2, 1) -60,699.823 -60,669.999
ARMA (3, 1) -60,846.107 -60,808.827
Table 6
Inflow Series at Soyang Reservoir
Model AIC SBC
AR (1) 3,503.0533 3,517.2065
AR (2) 3,087.4847 3,108.7145
AR (3) 2,967.7949 2,996.1012
AR (11) 2,778.3537 2,863.2726
AR (12) 2,774.8127 2,866.8083
AR (13) 2,774.0828 2,873.1550
AR (14) 2,774.1724 2,880.3211
ARMA (1, 1) 2,940.9157 2,962.14539
ARMA (2, 1) 2,776.7955 2,805.1018
ARMA (3, 1) 2,772.8937 2,808.2766
The forecasting results using AR (1) and ARMA (3, 1) models were compared with the results from the DVS algorithm and neural networks for each lead time (T = 1, 10, and 20 days). For case-1, the AR (1) model showed high accuracy with a correlation coefficient of R = 0.9994 at T = 1 day, comparable to the results of the DVS algorithm (R = 0.9995) and neural network model (R = 0.9994). However, the accuracy decreased as the lead time increased, with R = 0.8103 at T = 20 days.
In contrast, the performance of the ARMA (3, 1) model for case-2 (Soyang Reservoir) was relatively poor. At T = 1 day, the correlation coefficient was R = 0.6139, and the forecasting accuracy dropped sharply to R = 0.1328 at T = 10 days and R = 0.0695 at T = 20 days. This trend was consistent with the results obtained from the DVS and neural network models, which also showed lower performance for the stochastic inflow series compared to the chaotic streamflow series.
In the forecasting results using ARMA models, the AR (1) model provided highly accurate results for the chaotic streamflow series, while the ARMA (3, 1) model showed limited forecasting capability for the stochastic inflow series. The forecasting performance of both models across different lead times is summarized in Tables 7 and 8, and the relationship between observed and forecasted values is illustrated in Figs. 9 and 10. As shown, the AR (1) model captures the trend of the observed values well, especially at shorter lead times, whereas the ARMA (3, 1) model struggles to predict the inflow dynamics at the Soyang Reservoir, with forecasting performance deteriorating rapidly as the lead time increases.
Table 7
Forecasting Results for Case-1 Based on AR (1)
Observed T = 1 day T = 10 day T = 20 day
Mean (cfs) 1,110.0166 1,125.9961 1,258.8694 1,336.1104
Standard dev. 1,172.8175 1,193.8547 1,346.3631 1,355.1663
Peak (cfs) 5,390 5,410.03 5,453.81 5,360.43
Peak time (day) 3 1 9 19
Volume (ft3) 2.887*1010 2.928*1010 3.274*1010 3.475*1010
AMB 29.1589 261.8785 540.1991
RMSE 47.4525 403.6798 828.9676
RRMSE 0.0294 0.2502 0.5138
MRE 0.0285 0.2313 0.4783
Correlation coef. 0.9994 0.9651 0.8103
Table 8
Forecasting Results for Case-2 Based on ARMA (3, 1)
Observed T = 1 day T = 10 day T = 20 day
Mean (cms) 78.2837 61.1076 43.1531 34.9857
Standard dev. 131.1263 70.0456 33.7713 22.5125
Peak (cms) 1,023.5 432.22 180.59 117.25
Peak time (min) 64 121 133 143
Volume (m3) 2.036*109 1.589*109 1.122*109 9.099*108
AMB 33.6009 56.4928 61.4458
RMSE 105.2654 135.4000 138.2088
RRMSE 0.6905 0.8881 0.9065
MRE 0.3389 0.7268 0.8835
Correlation coef. 0.6139 0.1328 0.0695
Fig. 9
Relationship between Observed and Forecasted Values by AR (1) (Case-1)
kosham-2025-25-3-9-g009.jpg
Fig. 10
Relationship between Observed and Forecasted Values by ARMA (3, 1) (Case-2)
kosham-2025-25-3-9-g010.jpg

5. Conclusions

This study analyzed the dynamic characteristics of daily streamflow time series from the St. Johns River and the Soyang Reservoir using nonlinear dynamic techniques, including correlation dimension and the DVS algorithm, as well as neural network and ARMA-based forecasting models. The comparative results revealed distinct behaviors between the two time series and highlighted the impact of these characteristics on forecasting performance.
The St. Johns River data exhibited low-dimensional deterministic chaos, as confirmed by both correlation dimension analysis and DVS plots. Forecasting using the DVS algorithm and neural networks produced high accuracy, particularly for short-term lead times, with correlation coefficients exceeding 0.99 for a 1-day lead time. Similarly, the AR (1) model showed comparable performance, suggesting that linear models may still be effective when underlying dynamics are well captured.
Additionally, a detailed examination of the prediction accuracy metrics across lead times reveals notable trends. For the St. Johns River dataset, all three models—DVS, Neural Network, and AR (1)—maintained strong performance at T = 1 day with RMSE values around 47-52 and correlation coefficients above 0.999. However, as the lead time increased to T = 20 days, RMSE values rose significantly (e.g., 828.96 for AR (1)), and correlation coefficients declined to approximately 0.81. The RRMSE also doubled or tripled across lead times, reflecting growing relative errors over time. This illustrates that even in chaotic systems, predictive accuracy gradually deteriorates with longer horizons, though performance remains reasonably acceptable.
In contrast, the inflow series at the Soyang Reservoir displayed stochastic and nonlinear characteristics, as indicated by gradually increasing correlation dimension values and strong deviations in the BDS statistic. Forecasting accuracy for this dataset was relatively lower across all methods, including DVS, neural networks, and ARMA (3, 1), with correlation coefficients decreasing significantly as the lead time increased. In particular, RMSEs exceeded 100 across all lead times, and correlation coefficients dropped below 0.1 by T = 20 days, while RRMSE values surpassed 0.9. These results highlight that the stochastic nature of the inflow series imposes fundamental limitations on predictive accuracy, and such sensitivity is exacerbated with increasing lead times. This underlines the critical need to consider the underlying system dynamics when interpreting performance metrics and selecting appropriate forecasting methods.
Overall, the findings underscore the importance of identifying the intrinsic dynamic structure of hydrological time series when selecting forecasting approaches. While chaos-based methods and shallow neural networks offer strong predictive capabilities for systems with deterministic nonlinearity, their performance declines for highly stochastic series. Conversely, conventional models such as ARMA may provide stable, though limited, accuracy for short-term forecasts in stochastic systems.
Future work should explore hybrid modeling approaches that combine stochastic frameworks with deep learning architectures or chaos-informed structures to improve long- term forecasting accuracy. Additionally, integrating external hydrological drivers, such as precipitation and land-use change data, could enhance model interpretability and robustness in operational forecasting.

References

1. Bancheri, M, Serafin, F, and Rigon, R (2019) The representation of hydrological dynamical systems using extended petri nets (EPN). Water Resources Research, Vol. 55, No. 11, pp. 8895-8921 doi:10.1029/2019WR025099.
crossref pdf
2. Barnett, K.D (1993) On the estimation of the correlation dimension and its application to radar reflector discrimination, NASA Contractor Report 4564, DOT/ FAA/RD-93/41.

3. Brock, W.A, Hsieh, D, and LeBaron, B (1991). Nonlinear dynamics, chaos and instability:Statistical theory and ecnomic evidence. Cambridge MA: MIT Press.

4. Brock, W.A, Scheinkman, J.A, Dechert, W.D, and LeBaron, B (1996) A test for independence based on the correlation dimension. Econometrics Review, Vol. 15, No. 3, pp. 197-235 doi:10.1080/07474939608800353.
crossref
5. Butts, M, Drews, M, Larsen, M.A.D, Lerer, S, Rasmussen, S.H, and Grooss, J (2014) Embedding complex hydrology in the regional climate system - Dynamic coupling across different modelling domains. Advances in Water Resources, Vol. 74, pp. 166-184 doi:10.1016/j.advwatres.2014.09.004.
crossref
6. Cambria, E, Malandri, L, Mercorio, F, Mezzanzaniva, M, and Nobani, N (2023) A survey on XAI and natural language explanations. Information Processing &Management, Vol. 60, No. 1, pp. 103111 doi:10.1016/j.ipm.2022.103111.
crossref
7. Casdagli, M (1992) Chaos and deterministic versus stochastic non-linear modeling. J. R. Statist. Soc. B, Vol. 54, No. 2, pp. 303-328 doi:10.1111/j.2517-6161.1992.tb01884.x.
crossref pdf
8. Di, C, Wang, T, Istanbulluoglu, E, Jayawardena, A.W, Li, S, and Chen, X (2019) Deterministic chaotic dynamics in soil moisture across Nebraska. Journal of Hydrology, Vol. 578, pp. 124048 doi:10.1016/j.jhydrol.2019.124048.
crossref
9. Ehret, U, Gupta, H.V, Sivapalan, M, Weijs, S.V, Schymanski, S.J, and Blöschl, G (2014) Advancing catchment hydrology to deal with predictions under change. Hydrol. Earth Syst. Sci, Vol. 18, No. 2, pp. 649-671 doi:10.5194/hess-18-649-2014.
crossref
10. Farmer, J.D, and Sidorrwich, J.J (1987) Predicting chaotic time series. Physical Review Letters, Vol. 59, pp. 845-848 doi:10.1103/PhysRevLett.59.845.
crossref pmid
11. Gao, S, Huang, Y, Zhang, S, Han, J, Wang, G, Zhang, M, and Lin, Q (2020) Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation. Journal of Hydrology, Vol. 589, pp. 125188 doi:10.1016/j.jhydrol.2020.125188.
crossref
12. Graf, K.E, and Elbert, T (1990) Dimensional analysis of the waking EEG. Chaos in Brain Function, pp. 135-152 doi:10.1007/978-3-642-75545-3_11.
crossref pmid
13. Grassberger, P, and Procaccia, I (1983) Measuring the strangeness of strange attractors. Physica D:Nonlinear Phenomena, Vol. 9, No. 1-2, pp. 189-208 doi:10.1016/0167-2789(83)90298-1.
crossref
14. Han, H, Kim, D, Wang, W, and Kim, H.S (2023) Dam inflow prediction using large-scale climate variability and deep learning approach:A case study in South Korea. Water Supply, Vol. 23, No. 2, pp. 934-947 doi:10.2166/ws.2023.012.
crossref pdf
15. Holzfuss, J, and Mayer-Kress, G (1986) An approach to error-estimation in the application of dimension algorithms. Dimensions and Entropies in Chaotic Systems, Vol. 32, pp. 114-122 doi:10.1007/978-3-642-71001-8_15.
crossref pmid
16. Jung, J, Han, H, Kim, K, and Kim, H.S (2021) Machine learning-based small hydropower potential prediction under climate change. Engeries, Vol. 14, No. 12, pp. 3643-3653 doi:10.3390/en14123643.
crossref
17. Kedra, M (2014) Deterministic chaotic dynamics of raba river flow (Polish Carpathian Mountains). Journal of Hydrology, Vol. 509, No. 13, pp. 474-503 doi:10.1016/j.jhydrol.2013.11.055.
crossref
18. Kim, H.S, Eykholt, R, and Salas, J.D (1999) Nonlinear dynamics, delay times, and embedding windows. Physica D :Nonlinear Phenomena, Vol. 127, No. 1-2, pp. 48-60 doi:10.1016/S0167-2789(98)00240-1.
crossref
19. Kim, H.S, Kang, D.S, and Kim, J.H (2003) The BDS statistic and residual test. Stochastic Environmental Research and Risk Assessment, Vol. 17, No. 1-2, pp. 104-115 doi:10.1007/s00477-002-0118-0.
crossref pdf
20. Kim, H.S, Lee, K.H, Kyoung, M.S, Sivakumar, B, and Lee, E.T (2009) Measuring nonlinear dependence in hydrologic time series. Stochastic Environmental Research and Risk Assessment, Vol. 23, No. 7, pp. 907-916 doi:10.1007/s00477-008-0268-9.
crossref pdf
21. Kim, H.S, Park, J, Yoo, J, and Kim, T.W (2015) Assessment of drought hazard, vulnerability, and risk:A case study for administrative districts in South Korea. Journal of Hydro-environment Research, Vol. 9, No. 1, pp. 28-35 doi:10.1016/j.jher.2013.07.003.
crossref
22. Kim, H.S, Yoon, Y.N, Kim, J.H, and Kim, J.H (2001) Searching for strange attractor in wastewater flow. Stochastic Environmental Research and Risk Assessment, Vol. 15, No. 5, pp. 399-413 doi:10.1007/s004770100078.
crossref pdf
23. Kim, J, Lee, H, Lee, M, Han, H, Kim, D, and Kim, H.S (2022) Development of a deep learning-based prediction model for water consumption at the household level. Water, Vol. 14, No. 9, pp. 1512 doi:10.3390/w14091512.
crossref
24. Kim, S, Kim, Y, Lee, J, and Kim, H.S (2015) Identifying and evaluating chaotic behavior in hydro-meteorological processes. Advances in Meteorology, Vol. 2015, pp. 195940 doi:10.1155/2015/195940.
crossref pdf
25. Kim, S, Noh, H, Kang, N, Lee, K.H, Kim, Y.S, and Lim, S (2014) Noise reduction analysis of radar rainfall using chaotic dynamics and filtering techniques. Advances in Meteorology, Vol. 2014, pp. 517571 doi:10.1155/2014/517571.
crossref pdf
26. Kim, S.W, and Kim, H.S (2008) Uncertainty reduction of the flood stage forecasting using neural networks model. Journal of The American Water Resources Association, Vol. 44, No. 1, pp. 148-165 doi:10.1111/j.1752-1688.2007.00144.x.
crossref
27. Kwak, J, Han, H, Kim, S, and Kim, H.S (2022) Is the deep-learning technique a completely alternative for the hydrological model?:A case study on hyeongsan river basin, Korea. Stochastic Environmental Research and Risk Assessment, Vol. 36, pp. 1615-1629 doi:10.1007/s00477-021-02094-x.
crossref pdf
28. Kyoung, M, Kim, H.S, Sivakumar, B, Singh, V.P, and Ahn, K.S (2011) Dynamic characteristics of monthly rainfall in the Korean peninsular under climate change. Stochastic Environmental Research and Risk Assessment, Vol. 25, No. 4, pp. 613-625 doi:10.1007/s00477-010-0425-9.
crossref pdf
29. Lall, U, Sangoyomi, T, and Abarbanel, H.D.I (1996) Nonlinear dynamics of the great salt lake:Nonparametric short-term forecasting. Water Resources Research, Vol. 32, No. 4, pp. 975-985 doi:10.1029/95WR03402.
crossref pdf
30. Li, X, Gao, G, Hu, T, Ma, H, and Li, T (2014) Multiple time scales analysis of runoff series based on the chaos theory. Desalination and Water Treatment, Vol. 52, No. 13-15, pp. 2741-2749 doi:10.1080/19443994.2013.813667.
crossref
31. Liu, G, Tang, Z, Qin, H, Liu, S, Shen, Q, Qu, Y, and Zhou, J (2022) Short-term runoff prediction using deep learning multi-dimensional ensemble method. Journal of Hydrology, Vol. 609, pp. 127762 doi:10.1016/j.jhydrol.2022.127762.
crossref
32. Ombadi, M, Nguyen, P, Sorooshian, S, and Hsu, K.L (2021) Complexity of hydrologic basins:A chaotic dynamics perspective. Journal of Hydrology, Vol. 597, pp. 126222 doi:10.1016/j.jhydrol.2021.126222.
crossref
33. Packard, N.H, Curtchfield, J.P, Farmer, J.D, and Shaw, R.S (1980) Geometry from a time series. Physical Review Letters, Vol. 45, pp. 712-716 doi:10.1103/PhysRevLett.45.712.
crossref
34. Paik, K.R, Kim, J.H, Kim, H.S, and Lee, D.R (2005) A conceptual rainfall-runoff model considering seasonal variation. Hydrological Processes, Vol. 19, pp. 3837-3850 doi:10.1002/hyp.5984.
crossref
35. Porporato, A, and Ridolfi, L (1997) Nonlinear analysis of river flow time sequences. Water Resources Research, Vol. 33, No. 6, pp. 1353-1367 doi:10.1029/96WR03535.
crossref pdf
36. Puente, C.E, and Obregon, N (1996) A deterministic geometric representation of temporal rainfall:Results for a storm in Boston. Water Resources Research, Vol. 32, No. 9, pp. 2825-2839 doi:10.1029/96WR01466.
crossref pdf
37. Rodriguez-Iturbe, I, Power, B.F.D, Sharifi, M.B, and Georgakakos, K.P (1989) Chaos in rainfall. Water Resources Research, Vol. 25, No. 7, pp. 1667-1675 doi:10.1029/WR025i007p01667.
crossref pdf
38. Salas, J.D, Kim, H.S, Eykholt, R, Burlando, P, and Green, T (2005) Aggregation and sampling in deterministic chaos:Implications on the dynamics of hydrological processes. Nonlinear Processes in Geophysics, Vol. 12, pp. 557-567 doi:10.5194/npg-12-557-2005.
crossref
39. Sang, Y.F, Singh, V.P, Wen, J, and Liu, C (2015) Gradation of complexity and predictability of hydrological processes. JGR Atmospheres, Vol. 120, No. 11, pp. 5334-5343 doi:10.1002/2014JD022844.
crossref pdf
40. Sangoyomi, T.B, Lall, U, and Abarbanel, H.D.I (1996) Nonlinear dynamics of the great salt lake:Dimension estimation. Water Resources Research, Vol. 32, No. 1, pp. 149-159 doi:10.1029/95WR02872.
crossref pdf
41. Sharifi, M.B, Georgakakos, K.P, and Rodriguez-Iturbe, I (1990) Evidence of deterministic chaos on the pulse of storm rainfall. J. of Atmos. Sci, Vol. 47, pp. 888-893.
crossref
42. Sivakumar, B, and Singh, V.P (2012) Hydrologic system complexity and nonlinear dynamic concepts for a catchment classification framework. Hydrol. Earth Syst. Sci, Vol. 16, No. 11, pp. 4119-4131 doi:10.5194/hess-16-4119-2012.
crossref
43. Swagatika, S, Paul, J.C, Sahoo, B.B, Gupta, S.K, and Singh, P.K (2024) Improving the forecasting accuracy of monthly runoff time series of the brahmani river in india using a hybrid deep learning model. Water &Climate Change, Vol. 15, No. 1, pp. 139-156 doi:10.2166/wcc.2023.487.
crossref pdf
44. Takens, F (1981) Detecting strange attractors in turbulence. In: Rand D.A, Young L.S, eds. Lecture notes in mathematics, Vol. 898, pp. 336-381 doi:10.1007/BFb0091924.
crossref
45. Tsonis, A.A, and Elsner, J.B (1988) The weather attractor over very short time scales. Nature, Vol. 333, pp. 545-547 doi:10.1038/333545a0.
crossref pdf
46. Wang, S, Liu, Y, Wang, W, Zhao, G, and Liang, H (2024) Interpretable machine learning guided by physical mechanisms reveals drivers of runoff under dynamic land use changes. Journal of Environmental Management, Vol. 367, pp. 121978 doi:10.1016/j.jenvman.2024.121978.
crossref pmid
47. Wang, W.J, Yoo, Y.H, Lee, M.J, Bae, Y.H, and Kim, H.S (2019) Analysis of chaos characterization and forecasting of daily streamflow. Journal of Wetlands Research, Vol. 21, No. 3, pp. 236-243 doi:10.17663/JWR.2019.21.3.236.

48. Waqas, M, and Humphries, U.W (2024) A critical review of RNN and LSTM variants in hydrological time series predictions. MethodsX, Vol. 13, pp. 102946 doi:10.1016/j.mex.2024.102946.
crossref pmid pmc
49. Widiasari, I.R, and Efendi, R (2024) Utilizing LSTM-GRU for IOT-based water level prediction using multi- variable rainfall time series data. Informatics, Vol. 11, No. 4, pp. 73 doi:10.3390/informatics11040073.
crossref
50. Yu, S, Wang, W, Liang, H, Zhang, Y, and Liu, M (2025) Characterizing the nonlinear dynamics of hydrological systems based on global recurrence analysis. Journal of Hydrology, Vol. 654, pp. 132817 doi:10.1016/j.jhydrol.2025.132817.
crossref
51. Zhang, T, Zhang, R, Li, J, and Feng, P (2025) Deep learning of flood forecasting by considering interpretability and physical constraints. Hydrol. Earth Syst. Sci, hess-2024-393. doi:10.5194/hess-2024-393.
crossref


ABOUT
ARTICLE CATEGORY

Browse all articles >

BROWSE ARTICLES
AUTHOR INFORMATION
Editorial Office
1014 New Bldg., The Korea Science Technology Center, 22 Teheran-ro 7-gil(635-4 Yeoksam-dong), Gangnam-gu, Seoul 06130, Korea
Tel: +82-2-567-6311    Fax: +82-2-567-6313    E-mail: master@kosham.or.kr                

Copyright © 2026 by The Korean Society of Hazard Mitigation.

Developed in M2PI

Close layer
prev next