Abstract
Cross-sample entropy (CSE) allows to analyze the association level between two time series that are not necessarily stationary. The current criteria to estimate the CSE are based on the normality assumption, but this condition is not necessarily satisfied in reality. Also, CSE calculation is based on a tolerance and an embedding dimension parameter, which are defined rather subjectively. In this paper, we define a new way of estimating the CSE with a nonparametric approach. Specifically, a residual-based bootstrap-type estimator is considered for long-memory and heteroskedastic models. Subsequently, the established criteria are redefined for the approach of interest for generalization purposes. Finally, a simulation study serves to evaluate the performance of this estimation technique. An application to foreign exchange market data before and after the 1999 Asian financial crisis was considered to study the synchrony level of the CAD/USD and SGD/USD foreign exchange rate time series. A bootstrap-type method allowed to obtain a more realistic estimation of the cross-sample entropy (CSE) statistics. Specifically, estimated CSE was slightly different than that obtained in previous studies, but for both periods the synchrony level using CSE between the time series was higher after the 1999 Asian financial crisis.
Similar content being viewed by others
Data availability
Data used in this paper will be made available upon reasonable request from the corresponding author.
References
Abramson, A., Cohen, I.: On the stationarity of Markov-switching GARCH processes. Econom. Theor. 23, 485–500 (2007)
Al-Eyd, A.J., Karasulu, M.: Ambition versus gradualism in disinflation horizons under bounded rationality: The case of Chile. NIESR discussion papers (2008-03) (2008)
Ali, A., Khan, S.A., Khalil, A.U., Khan, D.M.: Bootstrap prediction intervals for time series with hetroscedastic errors. Pak. J. Stat. 33, 1–13 (2017)
Anděl, J., Netuka, I., Zvára, K.: On threshold autoregressive processes. Kybernetika 20, 89–106 (1984)
Baillie, R.T., Bollerslev, T., Mikkelsen, H.O.: Fractionally integrated generalized autoregressive conditional heteroskedasticity. J. Econ. 74, 3–30 (1996)
Bhattacharyya, R., Hossain, S.A., Kar, S.: Fuzzy cross-entropy, mean, variance, skewness models for portfolio selection. J. King Saud Univ. Comput. Inf. Sci. 26, 79–87 (2014)
Bisaglia, L., Guégan, D.: A comparison of techniques of estimation in long-memory processes. Comput. Stat. Data Anal. 27, 61–81 (1998)
Bollerslev, T.: Generalized autoregressive conditional heteroskedasticity. J. Econ. 31, 307–327 (1986)
Contreras-Reyes, J.E.: Asymptotic form of the Kullback-Leibler divergence for multivariate asymmetric heavy-tailed distributions. Physica A 395, 200–208 (2014)
Contreras-Reyes, J.E.: Mutual information matrix based on asymmetric Shannon entropy for nonlinear interactions of time series. Nonlin. Dyn. 104, 3913–3924 (2021)
Contreras-Reyes, J.E., Palma, W.: Statistical analysis of autoregressive fractionally integrated moving average models in R. Comput. Stat. 28, 2309–2331 (2013)
Contreras-Reyes, J.E., Idrovo-Aguirre, B.J.: Backcasting and forecasting time series using detrended cross-correlation analysis. Physica A 560, 125109 (2020)
Dahlhaus, R.: Efficient parameter estimation for self-similar processes. Ann. Stat. 17, 1749–1766 (1989)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. CRC Press, Boca Raton (1994)
Franco, G.C., Reisen, V.A.: Bootstrap techniques in semiparametric estimation methods for ARFIMA models: a comparison study. Comput. Stat. 19, 243–259 (2004)
Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101, e215–e220 (2000)
Granger, C.W., Joyeux, R.: An introduction to long-memory time series models and fractional differencing. J. Time Ser. Anal. 1, 15–29 (1980)
Hall, P., Horowitz, J.L., Jing, B.Y.: On blocking rules for the bootstrap with dependent data. Biometrika 82, 561–574 (1995)
Hernández-Santoro, C., Contreras-Reyes, J.E., Landaeta, M.F.: Intra-seasonal variability of sea surface temperature influences phenological decoupling in anchovy ( Engraulis ringens). J. Sea Res. 152, 101765 (2019)
Idrovo-Aguirre, B.J., Contreras-Reyes, J.E.: The response of housing construction to a copper price shock in Chile (2009–2020). Economies 9, 98 (2021)
Jamin, A., Humeau-Heurtier, A.: (Multiscale) Cross-entropy methods: a review. Entropy 22, 45 (2020)
Jeong, M.: Residual-based GARCH bootstrap and second order asymptotic refinement. Econom. Theor. 33, 779–790 (2017)
Karmakar, C., Udhayakumar, R., Palaniswami, M.: Entropy profiling: a reduced-parametric measure of Kolmogorov-Sinai entropy from short-term HRV signal. Entropy 22, 1396 (2020)
Lake, D.E., Richman, J.S., Griffin, M.P., Moorman, J.R.: Sample entropy analysis of neonatal heart rate variability. Amer. J. Physiol. Heart C 283, R789–R797 (2002)
Li, B., Han, G., Jiang, S., Yu, Z.: Composite multiscale partial cross-sample entropy analysis for quantifying intrinsic similarity of two time series affected by common external factors. Entropy 22, 1003 (2020)
Liu, L.Z., Qian, X.Y., Lu, H.Y.: Cross-sample entropy of foreign exchange time series. Physica A 389, 4785–4792 (2010)
Fisher, T.J., Gallagher, C.M.: New weighted portmanteau statistics for time series goodness of fit testing. J. Amer. Stat. Assoc. 107, 777–787 (2012)
Matlab (2017). MATLAB, version 9.2.0.538062. The MathWorks Inc., Natick, Massachusetts, USA
Palma, W.: Long-memory time series, theory and methods. Wiley, Hoboken (2007)
Palma, W.: Time Series Analysis. Wiley, Hoboken (2016)
R Core Team (2020). A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.R-project.org
Ray, S., Das, S.S., Mishra, P., Al Khatib, A.M.G.: Time series SARIMA Modelling and forecasting of monthly rainfall and temperature in the south Asian countries. Envir., in press, Earth Sys (2021). https://doi.org/10.1007/s41748-021-00205-w
Richman, J.S., Moorman, J.R.: Physiological time-series analysis using approximate entropy and sample entropy. Amer. J. Physiol. Heart C. 283, R789–R797 (2000)
Shang, D., Shang, P., Zhang, Z.: Efficient synchronization estimation for complex time series using refined cross-sample entropy measure. Commun. Nonlinear Sci. 94, 105556 (2021)
Shi, W., Shang, P.: Cross-sample entropy statistic as a measure of synchronism and cross-correlation of stock markets. Nonlin. Dyn. 71, 539–554 (2013)
Shimizu, K.: Bootstrapping stationary ARMA-GARCH models. Vieweg+ Teubner (2010)
Silverman, B.W.: Density estimation for statistics and data analysis, vol. 26. CRC Press, Boca Raton (1986)
Sun, Z., Fisher, T.J.: Testing for correlation between two time series using a parametric bootstrap. J. Appl. Stat., in press (2020)
Udhayakumar, R.K., Karmakar, C., Palaniswami, M.: Approximate entropy profile: a novel approach to comprehend irregularity of short-term HRV signal. Nonlin. Dyn. 88, 823–837 (2017)
Valipour, M., Bateni, S.M., Gholami Sefidkouhi, M.A., Raeini-Sarjaz, M., Singh, V.P.: Complexity of forces driving trend of reference evapotranspiration and signals of climate change. Atmosphere 11, 1081 (2020)
Wand, M.P., Jones, M.C.: Kernel Smoothing. CRC Press, Boca Raton (1994)
Wang, G.J., Xie, C., Han, F.: Multi-scale approximate entropy analysis of foreign exchange markets efficiency. Sys. Eng. Proc. 3, 201–208 (2012)
Wang, F., Zhao, W., Jiang, S.: Detecting asynchrony of two series using multiscale cross-trend sample entropy. Nonlin. Dyn. 99, 1451–1465 (2020)
Wasserman, L.: All of statistics: a concise course in statistical inference. Springer Science & Business Media (2013)
Whittle, P.: Estimation and information in stationary time series. Arkiv för Matematik 2, 423–434 (1953)
Xie, H.B., Zheng, Y.P., Guo, J.Y., Chen, X.: Cross-fuzzy entropy: A new method to test pattern synchrony of bivariate time series. Inform. Sci. 180, 1715–1724 (2010)
Yan, R., Yang, Z., Zhang, T.: Multiscale Cross Entropy: A Novel Algorithm for Analyzing Two Time Series. Proc. Int. Conf. Natural Comput. 1, 411-413. Tianjin, China, 14–16 August 2009 (2009)
Acknowledgements
The authors thank SergioContreras- Espinoza for additional suggestions and useful comments on an earlier draft of this paper. The authors also thank the editor and two anonymous referees for their helpful comments and suggestions. The R and MATLAB codes used in this work are available from the corresponding author upon request.
Funding
This study was funded by FONDECYT (Chile) grant No. 11190116.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Cross-sample entropy variance
The CSE variance is obtained by Richman and Moorman [33]. Let \(\Upsilon =(N-m)^{2}{\bar{A}}_m(r)\) and fixing \(\Psi =(N-m)^{2}{\bar{B}}_m(r)\) (i.e., \(\Psi \) is not a random variable), where \({\bar{A}}_m\) and \({\bar{B}}_m\) are defined in (1.2), we get
which corresponds to an estimation of conditional probability about coincidence between u and v for \(m+1\) points, given that coincidences between u and v for m points exist. According to Liu et al. [26], we get
where
Considering \(\sigma _{CP}^{2}\) as the variance of the estimator CP, we get
as for \(i=k\) and \(j=\ell \) the covariance between \(U_{i,j}\) and \(U_{k,\ell }\) is \(Cov(U_{i,j},U_{k,\ell })=Var(U_{k,\ell })=CP(1-CP)\). If \(i \ne k\) and \(j \ne \ell \), \(U_{ij}\) and \(U_{k,\ell }\) are independent and \(Cov(U_{i,j},U_{k,\ell })=0\).
If vectors overlap, i.e., if \(\min \{ |i-k|,|j-\ell | \} \le m\), we get
Then, \(\sigma _{CP}^{2}\) is estimated by
where \(M_{\Upsilon }\) is the number of pairs of matching templates of length \(m+1\) that vectors \(x_{i,m+1}\) and \(y_{j,m+1}\) overlapped and \(M_{\Psi }\) is the number of pairs of matching templates of length m that vectors \(x_{i,m}\) and \(y_{j,m}\) overlapped. Using the Delta method, the CSE variance \(\sigma _{CSE}^{2}\) is estimated as
where \(g(t)=-\log (t)\), \(t>0\).
Appendix B: Whittle estimator
The methodology to approximate the maximum likelihood estimator (MLE) is based on the calculation of the periodogram by means of the fast Fourier transform and the use of the approximation of the Gaussian log-likelihood function due to Whittle [45] and Bisaglia and Guégan [7]. Suppose that the sample vector \(\mathbf{Y}=(y_1,y_2,\ldots ,y_n)\) is normally distributed with zero mean and autocovariance given by
\(k,j=1,\ldots ,n\), where \(f(\lambda )\) is the spectral density of the ARFIMA model (2.4) defined by
and is associated with the parameter set \(\mathbf{\Omega }=(\alpha _1,\ldots ,\alpha _p,d,\beta _1,\ldots ,\beta _q)\) of (2.4). The log-likelihood function of the process Y is given by
where \(\varvec{\Delta }=[\gamma (k-j)]\). For calculating (5.1), two asymptotic approximations are made for the terms \(\log |\varvec{\Delta }|\) and \(\mathbf{Y}^{\top }\varvec{\Delta }^{-1} \mathbf{Y}\) to obtain
as \(n\rightarrow \infty \), where
is the periodogram. Thus, a discrete version of (5.2) is the Riemann approximation of the integral given by
where \(\lambda _j=2\pi j/n\) are the Fourier frequencies. To find the estimator of parameter vector \(\varvec{\Omega }\), we use the minimization of \(L(\varvec{\Omega }|\mathbf{Y})\) produced by the nlm function of R software [31]. This nonlinear minimization function minimizes \(L(\varvec{\Omega }|\mathbf{Y})\) using a Newton-type algorithm. Under regularity conditions according to Theorem 2 of Contreras-Reyes and Palma [11], the Whittle estimator \(\widehat{\varvec{\Omega }}\) that maximizes the log-likelihood function given in (5.3) is consistent [5] and distributed normally [13].
Appendix C: Quasi-maximum likelihood estimator
Let \(\{ y_t \}\) be a FIGARCH(p, d, q)-(a, b) process described in (2.6–2.8), with a parameter set \(\varvec{\Omega }=(\varvec{\Omega }_{1},\varvec{\Omega }_{2})^{\top }\), where \(\varvec{\Omega }_{1}=(\phi _{1},\ldots ,\phi _{p},d,\theta _{1},\ldots ,\theta _{q})^{\top }\) and \(\varvec{\Omega }_{2}=(\alpha _{0},\ldots ,\alpha _{a},\beta _{1},\ldots ,\beta _{b})^{\top }\), and \(\{ y_t \}\) is associated with the sample vector \(\mathbf{Y}=(y_1,y_2,\ldots ,y_n)\). An approximated MLE or quasi-MLE \({\widehat{\Omega }}\) for \(\{ y_t \}\) is obtained by maximizing the conditional log-likelihood function
Let \(\widehat{\varvec{\Omega }}=\widehat{\varvec{\Omega }}_n\) be the exact point where the function (5.4) is maximized. Under regularity conditions discussed in Baillie et al. [5] and Palma [30], we have that \(\widehat{\varvec{\Omega }}\) is a consistent estimator and
as \(n\rightarrow \infty \), where \({\mathcal {D}}\) denotes convergence in distribution, \(\kappa =p+q+a+b+2\) and \(\varvec{\Sigma }=\text {diag}(\varvec{\Sigma }_{1},\varvec{\Sigma }_{2})\) with
and
Rights and permissions
About this article
Cite this article
Ramírez-Parietti, I., Contreras-Reyes, J.E. & Idrovo-Aguirre, B.J. Cross-sample entropy estimation for time series analysis: a nonparametric approach. Nonlinear Dyn 105, 2485–2508 (2021). https://doi.org/10.1007/s11071-021-06759-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11071-021-06759-8