Skip to main content
Log in

Cross-sample entropy estimation for time series analysis: a nonparametric approach

  • Original paper
  • Published:
Nonlinear Dynamics Aims and scope Submit manuscript

Abstract

Cross-sample entropy (CSE) allows to analyze the association level between two time series that are not necessarily stationary. The current criteria to estimate the CSE are based on the normality assumption, but this condition is not necessarily satisfied in reality. Also, CSE calculation is based on a tolerance and an embedding dimension parameter, which are defined rather subjectively. In this paper, we define a new way of estimating the CSE with a nonparametric approach. Specifically, a residual-based bootstrap-type estimator is considered for long-memory and heteroskedastic models. Subsequently, the established criteria are redefined for the approach of interest for generalization purposes. Finally, a simulation study serves to evaluate the performance of this estimation technique. An application to foreign exchange market data before and after the 1999 Asian financial crisis was considered to study the synchrony level of the CAD/USD and SGD/USD foreign exchange rate time series. A bootstrap-type method allowed to obtain a more realistic estimation of the cross-sample entropy (CSE) statistics. Specifically, estimated CSE was slightly different than that obtained in previous studies, but for both periods the synchrony level using CSE between the time series was higher after the 1999 Asian financial crisis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

Data used in this paper will be made available upon reasonable request from the corresponding author.

References

  1. Abramson, A., Cohen, I.: On the stationarity of Markov-switching GARCH processes. Econom. Theor. 23, 485–500 (2007)

    Article  MathSciNet  Google Scholar 

  2. Al-Eyd, A.J., Karasulu, M.: Ambition versus gradualism in disinflation horizons under bounded rationality: The case of Chile. NIESR discussion papers (2008-03) (2008)

  3. Ali, A., Khan, S.A., Khalil, A.U., Khan, D.M.: Bootstrap prediction intervals for time series with hetroscedastic errors. Pak. J. Stat. 33, 1–13 (2017)

    MathSciNet  Google Scholar 

  4. Anděl, J., Netuka, I., Zvára, K.: On threshold autoregressive processes. Kybernetika 20, 89–106 (1984)

    MathSciNet  MATH  Google Scholar 

  5. Baillie, R.T., Bollerslev, T., Mikkelsen, H.O.: Fractionally integrated generalized autoregressive conditional heteroskedasticity. J. Econ. 74, 3–30 (1996)

    Article  MathSciNet  Google Scholar 

  6. Bhattacharyya, R., Hossain, S.A., Kar, S.: Fuzzy cross-entropy, mean, variance, skewness models for portfolio selection. J. King Saud Univ. Comput. Inf. Sci. 26, 79–87 (2014)

    Article  Google Scholar 

  7. Bisaglia, L., Guégan, D.: A comparison of techniques of estimation in long-memory processes. Comput. Stat. Data Anal. 27, 61–81 (1998)

    Article  Google Scholar 

  8. Bollerslev, T.: Generalized autoregressive conditional heteroskedasticity. J. Econ. 31, 307–327 (1986)

    Article  MathSciNet  Google Scholar 

  9. Contreras-Reyes, J.E.: Asymptotic form of the Kullback-Leibler divergence for multivariate asymmetric heavy-tailed distributions. Physica A 395, 200–208 (2014)

    Article  MathSciNet  Google Scholar 

  10. Contreras-Reyes, J.E.: Mutual information matrix based on asymmetric Shannon entropy for nonlinear interactions of time series. Nonlin. Dyn. 104, 3913–3924 (2021)

    Article  Google Scholar 

  11. Contreras-Reyes, J.E., Palma, W.: Statistical analysis of autoregressive fractionally integrated moving average models in R. Comput. Stat. 28, 2309–2331 (2013)

    Article  MathSciNet  Google Scholar 

  12. Contreras-Reyes, J.E., Idrovo-Aguirre, B.J.: Backcasting and forecasting time series using detrended cross-correlation analysis. Physica A 560, 125109 (2020)

    Article  MathSciNet  Google Scholar 

  13. Dahlhaus, R.: Efficient parameter estimation for self-similar processes. Ann. Stat. 17, 1749–1766 (1989)

    Article  MathSciNet  Google Scholar 

  14. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. CRC Press, Boca Raton (1994)

    Book  Google Scholar 

  15. Franco, G.C., Reisen, V.A.: Bootstrap techniques in semiparametric estimation methods for ARFIMA models: a comparison study. Comput. Stat. 19, 243–259 (2004)

    Article  Google Scholar 

  16. Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101, e215–e220 (2000)

    Google Scholar 

  17. Granger, C.W., Joyeux, R.: An introduction to long-memory time series models and fractional differencing. J. Time Ser. Anal. 1, 15–29 (1980)

    Article  MathSciNet  Google Scholar 

  18. Hall, P., Horowitz, J.L., Jing, B.Y.: On blocking rules for the bootstrap with dependent data. Biometrika 82, 561–574 (1995)

    Article  MathSciNet  Google Scholar 

  19. Hernández-Santoro, C., Contreras-Reyes, J.E., Landaeta, M.F.: Intra-seasonal variability of sea surface temperature influences phenological decoupling in anchovy ( Engraulis ringens). J. Sea Res. 152, 101765 (2019)

    Article  Google Scholar 

  20. Idrovo-Aguirre, B.J., Contreras-Reyes, J.E.: The response of housing construction to a copper price shock in Chile (2009–2020). Economies 9, 98 (2021)

    Article  Google Scholar 

  21. Jamin, A., Humeau-Heurtier, A.: (Multiscale) Cross-entropy methods: a review. Entropy 22, 45 (2020)

    Article  MathSciNet  Google Scholar 

  22. Jeong, M.: Residual-based GARCH bootstrap and second order asymptotic refinement. Econom. Theor. 33, 779–790 (2017)

    Article  MathSciNet  Google Scholar 

  23. Karmakar, C., Udhayakumar, R., Palaniswami, M.: Entropy profiling: a reduced-parametric measure of Kolmogorov-Sinai entropy from short-term HRV signal. Entropy 22, 1396 (2020)

    Article  MathSciNet  Google Scholar 

  24. Lake, D.E., Richman, J.S., Griffin, M.P., Moorman, J.R.: Sample entropy analysis of neonatal heart rate variability. Amer. J. Physiol. Heart C 283, R789–R797 (2002)

    Google Scholar 

  25. Li, B., Han, G., Jiang, S., Yu, Z.: Composite multiscale partial cross-sample entropy analysis for quantifying intrinsic similarity of two time series affected by common external factors. Entropy 22, 1003 (2020)

    Article  MathSciNet  Google Scholar 

  26. Liu, L.Z., Qian, X.Y., Lu, H.Y.: Cross-sample entropy of foreign exchange time series. Physica A 389, 4785–4792 (2010)

    Article  Google Scholar 

  27. Fisher, T.J., Gallagher, C.M.: New weighted portmanteau statistics for time series goodness of fit testing. J. Amer. Stat. Assoc. 107, 777–787 (2012)

    Article  MathSciNet  Google Scholar 

  28. Matlab (2017). MATLAB, version 9.2.0.538062. The MathWorks Inc., Natick, Massachusetts, USA

  29. Palma, W.: Long-memory time series, theory and methods. Wiley, Hoboken (2007)

    Book  Google Scholar 

  30. Palma, W.: Time Series Analysis. Wiley, Hoboken (2016)

    MATH  Google Scholar 

  31. R Core Team (2020). A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.R-project.org

  32. Ray, S., Das, S.S., Mishra, P., Al Khatib, A.M.G.: Time series SARIMA Modelling and forecasting of monthly rainfall and temperature in the south Asian countries. Envir., in press, Earth Sys (2021). https://doi.org/10.1007/s41748-021-00205-w

  33. Richman, J.S., Moorman, J.R.: Physiological time-series analysis using approximate entropy and sample entropy. Amer. J. Physiol. Heart C. 283, R789–R797 (2000)

    Google Scholar 

  34. Shang, D., Shang, P., Zhang, Z.: Efficient synchronization estimation for complex time series using refined cross-sample entropy measure. Commun. Nonlinear Sci. 94, 105556 (2021)

    Article  MathSciNet  Google Scholar 

  35. Shi, W., Shang, P.: Cross-sample entropy statistic as a measure of synchronism and cross-correlation of stock markets. Nonlin. Dyn. 71, 539–554 (2013)

    Article  MathSciNet  Google Scholar 

  36. Shimizu, K.: Bootstrapping stationary ARMA-GARCH models. Vieweg+ Teubner (2010)

  37. Silverman, B.W.: Density estimation for statistics and data analysis, vol. 26. CRC Press, Boca Raton (1986)

    Book  Google Scholar 

  38. Sun, Z., Fisher, T.J.: Testing for correlation between two time series using a parametric bootstrap. J. Appl. Stat., in press (2020)

  39. Udhayakumar, R.K., Karmakar, C., Palaniswami, M.: Approximate entropy profile: a novel approach to comprehend irregularity of short-term HRV signal. Nonlin. Dyn. 88, 823–837 (2017)

    Article  Google Scholar 

  40. Valipour, M., Bateni, S.M., Gholami Sefidkouhi, M.A., Raeini-Sarjaz, M., Singh, V.P.: Complexity of forces driving trend of reference evapotranspiration and signals of climate change. Atmosphere 11, 1081 (2020)

    Article  Google Scholar 

  41. Wand, M.P., Jones, M.C.: Kernel Smoothing. CRC Press, Boca Raton (1994)

    Book  Google Scholar 

  42. Wang, G.J., Xie, C., Han, F.: Multi-scale approximate entropy analysis of foreign exchange markets efficiency. Sys. Eng. Proc. 3, 201–208 (2012)

    Article  Google Scholar 

  43. Wang, F., Zhao, W., Jiang, S.: Detecting asynchrony of two series using multiscale cross-trend sample entropy. Nonlin. Dyn. 99, 1451–1465 (2020)

    Article  Google Scholar 

  44. Wasserman, L.: All of statistics: a concise course in statistical inference. Springer Science & Business Media (2013)

  45. Whittle, P.: Estimation and information in stationary time series. Arkiv för Matematik 2, 423–434 (1953)

    Article  MathSciNet  Google Scholar 

  46. Xie, H.B., Zheng, Y.P., Guo, J.Y., Chen, X.: Cross-fuzzy entropy: A new method to test pattern synchrony of bivariate time series. Inform. Sci. 180, 1715–1724 (2010)

    Article  Google Scholar 

  47. Yan, R., Yang, Z., Zhang, T.: Multiscale Cross Entropy: A Novel Algorithm for Analyzing Two Time Series. Proc. Int. Conf. Natural Comput. 1, 411-413. Tianjin, China, 14–16 August 2009 (2009)

Download references

Acknowledgements

The authors thank SergioContreras- Espinoza for additional suggestions and useful comments on an earlier draft of this paper. The authors also thank the editor and two anonymous referees for their helpful comments and suggestions. The R and MATLAB codes used in this work are available from the corresponding author upon request.

Funding

This study was funded by FONDECYT (Chile) grant No. 11190116.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier E. Contreras-Reyes.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 308 KB)

Appendices

Appendix A: Cross-sample entropy variance

The CSE variance is obtained by Richman and Moorman [33]. Let \(\Upsilon =(N-m)^{2}{\bar{A}}_m(r)\) and fixing \(\Psi =(N-m)^{2}{\bar{B}}_m(r)\) (i.e., \(\Psi \) is not a random variable), where \({\bar{A}}_m\) and \({\bar{B}}_m\) are defined in (1.2), we get

$$\begin{aligned} CP=\frac{{\bar{A}}_m(r)}{{\bar{B}}_m(r)}=\frac{\Upsilon }{\Psi }, \end{aligned}$$

which corresponds to an estimation of conditional probability about coincidence between u and v for \(m+1\) points, given that coincidences between u and v for m points exist. According to Liu et al. [26], we get

$$\begin{aligned} \Upsilon =\displaystyle {\sum _{i,j=1}^{N-m}} U_{ij}, \end{aligned}$$

where

$$\begin{aligned} U_{ij}=\left\{ \begin{array}{lcc} 1, &{} \text {if} &{} d_{\infty }(x_{i,m+1},y_{i,m+1}) \le r, \\ 0, &{} \text {if} &{} d_{\infty }(x_{i,m+1},y_{i,m+1}) > r. \end{array}\right. \end{aligned}$$

Considering \(\sigma _{CP}^{2}\) as the variance of the estimator CP, we get

$$\begin{aligned} \sigma _{CP}^{2}= & {} Var\bigg (\frac{\Upsilon }{\Psi }\bigg )\\= & {} \frac{1}{\Psi ^{2}}\displaystyle {\sum _{i=1}^{N-m}\sum _{j=1}^{N-m}\sum _{k=1}^{N-m}\sum _{\ell =1}^{N-m}}Cov(U_{i,j},U_{k,\ell }), \end{aligned}$$

as for \(i=k\) and \(j=\ell \) the covariance between \(U_{i,j}\) and \(U_{k,\ell }\) is \(Cov(U_{i,j},U_{k,\ell })=Var(U_{k,\ell })=CP(1-CP)\). If \(i \ne k\) and \(j \ne \ell \), \(U_{ij}\) and \(U_{k,\ell }\) are independent and \(Cov(U_{i,j},U_{k,\ell })=0\).

If vectors overlap, i.e., if \(\min \{ |i-k|,|j-\ell | \} \le m\), we get

$$\begin{aligned}&Cov(U_{i,j},U_{k,\ell })=U_{i,j} U_{k,\ell }-CP^{2}\\&\quad =\left\{ \begin{array}{lcc} 1-CP^{2}, &{} \text {if exists coincidence in }m+1\text { points}, \\ -CP^{2}, &{} \quad \text {otherwise}. \end{array}\right. \end{aligned}$$

Then, \(\sigma _{CP}^{2}\) is estimated by

$$\begin{aligned} {\widehat{\sigma }}_{CP}^{2}=\frac{CP(1-CP)}{\Psi }+\frac{1}{\Psi ^{2}}\big [M_{\Upsilon }-M_{\Psi }(CP)^{2}\big ], \end{aligned}$$

where \(M_{\Upsilon }\) is the number of pairs of matching templates of length \(m+1\) that vectors \(x_{i,m+1}\) and \(y_{j,m+1}\) overlapped and \(M_{\Psi }\) is the number of pairs of matching templates of length m that vectors \(x_{i,m}\) and \(y_{j,m}\) overlapped. Using the Delta method, the CSE variance \(\sigma _{CSE}^{2}\) is estimated as

$$\begin{aligned} \sigma _{CSE}^{2}\approx [g'(CP)]^{2}\sigma _{CP}^{2}=\frac{\sigma _{CP}^{2}}{CP^{2}}, \end{aligned}$$

where \(g(t)=-\log (t)\), \(t>0\).

Appendix B: Whittle estimator

The methodology to approximate the maximum likelihood estimator (MLE) is based on the calculation of the periodogram by means of the fast Fourier transform and the use of the approximation of the Gaussian log-likelihood function due to Whittle [45] and Bisaglia and Guégan [7]. Suppose that the sample vector \(\mathbf{Y}=(y_1,y_2,\ldots ,y_n)\) is normally distributed with zero mean and autocovariance given by

$$\begin{aligned} \gamma (k-j)= & {} \int _{-\pi }^{\pi }f(\lambda )e^{i\lambda (k-j)}d\lambda , \end{aligned}$$

\(k,j=1,\ldots ,n\), where \(f(\lambda )\) is the spectral density of the ARFIMA model (2.4) defined by

$$\begin{aligned} f(\lambda )=\frac{\sigma ^2}{2\pi }\left( 2\sin \frac{\lambda }{2}\right) ^{-2d}\frac{|\Theta (e^{-i\lambda })|^2}{|\Phi (e^{-i\lambda })|^2} \end{aligned}$$

and is associated with the parameter set \(\mathbf{\Omega }=(\alpha _1,\ldots ,\alpha _p,d,\beta _1,\ldots ,\beta _q)\) of (2.4). The log-likelihood function of the process Y is given by

$$\begin{aligned} L(\varvec{\Omega }|\mathbf{Y})=-\frac{1}{2n}[\log |\varvec{\Delta }|-\mathbf{Y}^{\top }\varvec{\Delta }^{-1} \mathbf{Y}]. \end{aligned}$$
(5.1)

where \(\varvec{\Delta }=[\gamma (k-j)]\). For calculating (5.1), two asymptotic approximations are made for the terms \(\log |\varvec{\Delta }|\) and \(\mathbf{Y}^{\top }\varvec{\Delta }^{-1} \mathbf{Y}\) to obtain

$$\begin{aligned} L(\varvec{\Omega }|\mathbf{Y})\approx -\frac{1}{4\pi }\left[ \int _{-\pi }^{\pi }\log [2\pi f(\lambda )]d\lambda + \int _{-\pi }^{\pi }\frac{I(\lambda )}{f(\lambda )}d\lambda \right] ,\nonumber \\ \end{aligned}$$
(5.2)

as \(n\rightarrow \infty \), where

$$\begin{aligned} I(\lambda )=\frac{1}{2\pi n}\Bigg |\sum _{j=1}^{n}y_j e^{i\lambda j}\Bigg |^2 \end{aligned}$$

is the periodogram. Thus, a discrete version of (5.2) is the Riemann approximation of the integral given by

$$\begin{aligned} L(\varvec{\Omega }|\mathbf{Y})\approx -\frac{1}{2n}\left[ \sum _{j=1}^{n}\log f(\lambda _j) + \sum _{j=1}^{n}\frac{I(\lambda _j)}{f(\lambda _j)}\right] ,\nonumber \\ \end{aligned}$$
(5.3)

where \(\lambda _j=2\pi j/n\) are the Fourier frequencies. To find the estimator of parameter vector \(\varvec{\Omega }\), we use the minimization of \(L(\varvec{\Omega }|\mathbf{Y})\) produced by the nlm function of R software [31]. This nonlinear minimization function minimizes \(L(\varvec{\Omega }|\mathbf{Y})\) using a Newton-type algorithm. Under regularity conditions according to Theorem 2 of Contreras-Reyes and Palma [11], the Whittle estimator \(\widehat{\varvec{\Omega }}\) that maximizes the log-likelihood function given in (5.3) is consistent [5] and distributed normally [13].

Appendix C: Quasi-maximum likelihood estimator

Let \(\{ y_t \}\) be a FIGARCH(pdq)-(ab) process described in (2.62.8), with a parameter set \(\varvec{\Omega }=(\varvec{\Omega }_{1},\varvec{\Omega }_{2})^{\top }\), where \(\varvec{\Omega }_{1}=(\phi _{1},\ldots ,\phi _{p},d,\theta _{1},\ldots ,\theta _{q})^{\top }\) and \(\varvec{\Omega }_{2}=(\alpha _{0},\ldots ,\alpha _{a},\beta _{1},\ldots ,\beta _{b})^{\top }\), and \(\{ y_t \}\) is associated with the sample vector \(\mathbf{Y}=(y_1,y_2,\ldots ,y_n)\). An approximated MLE or quasi-MLE \({\widehat{\Omega }}\) for \(\{ y_t \}\) is obtained by maximizing the conditional log-likelihood function

$$\begin{aligned} \begin{aligned} L(\varvec{\Omega }|\mathbf{Y}) = -\frac{1}{2n}\sum _{t=1}^{n}\Bigg [ \log \sigma _{t}^{2}+\frac{\varepsilon _{t}^{2}}{\sigma _{t}^{2}} \Bigg ]. \end{aligned} \end{aligned}$$
(5.4)

Let \(\widehat{\varvec{\Omega }}=\widehat{\varvec{\Omega }}_n\) be the exact point where the function (5.4) is maximized. Under regularity conditions discussed in Baillie et al. [5] and Palma [30], we have that \(\widehat{\varvec{\Omega }}\) is a consistent estimator and

$$\begin{aligned} \sqrt{n}(\widehat{\varvec{\Omega }}-\varvec{\Omega }_{0})\mathop {\buildrel {\mathcal {D}}\over \longrightarrow }N_{\kappa }(0,\varvec{\Sigma }^{-1}) \end{aligned}$$

as \(n\rightarrow \infty \), where \({\mathcal {D}}\) denotes convergence in distribution, \(\kappa =p+q+a+b+2\) and \(\varvec{\Sigma }=\text {diag}(\varvec{\Sigma }_{1},\varvec{\Sigma }_{2})\) with

$$\begin{aligned} \varvec{\Sigma }_{1}= & {} {\mathbb {E}}\Bigg [ \frac{1}{\sigma _{t}^{2}}\frac{\partial \varepsilon _{t}}{\partial \varvec{\Omega }_{1}}\frac{\partial \varepsilon _{t}}{\partial \varvec{\Omega }_{1}^{\top }}\\&+\frac{1}{2\sigma _{t}^{4}}\frac{\partial \sigma _{t}^{2}}{\partial \varvec{\Omega }_{1}}\frac{\partial \sigma _{t}^{2}}{\partial \varvec{\Omega }_{1}^{\top }} \Bigg ], \end{aligned}$$

and

$$\begin{aligned} \varvec{\Sigma }_{2}={\mathbb {E}}\Bigg [ \frac{1}{2\sigma _{t}^{4}}\frac{\partial \sigma _{t}^{2}}{\partial \varvec{\Omega }_{2}}\frac{\partial \sigma _{t}^{2}}{\partial \varvec{\Omega }_{2}^{\top }} \Bigg ]. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramírez-Parietti, I., Contreras-Reyes, J.E. & Idrovo-Aguirre, B.J. Cross-sample entropy estimation for time series analysis: a nonparametric approach. Nonlinear Dyn 105, 2485–2508 (2021). https://doi.org/10.1007/s11071-021-06759-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11071-021-06759-8

Keywords

Navigation