Abstract
In this paper, we propose a method to estimate the number and locations of change points and further estimate parameters of different regions for piecewise stationary vector autoregressive models. The procedure decomposes the problem of change points detection and parameter estimation along the component series. By reformulating the change point detection problem as a variable selection one, we apply group Lasso method to estimate the change points initially. Then, from the preliminary estimate of change points, a subset is selected based on the loss functions of Lasso method and a backward elimination algorithm. Finally, we propose a Lasso + OLS method to estimate the parameters in each segmentation for high-dimensional VAR models. The consistent properties of the estimation for the number and the locations of the change points and the VAR parameters are proved. Simulation experiments and real data examples illustrate the performance of the method.
Similar content being viewed by others
References
Andreou E, Ghysels E (2008) Structural breaks in financial time series. Handbook of financial time series. Springer, Berlin, pp 839–866
Angelosante D, Giannakis GB (2012) Group lassoing change-points in piecewise-constant AR processes. Eurasip J Adv Signal Proces 70(1):1–16
Bai J (1999) Likelihood ratio tests for multiple structural changes. J Econom 91:299–323
Basu S, Michailidis G (2015) Regularized estimation in sparse high-dimensional time series models. Ann Stat 43(4):1535–1567
Bleakley K, Vert JP (2011) The group fused lasso for multiple change-point detection. arXiv preprint arXiv:1106.4199
Chan NH, Yau CY, Zhang RM (2014) Group Lasso for structural break time series. J Am Stat Assoc 109(506):590–599
Davis RA, Lee TCM, Rodriguez-Yam GA (2006) Structure break estimation for nonstationary time series models. J Am Stat Assoc 101(473):223–239
Ding X, Qiu ZY, Chen XH (2017) Sparse transition matrix estimation for high-dimensional and locally stationary vector autoregressive models. Electron J Stat 11(2):3871–3902
Doerr B, Fischer P, Hilbert A, Witt C (2017) Detecting structural breaks in time series via genetic algorithms. Soft Comput 21(16):4707–4720
Efron B, Hastie T, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Harchaoui Z, Levy-Leduc C (2008) Catching change points with Lasso. Advances in neural information processing system. MIT Press, Vancouver, pp 161–168
Harchaoui Z, Levy-Leduc C (2010) Multiple change point estimation with a total variation penalty. J Am Stat Assoc 105(492):1480–1493
Kirch C, Muhsal B, Ombao H (2015) Detection of changes in multivariate time series with application to EEG data. J Am Stat Assoc 110(511):1197–1216
Li S, Lund R (2012) Multiple change point detection via genetic algorithms. J Clim 25(2):674–686
Liu HZ, Yu B (2013) Asymptotic properties of Lasso + mLS and Lasso + ridge in sparse high-dimensional linear regression. Electron J Stat 7:3124–3169
Liu J, Ji S, Ye J (2011) SLEP: sparse learning with efficient projections. http://www.public.asu.edu/~jye02/Software/SLEP
Liu J, Ye J (2010) Moreau-yosida regularization for grouped tree structure learning. Int Conf Neural Inf Process Syst (NIPS) 23:1459–1467
Picard F, Robin S, Lavielle M, Vaisse C, Daudin JJ (2005) A statistical approach for array CGH data analysis. Bioinformatics 6(27):1–14
Panigrahi S, Verma K, Tripathi P (2018) Land cover change detection using focused time delay neural network. Soft Comput https://doi.org/10.1007/s00500-018-3395-3
Safikhani A, Shojaie A (2017) Joint structural break detection and parameter estimation in high-dimensional non-stationary VAR models. arXiv preprint arXiv:1708.02736
Shao X, Zhang X (2010) Testing for change-points in time series. J Am Stat Assoc 105(491):1228–1240
Sims CA (1980) Macroeconomics and reality. Econometrica 48(1):1–48
Tibshirani R (1996) Regression shrinkage and selection via The Lasso. J R Stat Soc 58B(1):267–288
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc 68B(1):49–67
Zhao P, YU B (2006) On model selection consistency of Lasso. J Mach Learn Res 7(Nov):2541–2563
Zou H (2006) The adaptive Lasso and its Oracle properties. J Am Stat Assoc 101(476):1418–1429
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant No. 11601404, the National Statistical Research Program funded by National Bureau of Statistics of China under Grant No. 2016LZ37, the Youth Innovation Team of Shaanxi Universities, Yanta Scholars Foundation and talent development foundation of Xian University of Finance and Economics.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Author Wei Gao declares that she has no conflict of interest. Author Haizhong Yang declares that he has no conflict of interest. Author Lu Yang declares that she has no conflict of interest.
Human and animal rights
This article does not contain any studies with human or animal participants performed by the author.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by X. Li.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix Proofs of Theorems
Appendix Proofs of Theorems
Since the estimation problem decomposes across different time series \(Y_{it}\) with regressive coefficient vector \(B_{i\cdot }^j\), the number of change points \(m_i\) and change points \(t_k^i\), it is enough to show that the estimation \({\hat{B}}_{i\cdot }^j\), \({\hat{m}}_i\) and \({\hat{t}}_k^i\) are consistent and then apply the union bound to show that the whole parameter matrix B and change points are estimated consistent.
Suppose
Lemma 1
Suppose Assumptions 1 and 2 hold. For any \(c_0>0\), there exists some constant \(C>0\) such that for \(n\ge Clog(nd^2p)\)
Proof
For any t, l, h, \(cov(Y_{k,p+t}\epsilon _{i,p+t+l})=0\), it follows from Proposition 2.4(b) of Basu and Michailidis (2015) that for \(d\times 1\) vectors u, v with zeros except the k, i-th elements, respectively,
Suppose \(\eta =k_3\sqrt{\frac{1}{n}log (nd^2p)}\), and \(k_3>0\) be large enough yieal Eq. (32). \(\square \)
Lemma 2
Suppose Assumptions 1 and 2 hold. For any \(c_i>0,i=1,2,3\),
and
Proof
From Proposition 2.4 of Basu and Michailidis (2015),
and
Let \(\eta =k_3\sqrt{\frac{log(pd^2)}{n\gamma _n}}\), we get the results. \(\square \)
Lemma 3
Let \({\hat{\phi }}_{i}\) and \(\phi _i\) be defined as in Eq. (7), \(\mathbf X _{l-1}\) denote the row in matrix \(\mathbf X \) where \(X_{l-1}\) is located in. Under the condition of Theorem 1, we have
and
Lemma 3 concerns the KKT conditions of the group Lasso algorithm. Here, we omit the proof which can be deduced directly.
Proof of Theorem 1
From the similar proof strategy to Theorem 2.1 in Chan et al. (2014) (using Lemma 1 to replace the Lemma A.1), for some \(m_n=o(\lambda _n^{-1})\), then with some \(c_0>0\) and high probability as \(n\rightarrow \infty \),
where \(\max \limits _{\begin{array}{c} 1\le k\le d\\ 1\le j\le m+1 \end{array}}\vert B_{ik}^j\vert \le M_B^i\). Then,
\(\square \)
Proof of Theorem 2
Firstly, for \(i=1,\ldots ,d\), we prove
Suppose \(T_{ij}=\{\vert {\hat{t}}_j^i-t_j^i\vert > n\gamma _n\}, j=1,2,\ldots ,m_i\), then
Let \(C_n=\{\max \limits _{1\le j\le m_i}\vert {\hat{t}}_j^i-t_j\vert \le \min _i\vert t_j-t_{j-1}\vert /2\}\). To prove Eq. (42), it suffices to show that
where \(C_n^c\) is the complement of the set \(C_n\).
We only outline the proof for \(\sum _{j=1}^{m_i} P(T_{ij} C_n)\rightarrow 0.\) The proof for \(\sum _{j=1}^{m_i}P(T_{ij}C_n^c)\rightarrow 0\) is similar.
The number of change points \(m_i\) has two cases, \(m_i\) is fixed or \(m_i\rightarrow \infty \). For the case \(m_i\) is fixed, we consider two cases \({\hat{t}}_j^i<t_j^i\) and \({\hat{t}}_j^i>t_j^i\).
From the KKT conditions, we have
which implies that
Then, for \({\hat{t}}_j^i<t_j^i\),
let \({\mathscr {B}}_j=\Big \Vert \sum _{t={\hat{t}}_j^i}^{t_j^i-1}X_{t-1}\left( (B_{i\cdot }^{j})^T-(B_{i\cdot }^{j+1})^T\right) X_{t-1}\Big \Vert \)
In the similar ways as Theorem 2.2 Chan et al. (2014), using Lemma 2 and Lemma 3 to replace their Lemma A.2 and Lemma A.3, respectively, we can get that \(P(T_{ij1})\rightarrow 0\), \(P(T_{ij2})\rightarrow 0\) and \(P(T_{ij3})\rightarrow 0\). Combining these results, we have \(P(T_{ij}C_n\cap \{{\hat{t}}_j^i<t_j^i\})\rightarrow 0\). The proof of \(P(T_{ij}C_n\cap \{{\hat{t}}_j^i>t_j^i\})\rightarrow 0\) is similar. Then, \(P(T_{ij}C_n)\rightarrow 0\).
From Lemma 2, one can prove the case \(m_i\rightarrow \infty \). The rate of convergence of the \(P(T_{ij}),j=1,\ldots ,m_i\) can be fast enough to get \(\sum _{j=1}^{m_i} P(T_{ij})\rightarrow 0\). Then, Eq. (42) is proved.
Theorem 2 is proved from the definition of \({\hat{t}}_j,j=1,\ldots ,{\hat{m}}\) and \(\max \limits _{1\le j\le m}\vert {\hat{t}}_j-t_j\vert \le \max \limits _{1\le i\le d}\max \limits _{1\le j\le m}\vert {\hat{t}}_j^i-t_j^i\vert \). \(\square \)
Proof of Theorem 3
Firstly, we prove that for \(i=1,\ldots ,d\), if Assumptions 1 to 3 are satisfied, then
and
Applying Lemma 3 and Lemma 2 to similar augment of Theorem 2, according to the procedure of proof of Chan et.al.(2014), Eqs. (48) and (49) can be proved.
Then, from \({\mathscr {A}}_n=\cup _{i=1}^d{\mathscr {A}}_{ni}\), we get \(\left( \vert {\mathscr {A}}_{n}\vert \ge m\right) \subseteq \left( \vert {\mathscr {A}}_{ni}\vert \ge m_i\right) \) and \(d_H({\mathscr {A}}_{n}, {\mathscr {A}})=d_H({\mathscr {A}}_{ni}, {\mathscr {A}}_i)\), which completes the proof of Theorem 3. \(\square \)
Proof of Theorem 4
In the proof of the theorem, we will apply the conclusion of Lemma 4 in Safikhani and Shojaie (2017), for \(m_r<m\)
We prove the first conclusion by showing (a)\(P({\hat{m}}^*<m)\rightarrow 0\) and (b) \(P({\hat{m}}^*>m)\rightarrow 0\).
For (a) \(P({\hat{m}}^*<m)\rightarrow 0\), Theorem 3 implies that there are points \({\hat{t}}_{nj}\in {\mathscr {A}}_n\) such that \(max_{1\le j\le m}\vert {\hat{t}}_{nj}-t_j\vert \le n\gamma _n\).
By similar arguments as in Theorem 4 (Safikhani and Shojaie (2017)), we get that
By Eq. (50), we get
The last inequality comes from the conditions \(m\omega _n/I_{min}\rightarrow 0\) and \(lim_{n\rightarrow \infty }n\gamma _nS^2/\omega _n\le 1\). Then, \(P({\hat{m}}^*<m)\rightarrow 0\).
To prove (b), given \({\hat{m}}^*>m\),
Combining with Eq. (51), we get
When \({\hat{m}}^*>m\), we have that
Combining Eqs. (54) and (55), with the similar discussion as in Eq. (52), \(P({\hat{m}}^*>m)\rightarrow 0)\). The second conclusion follows as Theorem 4 in Safikhani and Shojaie (2017). \(\square \)
Proof of Theorem 5
From Theorem 4 in Zhao and YU (2006), under the irrepresentable condition (Assumption 7) and Assumption 8, for \(\lambda _n/n\rightarrow 0\) and \(\lambda _n/n^{\frac{1+c}{2}}\rightarrow \infty \) with \(0\le c<1\), the probability of the Lasso selecting wrong models satisfies \(P({\hat{S}}_i\ne S_i)=0(e^{-n^{c_2}})\).
The second part is proved from the results Theorem 3 in Liu and Yu (2013). \(\square \)
Rights and permissions
About this article
Cite this article
Gao, W., Yang, H. & Yang, L. Change points detection and parameter estimation for multivariate time series. Soft Comput 24, 6395–6407 (2020). https://doi.org/10.1007/s00500-019-04135-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04135-8