Abstract
While matrices are usually used as the basic data structure for experiments with repeated measurements or longitudinal data, testing methods for the equality of two matrix distributions have not been fully discussed in the literature. In this article, we propose three methods to test the equality of two matrix distributions: the likelihood ratio test, the Frobenius norm methods and triangle tests. We present a simulation to compare their performance under the matrix normal distribution. We apply the testing methods to compare the US economy, as measured by closing prices of five market indices, before and after the US stock market crash of 2008.
Similar content being viewed by others
References
Anderlucci L, Viroli C (2015) Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data. Ann Appl Stat 9(2):777–800
Anderson TW (2003) An introduction to multivariate statistical analysis. Wiley, New York
Banerjee T, Firouzi H, Hero AO (2015) Non-parametric quickest change detection for large scale random matrices. In: IEEE international symposium on information theory (ISIT), 146–150
Baringhaus L, Franz C (2004) On a new multivariate two-sample test. J Multivar Anal 88(1):190–206
Biswas M, Ghosh AK (2014) A nonparametric two-sample test applicable to high dimensional data. J Multivar Anal 123:160–171
Carroll JD, Arabie P (1980) Multidimensional scaling. Annu Rev Psychol 31:607649
Chen JT, Gupta AK (2005) Matrix variate skew normal distributions. Statistics 39(3):247–253
Chen J, Gupta AK (2012) Parametric statistical change point analysis: with applications to genetics, medicine, and finance. Springer, Berlin
Dawid AP (1981) Some matrix-variate distribution theory: notational considerations and a Bayesian application. Biometrika 68:265–274
Dutilleul P (1999) The MLE algorithm for the matrix normal distribution. J Stat Comput Simul 64(2):105–123
Gallaugher MPB, McNicholas PD (2018) Finite mixtures of skewed matrix variate distributions. Pattern Recognit 80:83–93
Gallaugher MP, McNicholas PD (2017) A matrix variate skew-t distribution. Stat 6(1):160–170
Gupta AK, Nagar DK (1999) Matrix variate distributions, vol 104. CRC Press, Florida
Harrar SW, Gupta AK (2008) On matrix variate skew-normal distributions. Statistics 42(2):179–194
Hoeffding W, Robbins H (1948) The central limit theorem for dependent random variables. Duke Math J 15(3):773–780
Liu Z, Modarres R (2011) A triangle test for equality of distribution functions in high dimensions. J Nonparametr Stat 23(3):605–615
Lovison G (2006) A matrix-valued Bernoulli distribution. J Multivar Anal 97(7):1573–1585
Lu N, Zimmerman DL (2005) The likelihood ratio test for a separable covariance matrix. Stat Probab Lett 73(4):449–457
Maa JF, Pearl DK, Bartoszyński R (1996) Reducing multidimensional two-sample data to one-dimensional interpoint comparisons. Ann Stat 24:1069–1074
Mitchell MW, Genton MG, Gumpertz ML (2006) A likelihood ratio test for separability of covariances. J Multivar Anal 97(5):1025–1043
Naik DN, Rao SS (2001) Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix. J Appl Stat 28(1):91–105
Roy A (2007) A note on testing of Kronecker product covariance structures for doubly multivariate data. In: Proceedings of the American Statistical Association, statistical computing section, pp 2157–2162
Székely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. InterStat 5
Vermunt JK (2007) A hierarchical mixture model for clustering three-way data sets. Comput Stat Data Anal 51:5368–5376
Viroli C (2011) Finite mixtures of matrix normal distributions for classifying three-way data. Stat Comput 21–4:511–522
Viroli C (2012) On matrix-variate regression analysis. J Multivar Anal 111:296–309
Xia Y, Li L (2017) Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics 73(3):780–791
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Lemma 1
The distance between \(\mathbf X_1\) and \(\mathbf X_2\) is
By definition of \(\mathcal {MN}\) distribution, we have \(vec(X_1),vec(X_2)\sim N_{dm}(vec(M_1), { \varSigma }_1\otimes \varPsi _1).\) Then we have \(vec(X_1)-vec(X_2)\sim N_{dm}(0, 2{ \varSigma }_1 \otimes \varPsi _1 )\). Suppose \(\{\lambda _i\},i=1,\ldots ,d\) are the eigenvalues for \({ \varSigma }_1\), and \(\{\theta _j\},j=1,\ldots ,m\) are the eigenvalues for \(\varPsi _1\), one can show that \(\{2\lambda _i\theta _j\},i=1,\ldots ,d,j=1,\ldots ,m\) are the eigenvalues for \(2{ \varSigma }_1\otimes \varPsi _1\). Therefore, we have
Thus \(E||X_1-X_2||_{Fr}^2 =2\sum _{i=1}^d\sum _{j=1}^m \lambda _i\theta _j =2tr({ \varSigma }_1 \otimes \varPsi _1)\). It is not difficult to show that \(\text {var(}(||X_1-X_2||_{Fr}^2)=8 tr({ \varSigma }_1 \otimes \varPsi _1)^2\). Note that if \(\varPsi _1=I_m\), i. e. the columns of the observation matrix are i.i.d., then the expectation reduces to \(2m\sum _{i=1}^d \lambda _i\).
Similarly, \(E||Y_1-Y_2||_{Fr}^2 =2tr({ \varSigma }_2 \otimes \varPsi _2)\). Consider \(\mathbf X_1\) and \(Y_1\), we have \(vec(X_1)-vec(Y_1)\sim N_{dm}(vec(M_1-M_2), { \varSigma }_1 \otimes \varPsi _1+{ \varSigma }_2 \otimes \varPsi _2 )\). One can show that
Therefore, we have \(2E||X_1-Y_1||_{Fr}^2-E||X_1-X_2||_{Fr}^2-E||Y_1-Y_2||_{Fr}^2 =||M_1-M_2||^2_{Fr}\ge 0\). The equality holds if and only if \(||M_1-M_2||^2_{Fr}=0\). It follows that \(M_1=M_2\). \(\square \)
Proof of Theorem 1
Let \(vec(\cdot )\) be an operator that maps a matrix to a vector by stacking the columns of the matrix on top of one another. Let \({\varvec{u}}_1=vec(X_1)\), \({\varvec{u}}_2=vec(X_2)\), \({\varvec{v}}_1=vec(Y_1)\) and \({\varvec{v}}_2=vec(Y_2)\). Based on Proposition 1, one can show that
where \(\mu \) is the uniform distribution on \(S^{dm-1}=\{{\varvec{x}}\in \mathfrak {R}^{dm}{:} ||{\varvec{x}}||_{Eu}=1\}\), the surface of the unit sphere in \(\mathfrak {R}^{dm}\) and \(\gamma _{dm}\) in Proposition 1. Similarly, we have
Following Baringhaus and Franz (2004), one can show that for each \(a\in S^{dm-1}\),
The equality holds if and only if the distribution of \(a'{\varvec{u}}_1\) and \(a'{\varvec{v}}_1\) coincide. Integrating with respect to \(\mu \) on both sides of inequality (24), we have
The equality holds if and only if for almost all \(a\in S^{dm-1}\), the distribution of \(a'{\varvec{u}}_1\) and \(a'{\varvec{v}}_1\) coincide. Since for each \(t\in \mathfrak {R}\), the function \(E(e^{ita' u_1})\) and \(E(e^{ita' v_1}) \) are continuous, the equality in (25) holds if and only if \({\varvec{u}}_1\) and \({\varvec{v}}_1\) have same Fourier transform, which implies \(F=G\). By the definition of the Frobenius and Euclidean norms, we have \(\mu _{FF}=E||X_1-X_2||_{Fr}=E||{\varvec{u}}_1-{\varvec{u}}_2||_{Eu}\), \(\mu _{GG}=E||Y_1-Y_2||_{Fr}=E||{\varvec{v}}_1-{\varvec{v}}_2||_{Eu}\), \(\mu _{FG}=E||X_1-Y_1||_{Fr}=E||{\varvec{u}}_1-{\varvec{v}}_1||_{Eu}\). Therefore, we have \(2\mu _{FG}-\mu _{FF}-\mu _{GG}\ge 0\), and the equality holds if and only if \(F=G\). \(\square \)
Proof of Theorem 2
Let \(h(X_i,X_j;Y_p,Y_q)\) be a real-valued function such that
It is clear that h is symmetric within each argument \((X_i,X_j)\) and \((Y_p,Y_q)\). One can show that EN is the U-statistic corresponding to the kernel function h. That is,
If the distribution F and G are identical, by Theorem 1, we have \(E(h(X_1,X_2;Y_1,\)\(Y_2)) =0\). In addition, since \(E[X_1,X_2;Y_1,Y_2|X_1={\mathbb {X}}_1,Y_1={\mathbb {Y}}_1]=0\) for almost all matrix realization \(({\mathbb {X}}_1,{\mathbb {Y}}_1)\), EN is a degenerate kernel U-statistic. The asymptotic distribution of EN can be inferred from the work of Hoeffding and Robbins (1948) for the case \(k=1\), which shows that \(n\cdot EN\) has a non-degenerate limiting distribution \(\sum _{i=1}^\infty \lambda _i (Z_i^2-1)\) where the constants \(\lambda _i\) depend on F and \(Z_i^2\) are independent \(\chi ^2_1\) random variables. \(\square \)
Proof of Theorem 3
If \(\mu _{FF}= \mu _{GG}= \mu _{FG}\), we have \(2\mu _{FG}-\mu _{FF}-\mu _{GG}= 0\), which implies \(F=G\) by Theorem 1. Suppose \(F=G\), the distributions of \(\mathbf{X}_1, \mathbf{X}_2,\mathbf{Y}_1\) and \(\mathbf{Y}_2\) are equal. Therefore, the distributions of \( ||\mathbf{X}_1-\mathbf{X}_2||_{Fr}, ||\mathbf{Y}_1-\mathbf{Y}_2||_{Fr}\) and \(||\mathbf{X}_1-\mathbf{Y}_1||_{Fr}\) are also equal, implying the fact that \(\mu _{FF}= \mu _{GG}= \mu _{FG}\). \(\square \)
Proof of Theorem 4
Following Biswas and Ghosh (2014) for vector distributions, we express \(n\cdot BG(\mathscr {A},\mathscr {B})\) as
where \({\hat{\mu }}_{FF} \) and \({\hat{\mu }}_{GG} \) are given in Eqs. (8)–(10). From Theorem 2, we have \({n}\cdot EN(\mathscr {A},\mathscr {B})=O_p(1)\), and hence \(\sqrt{n}\cdot EN(\mathscr {A},\mathscr {B}) \overset{p}{\rightarrow }0\), as \(n\rightarrow \infty \).
Let \(\mu _{FF}=E||\mathbf{X}_1-\mathbf{X}_2 ||_{Fr}\) and \(\mu _{GG}=E||\mathbf{Y}_{1}-\mathbf{Y}_{2} ||_{Fr}\). Under null hypothesis, we have \(\mu _{FF}=\mu _{GG}\), and hence \(\sqrt{n}({\hat{\mu }}_{FF} -{\hat{\mu }}_{GG} )=\sqrt{n}[({\hat{\mu }}_{FF}-\mu _{FF})-({\hat{\mu }}_{GG} -\mu _{GG})]. \) Note that
is a U-statistic with symmetric kernel function \(h(\mathbf{X}_i,\mathbf{X}_j)=||\mathbf{X}_i-\mathbf{X}_j||_{Fr}-\mu _{FF}\). Therefore, we have \(R_1=\sqrt{n_1}({\hat{\mu }}_{FF} -\mu _{FF})\overset{d}{\rightarrow }N(0,4{\sigma }_0)\), where \({\sigma }_0=Var[E(||\mathbf{X}_1-\mathbf{X}_2||_{Fr}|\mathbf{X}_1)]\). Similarly, we have \(R_2=\sqrt{n_2}({\hat{\mu }}_{GG} -\mu _{GG})\overset{d}{\rightarrow }N(0,4{\sigma }_0)\). Since \({\hat{\mu }}_{FF} \) and \({\hat{\mu }}_{GG} \) are independent, one can show
Therefore, as \(min(n_1,n_2)\rightarrow \infty \), we obtain
\(\square \)
Rights and permissions
About this article
Cite this article
Guo, L., Modarres, R. Testing the equality of matrix distributions. Stat Methods Appl 29, 289–307 (2020). https://doi.org/10.1007/s10260-019-00477-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-019-00477-7