Testing the equality of matrix distributions

Guo, Lingzhe; Modarres, Reza

doi:10.1007/s10260-019-00477-7

Testing the equality of matrix distributions

Original Paper
Published: 25 June 2019

Volume 29, pages 289–307, (2020)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

241 Accesses
2 Citations
Explore all metrics

Abstract

While matrices are usually used as the basic data structure for experiments with repeated measurements or longitudinal data, testing methods for the equality of two matrix distributions have not been fully discussed in the literature. In this article, we propose three methods to test the equality of two matrix distributions: the likelihood ratio test, the Frobenius norm methods and triangle tests. We present a simulation to compare their performance under the matrix normal distribution. We apply the testing methods to compare the US economy, as measured by closing prices of five market indices, before and after the US stock market crash of 2008.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Block-diagonal test for high-dimensional covariance matrices

Article 26 December 2022

The Likelihood Ratio Test of Equality of Mean Vectors with a Doubly Exchangeable Covariance Matrix

On testing the equality of high dimensional mean vectors with unequal covariance matrices

Article 08 October 2015

References

Anderlucci L, Viroli C (2015) Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data. Ann Appl Stat 9(2):777–800
MathSciNet MATH Google Scholar
Anderson TW (2003) An introduction to multivariate statistical analysis. Wiley, New York
MATH Google Scholar
Banerjee T, Firouzi H, Hero AO (2015) Non-parametric quickest change detection for large scale random matrices. In: IEEE international symposium on information theory (ISIT), 146–150
Baringhaus L, Franz C (2004) On a new multivariate two-sample test. J Multivar Anal 88(1):190–206
MathSciNet MATH Google Scholar
Biswas M, Ghosh AK (2014) A nonparametric two-sample test applicable to high dimensional data. J Multivar Anal 123:160–171
MathSciNet MATH Google Scholar
Carroll JD, Arabie P (1980) Multidimensional scaling. Annu Rev Psychol 31:607649
Google Scholar
Chen JT, Gupta AK (2005) Matrix variate skew normal distributions. Statistics 39(3):247–253
MathSciNet MATH Google Scholar
Chen J, Gupta AK (2012) Parametric statistical change point analysis: with applications to genetics, medicine, and finance. Springer, Berlin
MATH Google Scholar
Dawid AP (1981) Some matrix-variate distribution theory: notational considerations and a Bayesian application. Biometrika 68:265–274
MathSciNet MATH Google Scholar
Dutilleul P (1999) The MLE algorithm for the matrix normal distribution. J Stat Comput Simul 64(2):105–123
MATH Google Scholar
Gallaugher MPB, McNicholas PD (2018) Finite mixtures of skewed matrix variate distributions. Pattern Recognit 80:83–93
Google Scholar
Gallaugher MP, McNicholas PD (2017) A matrix variate skew-t distribution. Stat 6(1):160–170
MathSciNet Google Scholar
Gupta AK, Nagar DK (1999) Matrix variate distributions, vol 104. CRC Press, Florida
MATH Google Scholar
Harrar SW, Gupta AK (2008) On matrix variate skew-normal distributions. Statistics 42(2):179–194
MathSciNet MATH Google Scholar
Hoeffding W, Robbins H (1948) The central limit theorem for dependent random variables. Duke Math J 15(3):773–780
MathSciNet MATH Google Scholar
Liu Z, Modarres R (2011) A triangle test for equality of distribution functions in high dimensions. J Nonparametr Stat 23(3):605–615
MathSciNet MATH Google Scholar
Lovison G (2006) A matrix-valued Bernoulli distribution. J Multivar Anal 97(7):1573–1585
MathSciNet MATH Google Scholar
Lu N, Zimmerman DL (2005) The likelihood ratio test for a separable covariance matrix. Stat Probab Lett 73(4):449–457
MathSciNet MATH Google Scholar
Maa JF, Pearl DK, Bartoszyński R (1996) Reducing multidimensional two-sample data to one-dimensional interpoint comparisons. Ann Stat 24:1069–1074
MathSciNet MATH Google Scholar
Mitchell MW, Genton MG, Gumpertz ML (2006) A likelihood ratio test for separability of covariances. J Multivar Anal 97(5):1025–1043
MathSciNet MATH Google Scholar
Naik DN, Rao SS (2001) Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix. J Appl Stat 28(1):91–105
MathSciNet MATH Google Scholar
Roy A (2007) A note on testing of Kronecker product covariance structures for doubly multivariate data. In: Proceedings of the American Statistical Association, statistical computing section, pp 2157–2162
Székely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. InterStat 5
Vermunt JK (2007) A hierarchical mixture model for clustering three-way data sets. Comput Stat Data Anal 51:5368–5376
MathSciNet MATH Google Scholar
Viroli C (2011) Finite mixtures of matrix normal distributions for classifying three-way data. Stat Comput 21–4:511–522
MathSciNet MATH Google Scholar
Viroli C (2012) On matrix-variate regression analysis. J Multivar Anal 111:296–309
MathSciNet MATH Google Scholar
Xia Y, Li L (2017) Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics 73(3):780–791
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, The George Washington University, Washington, DC, 20052, USA
Lingzhe Guo & Reza Modarres

Authors

Lingzhe Guo
View author publications
You can also search for this author in PubMed Google Scholar
Reza Modarres
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reza Modarres.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Lemma 1

The distance between $\mathbf X_1$ and $\mathbf X_2$ is

$$\begin{aligned} ||X_1-X_2||_{Fr}= & {} ||vec(X_1)-vec(X_2)||_{Eu}\\= & {} [{(vec(X_1)-vec(X_2))'(vec(X_1)-vec(X_2))}]^{1/2}. \end{aligned}$$

By definition of $\mathcal {MN}$ distribution, we have $vec(X_1),vec(X_2)\sim N_{dm}(vec(M_1), { \varSigma }_1\otimes \varPsi _1).$ Then we have $vec(X_1)-vec(X_2)\sim N_{dm}(0, 2{ \varSigma }_1 \otimes \varPsi _1 )$. Suppose $\{\lambda _i\},i=1,\ldots ,d$ are the eigenvalues for ${ \varSigma }_1$, and $\{\theta _j\},j=1,\ldots ,m$ are the eigenvalues for $\varPsi _1$, one can show that $\{2\lambda _i\theta _j\},i=1,\ldots ,d,j=1,\ldots ,m$ are the eigenvalues for $2{ \varSigma }_1\otimes \varPsi _1$. Therefore, we have

$$\begin{aligned} ||X_1-X_2||_{Fr}^2 = (vec(X_1) -vec(X_2))'(vec(X_1)-vec(X_2))=2\sum _{i=1}^d\sum _{j=1}^m \lambda _i\theta _j\chi ^2_{(1)}. \end{aligned}$$

Thus $E||X_1-X_2||_{Fr}^2 =2\sum _{i=1}^d\sum _{j=1}^m \lambda _i\theta _j =2tr({ \varSigma }_1 \otimes \varPsi _1)$. It is not difficult to show that $\text {var(}(||X_1-X_2||_{Fr}^2)=8 tr({ \varSigma }_1 \otimes \varPsi _1)^2$. Note that if $\varPsi _1=I_m$, i. e. the columns of the observation matrix are i.i.d., then the expectation reduces to $2m\sum _{i=1}^d \lambda _i$.

Similarly, $E||Y_1-Y_2||_{Fr}^2 =2tr({ \varSigma }_2 \otimes \varPsi _2)$. Consider $\mathbf X_1$ and $Y_1$, we have $vec(X_1)-vec(Y_1)\sim N_{dm}(vec(M_1-M_2), { \varSigma }_1 \otimes \varPsi _1+{ \varSigma }_2 \otimes \varPsi _2 )$. One can show that

$$\begin{aligned} E||X_1-Y_1||_{Fr}^2= & {} E||vec(X_1)-vec(Y_1)||_{Eu}^2\\= & {} tr[{ \varSigma }_1 \otimes \varPsi _1+{ \varSigma }_2 \otimes \varPsi _2]+||M_1-M_2||_{Fr}^2. \end{aligned}$$

Therefore, we have $2E||X_1-Y_1||_{Fr}^2-E||X_1-X_2||_{Fr}^2-E||Y_1-Y_2||_{Fr}^2 =||M_1-M_2||^2_{Fr}\ge 0$. The equality holds if and only if $||M_1-M_2||^2_{Fr}=0$. It follows that $M_1=M_2$. $\square $

Proof of Theorem 1

Let $vec(\cdot )$ be an operator that maps a matrix to a vector by stacking the columns of the matrix on top of one another. Let ${\varvec{u}}_1=vec(X_1)$, ${\varvec{u}}_2=vec(X_2)$, ${\varvec{v}}_1=vec(Y_1)$ and ${\varvec{v}}_2=vec(Y_2)$. Based on Proposition 1, one can show that

$$\begin{aligned} ||{\varvec{u}}_1-{\varvec{u}}_2||_{Eu} = \gamma _{dm} \int _{S^{dm-1}}|a'({\varvec{u}}_1-{\varvec{u}}_2)| d\mu (a), \end{aligned}$$

where $\mu $ is the uniform distribution on $S^{dm-1}=\{{\varvec{x}}\in \mathfrak {R}^{dm}{:} ||{\varvec{x}}||_{Eu}=1\}$, the surface of the unit sphere in $\mathfrak {R}^{dm}$ and $\gamma _{dm}$ in Proposition 1. Similarly, we have

$$\begin{aligned} ||{\varvec{v}}_1-{\varvec{v}}_2||_{Eu}&= \gamma _{dm} \int _{S^{dm-1}}|a'({\varvec{v}}_1-{\varvec{v}}_2)| d\mu (a),\\ ||{\varvec{u}}_1-{\varvec{v}}_1||_{Eu}&= \gamma _{dm} \int _{S^{dm-1}}|a'({\varvec{u}}_1-{\varvec{v}}_1)| d\mu (a). \end{aligned}$$

Following Baringhaus and Franz (2004), one can show that for each $a\in S^{dm-1}$,

$$\begin{aligned} 2E|a'({\varvec{u}}_1-{\varvec{v}}_1)|-E|a'({\varvec{u}}_1-{\varvec{v}}_2)|-E|a'({\varvec{v}}_1-{\varvec{v}}_2)|\ge 0. \end{aligned}$$

(24)

The equality holds if and only if the distribution of $a'{\varvec{u}}_1$ and $a'{\varvec{v}}_1$ coincide. Integrating with respect to $\mu $ on both sides of inequality (24), we have

$$\begin{aligned} 2E||{\varvec{u}}_1-{\varvec{v}}_1||_{Eu}-E||{\varvec{u}}_1-{\varvec{u}}_2||_{Eu}-E||{\varvec{v}}_1-{\varvec{v}}_2||_{Eu}\ge 0. \end{aligned}$$

(25)

The equality holds if and only if for almost all $a\in S^{dm-1}$, the distribution of $a'{\varvec{u}}_1$ and $a'{\varvec{v}}_1$ coincide. Since for each $t\in \mathfrak {R}$, the function $E(e^{ita' u_1})$ and $E(e^{ita' v_1}) $ are continuous, the equality in (25) holds if and only if ${\varvec{u}}_1$ and ${\varvec{v}}_1$ have same Fourier transform, which implies $F=G$. By the definition of the Frobenius and Euclidean norms, we have $\mu _{FF}=E||X_1-X_2||_{Fr}=E||{\varvec{u}}_1-{\varvec{u}}_2||_{Eu}$, $\mu _{GG}=E||Y_1-Y_2||_{Fr}=E||{\varvec{v}}_1-{\varvec{v}}_2||_{Eu}$, $\mu _{FG}=E||X_1-Y_1||_{Fr}=E||{\varvec{u}}_1-{\varvec{v}}_1||_{Eu}$. Therefore, we have $2\mu _{FG}-\mu _{FF}-\mu _{GG}\ge 0$, and the equality holds if and only if $F=G$. $\square $

Proof of Theorem 2

Let $h(X_i,X_j;Y_p,Y_q)$ be a real-valued function such that

$$\begin{aligned} h(X_i,X_j;Y_p,Y_q)=||X_i-Y_p||_{Fr}+||X_j-Y_q||_{Fr}+||X_i-X_j||_{Fr}+||Y_p-Y_q||_{Fr}. \end{aligned}$$

(26)

It is clear that h is symmetric within each argument $(X_i,X_j)$ and $(Y_p,Y_q)$. One can show that EN is the U-statistic corresponding to the kernel function h. That is,

$$\begin{aligned} EN= {n_1\atopwithdelims ()2}^{-1}{n_2\atopwithdelims ()2}^{-1}\sum _{i=1}^{n_1}\sum _{i=1}^{n_1-1}\sum _{j=i+1}^{n_1}\sum _{p=1}^{n_2-1}\sum _{q=p+1}^{n_2}h(X_i,X_j;Y_p,Y_q). \end{aligned}$$

(27)

If the distribution F and G are identical, by Theorem 1, we have $E(h(X_1,X_2;Y_1,$$Y_2)) =0$. In addition, since $E[X_1,X_2;Y_1,Y_2|X_1={\mathbb {X}}_1,Y_1={\mathbb {Y}}_1]=0$ for almost all matrix realization $({\mathbb {X}}_1,{\mathbb {Y}}_1)$, EN is a degenerate kernel U-statistic. The asymptotic distribution of EN can be inferred from the work of Hoeffding and Robbins (1948) for the case $k=1$, which shows that $n\cdot EN$ has a non-degenerate limiting distribution $\sum _{i=1}^\infty \lambda _i (Z_i^2-1)$ where the constants $\lambda _i$ depend on F and $Z_i^2$ are independent $\chi ^2_1$ random variables. $\square $

Proof of Theorem 3

If $\mu _{FF}= \mu _{GG}= \mu _{FG}$, we have $2\mu _{FG}-\mu _{FF}-\mu _{GG}= 0$, which implies $F=G$ by Theorem 1. Suppose $F=G$, the distributions of $\mathbf{X}_1, \mathbf{X}_2,\mathbf{Y}_1$ and $\mathbf{Y}_2$ are equal. Therefore, the distributions of $ ||\mathbf{X}_1-\mathbf{X}_2||_{Fr}, ||\mathbf{Y}_1-\mathbf{Y}_2||_{Fr}$ and $||\mathbf{X}_1-\mathbf{Y}_1||_{Fr}$ are also equal, implying the fact that $\mu _{FF}= \mu _{GG}= \mu _{FG}$. $\square $

Proof of Theorem 4

Following Biswas and Ghosh (2014) for vector distributions, we express $n\cdot BG(\mathscr {A},\mathscr {B})$ as

$$\begin{aligned} n\cdot BG(\mathscr {A},\mathscr {B})=\frac{1}{2}([\sqrt{n}({\hat{\mu }}_{FF} -{\hat{\mu }}_{GG} )]^2+[\sqrt{n}\cdot EN(\mathscr {A},\mathscr {B})]^2), \end{aligned}$$

where ${\hat{\mu }}_{FF} $ and ${\hat{\mu }}_{GG} $ are given in Eqs. (8)–(10). From Theorem 2, we have ${n}\cdot EN(\mathscr {A},\mathscr {B})=O_p(1)$, and hence $\sqrt{n}\cdot EN(\mathscr {A},\mathscr {B}) \overset{p}{\rightarrow }0$, as $n\rightarrow \infty $.

Let $\mu _{FF}=E||\mathbf{X}_1-\mathbf{X}_2 ||_{Fr}$ and $\mu _{GG}=E||\mathbf{Y}_{1}-\mathbf{Y}_{2} ||_{Fr}$. Under null hypothesis, we have $\mu _{FF}=\mu _{GG}$, and hence $\sqrt{n}({\hat{\mu }}_{FF} -{\hat{\mu }}_{GG} )=\sqrt{n}[({\hat{\mu }}_{FF}-\mu _{FF})-({\hat{\mu }}_{GG} -\mu _{GG})]. $ Note that

$$\begin{aligned} {\hat{\mu }}_{FF} -\mu _{FF}={n_1\atopwithdelims ()2}^{-1}\sum _{i=1}^{n_1-1}\sum _{j=i+1}^{n_1}(||\mathbf{X}_i-\mathbf{X}_j||_{Fr}-\mu _{FF}) \end{aligned}$$

is a U-statistic with symmetric kernel function $h(\mathbf{X}_i,\mathbf{X}_j)=||\mathbf{X}_i-\mathbf{X}_j||_{Fr}-\mu _{FF}$. Therefore, we have $R_1=\sqrt{n_1}({\hat{\mu }}_{FF} -\mu _{FF})\overset{d}{\rightarrow }N(0,4{\sigma }_0)$, where ${\sigma }_0=Var[E(||\mathbf{X}_1-\mathbf{X}_2||_{Fr}|\mathbf{X}_1)]$. Similarly, we have $R_2=\sqrt{n_2}({\hat{\mu }}_{GG} -\mu _{GG})\overset{d}{\rightarrow }N(0,4{\sigma }_0)$. Since ${\hat{\mu }}_{FF} $ and ${\hat{\mu }}_{GG} $ are independent, one can show

$$\begin{aligned} \sqrt{n}({\hat{\mu }}_{FF} -{\hat{\mu }}_{GG} )=&\sqrt{n/n_1}R_1-\sqrt{n/n_2}R_2\overset{d}{\rightarrow }N\left( 0,\left( \frac{1}{\lambda }+\frac{1}{1-\lambda }\right) 4{\sigma }_0\right) . \end{aligned}$$

Therefore, as $min(n_1,n_2)\rightarrow \infty $, we obtain

$$\begin{aligned} n\cdot BG(\mathscr {A},\mathscr {B}) =\frac{1}{2}([\sqrt{n}({\hat{\mu }}_{FF} -{\hat{\mu }}_{GG} )]^2+[\sqrt{n}\cdot EN(\mathscr {A},\mathscr {B})]^2) \overset{d}{\rightarrow }\frac{ 2{\sigma }_0}{\lambda (1-\lambda )}\chi _1^2. \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, L., Modarres, R. Testing the equality of matrix distributions. Stat Methods Appl 29, 289–307 (2020). https://doi.org/10.1007/s10260-019-00477-7

Download citation

Accepted: 02 June 2019
Published: 25 June 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10260-019-00477-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing the equality of matrix distributions

Abstract

Access this article

Similar content being viewed by others

Block-diagonal test for high-dimensional covariance matrices

The Likelihood Ratio Test of Equality of Mean Vectors with a Doubly Exchangeable Covariance Matrix

On testing the equality of high dimensional mean vectors with unequal covariance matrices

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Proof of Lemma 1

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Testing the equality of matrix distributions

Abstract

Access this article

Similar content being viewed by others

Block-diagonal test for high-dimensional covariance matrices

The Likelihood Ratio Test of Equality of Mean Vectors with a Doubly Exchangeable Covariance Matrix

On testing the equality of high dimensional mean vectors with unequal covariance matrices

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Proof of Lemma 1

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation