Abstract
Given a p-dimensional random variable X, Principal Components Analysis defines its optimal representation in a lower dimensional space. In this article we assume that X is distributed according to a Mixture of two Multivariate Normal Distributions and we project it onto an optimal vector space. We propose an original combination of principal components and linear discriminant analysis where the area under the ROC curve appears as the link between both methods. We represent X in terms of a small number of independent factors with maximum contribution to the area under the ROC curve of an optimal linear discriminant function. A practical example illustrates how these factors describe the differences between two categories in a simple classification problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Class conditional distributions
- 2.
Receiver Operating Characteristic
- 3.
But not necessarily orthogonal
- 4.
remember that \(\mathbf{B} = (\Sigma _{0} + \Sigma _{1})\)
- 5.
This makes our proposal similar to the methodology applied in Caprihan (2008).
- 6.
This is less than angle formed by the hands of a clock at five past twelve.
References
Bamber, D. (1975). The area above the ordinal dominance graph and the area bellow the receiver operating characteristic graph. Journal of Mathematical Psychology, 12, 387–415.
Caprihan, A., Pearlson, G. D., & Calhoun, V. D. (2008). Aplication of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements. Neuroimage, 42(2), 675–682.
Chang, W. C. (1983). Using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32(3), 267–275.
Harville, D. A. (1997). Matrix algebra from a statistitian’s perspective. New York: Springer.
Izenman, A. J. (2008). Modern multivariate statistical techniques: regression, classification and manifold learning. New York: Springer.
Krzanowski, W. J., & Hand, D. J. (2008). ROC curves for continuous data. Boca Raton, FL: CRC/Taylor and Francis/Chapman & Hall.
Kullback, S. (1968). Information theory and statistics. Mineola, NY: Dover.
Metz, C. E., & Xiaochuan, P. (1999). “Proper” binormal ROC curves: Theory and maximum likelihood estimation. Journal of Mathematical Psychology, 43, 1–33.
Su, J. Q., & Liu, J. S. (1993). Linear combinations of multiple diagnostic markers. Journal of the American Statistical Association, 88(424), 1350–1355.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Cuevas-Covarrubias, C. (2013). Principal Components Analysis for a Gaussian Mixture. In: Lausen, B., Van den Poel, D., Ultsch, A. (eds) Algorithms from and for Nature and Life. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-00035-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-00035-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-00034-3
Online ISBN: 978-3-319-00035-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)