Abstract
Many statistical techniques, particularly multivariate methodologies, focus on extracting information from data and proximity matrices. Rather than rely solely on numerical characteristics, matrix visualization allows one to graphically reveal structure in a matrix.This article reviews the history of matrix visualization, then gives a more detailed description of its general framework, along with some extensions. Possible research directions in matrix visualization and information mining are sketched. Color versions of figures presented in this article, together with software packages, can be obtained from http://gap.stat.sinica.edu.tw/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bertin J. (1967). Semiologie graphique, Paris: Editions Gauthier-Villars. English translation by William J. Berg, as Semiology of Graphics: Diagrams, Networks, Maps. TheUniversity of Wisconsin Press, Madison, WI, 1983.
Carmichael J., Sneath P. (1969). Taxometric maps. Systematic Zoology 18, 402–415.
Chang S.C., Chen C.H., Chi Y.Y., Ouyoung C.W. (2002). Relativity and resolution for high dimensional information visualization with generalized association plots (GAP). Proceedings in Computational Statistics 2002 (Compstat 2002), Berlin, Germany, 55–66.
Chen C. H. (1996). The properties and applications of the convergence of correlation matrices. In: 1996 Proceedings of the Section on Statistical computing, 49–54, American Statistical Association.
Chen C. H. (1999). Extensions of generalized association plots (GAP). In: 1999 Proceedings of the Section on Statistical Graphics,111–116, American Statistical Association.
Chen C. H. (2002). Generalized association plots: information visualization via iteratively generated correlation matrices. Statistica Sinica 12, 7–29.
Chi Y. Y. (1999). Information visualization for comparing two sets of variables. Master Thesis. Division of Biomedical Statistics, Graduate Institute of Epidemiology, College of Public Health, National Taiwan University.
Chepoi V., Fichet B. (1997). Recognition of Robinsonian dissimilarities, Journal of Classification 14, 311–325.
Church K.W., Helfman J.I. (1993). Dotplot: a program for exploring selfsimilarity in millions of lines of text and code. Journal of Computational and Graphical Statistics 2, 153–174.
Cox T.F., Cox M. A.A. (2000). A general weighted two-way dissimilarity coefficient. Journal of Classification 17, 101–121.
Cox T.F., Cox M.A.A. (2001). Multidimensional scaling. 2nd ed. Chapman & Hall/CRC.
Eisen M.B., Spellman P.T., Brown P.O., Botstein B. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Nat’l. Acad. Sci. U. S. A. 95, 14863–14868.
Encarnacao J., Fruhauf M. (1994). Global information visualization: the visualization challenge for the 21st Century, in Scientific Visualization Advances and Changes L. Rosenblum et al (eds), Academic Press.
Fisher R.A. (1936). The use of multiple measurements in axonomic problems. Annals of Eugenics 7, 179–188.
Friendly M. (2002). Corrgrams: exploratory displays for correlation matrices. Amer. Statist 56, 316–324.
Friendly M., Kwan E. (2003). Effect ordering for data displays. Computational Statistics & Data Analysis 43, 509–539.
Gale N., Halperin C.W., Costanzo C.M. (1984). Unclassed matrix shading and optimal ordering in hierarchical cluster analysis. J. Classification 1, 75–92.
Gower J.C. (1971). A general coefficient of similarity and some of its properties. Biometrics 27, 857–874.
Hartigan J.A. (1972). Direct clustering of a data matrix. Journal of the American Statistical Association 67, 123–129.
Huber P.J. (1985). Projection pursuit. The Annals of Statistics 13, 435–475.
Hubert L. (1976). Seriation using asymmetric proximity measures. British J. Math. Statist. Psych. 29, 32–52.
Hwu H.G., Chen C.H., Hwang T.J., Liu CM., Cheng J.J., Lin S.K., Liu S.K., Chen C.H., Chi Y.Y., Ouyoung C.W., Lin H.N., Chen W. J. 2002). Symptom patterns and subgrouping of schizophrenic patients: significance of negative symptoms assessed on admission. Schizophrenia Research 56, 105–119.
Kay S.R., Fiszbein A., Opler L.A. (1987). The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276.
Kohonen T. (1995). Self-organizing maps. Berlin, Heidelberg: Springer.
Lenstra J.K. (1974). Clustering a data array and the traveling salesman problem. Operations Research 22, 413–414.
Li K.C (1991). Sliced inverse regression for dimensional reduction (with discussion). Journal of the American Statistical Association 86, 316–342.
Ling R.F. (1973). A computer generated aid for cluster analysis. Communications of the ACM 16, 355–361.
Marchette D.J., Solka J.L. (2003). Using data images for outlier detection. Computational Statistics and Data Analysis 43, 541–552.
Marcotorchino F. (1991). Seriation problems: an overview. Applied Stochastic Models and Data Analysis 7, 139–151.
Minnotte M., West W. (1998). The data image: a tool for exploring high dimensional data sets. In: 1998 Proceedings of the ASA Section on Statistical Graphics, Dallas, Texas, 25–33.
Murdoch D.J., Chow E.D. (1996). A graphical display of large correlation matrices. Statistical Computing 50, 178–180.
Robinson W. S. (1951). A method for chronologically ordering archaeological deposits. American Antiquity 16, 293–301.
Slagel J.R., Chang C.L., Heller S.R. (1975). A clustering and data reorganizing algorithm. IEEE Transactions on Systems, Man, and Cybernetics 5, 125–128.
Tukey J.W. (1977). Exploratory Data Analysis. Addison-Wesley.
Wegman E. (1990). Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association 85, 664–675.
Ziv B.J., David K.G., Tommi S.J. (2001). Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17, S22–S29.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, CH. et al. (2004). Matrix Visualization and Information Mining. In: Antoch, J. (eds) COMPSTAT 2004 — Proceedings in Computational Statistics. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-2656-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2656-2_6
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-1554-2
Online ISBN: 978-3-7908-2656-2
eBook Packages: Springer Book Archive