Abstract
Consider the problem of classifying a number of objects into one of several groups or classes based on a set of characteristics. This problem has been extensively studied under the general subject of discriminant analysis in the statistical literature, or supervised pattern recognition in the machine learning field. Recently, dimension reduction methods, such as SIR. and SAVE, have been used for classification purposes. In this paper we propose a regularized version of the SIR. method which is able to gain information from both the structure of class means and class variances. Furthermore, the introduction of a shrinkage parameter allows the method to be applied in under-resolution problems, such as those found in gene expression microarray data. The REGSIR method is illustrated on two different classification problems using real data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
COOK, R.D. and YIN, X. (2001): Dimension Reduction and Visualization in Discriminant Analysis (with discussion). Australian and New Zealand Journal of Statistics, 43, 147–199.
COOK, R.D. and WEISBERG, S. (1991): Discussion of Li (1991). Journal of the American Statistical Association, 86, 328–332.
DUDOIT, S., FRIDLYAND, J. and SPEED, T.P. (2002): Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association, 97, 77–87.
FRIEDMAN, J. H. (1989): Regularized Discriminant Analysis. Journal of the American Statistical Association, 84, 165–175.
KHAN, J., WEI, J.S., RINGNER, M., SAAL, L.H., LADANYI, M., WESTER-MANN, F., BERTHOLD, F., SCHWAB, M., ANTONESCU, C.R., PETERSON, C. and MELTZER, P.S. (2001): Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks. Nature Medicine, 7, 673–679.
LI, K. C. (1991): Sliced Inverse Regression for Dimension Reduction (with discussion). Journal of the American Statistical Association, 86, 316–342.
LI, K. C. (2000): High Dimensional Data Analysis Via the SIR/PHD Approach. Unpublished manuscript.
R Development Core Team (2005): R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://uuu.R-project.org.
SAS Institute (1999): SAS/STAT User’s manual. Version 8.0, SAS Institute, Cary, NC, URL http: //v8doc.sas.com.
TIBSHIRANI, R., HASTIE, T., NARASHIMAN, B. and CHU, G. (2002): Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression. PNAS, 99, 6567–6572.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Heidelberg
About this paper
Cite this paper
Scrucca, L. (2006). Regularized Sliced Inverse Regression with Applications in Classification. In: Zani, S., Cerioli, A., Riani, M., Vichi, M. (eds) Data Analysis, Classification and the Forward Search. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-35978-8_7
Download citation
DOI: https://doi.org/10.1007/3-540-35978-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35977-7
Online ISBN: 978-3-540-35978-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)