Abstract
In this paper, we consider high-dimensional quadratic classifiers in non-sparse settings. The quadratic classifiers proposed in this paper draw information about heterogeneity effectively through both the differences of growing mean vectors and covariance matrices. We show that they hold a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under non-sparse settings. We also propose a quadratic classifier after feature selection by using both the differences of mean vectors and covariance matrices. We discuss the performance of the classifiers in numerical simulations and actual data analyzes. Finally, we give concluding remarks about the choice of the classifiers for high-dimensional, non-sparse data.
Article PDF
Similar content being viewed by others
Change history
26 December 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11009-021-09918-x
References
Aoshima M, Yata K (2011) Two-stage procedures for high-dimensional data. Seq Anal (Editor’s special invited paper) 30:356–399
Aoshima M, Yata K (2014) A distance-based, misclassification rate adjusted classifier for multiclass, high-dimensional data. Ann I Stat Math 66:983–1010
Aoshima M, Yata K (2015a) Asymptotic normality for inference on multisample, high-dimensional mean vectors under mild conditions. Methodol Comput Appl 17:419–439
Aoshima M, Yata K (2015b) Geometric classifier for multiclass, high-Dimensional data. Seq Anal 34:279–294
Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL Translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Bai Z, Saranadasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sinica 6:311–329
Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10:989–1010
Bickel PJ, Levina E (2008) Covariance regularization by thresholding. Ann Stat 36:2577–2604
Cai TT, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106:1566–1577
Cai TT, Liu W, Luo X (2011) A constrained ℓ1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106:594–607
Chan YB, Hall P (2009) Scale adjustments for classifiers in high-dimensional, low sample size settings. Biometrika 96:469–478
Donoho D, Jin J (2015) Higher criticism for large-scale inference, especially for rare and weak effects. Stat Sci 30:1–25
Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87
Fan J, Fan Y (2008) High-dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637
Fan J, Feng Y, Tong X (2012) A road to classification in high dimensional space: the regularized optimal affine discriminant. J Roy Stat Soc B 74:745–771
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Hall P, Marron JS, Neeman A (2005) Geometric representation of high dimension, low sample size data. J Roy Stat Soc B 67:427–444
Huang S, Tong T, Zhao H (2010) Bias-corrected diagonal discriminant rules for high-dimensional classification. Biometrics 66:1096–1106
Li Q, Shao J (2015) Sparse quadratic dicriminant analysis for high dimensional data. Stat Sinica 25:457–473
Marron JS, Todd MJ, Ahn J (2007) Distance-weighted discrimination. J Am Stat Assoc 102:1267–1271
Shao J, Wang Y, Deng X, Wang S (2011) Sparse linear discriminant analysis by thresholding for high dimensional data. Ann Stat 39:1241–1265
Yata K, Aoshima M (2013) Correlation tests for high-dimensional data using extended cross-data-matrix methodology. J Multivariate Anal 117:313–331
Acknowledgements
We would like to thank the reviewers for their constructive comments. The research of the first author was partially supported by Grants-in-Aid for Scientific Research (A) and Challenging Exploratory Research, Japan Society for the Promotion of Science (JSPS), under Contract Numbers 15H01678 and 26540010. The research of the second author was partially supported by Grant-in-Aid for Young Scientists (B), JSPS, under Contract Number 26800078.
Author information
Authors and Affiliations
Corresponding author
Additional information
The original online version of this article was revised due to a retrospective Open Access order.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Aoshima, M., Yata, K. High-Dimensional Quadratic Classifiers in Non-sparse Settings. Methodol Comput Appl Probab 21, 663–682 (2019). https://doi.org/10.1007/s11009-018-9646-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11009-018-9646-z