Abstract
Real-life datasets in biomedicine often include missing values. When learning a Bayesian network classifier from such a dataset, the missing values are typically filled in by means of an imputation method to arrive at a complete dataset. The thus completed dataset then is used for the classifier’s construction. When learning a selective classifier, also the selection of appropriate features is based upon the completed data. The resulting classifier, however, is likely to be used in the original real-life setting where it is again confronted with missing values. By means of a real-life dataset in the field of oesophageal cancer that includes a relatively large number of missing values, we argue that especially the wrapper approach to feature selection may result in classifiers that are too selective for such a setting and that, in fact, some redundancy is required to arrive at a reasonable classification accuracy in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Groves, R., Dillman, D., Eltinge, J., Little, R.: Survey Nonresponse. Wiley- Interscience, Chichester (2002)
Kalton, G., Kasprzyk, D.: The treatment of missing survey data. Survey Methodology 12, 1–16 (1986)
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)
Castillo, E., Gutiérrez, J., Hadi, A.: Expert Systems and Probabilistic Network Models. Springer, New York (1997)
Jensen, F.: Bayesian Networks and Decision Graphs. Springer, Heidelberg (2001)
Pardo, L.: Teoría de la Información Estadística. Hespérides (1997) (in spanish)
Minsky, M.: Steps toward artificial intelligence. Transactions on Institute of Radio Engineers 49, 8–30 (1961)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–164 (1997)
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 335–338 (1996)
Hand, D., You, K.: Idiot’s Bayes –not so stupid after all? International Statistical Review 69, 385–398 (2001)
Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Morgan Kaufmann, San Francisco (1994)
Pazzani, M.: Searching for dependencies in Bayesian classifiers. In: Artificial Intelligence and Statistics IV, Lecture Notes in Statistics. Springer, New York (1997)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Blanco, R., van der Gaag, L.C., Inza, I., Larrañaga, P. (2004). Selective Classifiers Can Be Too Restrictive: A Case-Study in Oesophageal Cancer. In: Barreiro, J.M., Martín-Sánchez, F., Maojo, V., Sanz, F. (eds) Biological and Medical Data Analysis. ISBMDA 2004. Lecture Notes in Computer Science, vol 3337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30547-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-30547-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23964-2
Online ISBN: 978-3-540-30547-7
eBook Packages: Springer Book Archive