Selective Classifiers Can Be Too Restrictive: A Case-Study in Oesophageal Cancer

Blanco, Rosa; van der Gaag, Linda C.; Inza, Iñaki; Larrañaga, Pedro

doi:10.1007/978-3-540-30547-7_22

Rosa Blanco²⁰,
Linda C. van der Gaag²¹,
Iñaki Inza²⁰ &
…
Pedro Larrañaga²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3337))

Included in the following conference series:

International Symposium on Biological and Medical Data Analysis

737 Accesses

Abstract

Real-life datasets in biomedicine often include missing values. When learning a Bayesian network classifier from such a dataset, the missing values are typically filled in by means of an imputation method to arrive at a complete dataset. The thus completed dataset then is used for the classifier’s construction. When learning a selective classifier, also the selection of appropriate features is based upon the completed data. The resulting classifier, however, is likely to be used in the original real-life setting where it is again confronted with missing values. By means of a real-life dataset in the field of oesophageal cancer that includes a relatively large number of missing values, we argue that especially the wrapper approach to feature selection may result in classifiers that are too selective for such a setting and that, in fact, some redundancy is required to arrive at a reasonable classification accuracy in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Groves, R., Dillman, D., Eltinge, J., Little, R.: Survey Nonresponse. Wiley- Interscience, Chichester (2002)
MATH Google Scholar
Kalton, G., Kasprzyk, D.: The treatment of missing survey data. Survey Methodology 12, 1–16 (1986)
Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Article MATH Google Scholar
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)
MATH Google Scholar
Castillo, E., Gutiérrez, J., Hadi, A.: Expert Systems and Probabilistic Network Models. Springer, New York (1997)
Google Scholar
Jensen, F.: Bayesian Networks and Decision Graphs. Springer, Heidelberg (2001)
MATH Google Scholar
Pardo, L.: Teoría de la Información Estadística. Hespérides (1997) (in spanish)
Google Scholar
Minsky, M.: Steps toward artificial intelligence. Transactions on Institute of Radio Engineers 49, 8–30 (1961)
MathSciNet Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–164 (1997)
Article MATH Google Scholar
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 335–338 (1996)
Google Scholar
Hand, D., You, K.: Idiot’s Bayes –not so stupid after all? International Statistical Review 69, 385–398 (2001)
Article MATH Google Scholar
Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Pazzani, M.: Searching for dependencies in Bayesian classifiers. In: Artificial Intelligence and Statistics IV, Lecture Notes in Statistics. Springer, New York (1997)
Google Scholar
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, University of Basque Country, P.O. Box 649, E-20080, San Sebastián, Spain
Rosa Blanco, Iñaki Inza & Pedro Larrañaga
Institute of Information and Computing Sciences, Utrecht University, P.O. Box 80089, 3508 TB, Utrecht, The Netherlands
Linda C. van der Gaag

Authors

Rosa Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Linda C. van der Gaag
View author publications
You can also search for this author in PubMed Google Scholar
Iñaki Inza
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Larrañaga
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Biomedical Informatics Group, Artificial Intelligence Lab, Polytechnical University of Madrid, Spain
José María Barreiro
Medical Bioinformatics Department, Institute of Health ‘Carlos III’, Ctra. Majadahonda-Pozuelo, km 2. 28220 Majadahonda, Madrid,
Fernando Martín-Sánchez
Biomedical Informatics Group, Dep. Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid, Spain
Víctor Maojo
GRIB, IMIM/UPF Barcelona, Spain
Ferran Sanz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blanco, R., van der Gaag, L.C., Inza, I., Larrañaga, P. (2004). Selective Classifiers Can Be Too Restrictive: A Case-Study in Oesophageal Cancer. In: Barreiro, J.M., Martín-Sánchez, F., Maojo, V., Sanz, F. (eds) Biological and Medical Data Analysis. ISBMDA 2004. Lecture Notes in Computer Science, vol 3337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30547-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-30547-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23964-2
Online ISBN: 978-3-540-30547-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics