Skip to main content

Unsupervised Elimination of Redundant Features Using Genetic Programming

  • Conference paper
AI 2009: Advances in Artificial Intelligence (AI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5866))

Included in the following conference series:

Abstract

While most feature selection algorithms focus on finding relevant features, few take the redundancy issue into account. We propose a nonlinear redundancy measure which uses genetic programming to find the redundancy quotient of a feature with respect to a subset of features. The proposed measure is unsupervised and works with unlabeled data. We introduce a forward selection algorithm which can be used along with the proposed measure to perform feature selection over the output of a feature ranking algorithm. The effectiveness of the proposed method is assessed by applying it to the output of the Chi-square (χ 2) feature ranker on a classification task. The results show significant improvements in the performance of decision tree and SVM classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jong, K., Mary, J., Cornuéjols, A., Marchiori, E., Sebag, M.: Ensemble feature ranking. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 267–278. Springer, Heidelberg (2004)

    Google Scholar 

  2. Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Fast feature ranking algorithm. In: Knowledge-Based Intelligent Information and Engineering Systems, pp. 325–331 (2003)

    Google Scholar 

  3. Neshatian, K., Zhang, M.: Genetic programming for feature subset ranking in binary classification problems. In: Vanneschi, L., et al. (eds.) EuroGP 2009. LNCS, vol. 5481. Springer, Heidelberg (2009)

    Google Scholar 

  4. Zheng, Z., Srihari, R., Srihari, S.: A feature selection framework for text filtering. In: Proceedings of the Third IEEE International Conference on Data Mining. IEEE Computer Society, Washington (2003)

    Google Scholar 

  5. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 5th edn. Prentice Hall, Englewood Cliffs (2002)

    Google Scholar 

  6. Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://archive.ics.uci.edu/ml/index.html

  7. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  8. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  9. Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neshatian, K., Zhang, M. (2009). Unsupervised Elimination of Redundant Features Using Genetic Programming. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10439-8_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10438-1

  • Online ISBN: 978-3-642-10439-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics