Skip to main content

Cluster Ensemble Based on Co-occurrence Data

  • Conference paper
  • First Online:
Advances in Data Analysis, Data Handling and Business Intelligence
  • 3054 Accesses

Abstract

Ensemble approach based on aggregated models has been successfully applied in the context of supervised learning in order to increase the accuracy and stability of classification. Recently, analogous techniques for cluster analysis have been suggested. Research has proved that, by combining a set of different clusterings, an improved solution can be obtained. In the traditional way of learning from a data set, the classifiers are built in a feature space. However, an alternative way can be found by constructing decision rules on dissimilarity representations. In such a recognition process each object is described by a matrix showing the similarities or distances to the rest of training samples. This research has focused on exploiting the additional information provided by a collection of diverse clusterings to generate a co-occurrence (co-association) matrix. Taking the co-occurrences of pairs of patterns in the same cluster as votes for their association, the data partitions are mapped into a co-association matrix. This n ×n matrix represents a new similarity measure between patterns. The final data partition is obtained by clustering this matrix. In the experiments, the behavior of partitions built on co-occurrence data is studied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository. Irvine: School of Information and Computer Science, University of California. Retrieved from http://www.ics.uci.edu/∼mlearn/MLRepository.html.

  • Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum.

    MATH  Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 26(2), 123–140.

    Google Scholar 

  • Fred, A. (2002). Finding consistent clusters in data partitions. In F. Roli, & J. Kittler (Eds.), Proceedings of the International Workshop on Multiple Classifier Systems (pp. 309–318). LNCS 2364.

    Google Scholar 

  • Fred, A., & Jain, A. K. (2002). Data clustering using evidence accumulation. In Proceedings of the Sixteenth International Conference on Pattern Recognition (pp. 276–280). ICPR, Canada.

    Google Scholar 

  • Freund, Y. (1990). Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory (pp. 202–216).

    Google Scholar 

  • Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. New Jersey: Prentice-Hall.

    MATH  Google Scholar 

  • Jain, A., Murty, M. N., & Flynn, P. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.

    Article  Google Scholar 

  • Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.

    Google Scholar 

  • Kuncheva, L. I., Hadjitodorov, S. T., & Todorova, L. P. (2006). Experimental comparison of cluster ensemble methods. In 9th International Conference on Information Fusion (pp. 1–7). Florence.

    Google Scholar 

  • Pekalska, E., & Duin, R. P. W. (2000). Classifiers for dissimilarity-based pattern recognition. In A. Sanfeliu, J. J. Villanueva, M. Vanrell, R. Alquezar, A. K. Jain, & J. Kittler (Eds.), Proceedings of the Fifteenth International Conference on Pattern Recognition (pp. 12–16). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Strehl, A., & Ghosh, J. (2002). Cluster ensembles – A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–618.

    Article  MathSciNet  Google Scholar 

  • Tsymbal, A., Pechenizkiy, M., & Cunningham, P. (2003). Diversity in ensemble feature selection. Technical Report, Trinity College Dublin.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dorota Rozmus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rozmus, D. (2009). Cluster Ensemble Based on Co-occurrence Data. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_16

Download citation

Publish with us

Policies and ethics