Skip to main content

A Novel Semi-supervised Clustering Algorithm for Finding Clusters of Arbitrary Shapes

  • Conference paper
Advances in Computer Science and Engineering (CSICC 2008)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 6))

Included in the following conference series:

  • 901 Accesses

Abstract

Recently, several algorithms have been introduced for enhancing clustering quality by using supervision in the form of constraints. These algorithms typically utilize the pair wise constraints to either modify the clustering objective function or to learn the clustering distance measure. Very few of these algorithms show the ability of discovering clusters of different shapes along with satisfying the provided constraints. In this paper, a novel semi-supervised clustering algorithm is introduced that uses the side information and finds clusters of arbitrary shapes. This algorithm uses a two-stage clustering approach satisfying the pair wise constraints. In the first stage, the data points are grouped into a relatively large number of fuzzy ellipsoidal sub-clusters. Then, in the second stage, connections between sub-clusters are established according to the pair wise constraints and the similarity of sub-clusters. Experimental results show the ability of the proposed algorithm for finding clusters of arbitrary shapes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6) (2005)

    Google Scholar 

  2. Soleymani Baghshah, M., Bagheri Shouraki, S.: A fuzzy clustering algorithm for finding arbitrary shaped clusters. In: 6th ACS/IEEE International Conference on Computer Systems and Applications (to be published, 2008)

    Google Scholar 

  3. Basu, S.: Semi-supervised clustering: probabilistic models, algorithms and experiments. Ph.D. Thesis, University of Texas at Austin (2005)

    Google Scholar 

  4. Law, H.C.: Clustering, dimensionality reduction, and side information. Ph.D. Thesis, Michigan University (2006)

    Google Scholar 

  5. Zhengdong, L., Leen, T.: Semi-supervised learning with penalized probabilistic clustering. In: Neural Information Processing Systems, vol. 17 (2004)

    Google Scholar 

  6. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lange, T., Law, M.H., Jain, A.K., Buhmann, J.B.: Learning with constrained and unlabelled data. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 730–737 (2005)

    Google Scholar 

  8. Zhao, Q., Miller, D.J.: Mixture modeling with pair wise, instance-level class constraints. Neural Computation 17(11), 2482–2507 (2005)

    Article  MATH  Google Scholar 

  9. Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised graph clustering: a kernel approach. In: 22nd International Conference on Machine Learning, pp. 457–464 (2005)

    Google Scholar 

  10. Shental, N., Bar-Hillel, A., Hertz, T., Weinshall, D.: Computing Gaussian mixture models with EM using equivalence constraints. In: Neural Information Processing Systems, vol. 16, pp. 465–472 (2004)

    Google Scholar 

  11. Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning, with application to clustering with side information. In: Neural Information Processing Systems, vol. 15, pp. 505–512. MIT Press, Cambridge (2003)

    Google Scholar 

  12. Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research 6, 937–965 (2005)

    MathSciNet  MATH  Google Scholar 

  13. Yeung, D.Y., Chang, H.: Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints. Pattern Recognition 39, 1007–1010 (2006)

    Article  MATH  Google Scholar 

  14. Yeung, D.Y., Chang, H.: Robust path-based spectral clustering. Pattern Recognition 41, 191–203 (2008)

    Article  MATH  Google Scholar 

  15. Ceccarelli, M., Maratea, A.: Improving fuzzy clustering of biological data by metric learning with side information. International Journal of Approximate Reasoning (2007) doi: 10.1016/j.ijar.2007.03.008

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Soleymani Baghshah, M., Bagheri Shouraki, S. (2008). A Novel Semi-supervised Clustering Algorithm for Finding Clusters of Arbitrary Shapes. In: Sarbazi-Azad, H., Parhami, B., Miremadi, SG., Hessabi, S. (eds) Advances in Computer Science and Engineering. CSICC 2008. Communications in Computer and Information Science, vol 6. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89985-3_123

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89985-3_123

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89984-6

  • Online ISBN: 978-3-540-89985-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics