Skip to main content

On Discrete Data Clustering

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Abstract

Finite mixture modeling have been applied for different data mining tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited. In this paper, we investigate the problem of discrete data modeling using finite mixture models. We propose a novel mixture that we call the multinomial generalized Dirichlet mixture. We designed experiments involving spatial color image databases modeling and summarization to show the robustness, flexibility and merits of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bouguila, N., Ziou, D.: Unsupervised Learning of a Finite Discrete Mixture: Applications to Texture Modeling and Image Databases Summarization. Journal of Visual Communication and Image Representation 18(4), 295–309 (2007)

    Article  Google Scholar 

  2. Rennie, J.D.M., Shih, L., Teevan, J., Karger, D.R.: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: Proc. of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 616–623 (2003)

    Google Scholar 

  3. Madsen, R.E., Kauchak, D., Elkan, C.: Modeling Word Buristness Using the Dirichlet Distribution. In: Proc. of the 22nd International Conference on Machine Learning (ICML 2005), pp. 545–552. ACM Press, New York (2005)

    Chapter  Google Scholar 

  4. Bouguila, N., Ziou, D., Vaillancourt, J.: Unsupervised Learning of a Finite Mixture Model Based on the Dirichlet Distribution and its Application. IEEE Transactions on Image Processing 13(11), 1533–1543 (2004)

    Article  Google Scholar 

  5. Bouguila, N., Ziou, D.: A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture. IEEE Transactions on Image Processing 15(9), 2657–2668 (2006)

    Article  Google Scholar 

  6. Lochner, R.H.: A Generalized Dirichlet Distribution in Bayesian Life Testing. Journal of the Royal Statistical Society, B 37, 103–113 (1975)

    MATH  MathSciNet  Google Scholar 

  7. Thall, P.F., Sung, H.G.: Some Extensions and Applications of a Bayesian Startegy for Monitoring Multiple Outcomes in Clinical Trials. Statistics in Medicine 17, 1563–1580 (1998)

    Article  Google Scholar 

  8. Wong, T.: Generalized Dirichlet Distribution in Bayesian Analysis. Applied Mathematics and Computation 97, 165–181 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  9. McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley-Interscience, New York (1997)

    MATH  Google Scholar 

  10. Swain, M., Ballard, D.: Color Indexing. International Journal of Computer Vision 7(1), 11–32 (1991)

    Article  Google Scholar 

  11. Bouguila, N.: Spatial Color Image Databases Summarization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, HI, USA, vol. 1, pp. I–953–I–956 (2007)

    Google Scholar 

  12. Huang, J., Kumar, S.R., Mitra, M., Zhu, W., Zabih, R.: Spatial Color Indexing and Applications. International Journal of Computer Vision 35(3), 245–268 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bouguila, N., ElGuebaly, W. (2008). On Discrete Data Clustering. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics