Skip to main content

WWW-Newsgroup-Document Clustering by Means of Dynamic Self-organizing Neural Networks

  • Conference paper
Artificial Intelligence and Soft Computing – ICAISC 2008 (ICAISC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5097))

Included in the following conference series:

Abstract

The paper presents a clustering technique based on dynamic self-organizing neural networks and its application to a large-scale and highly multidimensional WWW-newsgroup-document clustering problem. The collection of 19 997 documents (e-mail messages of different Usenet-News newsgroups) available at WWW server of the School of Computer Science, Carnegie Mellon University (www.cs.cmu.edu/ TextLearning/datasets.html) has been the subject of clustering. A broad comparative analysis with nine alternative clustering techniques has also been carried out demonstrating the superiority of the proposed approach in the considered problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, M.W.: Survey of Text Mining. Springer, New York (2004)

    MATH  Google Scholar 

  2. Caillet, M., Pessiot, J., Amini, M., Gallinari, P.: Unsupervised Learning with Term Clustering For Thematic Segmentation of Texts. In: Proc. of RIAO 2004 (Recherche d’Information Assiste par Ordinateur), Toulouse, France (2004)

    Google Scholar 

  3. Chakrabarti, S.: Mining the Web: Analysis of Hypertext and Semi Structured Data. Morgan Kaufmann Publishers, San Francisco (2002)

    Google Scholar 

  4. Franke, J., Nakhaeizadeh, G., Renz, I. (eds.): Text Mining: Theoretical Aspects and Applications. Physica Verlag/Springer, Heidelberg (2003)

    MATH  Google Scholar 

  5. Gorzałczany, M.B., Rudziński, F.: Application of Genetic Algorithms and Kohonen Networks to Cluster Analysis. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 556–561. Springer, Heidelberg (2004)

    Google Scholar 

  6. Gorzałczany, M.B., Rudziński, F.: Modified Kohonen Networks for Complex Cluster-Analysis Problems. In: Rutkowski, L., Siekmann, J., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 562–567. Springer, Heidelberg (2004)

    Google Scholar 

  7. Gorzałczany, M.B., Rudziński, F.: Cluster Analysis Via Dynamic Self-organizing Neural Networks. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 593–602. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Gorzałczany, M.B., Rudziński, F.: Application of dynamic self-organizing neural networks to WWW-document clustering. International Journal of Information Technology and Intelligent Computing 1(1), 89-101 (2006) (also presented at 8th Int. Conference on Artificial Intelligence and Soft Computing ICAISC 2006, Zakopane)

    Google Scholar 

  9. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Book Co., New York (1983)

    MATH  Google Scholar 

  10. Slonim, N., Friedman, N., Tishby, N.: Unsupervised Document Classification using Sequential Informaiton Maximization. In: Proc. of the Twenty-Fifth Annual International ACM SIGIR Conference, Tampere, Finland, pp. 129–136 (2002)

    Google Scholar 

  11. Tang, B., Shepherd, M., Milios, E., Heywood, M.I.: Comparing and combining dimension reduction techniques for efficient text clustering. In: Proc. of Int. Workshop on Feature Selection and Data Mining, Newport Beach (2005)

    Google Scholar 

  12. Weiss, S., Indurkhya, N., Zhang, T., Damerau, F.: Text Mining: Predictive Methods for Analyzing Unstructured Information. Springer, New York (2004)

    Google Scholar 

  13. Zanasi, A. (ed.): Text Mining and its Applications to Intelligence, CRM and Knowledge Management. WIT Press, Southampton (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Leszek Rutkowski Ryszard Tadeusiewicz Lotfi A. Zadeh Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gorzałczany, M.B., Rudziński, F. (2008). WWW-Newsgroup-Document Clustering by Means of Dynamic Self-organizing Neural Networks. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69731-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69572-1

  • Online ISBN: 978-3-540-69731-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics