Skip to main content

A New Biomimetic Approach Based on Social Spiders for Clustering of Text

  • Chapter
Software Engineering Research, Management and Applications 2012

Part of the book series: Studies in Computational Intelligence ((SCI,volume 430))

Abstract

View the explosion of data volume and high circulating on the web (satellite data, genomic data ...) the classification of the data (data mining technique) is required. The clustering was performed by a method based bio (social spiders) because there is currently no method of learning that can almost directly represent unstructured data (text). Thus, to make a good data classification must be a good representation of the data. The representation of these data is performed by a vector whose components are derived from the overall weight of the corpus used (TF-IDF). A language-independent method was used to represent text documents is that of n-grams characters and words. Several similarity measures have been tested. To validate the classification we used a measure of assessment based on recall and precision (f-measure).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, H., Martinez, J., Ng, T.D., Schatz, B.R.: A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System. Journal of the American Society for Information Science 48(1), 17–31 (1997)

    Article  Google Scholar 

  2. Tan, A.-H.: Text mining: The state of the art and the challenges. In: Workshop on Knowledge Discovery From Advanced Databases, PAKDD 1999, Beijing, China, pp. 65–70 (1999)

    Google Scholar 

  3. Feldman, R., Dagan, I.: Knowledge discovery in textual databases (KDT). In: International Conference on Knowledge Discovery, Montreal, Canada, pp. 112–117 (1995)

    Google Scholar 

  4. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  5. Willet, P.: Recent trends in hierarchic document clustering: A critical review. Information Processing & Management 24, 577–597 (1988)

    Article  Google Scholar 

  6. Rijsbergen, C.V.: Information retrieval, 2nd edn. Butterworths, London (1979)

    Google Scholar 

  7. Buhmann, J.: Data clustering and learning. In: Arbib, M. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 308–312. The MIT Press, Cambridge (2003)

    Google Scholar 

  8. Hamou, R.M., Lehireche, A., Lokbani, A.C., Rahmani, M.: Representation of textual documents by the approach wordnet and n-grams for the unsupervised classifcation (clustering) with 2D cellular automata:a comparative study. Journal of Computer and Information Science 3(3), 240–255 (2010) ISSN 1913-8989, E-ISSN 1913-8997

    Google Scholar 

  9. Hamou, R.M., Lehireche, A., Lokbani, A.C., Rahmani, M.: Text Clustering Based on the N-Grams by Bio Inspired Method (Immune Systems). International Refereed Research Journal Researchers Worls 1(1) (2010) ISSN 2229-4686

    Google Scholar 

  10. Hamou, R.M., Lehireche, A., Lokbani, A.C., Rahmani, M.: Text Clustering by 2D Cellular Automata Based on the N-Grams. In: 1st International Symposiums on Cryptography, Network Security, Data Mining and Knowledge Discovery, E-Commerce and its Applications, October 22-24. Proceedinds IEEE Publishers, Qinhuangdao (2010)

    Google Scholar 

  11. Beni, G., Wang, U.: Swarm intelligence in cellular robotic systems. In: NATO Advanced Workshop on Robots and Biological Systems, Il Ciocco, Tuscany, Italy (1989)

    Google Scholar 

  12. Dorigo, M., Stützle, T.: Ant Colony Optimization. MIT Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  13. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation 6(1), 58–73 (2002)

    Article  Google Scholar 

  14. Epstein, J.M., Axtell, R.: Growing Artificial Societies. MIT Press, Boston (1996)

    Google Scholar 

  15. Drogoul, A., Ferber, J.: Multi-agent Simulation as a Tool for Modeling Societies: Application to Social Differentiation in Ant Colonies. In: Castelfranchi, C., Werner, E. (eds.) MAAMAW 1992. LNCS, vol. 830, pp. 3–23. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reda Mohamed Hamou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hamou, R.M., Amine, A., Rahmani, M. (2012). A New Biomimetic Approach Based on Social Spiders for Clustering of Text. In: Lee, R. (eds) Software Engineering Research, Management and Applications 2012. Studies in Computational Intelligence, vol 430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30460-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30460-6_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30459-0

  • Online ISBN: 978-3-642-30460-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics