Skip to main content

A Fuzzy Methodology for Clustering Text Documents with Uncertain Spatial References

  • Conference paper
  • First Online:
Computational Intelligence, Cyber Security and Computational Models

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 412))

Abstract

Fuzzy ERC (Extraction, Resolving and Clustering) architecture is proposed for handling the uncertain information that can be either queried explicitly by the user and the system can also cluster the documents based on the spatial keyword present in them. This research work applies fuzzy logic techniques along with information retrieval methods in resolving the spatial uncertainty in text and also finds the spatial similarity between two documents, in other words, the degree to which two or more documents talk about the same spatial location. An experimental analysis is performed with Reuter’s Data set. The results obtained from the experiment are based on the empirical evidence of the document clustering based on the spatial references present in them. It is concluded that the proposed work will provide users a new way in retrieving documents that have similar spatial references in them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Li, H., Srihari, R.K., Niu, C., Li, W.: InfoXtract location normalization: a hybrid approach to geographic references in information extraction. Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, pp. 39–44. Alberta, Canada (2003)

    Chapter  Google Scholar 

  2. Durupinar, F., Kahramankaptan, U., Cicekli, I.: Intelligent indexing, querying and reconstruction of crime scene photographs. In: Proceedings of TAINN2004, Izmir, Turkey, pp. 297–306 (2004)

    Google Scholar 

  3. Bordogna, G., Pagani, M., Pasi, G., Psaila, G.: Managing uncertainty in location-based queries. Fuzzy Sets Syst. 160, 2241–2252 (2009). doi:10.1016/j.fss.2009.02.016

    Article  MathSciNet  Google Scholar 

  4. Zhou, B., Yao, Y.: Evaluating information retrieval system performance based on user preference. JIIS 34, 227–248 (2010)

    Google Scholar 

  5. Bitters B.: Geospatial reasoning in a natural language processing (NLP) environment. In: Proceedings of the 25th International Cartographic Conference, CO-253 (2011)

    Google Scholar 

  6. Mulkar-Mehta, R., Hobbs, J.R., Hovy, E.: Granularity in natural language discourse. In: International Conference on Computational Semantics, Oxford, UK, pp. 360–364 (2011)

    Google Scholar 

  7. Mulkar-Mehta, R., Hobbs, J.R., Hovy, E.: Applications and discovery of granularity structures in natural language discourse. In: Logical Formalizations of Commonsense Reasoning—Papers from the AAAI 2011 Spring Symposium (SS-11-06) (2011)

    Google Scholar 

  8. Ropero, J., et al.: Term weighting for information retrieval using fuzzy logic, www.intechopen.com. ISBN 978-953-51-0393-6 (2012)

  9. Leetaru, K.H.: Fulltext geocoding versus spatial metadata for large text archives: towards a geographically enriched wikipedia. D-Lib Magazine 18(9/10) (2012). doi:10.1045/september2012-leetaru

  10. Kordjamshidi, P., Van Otterlo, M., Moens, M.-F.: Spatial role labeling annotation scheme. Handbook of Linguistic Annotation, Edited by James Pustejovsky, Nancy Ide, 01/2014: chapter Spatial Role Labeling Annotation Scheme: pp. 28; Springer

    Google Scholar 

  11. Azmi Murad, M.A., Martin, T.: Word similarity for document grouping using soft computing. IJCSNS Int. J. Comput. Sci. Netw. Secur. 7(8) (2007)

    Google Scholar 

  12. Kyoomarsi, F., Khosravi, H., Eslami, E., Dehkordy, P.K.: Optimizing machine learning approach based on fuzzy logic in text summarization. Int. J. Hybrid Inf. Technol. 2(2) (2009)

    Google Scholar 

  13. Das, B., Pal, S., Mondal, S.Kr., Dalui, D., Shome, S.K.: Automatic keyword extraction from any text document using N-gram rigid collocation. Int. J. Soft Comput. Eng. (IJSCE) 3(2), 238–242 (2013), ISSN: 2231–2307

    Google Scholar 

  14. Cutting, D.R., Karger, D.R., Pedersen, P., Tukey, J.W.: Scatter/gather: a cluster based approach to browsing large document collections. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Interface Design and Display, pp. 318–32927 (1992)

    Google Scholar 

  15. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proceedings of the Text Mining Workshop KDD, pp. 109–110 (2000)

    Google Scholar 

  16. Chen, Phoebe, Y.-P.: A hybrid framework using SOM and fuzzy theory for textual classification in data mining. Modelling with Words, pp. 153–167. Springer, Berlin Heidelberg (2003)

    Chapter  Google Scholar 

  17. Liu, T., Liu, S., Chen, Z., Ma, W.Y.: An evaluation of feature selection for text clustering. In: Proceedings of the ICML Conference, Washington, DC, USA, pp. 488–495 (2003)

    Google Scholar 

  18. Sahoo, N., Callan, J., Krishnan, R., Duncan, G., Padman, R.: Incremental hierarchical clustering of text documents. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, 06–11 Nov 2006, Arlington, Virginia, USA (2006). doi:10.1145/1183614.1183667

  19. Aggarwal, C.C., Zhao, Y., Yu, P.S.: On the use of side information for mining text data. IEEE Trans. Knowl. Data Eng. 26(6), 1415–1429 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. R. Kanagavalli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Kanagavalli, V.R., Raja, K. (2016). A Fuzzy Methodology for Clustering Text Documents with Uncertain Spatial References. In: Senthilkumar, M., Ramasamy, V., Sheen, S., Veeramani, C., Bonato, A., Batten, L. (eds) Computational Intelligence, Cyber Security and Computational Models. Advances in Intelligent Systems and Computing, vol 412. Springer, Singapore. https://doi.org/10.1007/978-981-10-0251-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0251-9_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0250-2

  • Online ISBN: 978-981-10-0251-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics