Skip to main content

Pattern Recognition Method for Classification of Agricultural Scientific Papers in Polish

  • Conference paper
  • First Online:
Computer Vision and Graphics (ICCVG 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11114))

Included in the following conference series:

Abstract

Calculation of text similarity is an essential task for the text analysis and classification. It be can based, e.g., on Jaccard, cosine or other similar measures. Such measures consider the text as a bag-of-words and, therefore, lose some syntactic and semantic features of its sentences. This article presents a different measure based on the so-called artificial sentence pattern (ASP) method. This method has been developed to analyze texts in the Polish language which has very rich inflection. Therefore, ASP has utilized syntactic and semantic rules of the Polish language. Nevertheless, we argue that it admits extensions to other languages. As a result of the analysis, we have obtained several hypernodes which contain the most important words. Each hypernode corresponds to one of the examined documents, the latter being published papers from agriculture domain written in Polish. Experimental results obtained from that set of papers have been described and discussed. Those results have been visually illustrated using graphs of hypernodes and compared with Jaccard and cosine measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jurafski, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall, Englewood Cliffs (2008)

    Google Scholar 

  2. Encyclopedia Britannica: Pattern recognition. https://www.britannica.com/technology/pattern-recognition-computer-science

  3. Indurkhya, N., Damerau, F.J.: Handbook of Natural Language Processing, 2nd edn. Chapman & Hall CRC Press, Boca Raton (2010)

    Google Scholar 

  4. Kornai, A.: Mathematical Linguistics. Springer, London (2008). https://doi.org/10.1007/978-1-84628-986-6

    Book  MATH  Google Scholar 

  5. Kocaleva, M., Stojanov, D., Stojanovik, I., Zdravev, Z.: Pattern recognition and natural language processing. State of the art. TEM J. 5(2), 236–240 (2016)

    Google Scholar 

  6. Clopinet, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  7. Bellegarda, J.R.: Statistical language models with embedded latent semantic knowledge in pattern recognition. In: Chou, W., Juang, B.-H. (eds.) Speech and Language Processing. Electrical Engineering & Applied Signal Processing Series, 1st edn. CRC Press, Boca Raton (2003)

    Google Scholar 

  8. Wu, Q., Fuller, E., Zhang, C.Q.: Graph model for pattern recognition in text. In: Ting, I.H., Wu, H.J., Ho, T.H. (eds.) Mining and Analyzing Social Networks. SCI, vol. 288, pp. 1–20. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13422-7_1

    Chapter  Google Scholar 

  9. Huynh, D., Tran, D., Ma, W., Sharma, D.: Grammatical dependency-based relations for term weighting in text classification. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 476–487. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20841-6_39

    Chapter  Google Scholar 

  10. Ożdżyński, P., Zakrzewska, D.: Using frequent pattern mining algorithms in text analysis. Inf. Syst. Manag. 6(3), 213–222 (2017)

    Google Scholar 

  11. Zhong, N., Li, Y., Wu, S.T.: Effective pattern discovery for text mining. IEEE Trans. Knowl. Data Eng. 24(1), 30–44 (2012)

    Article  Google Scholar 

  12. Angelova, R., Weikum, G.: Graph-based text classification: learn from your neighbors. In: SIGIR 2006 Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 485–492 (2006)

    Google Scholar 

  13. Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950)

    Article  MathSciNet  Google Scholar 

  14. Atoum, I., Otoom, A., Kulthuramaiyer, N.: A comprehensive comparative study of word and sentence similarity measures. Int. J. Comput. Appl. 135(1), 10–17 (2016)

    Google Scholar 

  15. Lin, D.: An information-theoretic definition of similarity. In: ICML 1998 Proceedings of the Fifteenth International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann (1998)

    Google Scholar 

  16. Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  17. Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to latent semantic analysis. Discourse Process. 25, 259–284 (1998)

    Article  Google Scholar 

  18. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  19. The Polish language dictionary (Słownik Języka Polskiego) Homepage. sjp.pl. Accessed 10 Mar 2018

  20. Wrzeciono, P., Karwowski, W.: Automatic indexing and creating semantic networks for agricultural science papers in the Polish language. In: 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops, COMPSACW 2013, Kyoto, Japan, 22–26 July 2013, pp. 356–360 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piotr Wrzeciono .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wrzeciono, P., Karwowski, W. (2018). Pattern Recognition Method for Classification of Agricultural Scientific Papers in Polish. In: Chmielewski, L., Kozera, R., Orłowski, A., Wojciechowski, K., Bruckstein, A., Petkov, N. (eds) Computer Vision and Graphics. ICCVG 2018. Lecture Notes in Computer Science(), vol 11114. Springer, Cham. https://doi.org/10.1007/978-3-030-00692-1_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00692-1_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00691-4

  • Online ISBN: 978-3-030-00692-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics