Skip to main content

Word Level Plagiarism Detection of Marathi Text Using N-Gram Approach

  • Conference paper
  • First Online:
Recent Trends in Image Processing and Pattern Recognition (RTIP2R 2018)

Abstract

Plagiarism is increasing day by day. Plagiarism detection is one of the most complex, but a must requirement. This paper deals with word level plagiarism detection for Marathi text by using N-gram language model and a Marathi corpus. This is most simple in form still provides good depth for understanding and emphasing copy-paste and paraphrased plagiarism detection. It forms basis for sentence as well as paragraph level processing

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. University of Melbourne (2005). What is plagiarism? https://services.unimelb.edu.au/__data/assets/pdf_file/0004/821668/5297-Avoiding-PlagiarismWEB.pdf. Accessed 27 June 2018

  2. Paul clough, plagiarism in natural and programming languages an overview of current tools and technologies, Technical report, University of Sheffeld, Sheffeld, UK, June 2000

    Google Scholar 

  3. Grozea, C., et al.: ENCOPLOT: pairwise sequence matching in linear time applied to plagiarism detection. In 3rd PAN Workshop. Uncovering Plagiarism, Authorship, and Social Software Misuse, p. 10 (2009)

    Google Scholar 

  4. Grozea, C., Popescu, M.: Who’s the thief? automatic detection of the direction of plagiarism. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 700–710. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12116-6_59

    Chapter  Google Scholar 

  5. Barrón-Cedeño, A., Rosso, P.: On automatic plagiarism detection based on n-grams comparison. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 696–700. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_69

    Chapter  Google Scholar 

  6. Chiu, S., Uysal, I., Croft, B.W.: Evaluating text reuse discovery on the web. In: Proceedings of the Third Symposium on Information Interaction in Context, pp. 299–304 (2010)

    Google Scholar 

  7. Weber Wulff, D.: Copy, Shake, and Paste- A blog about plagiarism from a German professor, written in English. http://copy-shake-paste.blogspot.com. Accessed 28 June 2018

  8. Lancaster, T.: Effective and efficient plagiarism detection. Ph.D. thesis, school of computing, information systems and mathematics south bank university (2003)

    Google Scholar 

  9. Barnbaum, C.: Plagiarism: A Student’s Guide to Recognizing It and Avoiding It. Valdos Ta state university. http://www.valdosta.edu/cbarnbau/personal/teaching_MISC/plagiarism.htm. Accessed 28 June 2018

  10. Maurer, H., et al.: Plagiarism-a survey. J. Univ. Comput. Sci. 12, 1050–1084 (2006)

    Google Scholar 

  11. Bretag, T., Mahmud, S.: Self-plagiarism or appropriate textual re-use. J. Acad. Ethics 7, 193–205 (2009)

    Article  Google Scholar 

  12. Vani, K., Gupta, D.: Using k-means cluster based techniques in external plagiarism detection. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I), pp. 1268–1273. IEEE 2014

    Google Scholar 

  13. Jurafsky, D., Martin, J.H.: Text book on “Speech and Language Processing”, Copyright c 2016. All rights reserved (2017)

    Google Scholar 

  14. What-are-n-grams.html. http://text-analytics101.rxnlp.com/2014/11/. Accessed 18 Aug 2018

Download references

Acknowledgement

Authors would like to acknowledge and thanks to CSRI DST Major Project sanctioned No.SR/CSRI/71/2015(G), Computational and Psycholinguistic Research Lab Facility supporting to this work and Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ramesh R. Naik , Maheshkumar B. Landge or C. Namrata Mahender .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Naik, R.R., Landge, M.B., Mahender, C.N. (2019). Word Level Plagiarism Detection of Marathi Text Using N-Gram Approach. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1037. Springer, Singapore. https://doi.org/10.1007/978-981-13-9187-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9187-3_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9186-6

  • Online ISBN: 978-981-13-9187-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics