Skip to main content

Automatic Keyphrases Extraction from Document Using Neural Network

  • Conference paper
Advances in Machine Learning and Cybernetics

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

Abstract

Keyphrase extraction is a task with many applications in information retrieval, text mining, and natural language processing. In this paper, a keyphrase extraction approach based on neural network is proposed. To determine whether a phrase is a keyphrase, the following features of a phrase in a given document are adopted: its term frequency and inverted document frequency, whether to appear in the title or headings (subheadings) of the given document, and its frequency appearing in the paragraphs of the given document. The algorithm is evaluated by the standard information retrieval metrics of precision and recall, and human assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azcarraga, A.P., Yap Jr., T., Chua, T.S., Chua, T.S.: Comparing Keyword Extraction Techniques for WEBSOM Text Archives. International Journal on Artificial Intelligence Tools 11(2), 219–232 (2002)

    Article  Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Publishing Company, Reading (1999)

    Google Scholar 

  3. Barker, K., Cornacchia, N.: Using Noun Phrase Heads to Extract Document Keyphrases. In: Hamilton, H.J. (ed.) Canadian AI 2000. LNCS (LNAI), vol. 1822, pp. 40–52. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Chien, L.F.: PAT-tree-based Adaptive Keyphrase Extraction for Intelligent Chinese Information Retrieval. Information Processing and Management 35, 501–521 (1999)

    Article  Google Scholar 

  5. Freeman, J.A., Skapura, D.M.: Neural Networks: Algorithms, Applications, and Programming Techniques. Addison-Wesley Publishing Company, Reading (1992)

    Google Scholar 

  6. Freitag, D.: Machine Learning for Information Extraction in Informal Domains. Machine Learning 39, 169–202 (2000)

    Article  MATH  Google Scholar 

  7. Gayo-Avello, D., Álvarez-Gutiérrez, D., Gayo-Avello, J.: Naïve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process. In: Ijspeert, A.J., Murata, M., Wakamiya, N. (eds.) BioADIT 2004. LNCS, vol. 3141, pp. 440–455. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. HaCohen-Kerner, Y.: Automatic Extraction of Keywords from Abstracts. In: Palade, V., Howlett, R.J., Jain, L. (eds.) KES 2003. LNCS (LNAI), vol. 2773, pp. 843–849. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. HaCohen-Kerner, Y., Gross, Z., Masa, A.: Automatic Extraction and Learning of Keyphrases from Scientific Articles. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 657–669. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Hagan, M.T., Menhaj, M.: Training Feed-forward Networks with the Marquardt Algorithm. IEEE Transactions on Neural Networks 5(6), 989–993 (1994)

    Article  Google Scholar 

  11. He, J., Tan, A.-H., Tan, C.-L.: On Machine Learning Methods for Chinese Document Keyphrases Categorization. Applied Intelligence 18, 311–322 (2003)

    Article  MATH  Google Scholar 

  12. Hulth, A., Karlgren, J., Jonsson, A., Boström, H.: Automatic Keyword Extraction Using Domain Knowledge. In: Gelbukh, A. (ed.) CICLing 2001. LNCS, vol. 2004, pp. 472–482. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Ikeda, D., Hirokawa, S.: Extracting Positive and Negative Keywords for Web Communities. In: Morishita, S., Arikawa, S. (eds.) DS 2000. LNCS (LNAI), vol. 1967, pp. 299–303. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  14. Jones, S., Paynter, G.W.: Automatic Extraction of Document Keyphrases for Use in Digital Libraries: Evaluation and Applications. Journal of the American Society for Information Science and Technology 53(8), 653–677 (2002)

    Article  Google Scholar 

  15. Martínez-Fernández, J.L., Gacía-Serrano, A., Martínez, P., Villena, J.: Automatic Keyword Extraction for News Finder. In: Nürnberger, A., Detyniecki, M. (eds.) AMR 2003. LNCS, vol. 3094, pp. 99–119. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  16. Matsuo, Y., Ohsawa, Y., Ishizuka, M.: KeyWorld: Extracting Keywords from a Document as a Small World. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 271–281. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  17. Rydberg-Cox, J.A.: Keyword Extraction from Ancient Greek Literacy Texts. Literary and Linguistic Computing 17(2), 231–244 (2002)

    Article  Google Scholar 

  18. Soderland, S.: Learning Information Extraction Rules for Semi-structured and Free Text. Machine Learning 34, 233–272 (1999)

    Article  MATH  Google Scholar 

  19. Turney, P.D.: Learning Algorithms for Keyphrase Extraction. Information Retrieval 2(4), 303–336 (2000)

    Article  Google Scholar 

  20. Witten, I.H., Paynter, G.W., Frank, E., et al.: KEA: Practical Automatic Keyphrase Extraction. In: Fox, E.A., Rowe, N. (eds.) Proceedings of Digital Libraries 1999: The Fourth ACM Conference on Digital Libraries, pp. 254–255. ACM Press, Berkeley (1999)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, J., Peng, H., Hu, Js. (2006). Automatic Keyphrases Extraction from Document Using Neural Network. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_66

Download citation

  • DOI: https://doi.org/10.1007/11739685_66

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33584-9

  • Online ISBN: 978-3-540-33585-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics