Automatic Keyphrases Extraction from Document Using Neural Network

Wang, Jiabing; Peng, Hong; Hu, Jing-song

doi:10.1007/11739685_66

Jiabing Wang²²,
Hong Peng²² &
Jing-song Hu²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

1226 Accesses
15 Citations

Abstract

Keyphrase extraction is a task with many applications in information retrieval, text mining, and natural language processing. In this paper, a keyphrase extraction approach based on neural network is proposed. To determine whether a phrase is a keyphrase, the following features of a phrase in a given document are adopted: its term frequency and inverted document frequency, whether to appear in the title or headings (subheadings) of the given document, and its frequency appearing in the paragraphs of the given document. The algorithm is evaluated by the standard information retrieval metrics of precision and recall, and human assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Azcarraga, A.P., Yap Jr., T., Chua, T.S., Chua, T.S.: Comparing Keyword Extraction Techniques for WEBSOM Text Archives. International Journal on Artificial Intelligence Tools 11(2), 219–232 (2002)
Article Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Publishing Company, Reading (1999)
Google Scholar
Barker, K., Cornacchia, N.: Using Noun Phrase Heads to Extract Document Keyphrases. In: Hamilton, H.J. (ed.) Canadian AI 2000. LNCS (LNAI), vol. 1822, pp. 40–52. Springer, Heidelberg (2000)
Chapter Google Scholar
Chien, L.F.: PAT-tree-based Adaptive Keyphrase Extraction for Intelligent Chinese Information Retrieval. Information Processing and Management 35, 501–521 (1999)
Article Google Scholar
Freeman, J.A., Skapura, D.M.: Neural Networks: Algorithms, Applications, and Programming Techniques. Addison-Wesley Publishing Company, Reading (1992)
Google Scholar
Freitag, D.: Machine Learning for Information Extraction in Informal Domains. Machine Learning 39, 169–202 (2000)
Article MATH Google Scholar
Gayo-Avello, D., Álvarez-Gutiérrez, D., Gayo-Avello, J.: Naïve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process. In: Ijspeert, A.J., Murata, M., Wakamiya, N. (eds.) BioADIT 2004. LNCS, vol. 3141, pp. 440–455. Springer, Heidelberg (2004)
Chapter Google Scholar
HaCohen-Kerner, Y.: Automatic Extraction of Keywords from Abstracts. In: Palade, V., Howlett, R.J., Jain, L. (eds.) KES 2003. LNCS (LNAI), vol. 2773, pp. 843–849. Springer, Heidelberg (2003)
Chapter Google Scholar
HaCohen-Kerner, Y., Gross, Z., Masa, A.: Automatic Extraction and Learning of Keyphrases from Scientific Articles. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 657–669. Springer, Heidelberg (2005)
Chapter Google Scholar
Hagan, M.T., Menhaj, M.: Training Feed-forward Networks with the Marquardt Algorithm. IEEE Transactions on Neural Networks 5(6), 989–993 (1994)
Article Google Scholar
He, J., Tan, A.-H., Tan, C.-L.: On Machine Learning Methods for Chinese Document Keyphrases Categorization. Applied Intelligence 18, 311–322 (2003)
Article MATH Google Scholar
Hulth, A., Karlgren, J., Jonsson, A., Boström, H.: Automatic Keyword Extraction Using Domain Knowledge. In: Gelbukh, A. (ed.) CICLing 2001. LNCS, vol. 2004, pp. 472–482. Springer, Heidelberg (2001)
Chapter Google Scholar
Ikeda, D., Hirokawa, S.: Extracting Positive and Negative Keywords for Web Communities. In: Morishita, S., Arikawa, S. (eds.) DS 2000. LNCS (LNAI), vol. 1967, pp. 299–303. Springer, Heidelberg (2000)
Chapter Google Scholar
Jones, S., Paynter, G.W.: Automatic Extraction of Document Keyphrases for Use in Digital Libraries: Evaluation and Applications. Journal of the American Society for Information Science and Technology 53(8), 653–677 (2002)
Article Google Scholar
Martínez-Fernández, J.L., Gacía-Serrano, A., Martínez, P., Villena, J.: Automatic Keyword Extraction for News Finder. In: Nürnberger, A., Detyniecki, M. (eds.) AMR 2003. LNCS, vol. 3094, pp. 99–119. Springer, Heidelberg (2004)
Chapter Google Scholar
Matsuo, Y., Ohsawa, Y., Ishizuka, M.: KeyWorld: Extracting Keywords from a Document as a Small World. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 271–281. Springer, Heidelberg (2001)
Chapter Google Scholar
Rydberg-Cox, J.A.: Keyword Extraction from Ancient Greek Literacy Texts. Literary and Linguistic Computing 17(2), 231–244 (2002)
Article Google Scholar
Soderland, S.: Learning Information Extraction Rules for Semi-structured and Free Text. Machine Learning 34, 233–272 (1999)
Article MATH Google Scholar
Turney, P.D.: Learning Algorithms for Keyphrase Extraction. Information Retrieval 2(4), 303–336 (2000)
Article Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., et al.: KEA: Practical Automatic Keyphrase Extraction. In: Fox, E.A., Rowe, N. (eds.) Proceedings of Digital Libraries 1999: The Fourth ACM Conference on Digital Libraries, pp. 254–255. ACM Press, Berkeley (1999)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510641, China
Jiabing Wang, Hong Peng & Jing-song Hu

Authors

Jiabing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jing-song Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, P.O. Box, Hong Kong, China
Daniel S. Yeung
School of Creative Media, City University of Hong Kong,, China
Zhi-Qiang Liu
Department of Mathematics and Computer Science, Hebei University, 071002, Baoding, Hebei, P.R. China
Xi-Zhao Wang
School of Electrical and Information Engineering, University of Sydney, 2006, NSW, Australia
Hong Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Peng, H., Hu, Js. (2006). Automatic Keyphrases Extraction from Document Using Neural Network. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_66

Download citation

DOI: https://doi.org/10.1007/11739685_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33584-9
Online ISBN: 978-3-540-33585-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics