Skip to main content

From Web Crawled Text to Project Descriptions: Automatic Summarizing of Social Innovation Projects

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2019)

Abstract

In the past decade, social innovation projects have gained the attention of policy makers, as they address important social issues in an innovative manner. A database of social innovation is an important source of information that can expand collaboration between social innovators, drive policy and serve as an important resource for research. Such a database needs to have projects described and summarized. In this paper, we propose and compare several methods (e.g. SVM-based, recurrent neural network based, ensambled) for describing projects based on the text that is available on project websites. We also address and propose a new metric for automated evaluation of summaries based on topic modelling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gate.ac.uk/projects/knowmak/.

  2. 2.

    https://ec.europa.eu/programmes/horizon2020/en/h2020-section/societal-challenges.

  3. 3.

    http://ec.europa.eu/growth/industry/policy/key-enabling-technologies_en.

  4. 4.

    https://esid.manchester.ac.uk/.

  5. 5.

    https://www.knowmak.eu/.

References

  1. Bazrfkan, M., Radmanesh, M.: Using machine learning methods to summarize persian texts. Indian J. Sci. Res. 7(1), 1325–1333 (2014)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Bonifacio, M.: Social innovation: a novel policy stream or a policy compromise? An EU perspective. Eur. Rev. 22(1), 145–169 (2014)

    Article  Google Scholar 

  4. Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. arXiv preprint arXiv:1603.07252 (2016)

  5. Dong, Y.: A survey on neural network-based summarization methods. arXiv preprint arXiv:1804.04589 (2018)

  6. Fattah, M.A., Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)

    Article  Google Scholar 

  7. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. Text Summarization Branches Out (2004)

    Google Scholar 

  8. Maynard, D., Lepori, B.: Ontologies as bridges between data sources and user queries: the KNOWMAK project experience. In: Proceedings of Science, Technology and Innovation Indicators 2017, STI 2017 (2017)

    Google Scholar 

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  10. Milosevic, N., Gok, A., Nenadic, G.: Classification of intangible social innovation concepts. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 407–418. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_42

    Chapter  Google Scholar 

  11. Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  12. Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B., et al.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. arXiv preprint arXiv:1602.06023 (2016)

  13. Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)

    Google Scholar 

  14. Neto, J.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization using a machine learning approach. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 205–215. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36127-8_20

    Chapter  Google Scholar 

  15. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 528–540 (2018)

    Google Scholar 

  16. Riedhammer, K., Favre, B., Hakkani-Tür, D.: Long story short-global unsupervised models for keyphrase based meeting summarization. Speech Commun. 52(10), 801–815 (2010)

    Article  Google Scholar 

  17. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)

  18. Sarkar, K., Nasipuri, M., Ghose, S.: Using machine learning for medical document summarization. Int. J. Database Theory Appl. 4(1), 31–48 (2011)

    Google Scholar 

  19. Sinha, A., Yadav, A., Gahlot, A.: Extractive text summarization using neural networks. arXiv preprint arXiv:1802.10137 (2018)

  20. Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inform. 28(2), 251–275 (2012)

    Google Scholar 

  21. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)

    Article  Google Scholar 

  22. Zhang, Z., Petrak, J., Maynard, D.: Adapted textrank for term extraction: a generic method of improving automatic term extraction algorithms. Procedia Comput. Sci. 137, 102–108 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The work presented in this paper is part of the KNOWMAK project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 726992.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikola Milošević .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Milošević, N., Marinov, D., Gök, A., Nenadić, G. (2019). From Web Crawled Text to Project Descriptions: Automatic Summarizing of Social Innovation Projects. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23281-8_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23280-1

  • Online ISBN: 978-3-030-23281-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics