Skip to main content

A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search

  • Conference paper
Human-Inspired Computing and Its Applications (MICAI 2014)

Abstract

Multi-document summarization has been used for extracting the most relevant sentences from a set of documents, allowing the user to more quickly address the content thereof. This paper addresses the generation of extractive summaries from multiple documents as a binary optimization problem and proposes a method, based on CHC evolutionary algorithm and greedy search, called MA-MultiSumm, in which objective function optimizes the lineal combination of coverage and redundancy factors. MA-MultiSumm was compared with other state-of-the-art methods using ROUGE measures. The results showed that MA-MultiSumm outperforms all methods on the DUC2005 dataset; and on DUC2006 the results are very close to the best method. Furthermore in a unified ranking MA-MultiSumm only was improved on by the DESAMC+DocSum method, which requires as many iterations of the evolutionary process as MA-MultiSumm. The experimental results show that the optimization-based approach for multiple document summarization is truly a promising research direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artificial Intelligence Review 37(1), 1–41 (2012)

    Article  Google Scholar 

  2. Nenkova, A., McKeown, K.: A Survey of Text Summarization Techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, US (2012)

    Chapter  Google Scholar 

  3. Miranda, S., Gelbukh, A., Sidorov, G.: Generación de resúmenes por medio de síntesis de grafos conceptuales. Revista Signos. Estudios de Lingüística 47(86) (2014)

    Google Scholar 

  4. Amini, M.-R., Usunier, N.: Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of 32nd Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, USA, pp. 704–705. ACM (2009)

    Google Scholar 

  5. Ouyang, Y., et al.: Applying regression models to query-focused multi-document summarization. Information Processing & Management 47(2), 227–237 (2011)

    Article  Google Scholar 

  6. Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, pp. 1937–1942. IEEE (1937)

    Google Scholar 

  7. Atkinson, J., Munoz, R.: Rhetorics-based multi-document summarization. Expert Systems with Applications 40(11), 4346–4352 (2013)

    Article  Google Scholar 

  8. Otterbacher, J., Erkan, G., Radev, D.R.: Biased LexRank: passage retrieval using random walks with question-based priors. Information Processing and Management 45(1), 42–54 (2009)

    Article  Google Scholar 

  9. Wei, F., et al.: Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 283–290. ACM (2008)

    Google Scholar 

  10. Radev, D.R., et al.: Centroid-based summarization of multiple documents. Information Processing & Management 40(6), 919–938 (2004)

    Article  MATH  Google Scholar 

  11. Steinberger, J., Křišťan, M.: LSA-Based Multi-Document Summarization. In: Proceedings of 8th International PhD Workshop on Systems and Control, Balatonfured, Hungary (2007)

    Google Scholar 

  12. Sun, P., ByungRae, C.: Query-Based Multi-Document Summarization Using Non-Negative Semantic Feature and NMF Clustering. In: Proceedings Fourth International Conference on Networked Computing and Advanced Information Management, NCM, Gyeongju, pp. 609–614. IEEE (2008)

    Google Scholar 

  13. Hennig, L.: Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis. In: Proceedings International Conference RANLP, Borovets, Bulgaria, pp. 144–149 (2009)

    Google Scholar 

  14. Mei, J.-P., Chen, L.: SumCR: a new subtopic-based extractive approach for text summarization. Knowledge and Information Systems 31(3), 527–545 (2012)

    Article  MathSciNet  Google Scholar 

  15. Alguliev, R.M., et al.: MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 14514–14522 (2011)

    Article  Google Scholar 

  16. Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowledge-Based Systems 36(0), 21–38 (2012)

    Article  Google Scholar 

  17. Abuobieda, A., Salim, N., Kumar, Y.J., Osman, A.H.: An Improved Evolutionary Algorithm for Extractive Text Summarization. In: Selamat, A., Nguyen, N.T., Haron, H., et al. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 78–89. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  18. Mendoza, M., et al.: Extractive single-document summarization based on genetic operators and guided local search. Expert Systems with Applications 41(9), 4158–4169 (2014)

    Article  Google Scholar 

  19. Neri, F., Cotta, C.: Memetic algorithms and memetic computing optimization: A literature review. Swarm and Evolutionary Computation 2(0), 1–14 (2012)

    Article  Google Scholar 

  20. Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  21. Hachey, B., Murray, G., Reitter, D.: The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space. In: Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada (2005)

    Google Scholar 

  22. Silla, C.N., Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization with genetic algorithm-based attribute selection. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 305–314. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  23. Ochoa, G., Verel, S., Tomassini, M.: First-improvement vs. Best-improvement local optima networks of NK landscapes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 104–113. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  24. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 Workshop on Text Summarization Branches Out, Barcelona, Spain (2004)

    Google Scholar 

  25. Alguliev, R.M., Aliguliyev, R.M., Mehdiyev, C.A.: Sentence selection for generic document summarization using an adaptive differential evolution algorithm. Swarm and Evolutionary Computation 1(4), 213–222 (2011)

    Article  Google Scholar 

  26. Celikyilmaz, A., Hakkani-Tur, D.: A Hybrid Hierarchical Model for Multi-Document Summarization. In: Proceedings 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 815–824. Association for Computational Linguistics (2010)

    Google Scholar 

  27. Lei, H., et al.: Modeling Document Summarization as Multi-objective Optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386. IEEE (2010)

    Google Scholar 

  28. Wei, F., Li, W., Liu, S.: iRANK: a rank-learn-combine framework for unsupervised ensemble ranking. American Society for Information Science and Technology 61(6), 1232–1243 (2010)

    Google Scholar 

  29. Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado, pp. 362–370. Association for Computational Linguistics (2009)

    Google Scholar 

  30. Wang, D., et al.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 307–314 (2008)

    Google Scholar 

  31. Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of the Ninth SIAM International Conference on Data Mining, Nevada, USA, pp. 1148–1159 (2009)

    Google Scholar 

  32. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and development in Information Retrieval, Melbourne, Australia, pp. 335–336. ACM (1998)

    Google Scholar 

  33. Eiben, A.E., Smit, S.K.: Evolutionary Algorithm Parameters and Methods to Tune Them. In: Hamadi, Y., Monfroy, E., Saubion, F. (eds.) Autonomous Search, pp. 15–36. Springer, Heidelberg (2012)

    Google Scholar 

  34. Cobos, C., Estupiñán, D., Pérez, J.: GHS + LEM: Global-best Harmony Search using learnable evolution models. Applied Mathematics and Computation 218(6), 2558–2578 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  35. Sidorov, G., et al.: Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas 18(3) (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Mendoza, M., Cobos, C., León, E., Lozano, M., Rodríguez, F., Herrera-Viedma, E. (2014). A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13647-9_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13646-2

  • Online ISBN: 978-3-319-13647-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics