Skip to main content

Identification of Conclusive Association Entities by Biomedical Association Mining

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11431))

Included in the following conference series:

  • 1819 Accesses

Abstract

Conclusive association entities (CAEs) in the title and the abstract of an article a are those biomedical entities (e.g., genes, diseases, and chemicals) that are specific targets on which conclusive findings about their associations are reported in a. Identification of the CAEs is essential for the analysis of conclusive associations, which is a task routinely conducted by many biomedical scientists. However, CAE identification is challenging, as it is difficult to identify the specific entities and then estimate how conclusive the findings on the entities are. In this paper we present an association mining technique to improve CAE identification. The technique is based on a hypothesis: two candidate entities in an article are likely to be CAEs of the article if a strong association between them is mined from a collection of articles. Experimental results show that, by integrating the technique with representative keyword identification indicators, CAE identification can be significantly improved. The results are of technical and practical significance to the indexing, curation, and exploration of conclusive associations reported in biomedical literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Update information of CTD can be found at http://ctdbase.org/help/faq/;jsessionid=92111C8A6B218E4B2513C3B0BEE7E63F?p=6422623 (accessed, September 2018).

  2. 2.

    A large number of biomedical scientists join the curation tasks of GHR, see http://ghr.nlm.nih.gov/ExpertReviewers (accessed, September 2018).

  3. 3.

    OMIM updates association information on a daily basis, see http://www.omim.org/about (accessed, September 2018).

  4. 4.

    MeSH (available at https://www.ncbi.nlm.nih.gov/mesh) is a controlled vocabulary for indexing biomedical articles.

  5. 5.

    SVMrank is available at http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html.

  6. 6.

    In one of our previous projects (ID: MOST 105-2221-E-320-004), we ever employed articles from CTD as experimental data.

  7. 7.

    More information about the customized vocabulary is available at http://ctdbase.org/help/faq/;jsessionid=92111C8A6B218E4B2513C3B0BEE7E63F?p=6422623, http://ctdbase.org/help/geneDetailHelp.jsp, and http://ctdbase.org/help/diseaseDetailHelp.jsp (accessed, May 2017)

References

  1. Arighi, C.N., et al.: BioCreative III interactive task: an overview. BMC Bioinform. 12(Suppl. 8), S4 (2011)

    Article  Google Scholar 

  2. Aronson, A.R.: The MMI Ranking Function (1997). https://ii.nlm.nih.gov/MTI/Details/mmi.shtml. Accessed May 2018

  3. Boyack, K.W., et al.: Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PLoS ONE 6(3), e18029 (2011)

    Article  Google Scholar 

  4. Davis, A.P., et al.: The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 45(Database issue), D972–D978 (2017)

    Google Scholar 

  5. Frijters, R., van Vugt, M., Smeets, R., van Schaik, R., de Vlieg, J., Alkema, W.: Literature mining for the discovery of hidden connections between drugs, genes diseases. PLoS Comput. Biol. 6(9), e1000943 (2010). https://doi.org/10.1371/journal.pcbi.1000943

    Article  Google Scholar 

  6. Heo, G.E., Kang, K.Y., Song, M.: A flexible text mining system for entity and relation extraction in PubMed. In: Proceedings of DTMBIO 2015 (2015)

    Google Scholar 

  7. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of ACM SIGKDD, Edmonton, Alberta, Canada, pp. 133–142 (2002)

    Google Scholar 

  8. Kim, J., So, S, Lee, H.J., Park, J.C., Kim, J.J., Lee, H.: DigSee: disease gene search engine with evidence sentences (version cancer). Nucleic Acids Res. 41(Web Server issue), W510–W517 (2013). https://doi.org/10.1093/nar/gkt531

    Article  Google Scholar 

  9. Kwon, K., Choi, C.H., Lee, J., Jeong, J., Cho, W.S.: A graph based representative keywords extraction model from news articles. In: Proceedings of the 2015 International Conference on Big Data Applications and Services, pp. 30–36 (2015)

    Google Scholar 

  10. Li, L., Liu, S., Qin, M., Wang, Y., Huang, D.: Extracting biomedical event with dual decomposition integrating word embeddings. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(4), 669–677 (2016)

    Google Scholar 

  11. Liu, R.-L., Huang, Y.-C.: Ranker enhancement for proximity-based ranking of biomedical texts. J. Am. Soc. Inf. Sci. Technol. 62(12), 2479–2495 (2011)

    Article  Google Scholar 

  12. Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13(01), 157–169 (2004)

    Article  Google Scholar 

  13. Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2004)

    Google Scholar 

  14. Mork, J., Aronson, A., Demner-Fushman, D.: 12 years on - Is the NLM medical text indexer still useful and relevant? J. Biomed. Semant. 8, 8 (2017)

    Article  Google Scholar 

  15. Özgür, A., Vu, T., Erkan, G., Radev, D.R.: Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics 24(13), i277–i285 (2008)

    Article  Google Scholar 

  16. PubMed: Algorithm for finding best matching citations in PubMed. https://www.ncbi.nlm.nih.gov/books/NBK3827/#pubmedhelp.Algorithm_for_finding_best_ma. Accessed September 2018

  17. Shah, P.K., Perez-Iratxeta, C., Bork, P., Andrade, M.A.: Information extraction from full text scientific articles: where are the keywords? BMC Bioinform. 4, 20 (2003)

    Article  Google Scholar 

  18. Thomas, J.R., Bharti, S.K., Babu, K.S.: Automatic keyword extraction for text summarization in e-Newspapers. In: Proceedings of ICIA-16 (2016)

    Google Scholar 

  19. Thuy Phan, T.T., Ohkawa, T.: Protein-protein interaction extraction with feature selection by evaluating contribution levels of groups consisting of related features. BMC Bioinform. 17(Suppl 7), 246 (2016)

    Article  Google Scholar 

  20. Tsatsaronis, G., et al.: An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138 (2015)

    Google Scholar 

  21. Tudor, C.O., Schmidt, C.J., Vijay-Shanker, K.: eGIFT: mining gene information from the literature. BMC Bioinform. 11, 418 (2010)

    Article  Google Scholar 

  22. Wiegers, T.C., Davis, A.P., Cohen, K.B., Hirschman, L., Mattingly, C.J.: Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD). BMC Bioinform. 10, 326 (2009)

    Article  Google Scholar 

  23. Žitnik, S., Žitnik, M., Zupan, B., Bajec, M.: Sieve-based relation extraction of gene regulatory networks from biological literature. BMC Bioinform. 16(Suppl. 16), S1 (2015)

    Article  Google Scholar 

Download references

Acknowledgment

This research was supported by Ministry of Science and Technology, Taiwan (grant ID: MOST 107-2221-E-320-004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rey-Long Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, RL. (2019). Identification of Conclusive Association Entities by Biomedical Association Mining. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11431. Springer, Cham. https://doi.org/10.1007/978-3-030-14799-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-14799-0_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-14798-3

  • Online ISBN: 978-3-030-14799-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics