Identification of Conclusive Association Entities by Biomedical Association Mining

Liu, Rey-Long

doi:10.1007/978-3-030-14799-0_9

Rey-Long Liu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11431))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1819 Accesses

Abstract

Conclusive association entities (CAEs) in the title and the abstract of an article a are those biomedical entities (e.g., genes, diseases, and chemicals) that are specific targets on which conclusive findings about their associations are reported in a. Identification of the CAEs is essential for the analysis of conclusive associations, which is a task routinely conducted by many biomedical scientists. However, CAE identification is challenging, as it is difficult to identify the specific entities and then estimate how conclusive the findings on the entities are. In this paper we present an association mining technique to improve CAE identification. The technique is based on a hypothesis: two candidate entities in an article are likely to be CAEs of the article if a strong association between them is mined from a collection of articles. Experimental results show that, by integrating the technique with representative keyword identification indicators, CAE identification can be significantly improved. The results are of technical and practical significance to the indexing, curation, and exploration of conclusive associations reported in biomedical literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Update information of CTD can be found at http://ctdbase.org/help/faq/;jsessionid=92111C8A6B218E4B2513C3B0BEE7E63F?p=6422623 (accessed, September 2018).
2.
A large number of biomedical scientists join the curation tasks of GHR, see http://ghr.nlm.nih.gov/ExpertReviewers (accessed, September 2018).
3.
OMIM updates association information on a daily basis, see http://www.omim.org/about (accessed, September 2018).
4.
MeSH (available at https://www.ncbi.nlm.nih.gov/mesh) is a controlled vocabulary for indexing biomedical articles.
5.
SVM^rank is available at http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html.
6.
In one of our previous projects (ID: MOST 105-2221-E-320-004), we ever employed articles from CTD as experimental data.
7.
More information about the customized vocabulary is available at http://ctdbase.org/help/faq/;jsessionid=92111C8A6B218E4B2513C3B0BEE7E63F?p=6422623, http://ctdbase.org/help/geneDetailHelp.jsp, and http://ctdbase.org/help/diseaseDetailHelp.jsp (accessed, May 2017)

References

Arighi, C.N., et al.: BioCreative III interactive task: an overview. BMC Bioinform. 12(Suppl. 8), S4 (2011)
Article Google Scholar
Aronson, A.R.: The MMI Ranking Function (1997). https://ii.nlm.nih.gov/MTI/Details/mmi.shtml. Accessed May 2018
Boyack, K.W., et al.: Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PLoS ONE 6(3), e18029 (2011)
Article Google Scholar
Davis, A.P., et al.: The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 45(Database issue), D972–D978 (2017)
Google Scholar
Frijters, R., van Vugt, M., Smeets, R., van Schaik, R., de Vlieg, J., Alkema, W.: Literature mining for the discovery of hidden connections between drugs, genes diseases. PLoS Comput. Biol. 6(9), e1000943 (2010). https://doi.org/10.1371/journal.pcbi.1000943
Article Google Scholar
Heo, G.E., Kang, K.Y., Song, M.: A flexible text mining system for entity and relation extraction in PubMed. In: Proceedings of DTMBIO 2015 (2015)
Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of ACM SIGKDD, Edmonton, Alberta, Canada, pp. 133–142 (2002)
Google Scholar
Kim, J., So, S, Lee, H.J., Park, J.C., Kim, J.J., Lee, H.: DigSee: disease gene search engine with evidence sentences (version cancer). Nucleic Acids Res. 41(Web Server issue), W510–W517 (2013). https://doi.org/10.1093/nar/gkt531
Article Google Scholar
Kwon, K., Choi, C.H., Lee, J., Jeong, J., Cho, W.S.: A graph based representative keywords extraction model from news articles. In: Proceedings of the 2015 International Conference on Big Data Applications and Services, pp. 30–36 (2015)
Google Scholar
Li, L., Liu, S., Qin, M., Wang, Y., Huang, D.: Extracting biomedical event with dual decomposition integrating word embeddings. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(4), 669–677 (2016)
Google Scholar
Liu, R.-L., Huang, Y.-C.: Ranker enhancement for proximity-based ranking of biomedical texts. J. Am. Soc. Inf. Sci. Technol. 62(12), 2479–2495 (2011)
Article Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13(01), 157–169 (2004)
Article Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2004)
Google Scholar
Mork, J., Aronson, A., Demner-Fushman, D.: 12 years on - Is the NLM medical text indexer still useful and relevant? J. Biomed. Semant. 8, 8 (2017)
Article Google Scholar
Özgür, A., Vu, T., Erkan, G., Radev, D.R.: Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics 24(13), i277–i285 (2008)
Article Google Scholar
PubMed: Algorithm for finding best matching citations in PubMed. https://www.ncbi.nlm.nih.gov/books/NBK3827/#pubmedhelp.Algorithm_for_finding_best_ma. Accessed September 2018
Shah, P.K., Perez-Iratxeta, C., Bork, P., Andrade, M.A.: Information extraction from full text scientific articles: where are the keywords? BMC Bioinform. 4, 20 (2003)
Article Google Scholar
Thomas, J.R., Bharti, S.K., Babu, K.S.: Automatic keyword extraction for text summarization in e-Newspapers. In: Proceedings of ICIA-16 (2016)
Google Scholar
Thuy Phan, T.T., Ohkawa, T.: Protein-protein interaction extraction with feature selection by evaluating contribution levels of groups consisting of related features. BMC Bioinform. 17(Suppl 7), 246 (2016)
Article Google Scholar
Tsatsaronis, G., et al.: An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138 (2015)
Google Scholar
Tudor, C.O., Schmidt, C.J., Vijay-Shanker, K.: eGIFT: mining gene information from the literature. BMC Bioinform. 11, 418 (2010)
Article Google Scholar
Wiegers, T.C., Davis, A.P., Cohen, K.B., Hirschman, L., Mattingly, C.J.: Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD). BMC Bioinform. 10, 326 (2009)
Article Google Scholar
Žitnik, S., Žitnik, M., Zupan, B., Bajec, M.: Sieve-based relation extraction of gene regulatory networks from biological literature. BMC Bioinform. 16(Suppl. 16), S1 (2015)
Article Google Scholar

Download references

Acknowledgment

This research was supported by Ministry of Science and Technology, Taiwan (grant ID: MOST 107-2221-E-320-004).

Author information

Authors and Affiliations

Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan
Rey-Long Liu

Authors

Rey-Long Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rey-Long Liu .

Editor information

Editors and Affiliations

Ton Duc Thang University, Ho Chi Minh City, Vietnam
Ngoc Thanh Nguyen
Bina Nusantara University, Jakarta, Indonesia
Ford Lumban Gaol
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, RL. (2019). Identification of Conclusive Association Entities by Biomedical Association Mining. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11431. Springer, Cham. https://doi.org/10.1007/978-3-030-14799-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-14799-0_9
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14798-3
Online ISBN: 978-3-030-14799-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics