Skip to main content

Evaluating and Improving Annotation Tools for Medical Forms

  • Conference paper
  • First Online:
Data Integration in the Life Sciences (DILS 2017)

Abstract

The annotation of entities with concepts from standardized terminologies and ontologies is of high importance in the life sciences to enhance semantic interoperability, information retrieval and meta-analysis. Unfortunately, medical documents such as clinical forms or electronic health records are still rarely annotated despite the availability of some tools to automatically determine possible annotations. In this study, we comparatively evaluate the quality of two such tools, cTAKES and MetaMap, as well as of a recently proposed annotation approach from our group for annotating medical forms. We also investigate how to improve the match quality of the tools by post-filtering computed annotations as well as by combining several annotation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://medical-data-models.org.

  2. 2.

    https://sites.google.com/site/shareclefehealth/.

  3. 3.

    Clinical Text Analysis and Knowledge Extraction System http://ctakes.apache.org.

  4. 4.

    Unstructured Information Management Architecture [16] https://uima.apache.org.

  5. 5.

    Medical Dictionary for Regulatory Activities.

  6. 6.

    Open-access and Collaborative (OAC) Consumer Health Vocabulary (CHV).

  7. 7.

    US Extension to Systematized Nomenclature of Medicine-Clinical Terms.

References

  1. Abedi, V., Zand, R., Yeasin, M., Faisal, F.E.: An automated framework for hypotheses generation using literature. BioData Min. 5(1), 13 (2012)

    Article  Google Scholar 

  2. Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010)

    Article  Google Scholar 

  3. Campos, D., Matos, S., Oliveira, J.: Current methodologies for biomedical named entity recognition. In: Biological Knowledge Discovery Handbook: Preprocessing, Mining, and Postprocessing of Biological Data, pp. 839–868 (2013)

    Google Scholar 

  4. Christen, V., Groß, A., Rahm, E.: A reuse-based annotation approach for medical documents. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 135–150. Springer, Cham (2016). doi:10.1007/978-3-319-46523-4_9

    Chapter  Google Scholar 

  5. Christen, V., Groß, A., Varghese, J., Dugas, M., Rahm, E.: Annotating medical forms using UMLS. In: Ashish, N., Ambite, J.-L. (eds.) DILS 2015. LNCS, vol. 9162, pp. 55–69. Springer, Cham (2015). doi:10.1007/978-3-319-21843-4_5

    Chapter  Google Scholar 

  6. Dai, M., Shah, N.H., Xuan, W., Musen, M.A., Watson, S.J., Athey, B.D., Meng, F., et al.: An efficient solution for mapping free text to ontology terms. In: AMIA Summit on Translational Bioinformatics 21 (2008)

    Google Scholar 

  7. Doan, S., Conway, M., Phuong, T.M., Ohno-Machado, L.: Natural language processing in biomedicine: a unified system architecture overview. In: Trent, R. (ed.) Clinical Bioinformatics. Methods in Molecular Biology (Methods and Protocols), vol 1168, pp. 275–294. Humana Press, New York (2014)

    Google Scholar 

  8. Dugas, M., Neuhaus, P., Meidt, A., Doods, J., Storck, M., Bruland, P., Varghese, J.: Portal of medical data models: information infrastructure for medical research and healthcare. Database: The Journal of Biological Databases and Curation p. bav121 (2016)

    Google Scholar 

  9. Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G.: Automated encoding of clinical documents based on natural language processing. J. Am. Med. Inform. Assoc. 11(5), 392–402 (2004)

    Article  Google Scholar 

  10. Funk, C., Baumgartner, W., Garcia, B., Roeder, C., Bada, M., Cohen, K.B., Hunter, L.E., Verspoor, K.: Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters. BMC Bioinform. 15(1), 1–29 (2014)

    Article  Google Scholar 

  11. Heinemann, F., Huber, T., Meisel, C., Bundschus, M., Leser, U.: Reflection of successful anticancer drug development processes in the literature. Drug Discovery Today 21(11), 1740–1744 (2016)

    Article  Google Scholar 

  12. Humphrey, S.M., Rogers, W.J., Kilicoglu, H., Demner-Fushman, D., Rindflesch, T.C.: Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment. J. Am. Soc. Inform. Sci. Technol. 57(1), 96–113 (2006)

    Article  Google Scholar 

  13. LePendu, P., Iyer, S., Fairon, C., Shah, N.H., et al.: Annotation analysis for testing drug safety signals using unstructured clinical notes. J. Biomed. Semant. 3(S-1), S5 (2012)

    Google Scholar 

  14. McCray, A.T., Srinivasan, S., Browne, A.C.: Lexical methods for managing variation in biomedical terminologies. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, pp. 235–239 (1994)

    Google Scholar 

  15. Oellrich, A., Collier, N., Smedley, D., Groza, T.: Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes. PLoS ONE 10(1), e0116040 (2015)

    Article  Google Scholar 

  16. Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)

    Article  Google Scholar 

  17. Shah, N.H., Bhatia, N., Jonquet, C., Rubin, D., Chiang, A.P., Musen, M.A.: Comparison of concept recognizers for building the open biomedical annotator. BMC Bioinform. 10(Suppl. 9), S14–S14 (2009)

    Article  Google Scholar 

  18. Sohn, S., Kocher, J.P.A., Chute, C.G., Savova, G.K.: Drug side effect extraction from clinical narratives of psychiatry and psychology patients. J. Am. Med. Inform. Assoc. 18(Suppl. 1), i144–i149 (2011)

    Article  Google Scholar 

  19. Sohn, S., Savova, G.K.: Mayo clinic smoking status classification system: extensions and improvements. In: AMIA Annual Symposium Proceedings, pp. 619–623 (2009)

    Google Scholar 

  20. Tanenblatt, M.A., Coden, A., Sominsky, I.L.: The ConceptMapper approach to named entity recognition. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC), pp. 546–551 (2010)

    Google Scholar 

  21. Tseytlin, E., Mitchell, K., Legowski, E., Corrigan, J., Chavan, G., Jacobson, R.S.: NOBLE-Flexible concept recognition for large-scale biomedical natural language processing. BMC Bioinform. 17(1), 32 (2016)

    Article  Google Scholar 

  22. University of Pittsburgh: TIES-Text Information Extraction System (2017). http://ties.dbmi.pitt.edu/

  23. Zheng, J., Chapman, W.W., Miller, T.A., Lin, C., Crowley, R.S., Savova, G.K.: A system for coreference resolution for the clinical narrative. J. Am. Med. Inform. Assoc. 19(4), 660 (2012)

    Article  Google Scholar 

  24. Zou, Q., Chu, W.W., Morioka, C., Leazer, G.H., Kangarloo, H.: Indexfinder: a knowledge-based method for indexing clinical texts. In: AMIA Annual Symposium Proceedings, pp. 763–767 (2003)

    Google Scholar 

Download references

Acknowledgment

This work is funded by the German Research Foundation (DFG) (grant RA 497/22-1, “ELISA - Evolution of Semantic Annotations”), German Federal Ministry of Education and Research (BMBF) (grant 031L0026, “Leipzig Health Atlas”) and National Research Fund Luxembourg (FNR) (grant C13/IS/5809134).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying-Chi Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Lin, YC. et al. (2017). Evaluating and Improving Annotation Tools for Medical Forms. In: Da Silveira, M., Pruski, C., Schneider, R. (eds) Data Integration in the Life Sciences. DILS 2017. Lecture Notes in Computer Science(), vol 10649. Springer, Cham. https://doi.org/10.1007/978-3-319-69751-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69751-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69750-5

  • Online ISBN: 978-3-319-69751-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics