Skip to main content

Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis

  • Conference paper
Rough Sets and Current Trends in Computing (RSCTC 2004)

Abstract

In this paper, we introduce a way to apply rough set data analysis to the problem of extracting protein-protein interaction sentences in biomedical literature. Our approach builds on decision rules of protein names, interaction words, and their mutual positions in sentences. In order to broaden the set of potential interaction words, we develop a morphological model which generates spelling and inflection variants of the interaction words. We evaluate the performance of the proposed method on a hand-tagged dataset of 1894 sentences and show a precision-recall break-even performance of 79,8% by using leave-one-out crossvalidation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bader, G., Donaldson, I., Wolting, C., Ouellette, B., Pawson, T., Hogue, C.: BIND – the biomolecular interaction network database. Nucleic Acids Research 29, 242–245 (2001)

    Article  Google Scholar 

  2. Xenarios, I., Rice, D., Salwinski, L., Baron, M., Marcotte, E., Eisenberg, D.: DIP: The database of interacting proteins. Nucleic Acids Research 28, 289–291 (2000)

    Article  Google Scholar 

  3. Temkin, J., Gilder, M.: Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics 19, 2046–2053 (2003)

    Article  Google Scholar 

  4. Bunescu, R., Ge, R., Kate, R., Marcotte, E.M., Mooney, R., Ramani, A.K., Wong, Y.W.: Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine. Special Issue on Summarization and Information Extraction from Medical Documents (2004) (to appear)

    Google Scholar 

  5. Pawlak, Z.: Rough sets, decision algorithms and Bayes’ theorem. European Journal of Operational Research 136, 181–189 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  6. Lydall, D., Weiner, T.: G2/M checkpoint genes of saccharomyces cerevisiae: further evidence for roles in DNA replication and/or repair. Molecular and General Genetic 256, 638–651 (1997)

    Article  Google Scholar 

  7. Calderwood, D., Zent, R., Grant, R., Rees, D., Hynes, R., Ginsberg, M.: The talin head domain binds to integrin beta subunit cytoplasmic tails and regulates integrin activation. The Journal of Biological Chemistry 274, 28071–28074 (1999)

    Article  Google Scholar 

  8. Koskenniemi, K.: Two-level model for morphological analysis. In: Bundy, A. (ed.) Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, West Germany, August 8-12, pp. 683–685. William Kaufmann, Inc., San Francisco (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ginter, F., Pahikkala, T., Pyysalo, S., Boberg, J., Järvinen, J., Salakoski, T. (2004). Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds) Rough Sets and Current Trends in Computing. RSCTC 2004. Lecture Notes in Computer Science(), vol 3066. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25929-9_99

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-25929-9_99

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22117-3

  • Online ISBN: 978-3-540-25929-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics