Skip to main content

Knowledge Management, Data Mining, and Text Mining in Medical Informatics

  • Chapter
Medical Informatics

Part of the book series: Integrated Series in Information Systems ((ISIS,volume 8))

Chapter Overview

In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data analysis including: probabilistic and statistical models, symbolic learning and rule induction, neural networks, evolution-based algorithms, and analytic learning and fuzzy logic. We also discuss their relevance and potential for biomedical research. Example applications of relevant knowledge management, data mining, and text mining research are then reviewed in order including: ontologies; knowledge management for health care, biomedical literature, heterogeneous databases, information visualization, and multimedia databases; and data and text mining for health care, literature, and biological data. We conclude the paper with discussions of privacy and confidentiality issues of relevance to biomedical data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abidi, S. S. R. (2001). “Knowledge Management in Healthcare: Towards ‘Knowledge-driven’ Decision-support Services,” International Journal of Medical Informatics, 63, 5–18.

    Article  PubMed  CAS  Google Scholar 

  • Acir, N. and Guzelis, C. (2004). “Automatic Spike Detection in EEG by a Two-stage Procedure Based on Support Vector Machines,” Computers in Biology and Medicine, 34(7), 561–575.

    Article  PubMed  Google Scholar 

  • Ackerman, M. J. (1991). “The Visible Human Project,” Journal of Biocommunication, 18(2), 14.

    PubMed  CAS  Google Scholar 

  • Ahmad, S., Gromiha, M. M., and Sarai, A. (2004). “Analysis and Prediction of DNA-binding Proteins and Their Binding Residues Based on Composition, Sequence, and Structural Information,” Bioinformatics, 20(4), 477–486.

    Article  PubMed  CAS  Google Scholar 

  • Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” Nucleic Acids Research, 25(17), 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  • Antani, S., Lee, D. J., Long, L. R., and Thoma, G. R. (2004). “Evaluation of Shape Similarity Measurement Methods for Spine X-ray Images,” Journal of Visual Communication and Image Representation, 15, 285–302.

    Article  Google Scholar 

  • Baclawski, K., Cigna, J., Kokar, M. W., Mager, P., and Indurkhya, B. (2000). “Knowledge Representation and Indexing Using the Unified Medical Language System,” in Proceedings of the Pacific Symposium on Biocomputing, 493–504.

    Google Scholar 

  • Barrera, J., Cesar-Jr, R. M., Ferreira, J. E., and Gubitoso, M. D. (2004). “An Environment for Knowledge Discovery in Biology,” Computers in Biology and Medicine, 34, 427–447.

    Article  PubMed  Google Scholar 

  • Baujard, O., Baujard, V., Aurel, S., Boyer, C., and Appel, R. D. (1998). “Trends in Medical Information Retrieval on the Internet,” Computers in Biology and Medicine, 28, 589–601.

    Article  PubMed  CAS  Google Scholar 

  • Belacel, B., Cuperlovic-Culf, M., Laflamme, M., and Ouellette, R. (2004). “Fuzzy J-Means and VNS Methods for Clustering Genes from Microarray Data,” Bioinformatics, 20(11), 1690–1701.

    Article  PubMed  CAS  Google Scholar 

  • Belew, R. K. (1989). “Adaptive Information Retrieval: Using a Connectionist representation to Retrieve and Learn about Documents,” in Proceedings of the 12 th ACM-SIGIR Conference, Cambridge, MA, June 1989.

    Google Scholar 

  • Berman, J. J. (2002). “Confidentiality Issues for Medical Data Miners,” Artificial Intelligence in Medicine, 26(1–2), 25–36.

    Article  PubMed  Google Scholar 

  • Blaschke, C., Andrade, M. A., Ouzounis, C. and Valencia, A. (1999). “Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions,” in Proceedings of the International Conference on Intelligent Systems for Molecular Biology, 60–67.

    Google Scholar 

  • Bodenreider, O. and McCray, A. T. (2003). “Exploring Semantic Groups through Visual Approaches,” Journal of Biomedical Informatics, 36, 414–432.

    Article  PubMed  Google Scholar 

  • Breiman, L. and Spector, P. (1992). “Submodel Selection and Evaluation in Regression: The X-random Case,” International Statistical Review, 60(3), 291–319.

    Article  Google Scholar 

  • Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, M., and Haussler, D. (2000). “Knowledge-based Analysis of Microarray Gene Expression Data by Using Support Vector Machines,” in Proceedings of the National Academy of Sciences, 97, 262–267.

    Article  CAS  Google Scholar 

  • Campbell, K. E., Oliver, D. E., and Shortliffe, E. H. (1998). “The Unified Medical Language System: Toward a Collaborative Approach for Solving Terminologic Problems,” Journal of the American Medical Informatics Association, 5(1), 12–16.

    PubMed  CAS  Google Scholar 

  • Carbonell, J. G. Michalski, R. S., Mitchell, T. M. (1983). “An Overview of Machine Learning,” in R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (Eds.), Machine Learning, An Artificial Intelligence Approach, Palo Alto, CA: Tioga.

    Google Scholar 

  • Cavideds, J. E. and Cimino, J. J. (2004). “Towards the Development of a Conceptual Distance Metric for the UMLS,” Journal of Biomedical Informatics, 37, 77–85.

    Article  Google Scholar 

  • Chapman, W. W., Dowling, J. N., and Wagner, M. M. (2004). “Fever Detection from Free-text Clinical Records for Biosurveillance,” Journal of Biomedical Informatics, 37, 120–127.

    Article  PubMed  Google Scholar 

  • Chau, M. and Chen, H. (2003). “Comparison of Three Vertical Search Spiders,” IEEE Computer, 36(5), 56–62.

    Google Scholar 

  • Chau, M. and Chen, H. (2004). “Using Content-based and Link-based Analysis in Building Vertical Search Engines,” in Proceedings of the International Conference on Asian Digital Libraries, Shanghai, China, December 13–17, 2004.

    Google Scholar 

  • Chau, M., Xu, J. J., and Chen, H. (2002). “Extracting Meaningful Entities from Police Narrative Reports,” in Proceedings of the National Conference for Digital Government Research, Los Angeles, California, USA, May 19–22, 2002, 271–275.

    Google Scholar 

  • Chen, H. (2001). Knowledge Management Systems: A Text Mining Perspective, Tucson, AZ: The University of Arizona.

    Google Scholar 

  • Chen, H. and Chau, M. (2004). “Web Mining: Machine Learning for Web Applications,” Annual Review of Information Science and Technology, 38, 289–329.

    Article  Google Scholar 

  • Chen, H. and Kim, J. (1995). “GANNET: A Machine Learning Approach to Document Retrieval,” Journal of Management Information Systems, 11(3), 9–43.

    Google Scholar 

  • Chen, H., Lally, A. M., Zhu, B., and Chau, M. (2003). “HelpfulMed: Intelligent Searching for Medical Information over the Internet,” Journal of the American Society for Information Science and Technology, 54(7), 683–694, 2003.

    Article  Google Scholar 

  • Chen, H. and Ng, T. (1995). “An Algorithmic Approach to Concept Exploration in a Large Knowledge Network (Automatic Thesaurus Consultation): Symbolic Branch and Bound Search vs. Connectionist Hopfield Net Activation,” Journal of the American Society for Information Science, 46(5), pp. 348–369.

    Article  Google Scholar 

  • Chinchor, N. A. (1998). “Overview of MUC-7/MET-2,” in Proceedings of the Seventh Message Understanding Conference (MUC-7), Virginia, USA, April 29–May 1, 1998.

    Google Scholar 

  • Cimino, J. J., Min, H., and Perl, Y. (2003) “Consistency across the Hierarchies of the UMLS Semantic Network and Metathesaurus,” Journal of Biomedical Informatics, 36, 450–461.

    Article  PubMed  CAS  Google Scholar 

  • Cios, K. J. and Moore, G. W. (2002). “Uniqueness of Medical Data Mining,” Artificial Intelligence in Medicine, 26(1–2), 25–36.

    Google Scholar 

  • Cohen, P. R. and Feigenbaum, E. A. (1982). The Handbook of Artificial Intelligence: Volume III, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Dawes, M. and Sampson, U. (2003). “Knowledge Management in Clinical Practice: A Systematic Review of Information Seeking Behavior in Physicians,” International Journal of Medical Informatics, 71, 9–15.

    Article  PubMed  Google Scholar 

  • Dickerson, J. A., Berleant, D., Cox, Z., Fulmer, A. W., and Wurtele, E. (2003). “Creating and Modeling Metabolic and Regulatory Networks Using Text Mining and Fuzzy Expert Systems,” in J. T. L. Wang, C. H. Wu, and P. P. Wang (Eds.), Computational Biology and Genome Informatics, World Scientific.

    Google Scholar 

  • Dreiseitl, S., Ohno-Machado, L., Kittler, H., Vinterbo, S., Billhardt, H., Binder, M. (2001). “A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin Lesions,” Journal of Biomedical Informatics, 34, 28–36.

    Article  PubMed  CAS  Google Scholar 

  • Duda, R. O. and Hart, P. E. (1973). Pattern Classification and Scene Analysis, New York: John Wiley and Sons.

    Google Scholar 

  • Dunham, M. H. (2002). Data Mining: Introductory and Advanced Topics, New Jersey, USA: Prentice Hall.

    Google Scholar 

  • Efron, B. (1983). “Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation,” Journal of the American Statistical Association, 78(382), 316–330.

    Article  Google Scholar 

  • Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap, Chapman and Hall.

    Google Scholar 

  • Eisen, M., Spellman, P., Brown, P., and Botstein, D. (1998). “Cluster Analysis and Display of Genome-wide Expression Patterns,” in Proceedings of the National Academy of Sciences, 95, 14863–14868.

    Article  CAS  Google Scholar 

  • Fayyad, U. M., Piatetsky-Shapiro, G., and Smyth, P. (1996). “From Data Mining to Knowledge Discovery in Databases,” AI Magazine, 17(3), 37–54.

    Google Scholar 

  • Fisher, D. H. (1987). “Knowledge Acquisition via Incremental Conceptual Clustering,” Machine Learning, 2, 139–172.

    Google Scholar 

  • Fogel, D. B. (1994). “An Introduction to Simulated Evolutionary Optimization,” IEEE Transactions on Neural Networks, 5, 3–14.

    Article  PubMed  CAS  Google Scholar 

  • Friedman, C. Hripcsak, G., Johnson, S. B., Cimino, J. J., Clayton, P. D. (1990). “A Generalized Relational Schema for an Integrated Clinical Patient Database,” in Proceedings of the 14 th Annual Symposium on Computer Applications in Medical Care, 335–339.

    Google Scholar 

  • Friedman, C. and Hripcsak, G. (1998). “Evaluating Natural Language Processors in the Clinical Domain,” Methods of Information in Medicine, 37, 334–344.

    PubMed  CAS  Google Scholar 

  • Friedman, C., Kra, P., Yu, H., Krauthammer, M., and Rzhetsky, A. (2001). “GENIES: A Natural-language Processing System for the Extraction of Molecular Pathways from Journal Articles,” Bioinformatics, 17(Supp. 1), S74–S82.

    PubMed  Google Scholar 

  • Fukuda K., Tamura A., Tsunoda T., and Takagi T. (1998). “Toward Information Extraction: Identifying Protein Names from Biological Papers,” in Proceedings of the Pacific Symposium on Biocomputing, 707–718.

    Google Scholar 

  • Fuller, S., Revere, D., Soderland, S., Bugni, P., Kadiyska, Y., Reber, L., Fuller, H., and Martin, G. (2002). “Modeling a Concept-Based Information System to Promote Scientific Discovery: The Telemakus System,” in Proceedings of the AMIA 2002 Annual Symposium, 1023.

    Google Scholar 

  • Fuller, S., Revere, D., Bugni, P., Fuller, H., and Martin, G. (2004). “A Knowledgebase System to Enhance Scientific Discovery: Telemakus,” Biomedical Digital Libraries, 1(2–15).

    Google Scholar 

  • Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Golub T. R., Slonim D. K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J. P., Coller H., Loh M. L., Downing J. R., Caligiuri M. A., Bloomfield C. D., Lander E. S. (1999). “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, 286(5439), 531–537.

    Article  PubMed  CAS  Google Scholar 

  • Gruninger, M. and Lee, J. (2002). “Ontology: Applications and Design,” Communications of the ACM, 45(2), 39–41

    Article  Google Scholar 

  • Han, K., and Byun, Y. (2004). “Three-dimensional Visualization of Protein Interaction Networks,” Computers in Biology and Medicine, 34, 127–139.

    Article  PubMed  CAS  Google Scholar 

  • Harris, M. R., Savova, G. K., Johnson, T. M., and Chute, C. G. (2003). “A Term Extraction Tool for Expanding Content in the Domain of Functioning, Disability, and Health: Proof of Concept,” Journal of Biomedical Informatics, 36, 250–259.

    Article  PubMed  Google Scholar 

  • Hatzivassiloglou, V., Duboue, P. A., and Rzhetsky, A. (2001). “Disambiguating Proteins, Genes, and RNA in Text: A Machine Learning Approach,” Bioinformatics, 17(Supp. 1), S96–S106.

    Google Scholar 

  • Hayes-Roth, F. and Jacobstein, N. (1994). “The State of Knowledge-based Systems,” Communications of the ACM, 37, 27–39.

    Article  Google Scholar 

  • Hearst, M. A. (1999). “Untangling Text Data Mining,” in Proceedings of ACL’99: the 37th Annual Meeting of the Association for Computational Linguistics, Maryland, June 20–26.

    Google Scholar 

  • Heathfield, H. and Louw, G. (1999). “New Challenges for Clinical Informatics: Knowledge Management Tools,” Health Informatics Journal, 5(2), 67–73.

    Article  Google Scholar 

  • Herrero, J., Valencia, A., and Dopazo, J. (2001). “A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns,” Bioinformatics, 17, 126–136.

    Article  PubMed  CAS  Google Scholar 

  • Hersh, W. (1996). Information Retrieval: A Health Care Perspective. Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Hersh, W., Mailhot, M., Arnott-Smith, C., and Lowe, H. (2002). “Selective Automated Indexing of Findings and Diagnoses in Radiology Reports,” Journal of Biomedical Informatics, 34, 262–273.

    Article  Google Scholar 

  • Herwig, R., Poustka, A., Müller, C., Bull, C., Lehrach, H., and O’Brien, J. (1999). “Large-scale Clustering of cDNA Fingerprinting Data,” Genome Research, 9, 1093–1105.

    Article  PubMed  CAS  Google Scholar 

  • Hirst, J. D. and Sternberg, M. J. E. (1992). “Prediction of Structural and Functional Features of Protein and Nucleic Acid Sequences by Artificial Neural Networks,” Biochemistry, 31, 7211–7218.

    Article  PubMed  CAS  Google Scholar 

  • Holland, J. H. (1975). Adaptation in Natural and Artificial Systems, Ann Arbor, MI: University of Michigan Press.

    Google Scholar 

  • Hopfield, J. J. (1982). “Neural Network and Physical Systems with Collective Computational Abilities,” in Proceedings of the National Academy of Science, USA, 1982, 79(4), pp. 2554–2558.

    Article  CAS  Google Scholar 

  • Houston, A. L., Chen, H., Hubbard, S. M., Schatz, B. R., Ng, T. D., Sewell, R. R. and Tolle, K. M. (1999). “Medical Data Mining on the Internet: Research on a Cancer Information System,” Artificial Intelligence Review, 13, 437–466.

    Article  Google Scholar 

  • Hsu, A. L., Tang, S., and Halgamuge, S. K. (2003). “An Unsupervised Hierarchical Dynamic Self-organizing Approach to Cancer Class Discovery and Market Gene Identification in Microarray Data,” Bioinformatics, 19(16), 2131–2140.

    Article  PubMed  CAS  Google Scholar 

  • Hripcsak, G. (1993). “Monitoring the Monitor: Automated Statistical Tracking of a Clinical Event Monitor,” Computers and Biomedical Research, 26(5), 449–466.

    Article  PubMed  CAS  Google Scholar 

  • Hripcsak, G., Austin, J. H., Alderson, P. O., and Friedman, C. (2002). “Use of Natural Language Processing to Translate Clinical Information from a Database of 889,921 Chest Radiographic Reports,” Radiology, 224(1), 157–163.

    PubMed  Google Scholar 

  • Humphreys, B. L., Lindberg, D. A. B., and McCray, A. (1993). “The Unified Medical Language System,” Methods of Information in Medicine, 32(4), 281.

    PubMed  Google Scholar 

  • Humphreys, B. L., Lindberg, D. A. B., Schoolman, H. M., and Barnett, G. O. (1998). “The Unified Medical Language System: An Informatics Research Collaboration,” Journal of the American Medical Informatics Association, 5(1), 1–11.

    PubMed  CAS  Google Scholar 

  • Jackson, J. R. (2000). “The Urgent Call for Knowledge Management in Medicine,” The Physician Executive, 26(1), 28–31.

    CAS  Google Scholar 

  • Jain, A. K., Dubes, R. C. and Chen, C. (1987). “Bootstrap Techniques for Error Estimation,” IEEE Transactions on Pattern Analysis and Machine Learning, 9(5), 628–633.

    Article  Google Scholar 

  • Knirsch, C.A., Jain, N. L., Pablos-Mendez, A., Friedman, C., and Hripcsak, G. (1996). “Respiratory Isolation of Tuberculosis Patients Using Clinical Guidelines and an Automated Clinical Decision Support System,” Infection Control and Hospital Epidemiology, 19(2), 94–100.

    Article  Google Scholar 

  • Jain, N. L. and Friedman, C. (1997). “Identification of Findings Suspicious for Breast Cancer Based on Natural Language Processing of Mammogram Reports.” in Proceedings of the Fall 1997 AMIA Conference, Philadelphia, USA, 829–833.

    Google Scholar 

  • Janetzki, V., Allen, M., and Cimino, J. J. (2004). “Using Natural Language Processing to Link from Medical Text to On-line Information Resources,” Proceedings of Medinfo, 2004, 1665.

    Google Scholar 

  • Joachims, T. (1998). “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” in Proceedings of the European Conference on Machine Learning, Berlin, 1998, pp. 137–142.

    Google Scholar 

  • Kandaswamy, A., Kumar, C. S., Ramanathan, R. P. Jayaraman, R., and Malmurugan, N. (2004). “Neural Classification of Lung Sounds Using Wavelet Coefficients,” Computers in Biology and Medicine, 34, 523–537.

    Article  PubMed  CAS  Google Scholar 

  • Karasavvas, K. A., Baldock, R., and Burger, A. (2004). “Bioinformatics Integration and Agent Technology,” Journal of Biomedical Informatics, 37, 205–219.

    Article  PubMed  CAS  Google Scholar 

  • Kazama, J., Maino, T., Ohta, Y., and Tsujii, J. (2002). “Tuning Support Vector Machines for Biomedical Named Entity Recognition,” in Proceedings of the Workshop on Natural Language Processing in the Biomedical Domain, Philadelphia, USA, July 2002, 1–8.

    Google Scholar 

  • Kohavi, R. (1995). “A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection,” in Proceedings of the I4th International Joint Conference on Artificial Intelligence, San Francisco, CA, 1995, Morgan Kaufmann, pp. 1137–1143.

    Google Scholar 

  • Kohonen, T. (1995). Self-organizing Maps, Springer-Verlag, Berlin.

    Google Scholar 

  • Kononenko, I. (1993). “Inductive and Bayesian Learning in Medical Diagnosis,” Applied Artificial Intelligence, 7, 317–337, 1993.

    Google Scholar 

  • Kovalerchuk, B., Vityaev, E., and Ruiz, J. F. (2001). “Consistent and Complete Data and ‘Expert’ Mining in Medicine,” in Cios, K. J. (Ed.), Medical Data Mining and Knowledge Discovery, New York, USA: Physica-Verlag.

    Google Scholar 

  • Kretschmann, E., Fleischmann, W., and Apweiler, R. (2001). “Automatic Rule Generation for Protein Annotation with the C4.5 Data Mining Algorithm Applied on SWISS-PROT,” Bioinformatics, 17(10), 920–926.

    Article  PubMed  CAS  Google Scholar 

  • Krishnan, V. G. and Westhead, D. R. (2003). “A Comparative Study of Machine-Learning Methods to Predict the Effects of Single Nucleotide Polymorphisms on Protein Function,” Bioinformatics, 19(17), 2199–2209.

    Article  PubMed  CAS  Google Scholar 

  • Kuperman, G. J., Gardner, R.M., Pryor, T.A. (1991). The HELP System, New York: Springer-Verlag.

    Google Scholar 

  • Kwok, K. L. (1989). “A Neural Network for Probabilistic Information Retrieval,” in Proceedings of the 12th ACM-SIGIR Conference on Research and Development in Information Retrieval, Cambridge, Massachusetts, June 1989, pp.21–30.

    Google Scholar 

  • Langley, P. and Simon, H. (1995). “Applications of Machine Learning and Rule Induction,” Communications of the ACM, 38(11), 55–64.

    Article  Google Scholar 

  • Leroy, G. and Chen, H. (2001). “Meeting Medical Terminology Needs — The Ontology-Enhanced Medical Concept Mapper,” IEEE Transactions on Information Technology in Biomedicine, 5(4), 261–270.

    Article  PubMed  CAS  Google Scholar 

  • Leroy, G. and Chen, H. (forthcoming). “Genescene: An Ontology-enhanced Integration of Linguistic and Co-occurrence-based Relations in Biomedical Texts” Journal of the American Society for Information Science and Technology, forthcoming.

    Google Scholar 

  • Leroy, G., Chen, H., and Martinez, J. D. (2003). “A Shallow Parser Based on Closed-class Words to Capture Relations in Biomedical Text,” Journal of Biomedical Informatics, 36, 145–158.

    Article  PubMed  Google Scholar 

  • Lippmann, R. P. (1987). An Introduction to Computing with Neural Networks, IEEE Acoustics Speech and Signal Processing Magazine, 4, 4–22.

    Google Scholar 

  • Maniezzo V. (1994). “Genetic Evolution of the Topology and Weight Distribution of Neural Networks,” IEEE Transactions on Neural Networks, 5(1), 39–53.

    Article  PubMed  CAS  Google Scholar 

  • Mendes, R. R. F., Voznika, F. B., Freitas, A. A. and Nievola, J. C. (2001). “Discovering Fuzzy Classification Rules with Genetic Programming and Co-evolution,” Principles of Data Mining and Knowledge Discovery, Lecture Notes in Artificial Intelligence, 2168, pp. 314–325. Springer-Verlag, 2001.

    Google Scholar 

  • Michalewicz, Z. (1992). Genetic Algorithms + Data Structures =Evolution Programs. Berlin: Springer-Verlag.

    Google Scholar 

  • Mitchell, T. (1997). Machine Learning, McGraw Hill, 1997.

    Google Scholar 

  • Montani, S. and Bellazzi, R. (2002). “Supporting Decisions in Medical Applications: The Knowledge Management Perspective,” International Journal of Medical Informatics, 68, 79–90.

    Article  PubMed  Google Scholar 

  • Nagarajan, N. and Yona, G. (2004). “Automatic Prediction of Protein Domains from Sequence Information Using a Hybrid Learning System,” Bioinformatics, 20(9), 1335–1360.

    Article  PubMed  CAS  Google Scholar 

  • National Research Council (2000). Bioinformatics: Converting Data to Knowledge: Workshop Summary, Washington, D.C.: National Academies Press.

    Google Scholar 

  • Paass, G. (1990), “Probabilistic Reasoning and Probabilistic Neural Networks,” in Proceedings of the 3rd International Conference on Information Processing and Management of Uncertainty, pp.6–8.

    Google Scholar 

  • Palakal, M., Mukhopadhyay, S., Mostafa, J., Raje, R., N’Cho, M., and Mishra, S. (2001). “An Intelligent Biological Information Management System,” Bioinformatics, 18(10), 1283–1288.

    Article  Google Scholar 

  • Prather, J. C., Lobach, D. F., Goodwin, L. K., Hales, J. W., Hage, M. L., and Hammond, W. E. (1997). “Medical Data Mining: Knowledge Discovery in a Clinical Data Warehouse,” in Proceedings of the AMIA Annual Symposium Fall 1997, 101–105.

    Google Scholar 

  • Perl, Y. and Geller, J. (2003). “Research on Structural Issues of the UMLS — Past, Present, and Future,” Journal of Biomedical Informatics, 36, 409–413.

    Article  PubMed  Google Scholar 

  • Pustejovsky J., Castano J., Zhang J., Kotecki M., and Cochran B. (2002). “Robust Relational Parsing over Biomedical Literature: Extracting Inhibit Relations,” Pacific Symposium on Biocomputing, 362–373.

    Google Scholar 

  • Qian, N. and Sejnowski, T. J. (1988). “Predicting the Secondary Structure of Globular Proteins Using Neural Network Models,” Journal of Molecular Biology, 202, 865–884.

    Article  PubMed  CAS  Google Scholar 

  • Qin, J., Lewis, D. P., and Noble, W. S. (2003). “Kernel Hierarchical Gene Clustering from Microarray Expression Data,” Bioinformatics, 19(16), 2097–2104.

    Article  PubMed  CAS  Google Scholar 

  • Qu Y. and Xu., S. (2004). “Supervised Cluster Analysis for Microarray Data Based on Multivariate Gaussian Mixture,” Bioinformatics, 20(12), 1905–1913.

    Article  PubMed  CAS  Google Scholar 

  • Quinlan, J. R. (1983). “Learning Efficient Classification Procedures and Their Application to Chess End Games,” in R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (Eds.), Machine Learning: An Artificial Intelligence Approach, Palo Alto, CA: Tioga.

    Google Scholar 

  • Quinlan, J. R. (1993). C4.5: Programs for Machine Learning, Los Altos, CA: Morgan Kaufmann.

    Google Scholar 

  • Revere, D., Fuller, S. S, Bugni, P. F., and Martin, G. M. (2004). “An Information Extraction and Representation System for Rapid Review of the Biomedical Literature,” in Proceedings of Medinfo, 2004.

    Google Scholar 

  • Rindflesch, T. C., Tanabe, L., and Weinstein, J. N., and Hunter, L. (2000). “EDGAR: Extraction of Drugs, Genes and Relations from the Biomedical Literature,” in Proceedings of the Pacific Symposium on Biocomputing, 514–525.

    Google Scholar 

  • Rindflesch, T. C. and Fiszman, M. (2003) “The Interaction of Domain Knowledge and Linguistic Structure in Natural Language Processing: Interpreting Hypernymic Propositions in Biomedical Text,” Journal of Biomedical Informatics, 36, 462–477.

    Article  PubMed  Google Scholar 

  • Rojdestvenski, I. (2003). “VRML Metabolic Network Visualizer,” Computers in Biology and Medicine, 33, 169–182.

    Article  PubMed  Google Scholar 

  • Rosse, C. and Mejino, J. L. V. (2003). “A Reference Ontology for Biomedical Informatics: The Foundational Model of Anatomy,” Journal of Biomedical Informatics, 36, 478–500.

    Article  PubMed  Google Scholar 

  • Rumelhart, D. E., Hinton, G. E., and McClelland, J. L. (1986a). “A General Framework for Parallel Distributed Processing,” in D. E. Rumelhart, J. L. McClelland, and the PDP Research Group (Eds.), Parallel Distributed Processing, pp. 45–76, Cambridge, MA: The MIT Press.

    Google Scholar 

  • Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986b). “Learning Internal Representations by Error Propagation,” in D. E. Rumelhart, J. L. McClelland, and the PDP Research Group (Eds.), Parallel Distributed Processing, pp. 318–362, Cambridge, MA: The MIT Press.

    Google Scholar 

  • Samuelson, C. and Rayner, M. (1991). “Quantitative Evaluation of Explanation-based Learning as an Optimization Tool for a Large-scale Natural Language System,” in Proceedings of the 12 th International Joint Conference on Artificial Intelligence, Sydney, Australia, 1991, pp. 609–615.

    Google Scholar 

  • Sawa, T. and Ohno-Machado, L. (2003). “A Neural Network-based Similarity Index for Clustering DNA Microarray Data,” Computers in Biology and Medicine, 33, 1–15.

    Article  PubMed  CAS  Google Scholar 

  • Schubart, J. R. and Einbinder, J. S. (2000). “Evaluation of a Data Warehouse in an Academic Health Sciences Center,” International Journal of Medical Informatics, 60, 319–333.

    Article  PubMed  CAS  Google Scholar 

  • Sekimisu, T., Park, H. S., and Tsujii, J. (1998). “Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in MEDLINE Abstracts,” Genome Informatics, 9, 62–71.

    Google Scholar 

  • Shatkay, H., Edwards, S., and Boguski, M. (2002). “Information Retrieval Meets Gene Analysis,” IEEE Intelligent Systems, 17(2), 45–53.

    Google Scholar 

  • Shatkay, H., Edwards, S., Wilbur, W. I, and Boguski, M. (2000). “Genes, Themes, and Microarrays: Using Information Retrieval for Large-scale Gene Analysis,” in Proceedings of the International Conference on Intelligent Systems for Molecular Biology, 317–328.

    Google Scholar 

  • Shortliffe, E. (1976). Computer-based Medical Consultations: MYCIN, New York: Elsevier/North Holland.

    Google Scholar 

  • Shortliffe, E. (1987). “Computer Programs to Support Clinical Decision Making,” Journal of the American Medical Association, 258, 61–66.

    Article  PubMed  CAS  Google Scholar 

  • Simon, H. A. (1983). “Why Should Machines Learn?” In R. S. Michalski, J. Carbonell, and T. M. Mitchell (Eds.), Machine Learning: An Artificial Intelligence Approach. Palo Alto, CA: Tioga Press.

    Google Scholar 

  • Smalheiser, N. R. and Swanson, D. R. (1998). “Using ARROWSMITH: A Computer-assisted Approach to Formulating and Assessing Scientific Hypotheses,” Computer Methods and Programs in Biomedicine, 57, 149–153.

    Article  PubMed  CAS  Google Scholar 

  • Stone, M. (1974). “Cross-validation Choices and Assessment of Statistical Predictions,” Journal of the Royal Statistical Society, 36, 111–147.

    Google Scholar 

  • Sujansky, W. (2001). “Heterogeneous Database Integration in Biomedicine,” Journal of Biomedical Informatics, 34, 285–298.

    Article  PubMed  CAS  Google Scholar 

  • Sun, Y. (2004). “Methods for Automated Concept Mapping between Medical Databases,” Journal of Biomedical Informatics, 37, 162–178.

    Article  PubMed  Google Scholar 

  • Swanson, D. R. (1986). “Fish Oil, Raynaud’s Syndrome, and Undiscovered Public Knowledge,” Perspectives in Biology and Medicine, 30(1), 7–18.

    PubMed  CAS  Google Scholar 

  • Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S., and Golub T. R. (1999). “Interpreting Patterns of Gene Expression with Self-organizing Maps: Methods and Application to Hematopoietic Differentiation,” in Proceedings of the National Academy of Sciences, 96, 2907–2912.

    Article  CAS  Google Scholar 

  • Tanabe, L. and Wilbur, W. J. (2002). “Tagging Gene and Protein Names in Biomedical Text,” Bioinformatics, 18(8), 1124–1132.

    Article  PubMed  CAS  Google Scholar 

  • The Gene Ontology Consortium (2000). “Gene Ontology: Tool for the Unification of Biology,” Nature Genetics, 25(1), 25–29.

    Article  CAS  Google Scholar 

  • Tolle, K. and Chen, H. (2000) “Comparing Noun Phrasing Techniques for Use with Medical Digital Library Tools,” Journal of the American Society for Information Science, 51(4), 352–370.

    Article  Google Scholar 

  • Tu, Q., Tang, H., and Ding, D. (2004). “MedBlast: Searching Articles Related to a Biological Sequence,” Bioinformatics, 20(1), 75–77.

    Article  PubMed  CAS  Google Scholar 

  • Vapnik, V. (1998). Statistical Learning Theory, Wiley, Chichester, GB, 1998.

    Google Scholar 

  • Wain, H. M., Lush, M., Ducluzeau, F., Povey, S. (2002). “Genew: The Human Gene Nomenclature Database,” Nucleic Acids Research, 30(1), 169–171.

    Article  PubMed  CAS  Google Scholar 

  • Wilbur, W. J. and Yang, Y. (1996). “An Analysis of Statistical Term Strength and Its Use in the Indexing and Retrieval of Molecular Biology Texts,” Computers in Biology and Medicine, 26(3), 209–222.

    Article  PubMed  CAS  Google Scholar 

  • Yandell, M. D. and Majoros, W. H. (2002). “Genomics and Natural Language Processing,” Nature Reviews Genetics, 3(8), 601–610.

    PubMed  CAS  Google Scholar 

  • Yang, Y. and Liu, X. (1999). “A Re-examination of Text Categorization Methods, in Proceedings of the 22 nd Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR’99), 1999, pp. 42–49.

    Google Scholar 

  • Yoo, T. S., and Chen, D. T. (1994). “Interactive 3D Medical Visualization: A Parallel Approach to Surface Rendering 3D Medical Data,” in Proceedings of the Symposium for Computer Assisted Radiology, North Carolina, USA, June 12–15, 1994, 100–105.

    Google Scholar 

  • Yu, H., Hatzivassiloglou, V., Rzhetsky, A., and Wilbur, W. J. (2002). “Automatically Identifying Gene/Protein Terms in MEDLINE Abstracts,” Journal of Biomedical Informatics, 35, 322–330.

    Article  PubMed  CAS  Google Scholar 

  • Yu, V. L., Fagan, L. M., Wraith, S. M., Clancey, W. J., Scott, A. C., Hannigan, J. Blum, R. L., Buchanan, B. G., and Cohen, S. N. (1979). “Antimicrobial Selection by a Computer: A Blinded Evaluation by Infectious Disease Experts,” Journal of the American Medical Association, 242(12), 1279–1282.

    Article  PubMed  CAS  Google Scholar 

  • Zadeh, L. A. (1965). “Fuzzy sets,” Information and Control, 8, 338–353.

    Article  Google Scholar 

  • Zhang, L., Perl, Y., Halper, M., and Geller, J. (2003). “Designing Metaschemas for the UMLS Enriched Semantic Network,” Journal of Biomedical Informatics, 36, 433–449.

    Article  PubMed  Google Scholar 

Suggested Readings

  • Shortliffe, E. H. and Perreault, L. E. (2002). Medical Informatics: Computer Applications in Health Care and Biomedicine, Springer.

    Google Scholar 

  • Baldi, P. and Brunak, S. (2000). Bioinformatics: The Machine Learning Approach, The MIT Press.

    Google Scholar 

  • Mitchell, T. (1997). Machine Learning, McGraw Hill, 1997.

    Google Scholar 

  • Chen, H., Lally, A. M., Zhu, B., and Chau, M. (2003). “HelpfulMed: Intelligent Searching for Medical Information over the Internet,” Journal of the American Society for Information Science and Technology, 54(7), 683–694, 2003.

    Article  Google Scholar 

  • Eisen, M., Spellman, P., Brown, P., and Botstein, D. (1998). “Cluster Analysis and Display of Genome-wide Expression Patterns,” in Proceedings of the National Academy of Sciences, 95, 14863–14868.

    Article  CAS  Google Scholar 

  • Swanson, D. R. (1986). “Fish Oil, Raynaud’s Syndrome, and Undiscovered Public Knowledge,” Perspectives in Biology and Medicine, 30(1), 7–18.

    PubMed  CAS  Google Scholar 

  • Yandell, M. D. and Majoros, W. H. (2002). “Genomics and Natural Language Processing,” Nature Reviews Genetics, 3(8), 601–610.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Chen, H., Fuller, S.S., Friedman, C., Hersh, W. (2005). Knowledge Management, Data Mining, and Text Mining in Medical Informatics. In: Chen, H., Fuller, S.S., Friedman, C., Hersh, W. (eds) Medical Informatics. Integrated Series in Information Systems, vol 8. Springer, Boston, MA. https://doi.org/10.1007/0-387-25739-X_1

Download citation

  • DOI: https://doi.org/10.1007/0-387-25739-X_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24381-8

  • Online ISBN: 978-0-387-25739-6

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics