Skip to main content

The EMILE 4.1 Grammar Induction Toolbox

  • Conference paper
  • First Online:
Grammatical Inference: Algorithms and Applications (ICGI 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2484))

Included in the following conference series:

Abstract

The EMILE 4.1 toolbox is intended to help researchers to analyze the grammatical structure of free text. The basic theoretical concepts behind the EMILE algorithm are expressions and contexts. The idea is that expressions of the same syntactic type can be substituted for each other in the same context. By performing a large statistical cluster analysis on the sentences of the text EMILE tries to identify traces of expressions that have this substitutionability relation. If there exists enough statistical evidence for the existence of a grammatical type EMILE creates such a type. Fundamental notions in the EMILE 4.1 algorithm are the so-called characteristic expressions and contexts. An expression of type T is characteristic for T if it only appears in a context of type T. The notion of characteristic context and expression boosts the learning capacities of the EMILE 4.1 algorithm. The EMILE algorithm is relatively scalable. It can easily analyze text up to 100,000 sentences on a workstation. The EMILE tool has been used in various domains, amongst others biomedical research [Adriaans, 2001b], identification of ontologies and semantic learning [Adriaans et al., 1993].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adriaans, P. (2001a). Learning shallow context-free languages under simple distributions. In Copestake, A. and (eds.), K.V., editors, Algebras, Diagrams and Decisions in Language, Logic and Computation. CSLI/CUP

    Google Scholar 

  2. Adriaans, P. (2001b). Semantic induction with emile, opportunities in bioinformatics. In Vet, P. v. d. e. a., editor, TWLT19, Information Extraction in Molecular Biology, Proceedings Twente Workshop on Language Technology 19, ES F Scientific Programme on Integrated Approaches for Functional Genomics, Enschede, pages 1–6. Universiteit Twente, Faculteit Informática.

    Google Scholar 

  3. Adriaans, P., Janssen, S., and Nomden, E. (1993). Effective identification of semantic categories in curriculum texts by means of cluster analysis. In Adriaans, P., editor, ECML-93, European Conference on Machine Learning, Workshop notes Machine Learning Techniques and Text Analysis, Vienna, Austria, pages 37–44. Department of Medical Cybernetics and Artificial Intelligence, University of Vienna in cooperation with the Austrian Rezsearch Institute for Artificial Intelligence.

    Google Scholar 

  4. Adriaans, W. P. (1992). Language Learning from a Categorial Perspective. PhD thesis, Universiteit van Amsterdam.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Adriaans, P., Vervoort, M. (2002). The EMILE 4.1 Grammar Induction Toolbox. In: Adriaans, P., Fernau, H., van Zaanen, M. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2002. Lecture Notes in Computer Science(), vol 2484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45790-9_24

Download citation

  • DOI: https://doi.org/10.1007/3-540-45790-9_24

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44239-4

  • Online ISBN: 978-3-540-45790-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics