Skip to main content

Syntactic parsing as a knowledge acquisition problem

  • Long Papers
  • Conference paper
  • First Online:
Knowledge Acquisition, Modeling and Management (EKAW 1997)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1319))

Abstract

Corpus linguistics involves the construction and annotation of large databases of text from spoken and written language. These have applications in NLP and taught grammar. This annotation represents the problem of the KA “bottleneck” in a new application area. This paper introduces parse checking as a KA problem, and compares it to other tree-oriented KA methodologies such as laddering and clustering. It argues that corpus linguistics represents a significant application area for KA. The laddering tools discussed here have been used to process thousands of tree structures. The paper compares two tools in use on the ICE-GB corpus. One tool, ICE Tree II, exploits the structure of grammatical trees more fully than the other. Timing results show a main learning effect which dominates any difference comparison. However, the more integrated tool reduces the scope for error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bowden, P., Halstead, P. and Rose, T.G. (1996), Extracting Conceptual Knowledge From Text Using Explicit Relation Markers, (in Shadbolt, O'Hara and Schreiber, 1996, 147–162).

    Google Scholar 

  • Burnage, G., and Dunlop, D. (1992), Encoding the British National Corpus, in Aarts, J., de Haan, P., and Oostdijk, N. (eds.) 1992, English Language Corpora: Design, Analysis and Exploitation, Papers from the 13th international conference on English Language research on computerized corpora, Nijmegen 1992, Amsterdam: Rodopi.

    Google Scholar 

  • Corbridge, C., Rugg, G., Major, N.P., Shadbolt N.R., and Burton, A.M. (1994), Laddering: technique and tool use in knowledge acquisition, Knowledge Acquisition (1994) 6,315–341.

    Article  Google Scholar 

  • Cupit, J., and Shadbolt, N.R. (1996), Knowledge Discovery in Databases: Exploiting Knowledge-Level Redescription (in Shadbolt, O'Hara and Schreiber, 1996, 245–261).

    Google Scholar 

  • EAGLES (1996), Syntactic Annotation: Survey of Annotation practices. EAG-TCWG-SASG/2. Pisa: Consiglio Nazionale delle Ricerche. Istituto di Linguistica Computazionale.

    Google Scholar 

  • Etherington, D.W., and Reiter, R. (1983), On Inheritance Hierarchies With Exceptions, reprinted in Brachman, R.J., and Levesque, H.J. (eds.) (1985) Readings in Knowledge Representation, San Mateo, CA: Morgan Kaufman.

    Google Scholar 

  • Fang, A.C. (1996), The Survey Parser: Design and Development (Chapter 11 in Greenbaum, 1996b, 142–160).

    Google Scholar 

  • Greenbaum, S. (1992), A New Corpus of English: ICE, in Svartvik, J. (ed.), Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82, Stockholm 4-8 August 1991, Berlin: Mouton de Gruyter.

    Google Scholar 

  • Greenbaum, S. (1996a), The Oxford English Grammar, Oxford: Oxford University Press.

    Google Scholar 

  • —(ed.) (1996b), Comparing English Worldwide: The International Corpus of English, Oxford: Clarendon Press.

    Google Scholar 

  • — and Ni, Y. (1996), About the ICE Tagset (Chapter 8 in Greenbaum, 1996b, 92–109).

    Google Scholar 

  • Halteren, H. Van and Oostdijk, N. (1993), Towards a Syntactic Database: the TOSCA Analysis System, in Aarts, J, de Haan, P. and Oostdijk, N. (eds), English Language Corpora: Design, Analysis and Exploitation. Amsterdam: Rodopi.

    Google Scholar 

  • Jonassen, D.H., Beissener, K., and Yacci, M. (1993), Structural Knowledge: Techniques for Representing, Conveying, and Acquiring Structural Knowledge, Hillsdale, NJ: LEA.

    Google Scholar 

  • Leech, G. and Garside, R. (1991), Running a Grammar Factory: on the compilation of parsed corpora, or treebanks, in Johansson, S. and Stenström, A.-B. (eds), English Computer Corpora: Selected Papers and Research Guide. Berlin: Mouton de Gruyter, 15–32.

    Google Scholar 

  • Major, N.P., and Reichgelt, H. (1990), ALTO: An Automated Laddering Tool, in Wielinga, B., Boose, J., Gaines, B. Schreiber, G., van Someren, M. (Eds.) (1990), Current Trends in Knowledge Acquisition, 222–236, Amsterdam: IOS Press.

    Google Scholar 

  • Major, N.P., and Shadbolt, N.R. (1992), CNN: Integrating Knowledge Elicitation with a Machine Learning Technique, in Proceedings of JKAW-92.

    Google Scholar 

  • Marcus, M., Marcinkiewicz, M.A. and Santorini, B. (1993), Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 2, 313–330.

    Google Scholar 

  • Michalski, R.S. and Stepp, R.E. (1983), Learning from observation: conceptual clustering, in Michalski, R.S., Carbonell J.G. and Mitchell T.M. (Eds.), Machine Learning: an Artificial Intelligence Approach, 331–363, Palo Alto: CA, Tioga.

    Google Scholar 

  • Minsky, M. (1975), A Framework for the Representation of Knowledge, in Winston, P. (Ed.), The Psychology of Computer Vision, New York: McGraw Hill, 211–277.

    Google Scholar 

  • Paskiewicz, T., Patten, C., Shadbolt, N.R., Swallow, S., and Wallis, S.A. (1991), Functional specification of SET tools, SET deliverable D006, University of Nottingham.

    Google Scholar 

  • Quinn, A., and Porter, N. (1996), ICE Annotation Tools, (Chapter 6 in Greenbaum, 1996b, 65–78).

    Google Scholar 

  • Shadbolt, N.R., O'Hara, K. and Schreiber, G. (eds.) Advances in Knowledge Acquisition, Proceedings of EKAW '96, Berlin: Springer-Verlaag.

    Google Scholar 

  • Wallis, S.A. (1993), Machine Learning with Knowledge, in Proceedings of MLnet Workshop on Scientific Discovery 1993, MLnet.

    Google Scholar 

  • — (1997), Exploiting hierarchical sets in A. L, PhD Thesis (submitted), University of Nottingham.

    Google Scholar 

  • — and SHADBoLT, N.R. (1993), Induction as Knowledge Acquisition, Dept. of Psychology Postgraduate Conference 1993, Department of Psychology, University of Nottingham.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Enric Plaza Richard Benjamins

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wallis, S., Nelson, G. (1997). Syntactic parsing as a knowledge acquisition problem. In: Plaza, E., Benjamins, R. (eds) Knowledge Acquisition, Modeling and Management. EKAW 1997. Lecture Notes in Computer Science, vol 1319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026792

Download citation

  • DOI: https://doi.org/10.1007/BFb0026792

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63592-5

  • Online ISBN: 978-3-540-69606-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics