Syntactic parsing as a knowledge acquisition problem

Wallis, Sean; Nelson, Gerry

doi:10.1007/BFb0026792

Sean Wallis¹ &
Gerry Nelson¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1319))

Included in the following conference series:

International Conference on Knowledge Engineering and Knowledge Management

243 Accesses
2 Citations

Abstract

Corpus linguistics involves the construction and annotation of large databases of text from spoken and written language. These have applications in NLP and taught grammar. This annotation represents the problem of the KA “bottleneck” in a new application area. This paper introduces parse checking as a KA problem, and compares it to other tree-oriented KA methodologies such as laddering and clustering. It argues that corpus linguistics represents a significant application area for KA. The laddering tools discussed here have been used to process thousands of tree structures. The paper compares two tools in use on the ICE-GB corpus. One tool, ICE Tree II, exploits the structure of grammatical trees more fully than the other. Timing results show a main learning effect which dominates any difference comparison. However, the more integrated tool reduces the scope for error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bowden, P., Halstead, P. and Rose, T.G. (1996), Extracting Conceptual Knowledge From Text Using Explicit Relation Markers, (in Shadbolt, O'Hara and Schreiber, 1996, 147–162).
Google Scholar
Burnage, G., and Dunlop, D. (1992), Encoding the British National Corpus, in Aarts, J., de Haan, P., and Oostdijk, N. (eds.) 1992, English Language Corpora: Design, Analysis and Exploitation, Papers from the 13th international conference on English Language research on computerized corpora, Nijmegen 1992, Amsterdam: Rodopi.
Google Scholar
Corbridge, C., Rugg, G., Major, N.P., Shadbolt N.R., and Burton, A.M. (1994), Laddering: technique and tool use in knowledge acquisition, Knowledge Acquisition (1994) 6,315–341.
Article Google Scholar
Cupit, J., and Shadbolt, N.R. (1996), Knowledge Discovery in Databases: Exploiting Knowledge-Level Redescription (in Shadbolt, O'Hara and Schreiber, 1996, 245–261).
Google Scholar
EAGLES (1996), Syntactic Annotation: Survey of Annotation practices. EAG-TCWG-SASG/2. Pisa: Consiglio Nazionale delle Ricerche. Istituto di Linguistica Computazionale.
Google Scholar
Etherington, D.W., and Reiter, R. (1983), On Inheritance Hierarchies With Exceptions, reprinted in Brachman, R.J., and Levesque, H.J. (eds.) (1985) Readings in Knowledge Representation, San Mateo, CA: Morgan Kaufman.
Google Scholar
Fang, A.C. (1996), The Survey Parser: Design and Development (Chapter 11 in Greenbaum, 1996b, 142–160).
Google Scholar
Greenbaum, S. (1992), A New Corpus of English: ICE, in Svartvik, J. (ed.), Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82, Stockholm 4-8 August 1991, Berlin: Mouton de Gruyter.
Google Scholar
Greenbaum, S. (1996a), The Oxford English Grammar, Oxford: Oxford University Press.
Google Scholar
—(ed.) (1996b), Comparing English Worldwide: The International Corpus of English, Oxford: Clarendon Press.
Google Scholar
— and Ni, Y. (1996), About the ICE Tagset (Chapter 8 in Greenbaum, 1996b, 92–109).
Google Scholar
Halteren, H. Van and Oostdijk, N. (1993), Towards a Syntactic Database: the TOSCA Analysis System, in Aarts, J, de Haan, P. and Oostdijk, N. (eds), English Language Corpora: Design, Analysis and Exploitation. Amsterdam: Rodopi.
Google Scholar
Jonassen, D.H., Beissener, K., and Yacci, M. (1993), Structural Knowledge: Techniques for Representing, Conveying, and Acquiring Structural Knowledge, Hillsdale, NJ: LEA.
Google Scholar
Leech, G. and Garside, R. (1991), Running a Grammar Factory: on the compilation of parsed corpora, or treebanks, in Johansson, S. and Stenström, A.-B. (eds), English Computer Corpora: Selected Papers and Research Guide. Berlin: Mouton de Gruyter, 15–32.
Google Scholar
Major, N.P., and Reichgelt, H. (1990), ALTO: An Automated Laddering Tool, in Wielinga, B., Boose, J., Gaines, B. Schreiber, G., van Someren, M. (Eds.) (1990), Current Trends in Knowledge Acquisition, 222–236, Amsterdam: IOS Press.
Google Scholar
Major, N.P., and Shadbolt, N.R. (1992), CNN: Integrating Knowledge Elicitation with a Machine Learning Technique, in Proceedings of JKAW-92.
Google Scholar
Marcus, M., Marcinkiewicz, M.A. and Santorini, B. (1993), Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 2, 313–330.
Google Scholar
Michalski, R.S. and Stepp, R.E. (1983), Learning from observation: conceptual clustering, in Michalski, R.S., Carbonell J.G. and Mitchell T.M. (Eds.), Machine Learning: an Artificial Intelligence Approach, 331–363, Palo Alto: CA, Tioga.
Google Scholar
Minsky, M. (1975), A Framework for the Representation of Knowledge, in Winston, P. (Ed.), The Psychology of Computer Vision, New York: McGraw Hill, 211–277.
Google Scholar
Paskiewicz, T., Patten, C., Shadbolt, N.R., Swallow, S., and Wallis, S.A. (1991), Functional specification of SET tools, SET deliverable D006, University of Nottingham.
Google Scholar
Quinn, A., and Porter, N. (1996), ICE Annotation Tools, (Chapter 6 in Greenbaum, 1996b, 65–78).
Google Scholar
Shadbolt, N.R., O'Hara, K. and Schreiber, G. (eds.) Advances in Knowledge Acquisition, Proceedings of EKAW '96, Berlin: Springer-Verlaag.
Google Scholar
Wallis, S.A. (1993), Machine Learning with Knowledge, in Proceedings of MLnet Workshop on Scientific Discovery 1993, MLnet.
Google Scholar
— (1997), Exploiting hierarchical sets in A. L, PhD Thesis (submitted), University of Nottingham.
Google Scholar
— and SHADBoLT, N.R. (1993), Induction as Knowledge Acquisition, Dept. of Psychology Postgraduate Conference 1993, Department of Psychology, University of Nottingham.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of English (Survey of English Usage), University College London, Gower Street, London, UK
Sean Wallis & Gerry Nelson

Authors

Sean Wallis
View author publications
You can also search for this author in PubMed Google Scholar
Gerry Nelson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Enric Plaza Richard Benjamins

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wallis, S., Nelson, G. (1997). Syntactic parsing as a knowledge acquisition problem. In: Plaza, E., Benjamins, R. (eds) Knowledge Acquisition, Modeling and Management. EKAW 1997. Lecture Notes in Computer Science, vol 1319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026792

Download citation

DOI: https://doi.org/10.1007/BFb0026792
Published: 17 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63592-5
Online ISBN: 978-3-540-69606-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics