Abstract
The aim of this paper is to present the design of a partial syntactic annotation of the IPI PAN Corpus of Polish [22] and the corresponding extension of the corpus search engine Poliqarp [25,12] developed at the Institue of Computer Science PAS and currently employed in Polish and Portuguese corpora projects. In particular, we will argue for the need to distinguish between, and represent both, syntactic and semantic heads, and we will sketch the representation of coordination, the area traditionally controversial both in theoretical and in computational linguistics. The annotation is designed in a way intended to maximise the usefulness of the resulting corpus for the task of automatic valence acquisition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barreto, F., Branco, A., Ferreira, E., Mendes, A., Nascimento, M.F., Nunes, F., Silva, J.: Open resources and tools for the shallow processing of Portuguese: The TagShare project. In: Proceedings of LREC (2006)
Beavers, J., Sag, I.A.: Coordinate ellipsis and apparent non-constitutent coordination. In: MĆ¼ller, S. (ed.) Proceedings of the HPSG04 Conference, pp. 48ā69. CSLI Publications, Stanford (2004)
Bloomfield, L.: Language. Holt, New York (1933)
BƶhmovĆ”, A., HajiÄ, J., HajiÄovĆ”, E., HladkĆ”, B.: The Prague Dependency Treebank: Three-level annotation scenario. In: AbeillĆ©, A. (ed.) Treebanks: Building and Using Parsed Corpora, pp. 103ā127. Kluwer, Dordrecht (2003)
Christ, O.: A modular and flexible architecture for an integrated corpus query system. In: COMPLEXā94, Budapest (1994)
Covington, M.A.: A 700-year-old argument for a syntactic transformation. http://www.ai.uga.edu/mc/trans700.html
Fast, J., PrzepiĆ³rkowski, A.: Automatic extraction of Polish verb subcategorization: An evaluation of common statistics. In: Vetulani, Z. (ed.) Proceedings of the 2nd Language & Technology Conference, PoznaƱ, Poland, pp. 191ā195 (2005)
Fillmore, C.J., Baker, C.F., Sato, H.: Seeing arguments through transparent structures. In: Proceedings of LREC 2002, Las Palmas, Canary Islands, Spain, pp. 787ā791. ELRA (2002)
Fillmore, C.J., Johnson, C.R., Petruck, M.R.L.: Background to FrameNet. International Journal of LexicographyĀ 16(3), 235ā250 (2003)
Huang, C.-R., Keh-Jiann, C., Feng-Yi, C., Keh-Jiann, C., Zhao-Ming, G., Kuang-Yu, C.: Sinica treebank: Design criteria, annotation guidelines, and on-line interface. In: Proceedings of 2nd Chinese Language Processing Workshop (Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, ACL-2000), Hong Kong, pp. 29ā37 (2000)
Ide, N., Bonhomme, P., Romary, L.: XCES: An XML-based standard for linguistic corpora. In: Proceedings of the Linguistic Resources and Evaluation Conference, Athens, Greece, pp. 825ā830 (2000)
Janus, D., PrzepiĆ³rkowski, A.: Poliqarp 1.0: Some technical aspects of a linguistic search engine for large corpora. In: WaliƱski, J., Kredens, K., GoÅŗdÅŗ-Roszkowski, S. (eds.) The proceedings of Practical Applications of Linguistic Corpora 2005, Peter Lang, Frankfurt am Main (2006)
Kallas, K.: SkÅadnia wspĆ³Åczesnych polskich konstrukcji wspĆ³ÅrzČ©dnych. Wydawnictwo Uniwersytetu MikoÅaja Kopernika, ToruÅ (1993)
Kosek, I.: Przyczasownikowe frazy przyimkowo-nominalne wĀ zdaniach wspĆ³Åczesnego jČ©zyka polskiego. Wydawnictwo Uniwersytetu WarmiÅsko-Mazurskiego, Olsztyn (1999)
Lezius, W.: TIGERSearch ā ein Suchwerkzeug fĆ¼r Baumbanken. In: Busemann, S. (ed.) Proceedings der 6.Ā Konferenz zur Verarbeitung natĆ¼rlicher Sprache (KONVENS 2002), SaarbrĆ¼cken (2002)
MelāÄuk, I.A.: Levels of dependency in linguistic description: concepts and problems. In: Ćgel, V., Eichinger, L., Eroms, H.-W., Hellwig, P., Heringer, H.-J., Lobin, H. (eds.) Dependenz und Valenz: Ein Internationales Handbuch Der Zeitgenƶsischen Forschung, pp. 188ā229. De Gruyter, Berlin (2003)
Monz, C., de Rijke, M.: Tequesta: The University of Amsterdamās texual question answering system. In: Proceedings of Tenth Text Retrieval Conference (TREC-10), pp. 513ā522 (2001)
Nivre, J.: Theory-supporting treebanks. In: Nivre, J., Hinrichs, E. (eds.) Proceedings of the Second Workshop on Treebanks and Linguistic Theories (TLT2003), VƤxjƶ, Norway, pp. 117ā128 (2003)
Pollard, C., Sag, I.A.: Information-Based Syntax and Semantics, vol. 1: Fundamentals. CSLI Lecture Notes, vol.Ā 13. CSLI Publications, Stanford (1987)
Pollard, C., Sag, I.A.: Head-driven Phrase Structure Grammar. Chicago University Press, Chicago (1994)
PrzepiĆ³rkowski, A.: Case Assignment and the Complement-Adjunct Dichotomy: A Non-Configurational Constraint-Based Approach. Ph. D. dissertation, UniversitƤt TĆ¼bingen (1999)
Adam PrzepiĆ³rkowski. The IPI PAN Corpus: Preliminary version. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004)
PrzepiĆ³rkowski, A.: On heads and coordination in a partial treebank. In: HajiÄ, J., Nivre, J. (eds.) Proceedings of the TLT 2006, Prague, pp. 163ā174 (2006)
PrzepiĆ³rkowski, A., Fast, J.: Baseline experiments in the extraction of Polish valence frames. In: KÅopotek, M.A., WierzchoÅ, S.T., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining, Advances in Soft Computing, pp. 511ā520. Springer, Berlin (2005)
Przepiāorkowski, A., Krynicki, Z., DĆŖbowski, Å., WoliÅski, M., Janus, D., BaÅski, P.: A search tool for corpora with positional tagsets and ambiguities. In: Proceedings of LREC 2004, Lisbon, pp. 1235ā1238. ELRA (2004)
PrzepiĆ³rkowski, A., WoliÅski, M.: AĀ flexemic tagset for Polish. In: Proceedings of Morphological Processing of Slavic Languages, EACLĀ 2003, Budapest, pp. 33ā40 (2003)
PrzepiĆ³kowski, A., WoliÅski, M.: The unbearable lightness of tagging: A case study in morphosyntactic tagging of Polish. In: Proceedings of the LINC-03, EACLĀ 2003, pp. 109ā116 (2003)
Sag, I.A., Gazdar, G., Wasow, T., Weisler, S.: Coordination and how to distinguish categories. Natural Language and Linguistic TheoryĀ 3, 117ā171 (1985)
Saloni, Z., ÅwidziÅski, M.: SkÅadnia wspĆ³Åczesnego jČ©zyka polskiego, 4th (changed) edn. Wydawnictwo Naukowe PWN, Warsaw (1998)
Sgall, P., HajiÄovĆ”, E., PanevovĆ”, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht (1986)
Silberztein, M.: Finite-state description of the French determiner system. French Language StudiesĀ 13, 221ā246 (2003)
M. ÅwidziÅski. Realizacje zdaniowe podmiotu-mianownika, czyli o strukturalnych ograniczeniach selekcyjnych. In: A. Markowski (ed.) OpisaÄ sÅowa, pp. 188ā201. Dom Wydawniczy Elipsa (1992)
TesniĆØre, L.: ĆlĆ©ments de Syntaxe Structurale. Klincksieck, Paris (1959)
Watson, R., Carroll, J., Briscoe, T.: Efficient extraction of grammatical relations. In: Proceedings of the Ninth International Workshop on Parsing Technology, Vancouver, British Columbia, pp. 160ā170. Association for Computational Linguistics (2005)
Wright, A., Kathol, A.: When a head is not a head: A constructional approach to exocentricity in English. In: Kim, J.-B., Wechsler, S. (eds.) Proceedings of the 9th International Conference on Head-Driven Phrase Structure Grammar, pp. 373ā389. CSLI Publications, Stanford (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
Ā© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
PrzepiĆ³rkowski, A. (2007). On Heads and Coordination in Valence Acquisition. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-70939-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70938-1
Online ISBN: 978-3-540-70939-8
eBook Packages: Computer ScienceComputer Science (R0)