Abstract
The C-ORAL-BRASIL is a Brazilian Portuguese spontaneous speech corpus, representative of the state of Minas Gerais diatopy (primarily from the capital city, Belo Horizonte,metropolitan area). The corpus was compiled following the same architecture and segmentation criteria adopted by the C-ORAL-ROM [1] as well as its alignment software, the WinPitch [2]. The corpus comprises 139 informal speech texts, 208,130 words, 21:08:52 hours of recording (6.1 GB wav files). The mean word number per text is 1,500. The recordings were carried out with high resolution, non-invasive wireless equipment, generally with clip-on, monodirectional microphones, and a mixer whenever there were more than two interactants, in a few occasions omnidirectional microphones were used. The texts are transcribed following the CHAT format [3], implemented for prosodic annotation [4]. The main goals for the corpus architecture are the documentation of the diaphasic and diastratic variations in Brazilian Portuguese speech.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cresti, E., Moneglia, M.: C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages. John Benjamins, Amsterdam/Philadelphia (2005)
Martin, P.: http://www.winpitch.com
Raso, T., Mello, H. (eds.): O Corpus C-ORAL-BRASIL. Editora UFMG, Belo Horizonte (2012)
MacWhinney, B.J.: The CHILDES Project. Tools for Analyzing Talk, vol. 2. Lawrence Erlbaum, Mahwah (2000)
Moneglia, M., Cresti, E.: LÃntonazione e i criteri di trascrizione del parlato adulto e infantile. In: Bortolini U., Pizzuto E. (eds) Il Progetto CHILDES-Italia: Contributi di ricerca sulla lingua italiana. Del Cerro, Pisa (1997)
Nencioni, G.: Di scritto e di parlato: Discorsi Linguistici. Zanichelli, Bologna (1983)
Bick, E.: The Parsing System Palavras - Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Aarhus University Press, Aarhus (2000)
Cresti, E.: Corpus di italiano parlato, vol. 2. Accademia della Crusca, Firenze (2000)
Austin, J.: How to do things with words. Oxford University Press, Oxford (1962)
Moneglia, M.: Spoken Corpora and Pragmatics. In: Revista Brasileira de LinguÃstica Aplicada, pp. 479–520 (2011)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 378–382 (1971)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Raso, T., Mello, H. (2012). The C-ORAL-BRASIL I: Reference Corpus for Informal Spoken Brazilian Portuguese. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-28885-2_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)