Abstract
We consider the task of learning three verb classes: raising (e.g., seem), control (e.g., try) and ambiguous verbs that can be used either way (e.g., begin). These verbs occur in sentences with similar surface forms, but have distinct syntactic and semantic properties. They present a conundrum because it would seem that their meaning must be known to infer their syntax, and that their syntax must be known to infer their meaning. Previous research with human speakers pointed to the usefulness of two cues found in sentences containing these verbs: animacy of the sentence subject and eventivity of the predicate embedded under the main verb. We apply a variety of algorithms to this classification problem to determine whether the primary linguistic data is sufficiently rich in this kind of information to enable children to resolve the conundrum, and whether this information can be extracted in a way that reflects distinctive features of child language acquisition. The input consists of counts of how often various verbs occur with animate subjects and eventive predicates in two corpora of naturalistic speech, one adult-directed and the other child-directed. Proportions of the semantic frames are insufficient. A Bayesian attachment model designed for a related language learning task does not work well at all. A hierarchical Bayesian model (HBM) gives significantly better results. We also develop and test a saturating accumulator that can successfully distinguish the three classes of verbs. Since the HBM and saturating accumulator are successful at the classification task using biologically realistic calculations, we conclude that there is sufficient information given subject animacy and predicate eventivity to bootstrap the process of learning the syntax and semantics of these verbs.
Similar content being viewed by others
References
Alishahi, A., & Stevenson, S. (2005a). The acquisition and use of argument structure constructions: A Bayesian model. In Proceedings of the ACL 2005 workshop on psychocomputational models of human language acquisition.
Alishahi, A., & Stevenson, S. (2005b). A probabilistic model of early argument structure acquisition. In Proceedings of the 27th annual meeting of the cognitive science society.
Alishahi A., Stevenson S. (2008) A Computational model of early argument structure acquisition. Cognitive Science 32: 789–834
Becker M. (2005a) Learning verbs without arguments: The problem of raising verbs. Journal of Psycholinguistic Research 34: 165–191
Becker, M. (2005b). Raising, control and the subset principle. In J. Alderete, C.-H. Han, & A. Kochetov (Eds.), Proceedings of WCCFL 24 (pp. 52–60). Somerville, MA: Cascadilla Press.
Becker M. (2006) There began to be a learnability puzzle. Linguistic Inquiry 37: 441–456
Becker, M., & Estigarribia, B. (2010). Drawing inferences about novel raising and control verbs. Poster presented at GALANA 4. University of Toronto.
Berry M. W., Browne M., Langville A. N., Pauca V. P., Plemmons R. J. (2007) Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis 52: 155–173
Boley D. (1998) Principal direction divisive partitioning. Data Mining and Knowledge Discovery 2: 325–344
Bowerman M. (1982) Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. Quaderni di Semantica 3: 5–66
Bresnan, J., Carletta, J., Crouch, R., Nissim, M., Steedman, M., & Wasow, T., et al. (2002). Paraphrase analysis for improved generation, link project. Stanford, CA: HRCR Edinburgh-CLSI Stanford.
Brown R. (1973) A first language. Harvard University Press, Cambridge, MA
Chomsky N. (1959) Review of verbal behavior. Language 35: 26–58
Chomsky N. (1981) Lectures on government and binding: The Pisa lectures. Mouton de Gruyter, New York
Deneve S. (2008a) Bayesian spiking neurons I: Inference. Neral Computation 20: 91–117
Deneve S. (2008b) Bayesian spiking neurons II: Learning. Neral Computation 20: 118–145
Devore J. L. (1991) Probability and statistics for engineering and the sciences (3rd ed.) Duxbury Press, Belmont, CA
Dowty D. (1991) Thematic proto-roles and argument selection. Language 67: 547–619
Fisher C., Gleitman H., Gleitman L. R. (1991) On the semantic content of subcategorization frames. Cognitive Psychology 23: 331–392
Gelman A., Carlin J. B., Stern H. S., Rubin D. B. (2004) Bayesian data analysis (2nd ed.). Chapman & Hall/CRC, London
Gleitman L. (1990) The structural sources of verb meanings. Language Acquisition 1: 3–55
Gomez, R., & Gerken, L. (1997). Artificial grammar learning in one-year-olds: Evidence for generalization to new structure. In E. Hughes, M. Hughes, & A. Greenhill (Eds.), Proceedings of BUCLD 21 (pp. 194–204).
Hirsch C., Wexler K. (2007) The late development of raising: What children seem to think about seem. In: Davies W. D., Dubinsky S. (eds) New horizons in the analysis of control and raising. Springer, Dordrecht, pp 35–70
Hudson-Kam C., Newport E. (2005) Regularizing unpredictable variation: The roles of adult and child learners in language formation and change. Language Learning and Development 1: 151–196
Keenan, E. (1976). Toward a universal definition of subjects. In C. Li (Ed.), Subject and topic. New York: Academic Press.
Kemp C., Perfors A., Tenenbaum J. B. (2007) Learning overhypotheses with hierarchical bayesian models. Developmental Science 10: 307–321
Lederer A., Gleitman H., Gleitman L. (1995) Verbs of a feather flock together: Semantic information in the structure of maternal speech. In: Tomasello M., Merriman W. E. (eds) Beyond names for things: Young children’s acquisition of verbs. Lawrence Erlbaum Associates Inc, Hillsdale, NJ, pp 277–297
Lee D. D., Seung H. S. (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401: 788–791
Levin B., Rappaport Hovav M. (2005) Argument realization. Cambridge University Press, New York
Lidz J., Henry G., Gleitman L. R. (2004) Kidz in the ’hood: Syntactic bootstrapping and the mental lexicon. In: Hall D. G., Waxman S. (eds) Weaving a lexicon. MIT Press, Cambridge, MA, pp 603–636
MacWhinney B. (2000) The child language data exchange system. Lawrence Erlbaum Associates, Mahwah, NJ
Marcus G. (1993) Negative evidence in language acquisition. Cognition 46: 53–85
Merlo P., Stevenson S. (2001) Automatic verb classification based on statistical distribution of argument structure. Computational Linguistics 27: 373–408
Paatero P., Tapper U. (1994) Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5: 111–126
Perfors A., Tennenbaum J. B., Wonnacott E. (2010) Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language 37: 607–642
Perlmutter D. M. (1970) The Two verbs begin. In: Jacobs R. A., Rosenbaum P. S. (eds) Readings in English transformational grammar. Waltham Mass, Ginn, pp 107–119
Rohde, D. L. T. (2005). Tgrep2 user manual. Manuscript.
Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, & the PDP research group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2, Chap. 18). Cambridge, MA: MIT Press.
Saffran J., Aslin R., Newport E. (1996) Statistical learning by 8-month-old infants. Science 274: 1926–1928
Schulte im Walde S. (2009) The induction of verb frames and verb classes from corpora. In: Lüdeling A., Kytö M. (eds) Corpus linguistics: An international handbook. Walter de Gruyter, Berlin
Taylor, A., Marcus, M., & Santorini, B. (2003). The PENN treebank: An overview. In: A. Anne (Ed.), Treebanks: The state of the art on syntactically annotated corpora. Dordrecht: Kluwer.
Yang C. (2002) Knowledge and learning in natural language. Oxford University Press, New York
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Mitchener, W.G., Becker, M. Computational Models of Learning the Raising-Control Distinction. Res on Lang and Comput 8, 169–207 (2010). https://doi.org/10.1007/s11168-011-9073-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11168-011-9073-6