Skip to main content

The Distribution of Data in Word Lists and its Impact on the Subgrouping of Languages

  • Conference paper
Data Analysis, Machine Learning and Applications

Abstract

This work reveals the reason for the bias in the separation levels computed for natural languages with only a small amount of residues; as opposed to stochastically normal distributed test cases like those presented in Hohn (2007a). It is shown how these biased data can be correctly projected to true separation levels. The result is a partly new chain of separation for the main Indo-European branches that fits well to the grammatical facts, as well as to their geographical distribution. In particular it strongly demonstrates that the Anatolian languages did not part as first ones and thereby refutes the Indo-Hittite hypothesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • CYSOUW, M. (2004): email.eva.mpg.de/ cysouw/pdf/cysouwWIP.pdf

    Google Scholar 

  • CYSOUW, M., WICHMANN, S. and KAMHOLZ, D. (2006): A critique of the separation base method for genealogical subgrouping, with data from Mixe-Zoquean. Journal of Quantitative Linguistics, 13(2-3), 225-264.

    Article  Google Scholar 

  • EMBLETON, S.M. (1986): Statistics in historical linguistics [Quantitative Linguistics 30]. Brockmeyer, Bochum.

    Google Scholar 

  • GRZYBEK, P., and R. KÖHLER (Eds). (2007): Exact Methods in the Study of Language and Text [Quantitative Linguistics 62]. De Gruyter Berlin.

    Google Scholar 

  • HAMP, E.P. (1998): “Whose were the Tocharians? Linguistic subgrouping and Diagnostic Idiosyncrasy” The Bronze Age and Early Iron Age Peoples of Eastern Central Asia. Vol. 1:307-46. Edited by Victor H. Mair. Washington DC: Institute for the Study of Man.

    Google Scholar 

  • HOLM, H.J. (2000): Genealogy of the Main Indo-European Branches Applying the Separation Base Method. Journal of Quantitative Linguistics, 7-2, 73-95.

    Article  MathSciNet  Google Scholar 

  • HOLM, H.J. (2003): The proportionality trap; or: What is wrong with lexicostatistics? In-dogermanische Forschungen 108, 38-46.

    Google Scholar 

  • HOLM, H.J. (2007a): Requirements and Limits of the Separation Level Recovery Method in Language Subgrouping. In: GRZYBEK, P. and KÖHLER, R. (Eds), Viribus Quantitatis. Exact Methods in the Study of Language and Text. Festschrift Gabriel Altmann zum 75. Geburtstag. Quantitative Linguistics 62. De Gruyter, Berlin.

    Google Scholar 

  • HOLM, H.J. (to appear 2007b): The new Arboretum of Indo-European “Trees”. Journal of Quantitative Linguistics 14-2.

    Google Scholar 

  • KENDALL, D.G. (1950): Discussion following Ross, A.S.C., Philological Probability Prob-lems. Journal of the Royal Statistical Society, Ser. B 12, p. 49f.

    Google Scholar 

  • POKORNY, J. (1959): Indogermanisches etymologisches Wörterbuch. Francke, Bern.

    Google Scholar 

  • RIX, H., KÜMMEL, M., ZEHNDER, Th., LIPP, R. and SCHIRMER, B. (2001): Lexikon der indogermanischen Verben. Die Wurzeln und ihre Primärstammbildungen. 2. Aufl. Reichert, Wiesbaden.

    Google Scholar 

  • SWOFFORD, D.L., OLSEN, G.J., Waddell, P.J., and HILLIS, D.M. (1996): “Phylogenetic Inference”. In: HILLIS, D.M., M. CRAIG, and B.K. MABLE (Eds). Molecular System-atics, Second Edition. Sinauer Associates, Sunderland MA, Chapter 11.

    Google Scholar 

  • WALDE, A., and J. Pokorny (Ed). (1926-1932): Vergleichendes Wörterbuch der indogerman-ischen Sprachen. de Gruyter, Berlin.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holm, H.J. (2008). The Distribution of Data in Word Lists and its Impact on the Subgrouping of Languages. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78246-9_74

Download citation

Publish with us

Policies and ethics