Skip to main content

Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability

  • Conference paper
  • First Online:
Research in Computational Molecular Biology (RECOMB 2017)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10229))

  • 1894 Accesses

Abstract

Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense SNP-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SE), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals. In addition, for larger datasets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications.

Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate confidence intervals (CIs). FIESTA is based on parametric bootstrap sampling, and therefore avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared to previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, e.g., requiring only several seconds for datasets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all dataset sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fisher, R.A.: The correlation between relatives on the supposition of mendelian inheritance. Trans. R. Soc. Edinb. 52, 399–433 (1918)

    Article  Google Scholar 

  2. Silventoinen, K., Sammalisto, S., Perola, M., Boomsma, D.I., Cornes, B.K., Davis, C., Dunkel, L., De Lange, M., Harris, J.R., Hjelmborg, J.V., et al.: Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Res. 6(05), 399–408 (2003)

    Article  Google Scholar 

  3. Macgregor, S., Cornes, B.K., Martin, N.G., Visscher, P.M.: Bias, precision and heritability of self-reported and clinically measured height in Australian twins. Hum. Genet. 120(4), 571–580 (2006)

    Article  Google Scholar 

  4. Manolio, T.A., Brooks, L.D., Collins, F.S.: A hapmap harvest of insights into the genetics of common disease. J. Clin. Invest. 118(5), 1590 (2008)

    Article  Google Scholar 

  5. Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L., Parkinson, H.: The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42(Database issue), D1001–D1006 (2014)

    Article  Google Scholar 

  6. Visscher, P.M., Hill, W.G., Wray, N.R.: Heritability in the genomics eraconcepts and misconceptions. Nat. Rev. Genet. 9(4), 255–266 (2008)

    Article  Google Scholar 

  7. Kang, H.M., Zaitlen, N.A., Wade, C.M., Kirby, A., Heckerman, D., Daly, M.J., Eskin, E.: Efficient control of population structure in model organism association mapping. Genetics 178(3), 1709–1723 (2008)

    Article  Google Scholar 

  8. Kang, H.M., Sul, J.H., Service, S.K., Zaitlen, N.A., Kong, S.Y.Y., Freimer, N.B., Sabatti, C., Eskin, E.: Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42(4), 348–354 (2010)

    Article  Google Scholar 

  9. Lippert, C., Listgarten, J., Liu, Y., Kadie, C.M., Davidson, R.I., Heckerman, D.: Fast linear mixed models for genome-wide association studies. Nat. Methods 8(10), 833–835 (2011)

    Article  Google Scholar 

  10. Zhou, X., Stephens, M.: Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44(7), 821–824 (2012)

    Article  Google Scholar 

  11. Vattikuti, S., Guo, J., Chow, C.C.: Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet. 8(3), e1002637 (2012)

    Article  Google Scholar 

  12. Wright, F.A., Sullivan, P.F., Brooks, A.I., Zou, F., Sun, W., Xia, K., Madar, V., Jansen, R., Chung, W., Zhou, Y.H., Abdellaoui, A., Batista, S., Butler, C., Chen, G., Chen, T.H., D’Ambrosio, D., Gallins, P., Ha, M.J., Hottenga, J.J., Huang, S., Kattenberg, M., Kochar, J., Middeldorp, C.M., Qu, A., Shabalin, A., Tischfield, J., Todd, L., Tzeng, J.Y., van Grootheest, G., Vink, J.M., Wang, Q., Wang, W., Wang, W., Willemsen, G., Smit, J.H., de Geus, E.J., Yin, Z., Penninx, B., Boomsma, D.I.: Heritability and genomics of gene expression in peripheral blood. Nat. Genet. 46(5), 430–437 (2014)

    Article  Google Scholar 

  13. Kruijer, W., Boer, M.P., Malosetti, M., Flood, P.J., Engel, B., Kooke, R., Keurentjes, J.J., van Eeuwijk, F.A.: Marker-based estimation of heritability in immortal populations. Genetics 199(2), 379–398 (2015)

    Article  Google Scholar 

  14. Yang, J., Benyamin, B., McEvoy, B.P., Gordon, S., Henders, A.K., Nyholt, D.R., Madden, P.A., Heath, A.C., Martin, N.G., Montgomery, G.W., Goddard, M.E., Visscher, P.M.: Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42(7), 565–569 (2010)

    Article  Google Scholar 

  15. Yang, J., Lee, S.H., Goddard, M.E., Visscher, P.M.: GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88(1), 76–82 (2011)

    Article  Google Scholar 

  16. Lohr, S.L., Divan, M.: Comparison of confidence intervals for variance components with unbalanced data. J. Stat. Comput. Simul. 58(1), 83–97 (1997)

    Article  MATH  Google Scholar 

  17. Burch, B.D.: Comparing pivotal and REML-based confidence intervals for heritability. J. Agric. Biol. Environ. Stat. 12(4), 470–484 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  18. Burch, B.D.: Assessing the performance of normal-based and REML-based confidence intervals for the intraclass correlation coefficient. Comput. Stat. Data Anal. 55(2), 1018–1028 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  19. Kraemer, K.: Confidence intervals for variance components and functions of variance components in the random effects model under non-normality (2012)

    Google Scholar 

  20. Schweiger, R., Kaufman, S., Laaksonen, R., Kleber, M.E., März, W., Eskin, E., Rosset, S., Halperin, E.: Fast and accurate construction of confidence intervals for heritability. Am. J. Hum. Genet. 98(6), 1181–1192 (2016)

    Article  Google Scholar 

  21. Chernoff, H.: On the distribution of the likelihood ratio. Ann. Math. Stat. 573–578 (1954)

    Google Scholar 

  22. Moran, P.A.: Maximum-likelihood estimation in non-standard conditions. In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 70, pp. 441–450. Cambridge University Press (1971)

    Google Scholar 

  23. Self, S.G., Liang, K.Y.: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 82(398), 605–610 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  24. Stern, S., Welsh, A.: Likelihood inference for small variance components. Can. J. Stat. 28(3), 517–532 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  25. Visscher, P.M., Goddard, M.E.: A general unified framework to assess the sampling variance of heritability estimates using pedigree or marker-based relationships. Genetics 199(1), 223–232 (2015)

    Article  Google Scholar 

  26. Thai, H.T., Mentré, F., Holford, N.H.G., Veyrat-Follet, C., Comets, E.: A comparison of bootstrap approaches for estimating uncertainty of parameters in linear mixed-effects models. Pharm. Stat. 12(3), 129–140 (2013)

    Article  Google Scholar 

  27. Wolfinger, R.D., Kass, R.E.: Nonconjugate Bayesian analysis of variance component models. Biometrics 56(3), 768–774 (2000)

    Article  MATH  Google Scholar 

  28. Chung, Y., Rabe-hesketh, S., Gelman, A., Dorie, V., Liu, J.: Avoiding boundary estimates in linear mixed models through weakly informative priors. Berkeley Preprints, pp. 1–3 (2011)

    Google Scholar 

  29. Harville, D.A., Fenech, A.P.: Confidence intervals for a variance ratio, or for heritability, in an unbalanced mixed linear model. Biometrics 137–152 (1985)

    Google Scholar 

  30. Burch, B.D., Iyer, H.K.: Exact confidence intervals for a variance ratio (or heritability) in a mixed linear model. Biometrics 1318–1333 (1997)

    Google Scholar 

  31. Furlotte, N.A., Heckerman, D., Lippert, C.: Quantifying the uncertainty in heritability. J. Hum. Genet. 59(5), 269–275 (2014)

    Article  Google Scholar 

  32. Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19(9), 1141–1164 (2000)

    Article  Google Scholar 

  33. Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., Downey, P., Elliott, P., Green, J., Landray, M., et al.: Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12(3), e1001779 (2015)

    Article  Google Scholar 

  34. Kushner, H., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, vol. 35. Springer Science & Business Media, New York (2003)

    MATH  Google Scholar 

  35. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)

    Google Scholar 

  36. Garthwaite, P.H.: Buckland, S.T.: Generating monte carlo confidence intervals by the robbins-monro process. Appl. Stat. 159–171 (1992)

    Google Scholar 

  37. Sabatti, C., Service, S.K., Hartikainen, A.L.L., Pouta, A., Ripatti, S., Brodsky, J., Jones, C.G., Zaitlen, N.A., Varilo, T., Kaakinen, M., Sovio, U., Ruokonen, A., Laitinen, J., Jakkula, E., Coin, L., Hoggart, C., Collins, A., Turunen, H., Gabriel, S., Elliot, P., McCarthy, M.I., Daly, M.J., Järvelin, M.R.R., Freimer, N.B., Peltonen, L.: Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41(1), 35–46 (2009)

    Article  Google Scholar 

  38. Sawcer, S., Hellenthal, G., Pirinen, M., Spencer, C.C., Patsopoulos, N.A., Moutsianas, L., Dilthey, A., Su, Z., Freeman, C., Hunt, S.E., et al.: Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476(7359), 214 (2011)

    Article  Google Scholar 

  39. Joseph, V.R.: Efficient Robbins-Monro procedure for binary data. Biometrika 91(2), 461–470 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  40. Furlotte, N.A., Eskin, E.: Efficient multiple trait association and estimation of genetic correlation using the matrix-variate linear mixed-model. Genetics 200(1), 59–68 (2015)

    Article  Google Scholar 

  41. Searle, S.R., Casella, G., McCulloch, C.E.: Variance Components, vol. 391. Wiley, Hoboken (2009)

    MATH  Google Scholar 

  42. Patterson, H.D., Thompson, R.: Recovery of inter-block information when block sizes are unequal. Biometrika 58(3), 545–554 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  43. Yang, J., Zaitlen, N.A., Goddard, M.E., Visscher, P.M., Price, A.L.: Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46(2), 100–106 (2014)

    Article  Google Scholar 

  44. Loh, P.R., Bhatia, G., Gusev, A., Finucane, H.K., Bulik-Sullivan, B.K., Pollack, S.J., de Candia, T.R., Lee, S.H., Wray, N.R., Kendler, K.S., O’Donovan, M.C., Neale, B.M., Patterson, N., Price, A.L.: Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47(12), 1385–1392 (2015)

    Article  Google Scholar 

  45. Sidak, Z.: Rectangular confidence regions for the means of multivariate normal sistributions. J. Am. Stat. Assoc. 62(318), 626–633 (1967)

    MATH  Google Scholar 

  46. Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference. Springer Science & Business Media, New York (2013)

    MATH  Google Scholar 

  47. Gilmour, A.R., Thompson, R., Cullis, B.R.: Average information REML: an efficient algorithm for variance parameter estimation in linear mixed models. Biometrics 1440–1450 (1995)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank David Steinberg. R.S. is supported by the Colton Family Foundation. This study was supported in part by a fellowship from the Edmond J. Safra Center for Bioinformatics at Tel Aviv University to R.S. The Northern Finland Birth Cohort data were obtained from dbGaP: phs000276.v2.p1. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Regev Schweiger .

Editor information

Editors and Affiliations

Appendix

Appendix

The supplementary material, including additional figures, are located at https://github.com/cozygene/albi.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Schweiger, R., Fisher, E., Rahmani, E., Shenhav, L., Rosset, S., Halperin, E. (2017). Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability. In: Sahinalp, S. (eds) Research in Computational Molecular Biology. RECOMB 2017. Lecture Notes in Computer Science(), vol 10229. Springer, Cham. https://doi.org/10.1007/978-3-319-56970-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56970-3_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56969-7

  • Online ISBN: 978-3-319-56970-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics