Abstract
Current technologies generate a huge number of single nucleotide polymorphism (SNP) genotype measurements in case-control studies. The resulting multiple testing problem can be ameliorated by considering candidate gene regions. The minPtest R package provides the first widely accessible implementation of a gene region-level summary for each candidate gene using the min \(P\) test. The latter is a permutation-based method that can be based on different univariate tests per SNP. The package brings together three different kinds of tests which were scattered over several R packages, and automatically selects the most appropriate one for the study design at hand. The implementation of the minPtest integrates two different parallel computing packages, thus optimally leveraging available resources for speedy results.
Similar content being viewed by others
References
Armitage P (1955) Tests for linear trends in proportions and frequencies. Biometrics 11(3):375–386
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–330
Carstensen B, Plummer M, Laara E, Laara M, et al (2010) Epi: a package for statistical analysis in epidemiology. http://CRAN.R-project.org/package=Epi, R package version 1.1.17
Chapman J, Whittaker J (2008) Analysis of multiple SNPs in a candidate gene region. Genet Epidemiol 32:560–566
Chen BE, Sakoda LC, Hsing AW, Rosenberg PS (2006) Resampling-based multiple hypothesis testing procedures for genetic case-control association studies. Genet Epidemiol 30:495–507
Clayton D, Leung H (2007) An R package for analysis of whole-genome association studies. Hum Hered 64:45–51
Clayton D (2011) snpStats: SnpMatrix and XSnpMatrix classes and methods. http://www-gene.cimr.cam.ac.uk/clayton. R package version 1.2.1
Cochran WG (1954) Some methods for strengthening the common chi-squared tests. Biometrics 10(4): 417–451
Eugster MJA, Knaus J, Porzelius C, Schmidberger M, Vicedo E (2011) Hands-on tutorial for parallel computing with R. Comput Stat 26:219–239
Gentleman R, Carey V, Bates D, Bolstad B et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80
Hahne F, Huber W, Gentleman R, Falcon S (2008) Bioconductor case studies. Springer, New York
Hosgood HD 3rd, Menashe I, Shen M, Yeager M et al (2008) Pathway-based evaluation of 380 candidate genes and lung cancer susceptibility suggests the importance of the cell cycle pathway. Carcinogenesis 29(10):1938–1943
Knaus J, Porzelius C, Binder H, Schwarzer G (2009) Easier parallel computing in R with snowfall and sfCluster. R J 1:54–59
Knaus J (2010) snowfall: Easier cluster computing (based on snow). http://CRAN.R-project.org/package=snowfall, R package version 1.84
Lan Q, Wang SS, Menashe I, Armstrong B et al (2011) Genetic variation in Th1/Th2 pathway genes and risk of non-Hodgkin lymphoma: a pooled analysis of three population-based case-control studies. Br J Hematol 153(3):341–350
Moore LE, Brennan P, Karami S et al (2009) Apolipoprotein E/C1 locus variants modify renal cell carcinoma risk. Cancer Res 69(20):8001–8008
R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/, ISBN 3-900051-07-0
Sauerbrei W, Royston P (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Stat Soc Ser A Stat Soc 162(1):71–94
Scherag A, Hebebrand J, Wichmann HE, Jöckel KH (2010) Evaluating strategies for marker ranking in genome-wide association studies of complex traits. Methods Inf Med 49:632–640
Schwender H, Fritsch A (2010) scrime: analysis of high-dimensional categorical data such as SNP data. http://CRAN.R-project.org/package=scrime, R package version 1.2.0
Schwender H, Ruczinski I, Ickstadt K (2011) Testing SNPs and sets of SNPs for importance in association studies. Biostatistics 12:18–32
Urbanek S, (2009) multicore: parallel processing of R code on machines with multiple cores or CPUs. http://RForge.net/multicore/, R package version 0.1-3
Wang SS, Purdue MP, Cerhan JR, Zheng T et al (2009) Common gene variants in the tumor necrosis factor (TNF) and TNF receptor superfamilies and NF-kB transcription factors and non-Hodgkin lymphoma risk. PLoS One 4(4):e5360
Westfall PH, Zaykin DV, Young SS (2002) Multiple tests for genetic effects in association studies. Methods Mol Biol 184:143–168
Westfall PH, Young SS (1993) Resampling-based multiple testing: example and methods for p-value adjustment. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hieke, S., Binder, H., Nieters, A. et al. minPtest: a resampling based gene region-level testing procedure for genetic case-control studies. Comput Stat 29, 51–63 (2014). https://doi.org/10.1007/s00180-012-0391-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0391-4