Abstract
The US National Toxicology Program recommends the use of the parametric multiple comparison procedures of Dunnett and Williams for the evaluation of repeated toxicity studies. For endpoints where either increasing or decreasing effects are of toxicological relevance, we recommend the use of the two-sided Dunnett test exclusively. For the many other endpoints, where a priori only one direction is of toxicological relevance, however, we recommend the combination of Dunnett and Williams test. In particular, we recommend the so-called Umbrella-protected Williams test which offers insights for all interesting monotone and non-monotone alternatives while only suffering a marginal loss in power compared to the Dunnett test. We illustrate the power difference analytically and compare the approach for different endpoint types using three real data examples to alternative tests available. Nonparametric tests, which are suitable for the evaluation of skewed distributed or scores data, are also considered. Particular attention is given to the different interpretations of the findings revealed by the different test. R programs used for the analyses are provided.
Similar content being viewed by others
References
Adaramoye OA, Adesanoye OA, Adewumi OM, Akanni O (2012) Studies on the toxicological effect of nevirapine, an antiretroviral drug, on the liver, kidney and testis of male Wistar rats. Hum Exp Toxicol 31(7):676–685. doi:10.1177/0960327111424304
Bretz F, Hothorn L (2003) Statistical analysis of monotone or non-monotone dose-response data from in vitro toxicological assays. ATLA-Altern Lab Anim 31(Suppl 1):81–96
Bretz F, Hothorn LA (2002) Detecting dose-response using contrasts: asymptotic power and sample size determination for binomial data. Stat Med 21(22):3325–3335
Bretz F, Hothorn T, Westfall P (2002) On multiple comparisons in R. R News 2:14–17
Denton DL, Diamond J, Zheng L (2011) Test of significance in toxicity: a statistical application for assessing whether an effluent or site water is truly toxic. Environ Toxicol Chem 30(5):1117–1126. doi:10.1002/etc.493
Dilba G, Bretz E, Guiard V, Hothorn LA (2004) Simultaneous confidence intervals for ratios with applications to the comparison of several treatments with a control. Method Inf Med 43(5):465–469
Dilba G, Schaarschmidt F, Hothorn L (2007) Inferences for ratios of normal means. R News 7:20–23
Dunnett CW (1955) A multiple comparison procedure for comparing several treatments with a control. J Am Stat Assoc 50(272):1096–1121
Genz A, Bretz F (1999) Numerical computation of multivariate t-probabilities with application to power calculation of multiple contrasts. J Stat Comput Simul 63(4):361–378
Hasler M, Hothorn LA (2008) Multiple contrast tests in the presence of heteroscedasticity. Biom J 50(5):793–800
Hasler M, Hothorn LA (2012) A multivariate Williams-type trend procedure. Stat Biopharm Res 4(1):57–65. doi:10.1080/19466315.2011.633868
Herberich E, Hothorn LA (2012) Statistical evaluation of mortality in long-term carcinogenicity bioassays using a Williams-type procedure. Regul Toxicol Pharmacol 64:26–34
Hothorn LA (2007) How to deal with multiple treatment or dose groups in randomized clinical trials? Fundam Clin Pharmacol 21(2):137–154
Hothorn LA, Djira GD (2011) A ratio-to-control Williams-type test for trend. Pharma Stat 10(4):289–292. doi:10.1002/pst.464
Hothorn LA, Gerhard D (2009) Statistical evaluation of the in vivo micronucleus assay. Arch Toxicol 83(6):625–634
Hothorn LA, Hasler M (2008) Proof of hazard and proof of safety in toxicological studies using simultaneous confidence intervals for differences and ratios to control. J Biopharm Stat 18:915–933
Hothorn T, Bretz F, Westfall P (2008) Simultaneous inference in general parametric models. Biometrical J 50(3):346–363
Konietschke F (2013) nparcomp: an R software package for nonparametric multiple comparisons and simultaneous confidence intervals (submitted)
Konietschke F, Hothorn LA (2012) Evaluation of toxicological studies using a non-parametric Shirley-type trend test for comparing several dose levels with a control group. Stat Biopharm Res 4:14–27
Konietschke F, Hothorn LA (2012) Rank-based multiple test procedures and simultaneous confidence intervals. Electron J Stat 6:738–759. doi:10.1214/12-EJS691
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat 47(260):583–621. doi:10.2307/2280779
Kuiper RM, Gerhard D, Hothorn LA (2013) Identification of the minimum effective dose for normal distributed endpoints using a model selection approach (submitted)
Manar R, Vasseur P, Bessi H (2012) Chronic toxicity of chlordane to Daphnia magna and Ceriodaphnia dubia: a comparative study. Environ Toxicol 27(2):90–97. doi:10.1002/tox.20616
R Development Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org, ISBN 3-900051-07-0
Schaarschmidt F, Sill M, Hothorn LA (2008) Approximate simultaneous confidence intervals for multiple contrasts of binomial proportions. Biom J 50(5):782–792
Schaarschmidt F, Sill M, Hothorn LA (2008) Poly-k-trend tests for survival adjusted analysis of tumor rates formulated as approximate multiple contrast test. J Biopharm Stat 18(5):934–948
Schaarschmidt F, Gerhard D, Sill M (2012) MCPAN: multiple comparisons using normal approximation. http://CRAN.R-project.org/package=MCPAN, r package version 1.1-14
Shirley E (1977) A nonparametric equivalent of Williams’ test for contrasting increasing dose levels of a treatment. Biometrics 33(2):386–389
Steel RGD (1959) A multiple comparison rank sum test—treatments versus control. Biometrics 15(4):560–572. doi:10.2307/2527654
Swain A, Turton J, Scudamore C, Maguire D, Pereira I, Freitas S, Smyth R, Munday M, Stamp C, Gandhi M, Sondh S, Ashall H, Francis I, Woodfine J, Bowles J, York M (2012) Nephrotoxicity of hexachloro-1:3-butadiene in the male Hanover Wistar rat; correlation of minimal histopathological changes with biomarkers of renal injury. J Appl Toxicol 32(6):417–428. doi:10.1002/jat.1727
US-NTP (2000) Toxicology and carcinogenesis studies of methyleugenol in f344/n rats and b6c3f1 mice. Technical report 491. Tech. rep., National Toxicology Program. US Department of Health and Human Services: National Institutes of Health, Washington DC
US-NTP (2012) Testing information, statistical procedures, expanded overview. Tech. rep., National Toxicology Program, Department of Health and Human Services, Testing Information, Statistical Procedures, Expanded Overview (http://ntp.niehs.nih.gov/?objectid=72015E2C-BDB7-CEBA-F17F9ACA7AE5346D)
Williams DA (1971) A test for differences between treatment means when several dose levels are compared with a zero dose control. Biometrics 27(1):103–117
Williams DA (1972) The comparison of several dose levels with a zero dose control. Biometrics 28(2):519–531
Acknowledgments
This work was supported in part by the German Science Foundation grant DfG-HO1687 and the EC FP7 program project ESNATS for the last author (LAH).
Conflict of interest
The authors declare that there is no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this section, we provide simple R-code used to analyze the examples. Comments are preceeded by # and are given for every command used.
Rights and permissions
About this article
Cite this article
Jaki, T., Hothorn, L.A. Statistical evaluation of toxicological assays: Dunnett or Williams test—take both. Arch Toxicol 87, 1901–1910 (2013). https://doi.org/10.1007/s00204-013-1065-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00204-013-1065-x