A semi-parametric method for transforming data to normality

Koekemoer, Gerhard; Swanepoel, Jan W. H.

doi:10.1007/s11222-008-9053-3

A semi-parametric method for transforming data to normality

Published: 14 February 2008

Volume 18, pages 241–257, (2008)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Gerhard Koekemoer¹ &
Jan W. H. Swanepoel¹

317 Accesses
7 Citations
Explore all metrics

Abstract

A non-parametric transformation function is introduced to transform data to any continuous distribution. When transformation of data to normality is desired, the use of a suitable parametric pre-transformation function improves the performance of the proposed non-parametric transformation function. The resulting semi-parametric transformation function is shown empirically, via a Monte Carlo study, to perform at least as well as any parametric transformation currently available in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Altman, N., Léger, C.: Bandwidth selection for kernel distribution function estimation. J. Stat. Plan. Inference 46, 195–214 (1995)
Article MATH Google Scholar
Atkinson, A.C.: Plots, Transformations and Regression. Clarendon/Oxford University Press, Oxford (1985)
MATH Google Scholar
Atkinson, A.C., Pericchi, L.R., Smith, R.L.: Grouped likelihood for the shifted power transformation. J. Roy. Stat. Soc. Ser. B 53, 473–482 (1991)
MATH MathSciNet Google Scholar
Azzalini, A.: A note on the estimation of a distribution function and quantiles by a kernel method. Biometrika 68, 326–328 (1981)
Article MathSciNet Google Scholar
Bickel, P.J., Doksum, K.A.: An analysis of transformations revisited. J. Am. Stat. Assoc. 76, 296–311 (1981)
Article MATH MathSciNet Google Scholar
Boos, D.D.: Rates of convergence for the distance between distribution function estimators. Metrika 33, 197–202 (1986)
Article MATH MathSciNet Google Scholar
Bowman, A., Hall, P., Prvan, T.: Bandwidth selection for the smoothing of distribution functions. Biometrika 85, 799–808 (1998)
Article MATH MathSciNet Google Scholar
Box, G.E.P., Cox, D.R.: An analysis of transformations. J. Roy. Stat. Soc. Ser. B 26, 211–252 (1964)
MATH MathSciNet Google Scholar
Burdige, J.B., Magee, L., Robb, A.L.: Alternative transformations to handle extreme values of the dependent variable. J. Am. Stat. Assoc. 83, 123–127 (1988)
Article Google Scholar
Cheng, R.C.H., Amin, N.A.K.: Estimating parameters in continuous univariate distributions with a shifted origin. J. Roy. Stat. Soc. Ser. B 45, 394–403 (1983)
MATH MathSciNet Google Scholar
Chu, I.-S.: Bootstrap smoothing parameter selection for distribution function estimation. Math. Japonica 41, 189–197 (1995)
MATH Google Scholar
D’Agostino, R.B., Stephens, M.A.: Goodness-of-Fit Techniques. Marcel Dekker, New York (1986)
MATH Google Scholar
Dony, J., Einmahl, U., Mason, D.M.: Uniform in bandwidth consistency of local polynomial regression function estimators. Austrian J. Stat. 35, 105–120 (2006)
Google Scholar
Einmahl, U., Mason, D.M.: Uniform in bandwidth consistency of kernel-type function estimators. Ann. Stat. 33, 1380–1403 (2005)
Article MATH MathSciNet Google Scholar
Gaudard, M., Karson, M.: On estimating the Box–Cox transformation to normality. Commun. Stat. Simul. Comput. 29, 559–582 (2000)
Article MATH Google Scholar
John, J.A., Draper, N.R.: An alternative family of transformations. J. Roy. Stat. Soc. Ser. C 29, 190–197 (1980)
MATH Google Scholar
Johnson, N.L.: Systems of frequency curves generated by methods of translation. Biometrika 36, 149–176 (1949)
MATH MathSciNet Google Scholar
Jones, M.C.: The performance of kernel density functions in kernel distribution function estimation. Stat. Probab. Lett. 9, 129–132 (1990)
Article MATH Google Scholar
Koekemoer, G.: A new method for transforming data to normality with application to density estimation. Ph.D. thesis, Potchefstroom University (2004)
Manley, B.F.: Exponential data transformations. Statistician 25, 37–42 (1976)
Article Google Scholar
Marron, J.S., Wand, M.P.: Exact mean integrated squared error. Ann. Stat. 20, 712–736 (1992)
Article MATH MathSciNet Google Scholar
Polansky, A.M., Baker, E.R.: Multistage plug-in bandwidth selection for kernel distribution function estimates. J. Stat. Comput. Simul. 65, 63–80 (2000)
Article MATH MathSciNet Google Scholar
Reiss, R.D.: Nonparametric estimation of smooth distribution functions. Scand. J. Stat. 8, 116–119 (1981)
MathSciNet Google Scholar
Ruppert, D., Cline, D.B.H.: Bias reduction in kernel density estimation by smoothed empirical transformations. Ann. Stat. 22, 185–210 (1994)
Article MATH MathSciNet Google Scholar
Ruppert, D., Wand, M.P.: Correcting for kurtosis in density estimation. Australian J. Stat. 34, 19–29 (1992)
Article MATH MathSciNet Google Scholar
Sakia, R.M.: The Box–Cox transformation technique: a review. Statistician 41, 169–178 (1992)
Article Google Scholar
Sarda, P.: Smoothing parameter selection for smooth distribution functions. J. Stat. Plan. Inference 35, 65–75 (1993)
Article MATH MathSciNet Google Scholar
Serfling, R.J.: Properties and applications of metrics on nonparametric density estimators. In: Proceedings of the International Colloquium on Nonparametric Statistical Inference, Budapest, pp. 859–873. North-Holland, Amsterdam (1980)
Google Scholar
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality. Biometrika 52, 591–611 (1965)
MATH MathSciNet Google Scholar
Swanepoel, J.W.H.: Mean integrated squared error properties and optimal kernels when estimating a distribution function. Commun. Stat. Theory Methods 17, 3785–3799 (1988)
Article MATH MathSciNet Google Scholar
Titterington, D.M.: Comment on ‘Estimating parameters in continuous univariate distributions’. J. Roy. Stat. Soc. Ser. B 47, 115–116 (1985)
MathSciNet Google Scholar
Tukey, J.W.: The comparative anatomy of transformations. Ann. Math. Stat. 28, 602–632 (1957)
Article MathSciNet Google Scholar
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
MATH Google Scholar
van Graan, F.C.: Nie-parametriese beraming van verdelingsfunksies. Master’s thesis, P.U. for C.H.E. Potchefstroom (1982)
Yang, L.: Root-n convergent transformation-kernel density estimation. J. Nonparametric Stat. 12, 447–474 (2000)
Article MATH Google Scholar
Yang, L., Marron, J.S.: Iterated transformation kernel density estimation. J. Am. Stat. Assoc. 94, 580–589 (1999)
Article MATH MathSciNet Google Scholar
Yeo, I.-K., Johnson, R.A.: A new family of power transformations to improve normality or symmetry. Biometrika 87, 954–959 (2000)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, North-West University, Private Bag X6001, Potchefstroom, 2520, South Africa
Gerhard Koekemoer & Jan W. H. Swanepoel

Authors

Gerhard Koekemoer
View author publications
You can also search for this author in PubMed Google Scholar
Jan W. H. Swanepoel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gerhard Koekemoer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koekemoer, G., Swanepoel, J.W.H. A semi-parametric method for transforming data to normality. Stat Comput 18, 241–257 (2008). https://doi.org/10.1007/s11222-008-9053-3

Download citation

Received: 11 August 2005
Accepted: 28 January 2008
Published: 14 February 2008
Issue Date: September 2008
DOI: https://doi.org/10.1007/s11222-008-9053-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A semi-parametric method for transforming data to normality

Abstract

Access this article

Similar content being viewed by others

Transforming variables to central normality

Tests for multivariate normality—a critical review with emphasis on weighted $$L^2$$ -statistics

On a test of normality based on the empirical moment generating function

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A semi-parametric method for transforming data to normality

Abstract

Access this article

Similar content being viewed by others

Transforming variables to central normality

Tests for multivariate normality—a critical review with emphasis on weighted $$L^2$$ -statistics

On a test of normality based on the empirical moment generating function

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation