Skip to main content

Regression Quantile Diagnostics for Multiple Outliers

  • Conference paper
Directions in Robust Statistics and Diagnostics

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 34))

Abstract

The concept of regression quantiles provides a natural approach to the analysis of the general linear model. In addition to providing methods of statistical inference, the regression quantile computation provides useful information concerning the presence of outliers. Two methods based on regression quantiles can be suggested; one involves “peeling” observations fit exactly by extreme quantiles, and the other comes more directly from the computation of the quantile function. Simulated data sets containing outliers are used to compare these methods with the use of Cook’s D diagnostic and with Rousseeuw’s method based on a high breakdown “least median of squares”-type estimator. Although all methods fare moderately well in these trials, the “peeling” method is clearly the most efficient at identifying outliers. In an effort to explain the relatively poorer performance of Rousseuw’s method, it is shown that the “least median of squares” estimator is not an elemental solutions (i.e., fit exactly by p observations), but is determined to have exactly (p + 1) equal residuals when there are p parameters.

This research was partially supported by NSF grant DMS 88-02555 and Air Force grant AFOSR 87-0041

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Abbreviations

AMS(MOS) subject classifications:

primary: 62J05, 62G35; secondary: 62F10

References

  • Atkinson, A.C. (1986), Masking unmasked, Biometrika 73, 533–541.

    Google Scholar 

  • Bassett, G.W., Koenker, R.W. (1982), An empirical quantile function for linear models with iid errors, J. Amer. Stat. Assoc., 77, 407–415.

    Article  MathSciNet  MATH  Google Scholar 

  • Belsley, Kuh, Welsch (1980), Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, Wiley, New York.

    Book  MATH  Google Scholar 

  • Cook, R.D., Weisberg, S. (1982), Residuals and Influence in Regression, Chapman and Hall, NY.

    MATH  Google Scholar 

  • Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A. (1986), Robust Statistics: the Approach Based on Influence Functions, Wiley, NY.

    MATH  Google Scholar 

  • Hawkins, D.M., Bradu, D., Kass, G.V. (1984), Location of several outliers in multiple-regression data using elemental sets, Technometrics 26, 197–208.

    Google Scholar 

  • Joss, J., Marazzi, A. (1990), Probabilistic algorithms for least median of squares regression, Comp. Stat. Data Anal., 9, 123–133.

    Google Scholar 

  • Jurečkova, J., Portnoy, S. (1987), Asymptotics for one-step M estimators in regression with application to combining efficiency and high breakdown point, Comm. Statis., Theory and Methods, 16, 2187–2200.

    Google Scholar 

  • Koenker, R.W. (1987), A Comparison of Asymptotic Testing Methods for l 1 regression, Statistical Data Analysis Based on the L 1 Norm and Related Methods (ed: Y. Dodge), North Holland, Amsterdam, 287–295.

    Google Scholar 

  • Koenker, R.W., Bassett, G.W. (1978), Regression quantiles, Econometrica, 46, 33–50.

    Google Scholar 

  • Koenker, R.W., d’Orey, V. (1987), Computing Regression Quantiles, Applied Statist., 36, 383v393

    Google Scholar 

  • Koenker, R.W., Portnoy, S. (1987), L-Estimation for the Linear Model, J. Amer. Statist. Assoc., 82, 851–857.

    Google Scholar 

  • Portnoy, S. (1984), Tightness of the sequence of cdf processes defined from regression fractiles, Robust and Nonlinear Time Series Analysis (eds: Franke, Hardle, Martin), Springer-Verlag, New York, 231–246.

    Google Scholar 

  • Portnoy, S. (1987), Using regression fractiles to identify outliers, Statistical Data Analysis Based on the L 1 Norm and Related Methods (ed: Y. Dodge), North Holland, Amsterdam, 345–356.

    Google Scholar 

  • Portnoy, S. (1988), Asymptotic behavior of the number of regression quantile breakpoints, to appear: J. Sci. Statist. Computing.

    Google Scholar 

  • Portnoy, S., Koenker, R.W. (1989), Adaptive L-estimation of linear models, Ann. Statist., 17, 362–381.

    Google Scholar 

  • Rousseeuw, P. (1984), Least median of squares regression, J. Amer. Statist. Assoc. 79, 871–880.

    Google Scholar 

  • Rousseeuw, P., Leroy, A. (1987), Robust Regression and Outlier Detection, Wiley, NY.

    Book  MATH  Google Scholar 

  • Rousseeuw, P., Yohai, V. (1984), Robust regression by means of S-estimates, Proc. of Worskhop on Robust and Nonlinear Meth. in Time Series Analysis, Lecture Notes in Statistics, 26, Springer, 256–272.

    Google Scholar 

  • Ruppert D., Carroll, R.J. (1980), Trimmed least squares estimation in the linear model, J. Amer. Statist. Assoc., 75, 828–838.

    Google Scholar 

  • Siegel, A.F. (1982), Robust regression using repeated medians, Biometrika, 69, 242–244.

    Google Scholar 

  • Souvaine, D.L., Steele, J.M. (1987), Time and space efficient algorithms for least median of squares regression, J. Amer. Statist. Assoc., 82, 794–801.

    Google Scholar 

  • Yohai, V. (1987), High breakdown-point and high efficiency robust estimates for regression, Ann. Statist. 15, 642–656.

    Google Scholar 

  • Yohai, V., Zaman, R. (1988), High breakdown-point estimates of regression by means of the minimization of efficient scale, J. Amer. Stat. Assoc., 83, 406–413.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag New York, Inc.

About this paper

Cite this paper

Portnoy, S. (1991). Regression Quantile Diagnostics for Multiple Outliers. In: Directions in Robust Statistics and Diagnostics. The IMA Volumes in Mathematics and its Applications, vol 34. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4444-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-4444-8_8

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-8772-8

  • Online ISBN: 978-1-4612-4444-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics