Iterated Boosting for Outlier Detection

Cheze, Nathalie; Poggi, Jean-Michel

doi:10.1007/3-540-34416-0_23

Nathalie Cheze^22,23 &
Jean-Michel Poggi^22,24

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2658 Accesses
4 Citations
1 Altmetric

Abstract

A procedure for detecting outliers in regression problems based on information provided by boosting trees is proposed. Boosting is meant for dealing with observations that are hard to predict, by giving them extra weights. In the present paper, such observations are considered to be possible outliers, and a procedure is proposed that uses the boosting results to diagnose which observations could be outliers. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate boosting after removing it. A lot of well-known bench data sets are considered and a comparative study against two classical competitors allows to show the value of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A. and STONE, C. J. (1984): Classification And Regression Trees. Chapman & Hall.
Google Scholar
CHEZE, N. and POGGI, J-M. (2005): Outlier Detection by Boosting Regression Trees. Preprint 2005–17, Orsay. www.math.u-psud.fr/biblio/ppo/2005/
Google Scholar
CHEZE, N., POGGI, J-M. and PORTIER, B. (2003): Partial and Recombined Estimators for Nonlinear Additive Models. Stat. Inf. Stoch. Proc., 6, 155–197.
Article MATH MathSciNet Google Scholar
DRUCKER, H. (1997): Improving Regressors using Boosting Techniques. In: Proc. of the 14th Int. Conf. on Machine Learning. Morgan Kaufmann, 107–115.
Google Scholar
FREUND, Y. and SCHAPIRE, R. E. (1997): A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55,1, 119–139.
Article MATH MathSciNet Google Scholar
GEY, S. and POGGI, J-M. (2006): Boosting and Instability for Regression Trees. Computational Statistics & Data Analysis, 50,2, 533–550.
Article MathSciNet Google Scholar
ROUSSEEUW, P.J. and LEROY, A. (1987): Robust regression and outlier detection. Wiley.
Google Scholar
VERBOVEN, S. and HUBERT, M. (2005): LIBRA: a MATLAB library for robust analysis. Chemometrics and Intelligent Laboratory Systems, 75, 127–136.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire de Mathématique — U.M.R. C 8628, “Probabilités, Statistique et Modélisation”, Université Paris-Sud, Bât. 425, 91405, Orsay cedex, France
Nathalie Cheze & Jean-Michel Poggi
Université Paris 10-Nanterre, Modal’X, France
Nathalie Cheze
Université Paris 5, France
Jean-Michel Poggi

Authors

Nathalie Cheze
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Michel Poggi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, FMF, University of Ljubljana, Jadranska 19, 1000, Ljubljana, Slovenia
Vladimir Batagelj
Institute of Statistics, RWTH Aachen University, 52056, Aachen, Germany
Hans-Hermann Bock
Faculty of Social Sciences, University of Ljubljana, Kardeljeva pl. 5, 1000, Ljubljana, Slovenia
Anuška Ferligoj & Aleš Žiberna &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheze, N., Poggi, JM. (2006). Iterated Boosting for Outlier Detection. In: Batagelj, V., Bock, HH., Ferligoj, A., Žiberna, A. (eds) Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-34416-0_23

Download citation

DOI: https://doi.org/10.1007/3-540-34416-0_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34415-5
Online ISBN: 978-3-540-34416-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics