Skip to main content

Classifcation and Regression Trees

  • Chapter
  • First Online:
Applied Statistical Genetics with R

Part of the book series: Use R ((USE R))

Abstract

Classification and regression trees (CARTs) are an approach to discovering relationships among a large number of independent (predictor) variables and a categorical or continuous trait. Classification trees are applied to categorical outcomes, while regression trees apply to continuous traits. Both involve the application of a recursive algorithm that aims to partition individuals into groups in a way that minimizes the within-group heterogeneity. CART was originally described by Breiman et al. (1993) and has gained popularity in recent years as a method for identifying structure in high-dimensional data settings. In the following sections, we begin by describing methods for constructing a tree. This involves defining a measure of heterogeneity, or what is commonly referred to as node impurity, as well as determining how predictor variables are input into the model. Both of these components will impact the resulting tree and need to be considered and defined carefully to reect the scientific questions at hand. We then describe methods for refining this tree to arrive at a final reproducible model. Further discussions of CART methods can be found in Breiman et al. (1993) and Zhang and Singer (1999). In Chapter 7, we describe extensions of the CART model, including random forests and logic regression trees that offer some additional advantages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea S. Foulkes .

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag New York

About this chapter

Cite this chapter

Foulkes, A. (2009). Classifcation and Regression Trees. In: Applied Statistical Genetics with R. Use R. Springer, New York, NY. https://doi.org/10.1007/978-0-387-89554-3_6

Download citation

Publish with us

Policies and ethics