The Traditional Approach: Gross Scoring

Michel, René; Schnakenburg, Igor; von Martens, Tobias

doi:10.1007/978-3-030-22625-1_2

René Michel⁴,
Igor Schnakenburg⁵ &
Tobias von Martens⁴

336 Accesses

Abstract

Model building and scoring as a statistical methodology have been known for decades, and there is a wide variety of literature available for studies. Instead of giving a complete introduction into model building and scoring techniques, it is the intention of this chapter to explain the main predictive modeling techniques from an angle which allows the reader to understand the change in paradigm that comes with the transition from classical scores to net scores. At first, the problem to be solved is explained and formalized. The second section introduces common methods for scoring, like decision trees or (logistic) regression, always with the generalization to net scoring in mind. The third section contains an introduction to well-known quality measures for scoring models. Although the facts presented in this chapter may be known to many readers, it is nevertheless recommended to study this chapter in order to get familiar with the way scoring methods are presented and described in this book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In direct marketing, tracking behavioral data is considered more and more important and particular efforts are dedicated to get and transmit as much of this data as possible. Many mobile devices allow the transmission of positioning data (Is the customer next to a store?), video or acoustic data, or information on websites visited. Loyalty cards enable the provider to assign IDs to customers in order to track their purchase behavior over an extended period across different channels, stores, or companies, even if the customer is paying cash.
2.
It is important to emphasize that only observations where the event could have occurred are relevant. Customers holding a certain product may be able to buy it again, but bank customers without credit will not default, and males will not get pregnant.
3.
A model may be trained to predict responses in May from March data. This data could be split into training and validation datasets. Performance indicators could then be taken from deploying the model on April data, where they would predict responses for June. The application to data from a different time slice ensures a very honest evaluation of the model quality, however, may also be subject to seasonal effects.
4.
An example: Data from external providers about creditworthiness, social atlases, etc. may result in better models without breaking even with their cost.
5.
The little subscript on \(\chi _1^2\) refers to a χ ² distribution with one degree of freedom.
6.
Each point somehow “pulls” the line a little bit towards itself.

References

S.F. Crone, S. Lessmann, and R. Stahlbock. The impact of preprocessing on data mining - an evaluation of classifier sensitivity in direct marketing. European Journal of Operational Research, 173(3):781–800, 2006.
Article MathSciNet Google Scholar
W. Daniel. Biostatistics - A Foundation for Analysis in the Health Sciences, Eighth Edition. Wiley, 2005.
Google Scholar
M. Falk, F. Marohn, and B. Tewes. Foundations of Statistical Analyses - Examples with SAS. Birkhäuser, Basel, 2003.
Google Scholar
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, Elsevier, San Francisco, 2006.
MATH Google Scholar
R. Johnson and G. Bhattacharyya. Statistics - Principles and Methods, 4th edition. Wiley, 2001.
Google Scholar
K. Larsen. Net lift models. 2010. Presentation at the Analytics 2010 Conference, available at: http://www.sas.com/events/aconf/2010/pres/larsen.pdf.
O. Marban, G. Mariscal, and J. Segovia. A data mining & knowledge discovery process model. In Data Mining and Knowledge Discovery in Real Life Applications, Book edited by: Julio Ponce and Adem Karahoca, pages 438–453, 2009.
Google Scholar
N.J. Radcliffe. Using control groups to target on predicted lift: Building and assessing uplift models. Direct Marketing Journal, 1:14–21, 2007.
Google Scholar
N.J. Radcliffe and P.D. Surry. Quality measures for uplift models. 2011. Working paper. http://stochasticsolutions.com/pdf/kdd2011late.pdf.
SAS. Data Mining Using SAS Enterprise Miner: A Case Study Approach. SAS Institute Inc., Cary, 3rd edition, 2013.
Google Scholar
E. Siegel. Predictive Analytics: The Power to Predict who will Click, Lie or Die. John Wiley & Sons, 2015.
Book Google Scholar
T. Wang, Z. Qin, Z. Jin, and S. Zhang. Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning. Journal of Systems and Software, 83(7):1137–1147, 2010.
Article Google Scholar
C. Weiss. Datenanalyse und Modellierung mit STATISTICA. Oldenbourg, Munich, 2007.
Google Scholar
S. Zhang. Cost-sensitive classification with respect to waiting cost. Knowledge-Based Systems, 23(5):369–378, 2010.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Deutsche Bank AG, Frankfurt am Main, Germany
René Michel & Tobias von Martens
DeTeCon International GmbH, Berlin, Germany
Igor Schnakenburg

Authors

René Michel
View author publications
You can also search for this author in PubMed Google Scholar
Igor Schnakenburg
View author publications
You can also search for this author in PubMed Google Scholar
Tobias von Martens
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Michel, R., Schnakenburg, I., von Martens, T. (2019). The Traditional Approach: Gross Scoring. In: Targeting Uplift. Springer, Cham. https://doi.org/10.1007/978-3-030-22625-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-22625-1_2
Published: 10 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22624-4
Online ISBN: 978-3-030-22625-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics