Parametric versus nonparametric methods in risk scoring: an application to microcredit

Hernandez, Manuel A.; Torero, Maximo

doi:10.1007/s00181-013-0703-8

Parametric versus nonparametric methods in risk scoring: an application to microcredit

Published: 09 May 2013

Volume 46, pages 1057–1079, (2014)
Cite this article

Empirical Economics Aims and scope Submit manuscript

Manuel A. Hernandez¹ &
Maximo Torero¹

508 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

The importance of credit access to improve economic opportunities in developing markets is well established in the literature. However, there exists a strong need to mitigate adverse selection problems in microlending. A risk scoring model that more accurately predicts the likelihood of repayment of potential borrowers can help address this market imperfection and to benefit both lenders and borrowers. This paper compares the performance of nonparametric versus semiparametric and traditional parametric risk scoring models based on default probabilities. We show the advantages of relying on less structured, data-driven methods for risk scoring using both simulated data and data from credit loans granted to small and microenterprises in rural Peru. The estimation results indicate that nonparametric methods lead to a better evaluation of credit worthiness and can help prevent including potential “bad” borrowers and excluding “good” borrowers from sensitive microcredit markets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Positive Payment Shocks, Liquidity and Refinance Constraints and Default Risk of Home Equity Lines of Credit at End of Draw

Article 23 March 2020

Bank Credit Risk Modeling in Emerging Capital Markets

Are Banks’ Internal Risk Parameters Consistent? Evidence from Syndicated Loans

Article 01 July 2015

Notes

As of December 2010, microfinance institutions reported reaching more than 205 million borrowers worldwide (Maes and Reed 2012). A separate issue pertains to whether microcredit has been an effective tool to lift poor people out of poverty by funding their microenterprises and increasing their wealth, considering that a large number of small businesses have been created through microcredits but only few have matured into larger businesses. Recent work evaluating the impact of microfinance using randomized field experiments provide mixed evidence regarding the effects of microcredit on household income and consumption (e.g., Banerjee et al. 2010; Dupas and Robinson 2009; Karlan and Zinman 2011).
There are also concerns that lending institutions have managed to sustain low interest rates and relatively high default rates due to subsidies and soft loans. Grameen Bank, for example, which charges an average real interest rate of 10 %, experienced losses close to 18 % of their outstanding loans from 1985 to 1996 after properly adjusting for their portfolio size (Armendariz and Morduch 2005).
See also Schreiner (2000) for additional discussion on credit scoring in microfinance.
Microfinance data in developing countries have been rather unexploited in general terms, in part due to the lack of information sharing across lending institutions.
We could also consider a continuous variable measuring the percentage of loan (installments) repaid by each individual.
The assumption that the threshold is zero is without loss of generality provided that X includes a constant.
An alternative estimator can be found in Ichimura (1993), but it is less efficient than the estimator proposed by Klein and Spady for binary choice models.
Klein and Spady add a trimming function to the log likelihood function, although trimming does not seem to matter in their simulations. Single index models further require two identification conditions under which the parameter vector \(\beta \) and function \(g(\cdot )\) can be sensibly estimated. First, the set of explanatory variables \(X\) must contain at least one continuous variable. Second, \(\beta \) cannot be identified without some location and scale restrictions (normalizations). One popular location-normalization is to not include a constant in \(X\); one popular scale-normalization is to assume that the first component of \(X\) has a unit coefficient and that this first component is a continuous variable. For further details on single index model estimations refer to Li and Racine (2006).
An alternative selection method is the standard rule-of-thumb procedure in which the bandwidth for covariate \(X_s \) is defined as \(h_s =X_{s,sd} n^{{-1}/{(4+q)}}\), where \(X_{s,sd} \) is the sample standard deviation of \(X_s , n\) is the number of observations in the working sample, and \(q\) is the total number of covariates in \(X\).
In this sense, the local linear estimator is similar to the standard linear probability model. We thank an anonymous referee for noting this.
See Racine (2008) for further details on nonparametric conditional mode models.
While the Probit model is implemented in Stata, the single index and nonparametric models are implemented in R using the np package.
McFadden et al. (1977) performance measure is equal to \(p_{11} +p_{22} -p_{12}^2 -p_{21}^2 \), where \(p_{ij} \) is the ijth entry (expressed as a fraction of the sum of all entries) in the 2 \(\times \) 2 confusion matrix of actual versus predicted (0,1) outcomes.
The Logit and linear probability model also perform very similar to the Probit model. Details are available upon request.
Note also that the differences in the MSPEs across models are more pronounced for “high” asset values, largely explained by the much lower correct default classification rate of the Probit and single index models.
Of course, it is possible that the odds of defaulting are linear to all covariates; but still in this (implausible) scenario, data-driven methods will perform at least similar to linear models.
The name of the bank is omitted due to confidentiality reasons.
Unfortunately, we only have information on asset (real estate) ownership but not on asset value. We also do not have information on debt ratio.
We estimate a random-effects Probit model since a client may be observed more than once in the database.
We also considered alternative data partitions (70–30 and 50–50 %) and obtained qualitatively similar results. The results are also not sensitive to repeated 60–40 % data partitions.
As indicated above, the local linear model may yield fitted values greater than one or less than zero. In this case, the fitted values range between \(-\)0.01 and 1.06, where 14 observations (out of 1,739) are greater than one and one observation is less than zero.
The predictive performance (both in-sample and out-of-sample) of the Logit and linear probability model are very similar to the performance of the Probit model. Further details are available upon request.
We also do not account for the probability of crop failure or climate conditions, but these variables are unlikely to explain default behavior in this case since the loans analyzed were granted to smallholder famers operating in a particular rural area in Peru.
The nonparametric method also points toward a nonlinear relationship between the odds of defaulting and other covariates.
Recent studies have also shown the potential gains of establishing a credit bureau system in microlending (de Janvry et al. 2010; Luoto et al. 2007).

References

Armendariz B, Morduch J (2005) The economics of microfinance. MIT Press, Cambridge
Google Scholar
Banerjee A, Duflo E, Glennerster R, Kinnan C (2010) The miracle of microfinance? Evidence from a randomized evaluation. Working paper, MIT Poverty Action Lab
Capon N (1982) Credit scoring systems: a critical analysis. J Market 46(2):82–91
Article Google Scholar
Coleman B (2006) Microfinance in Northeast Thailand: who benefits and how much? World Dev 34(9):1612–1638
Article Google Scholar
de Janvry A, McIntosh C, Sadoulet E (2010) The supply- and demand-side impacts of credit market information. J Dev Econ 93(2):173–188
Article Google Scholar
Dupas P, Robinson J (2009) Savings constraints and microenterprise development: evidence from a field experiment in Kenya. NBER Working Paper No. 14693
Fan J, Gijbels I (1996) Local polynomial modeling and its applications. Chapman and Hall, London
Google Scholar
Ghosh P, Mookherjee D, Ray D (2000) Credit rationing in developing countries: an overview of the theory. In: Mookherjee D, Ray D (eds) Readings in the theory of development economics. Blackwell, London
Google Scholar
Hand D, Henley W (1997) Statistical classification methods in consumer credit scoring: a review. J R Stat Soc Ser A 160(3):523–541
Article Google Scholar
Ichimura H (1993) Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J Econom 58(1–2):71–120
Article Google Scholar
Karlan D, Zinman J (2011) Microcredit in theory and practice: using randomized credit scoring for impact evaluation. Science 332:1278–1284
Article Google Scholar
Khandker S (2005) Microfinance and poverty: evidence using panel data from Bangladesh. World Bank Econ Rev 19(2):263–286
Article Google Scholar
Klein R, Spady R (1993) An efficient semiparametric estimator for binary response models. Econometrica 61(2):387–421
Article Google Scholar
Li Q, Racine J (2004) Nonparametric estimation of regression functions with both categorical and continuous data. J Econom 119(1):99–130
Article Google Scholar
Li Q, Racine J (2006) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton
Google Scholar
Luoto J, McIntosh C, Wydick B (2007) Credit information systems in less developed countries: a test with microfinance in Guatemala. Econ Dev Cult Change 55(2):313–334
Article Google Scholar
Maes J, Reed L (2012) State of the Microcredit Summit Campaign Report 2012. Microcredit Summit Campaign
McFadden D, Puig C, Kirschner D (1977) Determinants of the long-run demand for electricity. Proc Am Stat Assoc 1:109–117
Google Scholar
Pregibon D (1979) Data analytic methods for generalized linear models. PhD dissertation, University of Toronto
Racine J (1997) Consistent significance testing for nonparametric regression. J Bus Econ Stat 15(3):369–378
Google Scholar
Racine J (2008) Nonparametric econometrics: a primer. Found Trends Econom 3(1):1–88
Article Google Scholar
Racine J, Hart J, Li Q (2006) Testing the significance of categorical predictor variables in nonparametric regression models. Econom Rev 25(4):523–544
Article Google Scholar
Schreiner M (2000) Credit scoring for microfinance: can it work? J Microfinance 2(2):105–118
Google Scholar
Tukey J (1949) One degree of freedom for non-additivity. Biometrics 5(3):232–242
Article Google Scholar

Download references

Acknowledgments

We would like to thank Qi Li, Carlos Martins-Filho, Robert Kunst, and two anonymous referees for their valuable comments. We also thank Christopher Marciniak for his valuable research assistance.

Author information

Authors and Affiliations

Markets, Trade and Institutions Division, IFPRI, Washington, DC, 20006, USA
Manuel A. Hernandez & Maximo Torero

Authors

Manuel A. Hernandez
View author publications
You can also search for this author in PubMed Google Scholar
Maximo Torero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel A. Hernandez.

Appendix

Table 3 Description of variables

Full size table

Table 4 Summary statistics

Full size table

Table 5 Modeling the probability of default (dependent variable equal to one if client defaulted, zero otherwise)

Full size table

Table 6 Predictive performance of different nonparametric regression models using loan data from SMEs in rural Peru

Full size table

Table 7 Specification error test

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernandez, M.A., Torero, M. Parametric versus nonparametric methods in risk scoring: an application to microcredit. Empir Econ 46, 1057–1079 (2014). https://doi.org/10.1007/s00181-013-0703-8

Download citation

Received: 09 May 2012
Accepted: 05 February 2013
Published: 09 May 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s00181-013-0703-8

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parametric versus nonparametric methods in risk scoring: an application to microcredit

Abstract

Access this article

Similar content being viewed by others

Positive Payment Shocks, Liquidity and Refinance Constraints and Default Risk of Home Equity Lines of Credit at End of Draw

Bank Credit Risk Modeling in Emerging Capital Markets

Are Banks’ Internal Risk Parameters Consistent? Evidence from Syndicated Loans

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Parametric versus nonparametric methods in risk scoring: an application to microcredit

Abstract

Access this article

Similar content being viewed by others

Positive Payment Shocks, Liquidity and Refinance Constraints and Default Risk of Home Equity Lines of Credit at End of Draw

Bank Credit Risk Modeling in Emerging Capital Markets

Are Banks’ Internal Risk Parameters Consistent? Evidence from Syndicated Loans

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation