Testing exclusion restrictions and additive separability in sample selection models

Huber, Martin; Mellace, Giovanni

doi:10.1007/s00181-013-0742-1

Testing exclusion restrictions and additive separability in sample selection models

Published: 14 September 2013

Volume 47, pages 75–92, (2014)
Cite this article

Empirical Economics Aims and scope Submit manuscript

Martin Huber¹ &
Giovanni Mellace¹

1215 Accesses
32 Citations
6 Altmetric
Explore all metrics

Abstract

Standard sample selection models with non-randomly censored outcomes assume (i) an exclusion restriction (i.e., a variable affecting selection, but not the outcome) and (ii) additive separability of the errors in the selection process. This paper proposes tests for the joint satisfaction of these assumptions by applying the approach of Huber and Mellace (Testing instrument validity for LATE identification based on inequality moment constraints, 2011) (for testing instrument validity under treatment endogeneity) to the sample selection framework. We show that the exclusion restriction and additive separability imply two testable inequality constraints that come from both point identifying and bounding the outcome distribution of the subpopulation that is always selected/observed. We apply the tests to two variables for which the exclusion restriction is frequently invoked in female wage regressions: non-wife/husband’s income and the number of (young) children. Considering eight empirical applications, our results suggest that the identifying assumptions are likely violated for the former variable, but cannot be refuted for the latter on statistical grounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Notes

It has already been noticed by Manski (2003) that the exclusion restriction is violated if the identification region defined by the bounds is empty.
In contrast, Mealli and Pacini (2008) consider identification (for binary treatment variables) when conditioning on a binary instrument directly rather than using $\Pr (S=1|X,Z)$ as a control function. In this case, point identification is not obtained in general, but requires additional assumptions.
This issue does not arise in the endogenous treatment framework of Huber and Mellace (2011), where all outcomes are observed.
For a similar result in the context of selection models see Lee (2009), who in contrast to this paper considers monotonicity of selection in a binary treatment.
Note that the instrument $Z$ and the type $T$ uniquely determine the value of the selection indicator $S$ such that conditioning on the latter is redundant.
In Link to Kitagawa (2010) in Appendix we show how this result compares to Kitagawa (2010), who derives a related testable implication based on comparable model assumptions.
As discussed in Chen and Szroeter (2012), a sufficient condition for correct asymptotic size in the uniform sense is that the first four moments exist for each of the i.i.d. data points used to estimate the constraints.
Which number and definition of the subsets $A$ is optimal for testing is an unsolved issue. We therefore also considered more or less subsets, but the results did not differ in an important way and are for this reason not reported here.

References

Ahn H, Powell J (1993) Semiparametric estimation of censored selection models with a nonparametric selection mechanism. J Econ 58:3–29
Article Google Scholar
Angrist J, Bettinger E, Kremer M (2006) Long-term educational consequences of secondary school vouchers: evidence from administrative records in Colombia. Am Econ Rev 96:847–862
Article Google Scholar
Angrist J, Evans W (1998) Children and their parents labor supply: evidence from exogeneous variation in family size. Am Econ Rev 88:450–477
Google Scholar
Angrist J, Imbens G, Rubin D (1996) Identification of causal effects using instrumental variables. J Am Stat Assoc 91:444–472 (with discussion)
Article Google Scholar
Angrist J, Lang D, Oreopoulos P (2009) Incentives and services for college achievement: evidence from a randomized trial. Am Econ J Appl Econ 1:136–163
Article Google Scholar
Becker G (1981) A treatise on the family. Harvard University Press, Cambridge
Google Scholar
Blundell R, Gosling A, Ichimura H, Meghir C (2007) Changes in the distribution of male and female vages accounting for employment composition using bounds. Econometrica 75:323–363
Article Google Scholar
Chang S-K (2011) Simulation estimation of two-tiered dynamic panel Tobit models with an application to the labor supply of married women. J Appl Econ 26:854–871
Article Google Scholar
Chen L-Y, Szroeter J (2012) Testing multiple inequality hypotheses: a smoothed indicator approach, CeMMAP working paper 16/12
Cosslett S (1991) Distribution-free estimator of a regression model with sample selectivity. In: Barnett W, Powell J, Tauchen G (eds) Nonparametric and semiparametric methods in econometrics and statistics. Cambridge University Press, Camdridge, pp 175–198
Google Scholar
Crépon B (2006) Testing exclusion restrictions at infinity in the semiparametric selection model. IZA Discussion Paper no. 2035
Das M, Newey WK, Vella F (2003) Nonparametric estimation of sample selection models. Rev Econ Stud 70:33–58
Article Google Scholar
Fleisher BM, Rhodes J (1979) Fertility. Women’s wage rates, and labor supply. Am Econ Rev 69:14–24
Google Scholar
Frangakis CE, Rubin DB (2002) Principal stratification in causal inference. Biometrics 58:21–29
Article Google Scholar
Gallant A, Nychka D (1987) Semi-nonparametric maximum likelihood estimation. Econometrica 55:363–390
Article Google Scholar
Gronau R (1974) Wage comparisons—a selectivity bias. J Political Econ 82:1119–1143
Article Google Scholar
Heckman JJ (1974) Shadow prices. Market wages and labor supply. Econometrica 42:679–694
Article Google Scholar
Heckman JJ (1976) The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Ann Econ Soc Meas 5:475–492
Google Scholar
Heckman JJ (1979) Sample selection bias as a specification error. Econometrica 47:153–161
Article Google Scholar
Horowitz JL (1992) A smoothed maximum score estimator for the binary response model. Econometrica 60:505–531
Article Google Scholar
Horowitz JL, Manski CF (1995) Identification and robustness with contaminated and corrupted data. Econometrica 63:281–302
Article Google Scholar
Huber M, Mellace G (2011) Testing instrument validity for LATE identification based on inequality moment constraints, University of St Gallen, Dept. of Economics Discussion Paper no. 2011–43
Imbens GW, Rubin D (1997) Estimating outcome distributions for compliers in instrumental variables models. Rev Econ Stud 64:555–574
Article Google Scholar
Kitagawa T (2010) Testing for instrument independence in the selection model. University College London (unpublished manuscript)
Lee DS (2009) Training. Wages, and sample selection: estimating sharp bounds on treatment effects. Rev Econ Stud 76:1071–1102
Article Google Scholar
Manski CF (2003) Partial identification of probability distributions. Springer, New York
Google Scholar
Martins M (2001) Parametric and semiparametric estimation of sample selection models: an empirical application to the female labour force in Portugal. J Appl Econ 16:23–39
Article Google Scholar
Mealli F, Pacini B (2008) Exploiting instrumental variables in causal inference with nonignorable outcome nonresponse using principal stratification, mimeo
Mroz T (1987) The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions. Econometrica 55:765–799
Article Google Scholar
Mulligan CB, Rubinstein Y (2008) Selection. Investment, and women’s relative wages over time. Q J Econ 123:1061–1110
Article Google Scholar
Nakosteen RA, Westerlund O, Zimmer MA (2004) Marital matching and earnings: evidence from the unmarried population in Sweden. J Hum Resour 39:1033–1044
Article Google Scholar
Newey WK (2007) Nonparametric continuous/discrete choice models. Int Econ Rev 48:1429–1439
Article Google Scholar
Newey WK (2009) Two-step series estimation of sample selection models. Econ J 12:S217–S229
Article Google Scholar
Powell JL (1987) Semiparametric Estimation of Bivariate Latent Variable Models. unpublished manuscript. University of Wisconsin-Madison
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Article Google Scholar
Schafgans MMA (1998) Ethnic wage differences in Malaysia: parametric and semiparametric estimation of the Chinese-Malay wage gap. J Appl Econ 13:481–504
Article Google Scholar
Schochet PZ, Burghardt J, Glazerman S (2001) National job corps study: the impacts of job corps on participants’ employment and related outcomes, report. Mathematica Policy Research, Inc., Washington, DC
Google Scholar
Vytlacil E (2002) Independence. Monotonicity, and latent index models: an equivalence result. Econometrica 70:331–341
Article Google Scholar
Zabel JE (1993) The relationship between hours of work and labor force participation in four models of labor supply behavior. J Labor Econ 11:387–416
Article Google Scholar

Download references

Acknowledgments

We have benefited from comments by Alberto Abadie, Joshua Angrist, Guido Imbens, Toru Kitagawa, Alexa Tiemann, seminar participants at Harvard (seminar in econometrics, September 2011), and an anonymous associate editor. Martin Huber gratefully acknowledges financial support from the Swiss National Science Foundation Grant PBSGP1_138770.

Author information

Authors and Affiliations

Department of Economics, SEW, University of St. Gallen, Varnbüelstrasse 14, St. Gallen, 9000, Switzerland
Martin Huber & Giovanni Mellace

Authors

Martin Huber
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Mellace
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Martin Huber or Giovanni Mellace.

Additional information

An earlier version of this paper was circulated under the title “Testing instrument validity in sample selection models”.

Appendix

1.1 Link to Kitagawa (2010)

The subsequent discussion links the testable implications of Sect. 3 to Kitagawa (2010), who derives a testable implication based on comparable model assumptions. Considering only positive monotonicity, Kitagawa (2010) shows in his Proposition 2.3 that under Assumptions 1 and 2,

$$\begin{aligned} f(y,S=1|Z=0) \le f(y,S=1|Z=1) \hbox { for all }y \hbox { in the support of }Y, \end{aligned}$$

(13)

i.e., the joint density of $Y$ and $S=1$ given $Z=1$ must nest the joint density of $Y$ and $S=1$ given $Z=0$ for any value of $Y.$ Rearranging terms such that $f(y,S=1|Z=1)-f(y,S=1|Z=0) \ge 0$ gives the intuitive interpretation that the pdf of the compliers’ outcome cannot be smaller than zero, as densities must not be negative.

Note that (7) in Sect. 3 is equivalent to

$$\begin{aligned} \frac{\Pr (Y\in A,S=1|Z=1)}{P_{1|0}}-\frac{P_{1|1}-P_{1|0}}{P_{1|0}}&\le \frac{\Pr (Y\in A,S=1|Z=0)}{P_{1|0}} \nonumber \\&\le \frac{\Pr (Y\in A,S=1|Z=1)}{P_{1|0}} \end{aligned}$$

(14)

for all $A$ in the support of $Y,$ because

$$\begin{aligned} \frac{\Pr (Y\in A|Z=1,S=1)-(1-q)}{q}&= \frac{\Pr (Y\in A,S=1|Z=1)}{q\cdot \Pr (S=1|Z=1)}-\frac{(1-q)}{q} \\&= \frac{\Pr (Y\in A,S=1|Z=1)}{P_{1|0}}-\frac{P_{1|1}-P_{1|0}}{P_{1|0}},\\ \frac{\Pr (Y\in A|Z=1,S=1)}{q}&= \frac{\Pr (Y\in A,S=1|Z=1)}{q\cdot \Pr (S=1|Z=1)}\\&= \frac{\Pr (Y\in V,D=1|Z=1)}{P_{1|0}},\\ \Pr (Y\in A|Z=0,S=1)&= \frac{\Pr (Y\in A,S=1|Z=0)}{P_{1|0}}, \end{aligned}$$

by using basic probability theory. (14) in turn implies that $\hbox { for all }A \hbox { in the support of }Y,$

$$\begin{aligned} \Pr (Y\in A,S=1|Z=1)-(P_{1|1}-P_{1|0})&\le \Pr (Y\in A,S=1|Z=0)\nonumber \\&\le \Pr (Y\in A,S=1|Z=1), \end{aligned}$$

(15)

and when applied to the pdf, that $ \hbox {for all }y \hbox { in the support of }Y$

$$\begin{aligned} f(y,S=1|Z=1)-(P_{1|1}-P_{1|0})&\le f(y,S=1|Z=0)\nonumber \\&\le f(y,S=1|Z=1), \end{aligned}$$

(16)

i.e., (16) yields one additional testable implication compared to (13). If we rearrange the first part in (15) $\Pr (Y\in A,S=1|Z=1)-(P_{1|1}-P_{1|0})\le \Pr (Y\in A,S=1|Z=0)$ to be $\Pr (Y\in A,S=1|Z=1)-\Pr (Y\in A,S=1|Z=0)\le (P_{1|1}-P_{1|0}),$ our additional implication gets an intuitive interpretation: The joint probability of being a complier and having a particular value of the outcome (and any sum of joint probabilities defined by non-overlapping subsets $A$) must not be larger than the unconditional probability of being a complier, because

$$\begin{aligned} \int [f(y,S=1|Z=1)-f(y,S=1|Z=0)] dy = P_{1|1}-P_{1|0}. \end{aligned}$$

(17)

It is worth noting that if testing is based on subsets $A$ that are non-overlapping and jointly cover the entire support of $Y,$ then our additional testable implication in (16) is are already taken into account by (13) and thus redundant. The prevalence of some $\Pr (Y\in A,S=1|Z=1)-\Pr (Y\in A,S=1|Z=0)>(P_{1|1}-P_{1|0})$ then necessarily implies the existence of at least one distinct $A'$ for which $\Pr (Y\in A',S=1|Z=1)-\Pr (Y\in A',S=1|Z=0)<0$ so that (13) is violated, too. Therefore, power gains from the additional testable implication might possibly only be realized when using subsets $A$ that overlap (so that violations may be averaged out) and/or do not cover the entire support of $Y,$ see also the discussion in Huber and Mellace (2011).

1.2 Chen and Szroeter’s test algorithm

This section provides the algorithm of the Chen and Szroeter (2012) test when testing the constraints on the mean outcome given in (12), but testing the probability constraints in (8) is analogous. Let $\hat{\theta }$ denote the sample analog of $\theta =(\theta ^m_{1},\theta ^m_{2})'.$ The algorithm can be sketched as follows:

1.
Estimate the vector of parameters $\hat{\theta }$ and the asymptotic variance $\hat{J}$ of $\sqrt{n}\cdot (\hat{\theta }-\theta ).$
2.
Let $\hat{\eta }_i=1/\sqrt{\hat{J}_{i}}, \ i=1,2,$ where $\hat{J}_{i}$ is the ith element of the main diagonal of $\hat{J},$ and compute the smoothing function $\hat{\Psi }_i(\delta _n^{-1}\cdot \hat{\eta }_i\cdot \hat{\theta }_i)=\Phi (\delta _n^{-1}\cdot \hat{\eta }_i\cdot \theta _i),$ where $\Phi $ is the standard normal cdf and the tuning parameter $\delta _n$ is a sequence satisfying $\delta _n\rightarrow 0$ and $\sqrt{n}\cdot \delta _n\rightarrow \infty $ as $n\rightarrow \infty .$ In the applications, we choose $\delta _n=\sqrt{\frac{2\cdot \ln (\ln (n))}{n}}\cdot \hat{\sigma }_{\theta _i},$ where $ \hat{\sigma }_{\theta _i}$ is the estimated standard deviation of the ith inequality constraint.
3.
Compute the approximation term $\hat{\Lambda }_i=\phi (\delta _n^{-1}\cdot \hat{\eta }_i\cdot \hat{\theta }_i)\cdot \frac{1}{\delta _n\cdot \sqrt{n}}, \quad i=1,2,$ with $\phi $ being the standard normal pdf.
4.
Define the vectors $\hat{\Psi }=\left( \hat{\Psi }_1(\delta _n^{-1}\cdot \hat{\eta }_1\cdot \hat{\theta }_1), \hat{\Psi }_2(\delta _n^{-1}\cdot \hat{\eta }_2\cdot \hat{\theta }_2)\right) ^T,\,\hat{\Lambda }=\left( \hat{\Lambda }_1, \hat{\Lambda }_2\right) ^T,\iota _2=(1,1)^T,\,\hat{\Delta }=diag(\hat{J}_1, \hat{J}_2).$
5.
Let $\hat{Q}_1=\sqrt{(}n)\cdot \hat{\Psi }^T\hat{\Delta }\hat{\theta }-\iota _2^T\hat{\Lambda }$ and $\hat{Q}_2=\sqrt{\hat{\Psi }^T\hat{\Delta }\hat{J}\hat{\Delta }\hat{\Psi }}.$
6.
Compute the p-value as $\hat{p}=\left\{ \begin{array}{cc} 1-\Phi \left( \frac{\hat{Q}_1}{\hat{Q}_2}\right) &{} \hbox { if }\hat{Q}_2>0\\ 1 &{}\hbox { if }\hat{Q}_2=0. \end{array}\right. $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huber, M., Mellace, G. Testing exclusion restrictions and additive separability in sample selection models. Empir Econ 47, 75–92 (2014). https://doi.org/10.1007/s00181-013-0742-1

Download citation

Received: 08 November 2011
Accepted: 27 June 2013
Published: 14 September 2013
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00181-013-0742-1

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing exclusion restrictions and additive separability in sample selection models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Appendix

1.1 Link to Kitagawa (2010)

1.2 Chen and Szroeter’s test algorithm

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Testing exclusion restrictions and additive separability in sample selection models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Appendix

Appendix

1.1 Link to Kitagawa (2010)

1.2 Chen and Szroeter’s test algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation