Skip to main content

Inference in Two-Step Panel Data Models with Time-Invariant Regressors: Bootstrap Versus Analytic Estimators

  • Chapter
  • First Online:
Festschrift in Honor of Peter Schmidt

Abstract

The primary advantage of panel data is the ability they afford to control for unobserved heterogeneity. The fixed-effects (FE) estimator is by far the most popular technique for exploiting this advantage, but it eliminates any time-invariant regressors in the model along with the unobserved effects. Their partial effects can be easily recovered in a second-step regression of residuals constructed from the FE estimator and the group means of the time-invariant variables. In this paper, we reconsider such a two-step estimation procedure, derive its correct asymptotic covariance matrix, and compare conventional inference based on the asymptotic formula to bootstrap alternatives. Bootstrapping has a natural appeal, because of the complications associated with estimating the asymptotic covariance matrix and the inherent finite-sample bias of the resulting standard errors. We adapt the pairs and wild bootstrap to our two-step panel-data setup and show that both procedures are unbiased. Using Monte Carlo methods, we compare the error in rejection probability (ERP) of t-tests, measured as the difference between their actual and nominal size, and the power of such tests for the asymptotic and bootstrap estimators. Bootstrap estimators show small ERPs, even with a small number of cross-sectional observations, N, and clearly dominate the asymptotic method based on ERP and power. Bootstrap ERPs drop to nearly zero with N = 250, while those of the asymptotic methods are still substantial. This dominance diminishes but continues until N = 1, 000 and the number of time-series observations equals 20, so that inference with smaller samples is problematic using the asymptotic formula.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although the model setup assumes a balanced panel, this is not necessary. The asymptotic covariance matrix and bootstrap procedures can readily accommodate settings in which the number of time-series observations varies with the cross-section unit.

  2. 2.

    Atkinson and Cornwell (2013) extend the analysis here to allow some of the elements of z i to be correlated with the unobserved effect.

  3. 3.

    As discussed in Atkinson and Cornwell (2013), allowing some of the elements of z i to be correlated with the unobserved effect leads to the two-step “simple, consistent” instrumental variables estimator of Hausman and Taylor (1981). From this perspective, you can view our two-step estimator as an instrumental variables estimator using \([\mathbf{Q}_{T}\mathbf{X}_{i}, (j_{T} \otimes \mathbf{z}_{i})]\) as instruments.

  4. 4.

    Also, see Kapetanios (2008), who shows that if the data do not exhibit cross-sectional dependence but exhibit temporal dependence, then cross-sectional resampling is superior to block bootstrap resampling. Further, he shows that cross-sectional resampling provides asymptotic refinements. Monte Carlo results using these assumptions indicate the superiority of the cross-sectional method.

  5. 5.

    Further, this transformation is needed to obtain a heteroskedastic-consistent covariance matrix as explained above.

  6. 6.

    As indicated above, Davidson and Flachaire (2008) find that many other factors in addition to bias, especially heteroskedasticity, can increase the ERP of bootstrap and asymptotic estimators.

  7. 7.

    See Davidson and MacKinnon (2006a,b) for further discussion.

References

  • Agee MD, Atkinson SE, Crocker TD (2009) Multi-input and multi-output estimation of the efficiency of child health production: a two-step panel data approach. South Econ J 75:909–927

    Google Scholar 

  • Agee MD, Atkinson SE, Crocker TD (2012) Child maturation, time-invariant, and time-varying inputs: their interaction in the production of human capital. J Product Anal 35:29–44

    Article  Google Scholar 

  • Arellano M (1987) Computing robust standard errors for within-groups estimators, Oxf Bull Econ Stat 49:431–34

    Article  Google Scholar 

  • Atkinson SE, Cornwell C (2013) Inference in two-step panel data models with weak instruments and time-invariant regressors: bootstrap versus analytic estimators. Working paper, University of Georgia

    Google Scholar 

  • Breusch T, Ward MB, Nguyen HTM, Kompas T (2011) On the fixed-effects vector decomposition. Polit Anal 19:123–134

    Article  Google Scholar 

  • Cameron AC, Trivedi, PK (2005) Microeconometrics: methods and applications. Cambridge University Press, New York

    Book  Google Scholar 

  • Davidson R, Flachaire, E (2008) The wild bootstrap, tamed at last. J Econom 146:162–169

    Article  Google Scholar 

  • Davidson R, MacKinnon, JG (1999) Size distortion of bootstrap tests. Econom Theory 15:361–376

    Article  Google Scholar 

  • Davidson R, MacKinnon, JG (2006) The power of bootstrap and asymptotic tests. J Econom 133:421–441

    Article  Google Scholar 

  • Davidson R, MacKinnon, JG (2006) Bootstrap methods in econometrics. Chapter 23 In: Mills TC, Patterson KD (ed) Palgrave handbooks of econometrics: Volume I Econometric Theory. Palgrave Macmillan, Basingstoke, pp 812–838

    Google Scholar 

  • Flachaire E (2005) Bootstrapping Heteroskedastic regression models: wild bootsrap vs. pairs bootstrap. Comput Stat Data Anal 2005:361–376

    Article  Google Scholar 

  • Greene WH (2011) Fixed effects vector decomposition: A magical solution to the problem of time-invariant variables in fixed effects models? Polit Anal 19:135–146

    Article  Google Scholar 

  • Hausman JA, Taylor W (1981) Panel data and unobservable individual effects. Econometrica 49:1377–1399

    Article  Google Scholar 

  • Horowitz J (2001) The bootstrap. In: Heckman JJ, Leamer E (ed) Handbook of econometrics, vol 5, ch. 52. North-Holland, New York

    Google Scholar 

  • Kapetanios G (2008) A bootstrap procedure for panel data sets with many cross-sectional units. Econom J 11:377–395

    Article  Google Scholar 

  • Knack S (1993) The voter participation effects of selecting jurors from registration. J Law Econ 36:99–114

    Article  Google Scholar 

  • MacKinnon JG (2002) Bootstrap inference in econometrics. Can J Econ 35:615–645

    Article  Google Scholar 

  • Murphy KM, Topel RH, (1985) Estimation and inference in two-step econometric models. J Bus Econ Stat 3:88–97

    Google Scholar 

  • Plümper T, Troeger VE (2007) Efficient estimation of time-invariant and rarely changing variables in finite sample panel analysis with unit fixed effects. Polit Anal 15:124–139

    Article  Google Scholar 

  • Wooldridge JM (2010) Econometric analysis of cross section and panel data, 2nd edn. MIT: Cambridge

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Scott E. Atkinson .

Editor information

Editors and Affiliations

Appendix

Appendix

5.1.1 Unbiasedness of the Wild First-Step Estimator, \(\hat{\beta }_{FE}^{w}\)

Lemma 1:

Since ε i is drawn independently and E(ε i = 0), \(E(\boldsymbol{\xi }_{i}^{w}\vert \mathbf{X}_{i}) = 0.\)

Proof of Lemma 1: From (5.17), \(\boldsymbol{\xi }_{i}^{w} =\boldsymbol{\hat{\xi }} _{i}\epsilon _{i}\vartheta\). Thus, \(E(\boldsymbol{\xi }_{i}^{w}\vert \mathbf{X}_{i}) = E(\boldsymbol{\hat{\xi }}_{i}\epsilon _{i}\vartheta \vert \mathbf{X}_{i})\) = \(E(\boldsymbol{\hat{\xi }}_{i}\vert \mathbf{X}_{i})\vartheta E(\epsilon _{i}\vert \mathbf{X}_{i}) = E(\boldsymbol{\hat{\xi }}_{i}\vert \mathbf{X}_{i})\vartheta E(\epsilon _{i}) = 0,\) since ε i is independent of \(\boldsymbol{\hat{\xi }}_{i}\) and X i and in addition E(ε i ) = 0 by definition in (5.15).

Theorem 1:

Given the FE conditional-mean assumption in (5.3) and Lemma 1, the wild bootstrap first-step estimator \(\hat{\beta }_{FE}^{w}\) is unbiased for \(\hat{\beta }_{FE}\) .

Proof of Theorem 1: Writing the vector form of (5.16) as \(\mathbf{y}_{i}^{w} = \mathbf{X}_{i}\hat{\beta }_{FE} + \mathbf{z}_{i}\hat{\gamma }_{FE} +\boldsymbol{\xi }_{ i}^{w}\) and substituting into (5.18), the first-step wild estimator can be written as

$$\displaystyle{ \hat{\beta }_{FE}^{w} = \hat{\beta }_{ FE} +{\biggl (\sum _{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\biggr )}}^{-1}\sum _{ i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\boldsymbol{\xi }_{i}^{w}, }$$
(5.33)

where \(\boldsymbol{\xi }_{i}^{w}\) is a (T × 1) vector. Then

$$\displaystyle{ E(\hat{\beta }_{FE}^{w}\vert \mathbf{X}_{ i}) = \hat{\beta }_{FE} +{\biggl (\sum _{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\biggr )}}^{-1}\sum _{ i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}E(\boldsymbol{\xi }_{i}^{w}\vert \mathbf{X}_{ i}) = \hat{\beta }_{FE}, }$$
(5.34)

using Lemma 1. Further, \(E[E(\hat{\beta }_{FE}^{w}\vert \mathbf{X}_{i})] = E(\hat{\beta }_{FE}^{w}) = \hat{\beta }_{FE}\).

5.1.2 Unbiasedness of the Wild Second-Step Estimator, \(\hat{\gamma }_{FE}^{w}\)

To show that the second-step wild estimator is unbiased, we substitute (5.19) into (5.20) to obtain

$$\displaystyle{ u_{i}^{w} =\bar{ y}_{ i}^{w} -\bar{\mathbf{x}}_{ i}\hat{\beta }_{FE}^{w} -\mathbf{z}_{ i}\hat{\gamma }_{FE}. }$$
(5.35)

Now average (5.16) over t to obtain

$$\displaystyle{ \bar{y}_{i}^{w} =\bar{ \mathbf{x}}_{ i}\hat{\beta }_{FE} + \mathbf{z}_{i}\hat{\gamma }_{FE} +\bar{\xi }_{ i}^{w} }$$
(5.36)

and substitute (5.36) into (5.35) to yield

$$\displaystyle{ u_{i}^{w} =\bar{ \mathbf{x}}_{ i}(\hat{\beta }_{FE} -\hat{\beta }_{FE}^{w}) +\bar{\xi }_{ i}^{w}. }$$
(5.37)

Lemma 2:

Since ε i is drawn independently and E(ε i = 0), \(E(\bar{\xi }_{it}^{w}\vert \mathbf{z}_{i},\bar{\mathbf{x}}_{i}) = 0.\)

Proof of Lemma 2: Use the definition of \(\xi _{it}^{w}\) in (5.17) and condition on \(\mathbf{z}_{i},\bar{\mathbf{x}}_{i}\). Then use the independence of ε i from \(\mathbf{z}_{i},\bar{\mathbf{x}}_{i}\).

Theorem 2:

Given Theorem 1 and Lemma 2, the wild second-step estimator, \(\hat{\gamma }_{FE}^{w}\) , is unbiased for \(\hat{\gamma }_{FE}\) .

Proof of Theorem 2:

$$\displaystyle\begin{array}{rcl} E(\hat{\gamma }_{FE}^{w}\vert \mathbf{z}_{ i},\bar{\mathbf{x}}_{i})& =& \hat{\gamma }_{FE} +{\biggl (\sum _{i}\mathbf{z}_{i}^{^{\prime}}\mathbf{z}_{ i}^{{}\biggr )}}^{-1}{\biggl (\sum _{ i}\mathbf{z}_{i}^{^{\prime}}E(u_{ i}^{w}\vert \mathbf{z}_{ i},\bar{\mathbf{x}}_{i})\biggr )} \\ & =& \hat{\gamma }_{FE} +{\biggl (\sum _{i}\mathbf{z}_{i}^{^{\prime}}\mathbf{z}_{ i}^{{}\biggr )}}^{-1}{\biggl (\sum _{ i}\mathbf{z}_{i}^{^{\prime}}E\{[\bar{\mathbf{x}}_{ i}(\hat{\beta }_{FE} -\hat{\beta }_{FE}^{w}) +\bar{\xi }_{ i}^{w}]\vert \mathbf{z}_{ i},\bar{\mathbf{x}}_{i}\}\biggr )} \\ & =& \hat{\gamma }_{FE}, {}\end{array}$$
(5.38)

after substituting from (5.37) for u i w and then applying Theorem 1 and Lemma 2. Finally, \(E[E(\hat{\gamma }_{FE}^{w}\vert \mathbf{z}_{i},\bar{\mathbf{x}}_{i})] = E(\hat{\gamma }_{FE}^{w}) = \hat{\gamma }_{FE}\).

5.1.3 Unbiasedness of the Pairs First-Step Estimator, \(\hat{\beta }_{FE}^{p}\)

To show the unbiasedness of the pairs first-step estimator, we need (5.3).

Lemma 3:

Given (5.3), \(E(\mathbf{Q}_{T}\boldsymbol{\hat{\xi }}_{i}\vert v_{i},\mathbf{X}_{i}) = 0\) .

Proof of Lemma 3: First,

$$\displaystyle\begin{array}{rcl} E(\mathbf{Q}_{T}\boldsymbol{\hat{\xi }}_{i}\vert v_{i},\mathbf{X}_{i})& =& E(\mathbf{M}_{i}\mathbf{Q}_{T}\boldsymbol{\xi }_{i}\vert v_{i},\mathbf{X}_{i}) \\ & =& E(\mathbf{Q}_{T}\boldsymbol{\xi }_{i}\vert v_{i},\mathbf{X}_{i})-\mathbf{Q}_{T}\mathbf{X}_{i}{\biggl (\sum _{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\!\biggr )}}^{-1}\mathbf{X}_{ i}^{\prime}\mathbf{Q}_{T}E(\mathbf{Q}_{T}\boldsymbol{\xi }_{i}\vert v_{i},\mathbf{X}_{i}) \\ & =& 7E\{\mathbf{Q}_{T}[(\mathbf{j}_{T} \otimes c_{i}) + \mathbf{e}_{i}]\vert v_{i},\mathbf{X}_{i}\} \\ & & -\,\mathbf{Q}_{T}\mathbf{X}_{i}{\biggl (\sum _{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\!\biggr )}}^{-1}\mathbf{X}_{ i}^{\prime}\mathbf{Q}_{T}E\{\mathbf{Q}_{T}[(\mathbf{j}_{T} \otimes c_{i})+\mathbf{e}_{i}] \\ & & \times \,\vert v_{i},\mathbf{X}_{i}\},\quad {}\end{array}$$
(5.39)

using \(\mathbf{M}_{i} = \mathbf{I}_{T} -\mathbf{Q}_{T}\mathbf{X}_{i}{\biggl (\sum _{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\biggr )}}^{-1}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\) and \(\boldsymbol{\xi }_{i} = (\mathbf{j}_{T} \otimes c_{i}) + \mathbf{e}_{i}\). Then, using (5.3) and the fact that Q T eliminates c i completes the proof.

Theorem 3:

Given Lemma 3, the bootstrap pairs first-step estimator, \(\hat{\beta }_{FE}^{p},\) is unbiased for \(\hat{\beta }_{FE}.\)

Proof of Theorem 3: Substitute \(\mathbf{Q}_{T}\mathbf{y}_{i}\) in (5.28) and take expectations. Then

$$\displaystyle\begin{array}{rcl} E(\hat{\beta }_{FE}^{p}\vert v_{ i},\mathbf{X}_{i})& =& \hat{\beta }_{FE} +{\biggl (\sum _{i}v_{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}\mathbf{X}{_{i}\biggr )}}^{-1}\sum _{ i}v_{i}\mathbf{X}_{i}^{\prime}\mathbf{Q}_{T}E(\mathbf{Q}_{T}\boldsymbol{\hat{\xi }}_{i}\vert v_{i},\mathbf{X}_{i}) \\ & =& \hat{\beta }_{FE}, {}\end{array}$$
(5.40)

using Lemma 3.

5.1.4 Unbiasedness of the Pairs Second-Step Estimator, \(\hat{\gamma }_{FE}^{p}\)

Theorem 4:

Given (5.3), (5.4), and Theorem 3, the pairs second-step estimator, \(\hat{\gamma }_{FE}^{p}\) , is unbiased for \(\hat{\gamma }_{FE}\) .

Proof of Theorem 4: Substituting (5.29) into (5.30) we obtain

$$\displaystyle{ \hat{\gamma }_{FE}^{p} = \hat{\gamma }_{ FE} +{\biggl (\sum _{i}v_{i}\mathbf{z}_{i}^{^{\prime}}\mathbf{z}_{ i}^{{}\biggr )}}^{-1}{\biggl (\sum _{ i}v_{i}\mathbf{z}_{i}^{^{\prime}}\hat{u}_{ i}\biggr )}. }$$
(5.41)

We can relate \(\hat{u}_{i}\) to u i as follows:

$$\displaystyle{ \hat{u}_{i} = u_{i} -\mathbf{z}_{i}{\biggl (\sum _{i}\mathbf{z}_{i}^{\prime}\mathbf{z}{_{i}\biggr )}}^{-1}\sum _{ i}\mathbf{z}_{i}^{\prime}u_{i}. }$$
(5.42)

Then substitute (5.42) into (5.41) to obtain

$$\displaystyle{ \hat{\gamma }_{FE}^{p} = \hat{\gamma }_{ FE} +{\biggl (\sum _{i}v_{i}\mathbf{z}_{i}^{^{\prime}}\mathbf{z}_{ i}^{{}\biggr )}}^{-1}\sum _{ i}v_{i}\mathbf{z}_{i}^{^{\prime}}u_{ i} -{\biggl (\sum _{i}\mathbf{z}_{i}^{\prime}\mathbf{z}{_{i}\biggr )}}^{-1}\sum _{ i}\mathbf{z}_{i}^{\prime}u_{i} }$$
(5.43)

Conditioning on \((\mathbf{z}_{i},\bar{\mathbf{x}}_{i},v_{i})\), we use (5.8) and take the expectation of both sides to obtain

$$\displaystyle{ E(u_{i}\vert \mathbf{z}_{i},\bar{\mathbf{x}}_{i},v_{i}) = E(\bar{\xi }_{i}\vert \mathbf{z}_{i},v_{i}) +\bar{ \mathbf{x}}_{i}E[(\hat{\beta }_{FE}-\beta )\vert \mathbf{z}_{i},\bar{\mathbf{x}}_{i},v_{i}]. }$$
(5.44)

The first term on the right-hand-side of (5.44) is zero due to (5.3) and (5.4), while \(\hat{\beta }_{FE}\) is unbiased for β from Theorem 3.

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Atkinson, S.E., Cornwell, C. (2014). Inference in Two-Step Panel Data Models with Time-Invariant Regressors: Bootstrap Versus Analytic Estimators. In: Sickles, R., Horrace, W. (eds) Festschrift in Honor of Peter Schmidt. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-8008-3_5

Download citation

Publish with us

Policies and ethics