Abstract
Higher education researchers have applied increasingly sophisticated regression techniques to the study of many important issues. Historically, the statistical workhorse of this work has been linear regression, which has several desirable properties for analyzing continuous outcomes, and under certain assumptions yields unbiased coefficient estimates. However, several outcomes of interest to higher education scholars are categorical or limited in the values they assume, and using linear regression to study them may violate important assumptions. Herein we provide an overview of regression techniques often employed when studying categorical or limited dependent variables. We begin by discussing the modeling of binary outcomes, which are often studied using linear probability, logistic, or probit models. We then consider dependent variables with multiple categories, modeled using ordinal and multinomial regression methods. We also discuss the use of models for other limited dependent variables, including counts, fractions, and censored or truncated outcomes. Throughout the chapter, we apply these techniques to the study of students’ college choice using a relatively new data set available from the National Center for Education Statistics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Although we utilized a restricted version of the HSLS data, there is also a publicly available version (see https://nces.ed.gov/edat/).
- 2.
Defined as two- or four-year college enrollment as of November of 2013.
- 3.
For studies with few observations (e.g., fewer than 100), use exact logistic (Mehta & Patel, 1995).
- 4.
Our students were nested within high schools, which might suggest we adjust standard errors due to the heterogeneity found within high schools through the use of a Stata option. However, there is a tradeoff here. As Long and Freese (2014) discuss, using robust standard errors no longer makes maximum likelihood an appropriate estimator. After comparing our model with and without school-level clustered errors, we confirmed little difference in our findings and decided to proceed without the robust errors. Models that include robust standard errors should rely on the Wald, rather than the likelihood test (Sribney, n.d.).
- 5.
2*[(−6988)-(−5533)].
- 6.
If using survey data, Archer and Lemeshow (2006) argue one should account for survey sampling design to calculate goodness-of-fit using the Stata command .
- 7.
Stata will automatically output odds ratios instead of raw coefficients by using the command, or one can obtain odds ratios by invoking the option when using the command.
- 8.
A logistic regression model with college enrollment as the outcome and gender as the only covariate will confirm that the odds ratio is indeed 1.41 (p < 0.001).
- 9.
In other words, there is a built-in nonlinearity to the relationship between each covariate and the outcome. However, even with this nonlinearity imposed by the functional form, researchers still need to consider whether any higher order (i.e., polynomials) of covariates are appropriate to account for nonlinear relationships in the logit (or log-odds).
- 10.
Computed by adding the option when using the Stata command.
- 11.
A formal statistical test can be applied to test the difference between two probabilities using and . For more, see Long and Freese (2014).
- 12.
The default classification threshold in Stata is a probability of 0.5 – observations with probabilities above 0.5 are classified as 1; 0 otherwise.
- 13.
To be clear, there are no absolute definition of an area under the curve measure that is a “good fit,” but rather rules of thumb ranges: 0.5 is no discrimination (or no better than chance); 0.5 to 0.7 is considered poor; 0.7 to 0.8 is acceptable; 0.8 to 0.9 is excellent; and greater than 0.9 is outstanding (Hosmer et al., 2013).
- 14.
An additional approach that is not discussed here but may of use to higher education researchers is the sequential logit, which models events that individual experience in sequence—for example course-taking (Algebra I, Algebra II, Pre-Calculus); admission stages (application, admission, enrollment); tenure-track faculty positions (Assistant, Associate, Full).
- 15.
For a graphical representation of the cut points, see Long (1997).
- 16.
See Long (1997) for the derivation of the parallel regression assumption.
- 17.
The generalized ordered logit model does not assume that the \( \widehat{\beta} \)’s are equal. See Long and Freese (2014).
- 18.
For brevity, we do not include hypothesis tests of the ordered logit or probit, but refer readers to Long and Freese’s (2014) overview.
- 19.
For more on generalized ordered logit models, see Long and Freese (2014).
- 20.
Marginal effects are useful here in comparing across models.
- 21.
This is why this model is often called the “cumulative logit model.”
- 22.
Researchers should be careful to distinguish between the risk ratio and odds ratio, as they are not interchangeable terms. In particular, odds ratios and risk ratios are most dissimilar in the middle of a distribution (Menard, 2010). Only when J = 2 are the relative risk ratio and odds ratio equal. For a clear explanation, see https://www.stata.com/statalist/archive/2005-04/msg00678.html
- 23.
The IIA also applies to the conditional logit (not discussed here).
- 24.
For an explanation of odds and risk ratios/relative risks see: http://www.theanalysisfactor.com/the-difference-between-relative-risk-and-odds-ratios/
- 25.
Only the Wald test works when using robust standard errors or survey commands, See Long and Freese (2014) for a discussion of tradeoffs between the Wald and likelihood ratio tests.
- 26.
Note these are increases in probability, rather than odds.
- 27.
If overdispersion is the result of excess zeros, a zero-inflated Poisson model may be preferable over negative binomial regression (Long, 1997).
- 28.
Recall that if we were concerned about the 14% of students that do not apply to college (and thus have a count of zero), or if we wanted to understand the decision not to apply to college separately, we could estimate a zero-inflated count model.
- 29.
This may seem like a large number of goodness of fit tests to run, but in Stata the user-written command provides all of these results simultaneously.
- 30.
One could also use heteroskedastic probit to model the variance rather than the mean of a proportional outcome.
- 31.
Tobit models also work for censoring from above, such as when data from surveys top-code variables like income for privacy reasons.
References
Addo, F. R., Houle, J. N., & Simon, D. (2016). Young, black, and (still) in the red: Parental wealth, race, and student loan debt. Race and Social Problems, 8(1), 64–76. https://doi.org/10.1007/s12552-016-9162-0
Allison, P. D. (2002). Missing data: Quantitative applications in the social sciences. Thousand Oaks, CA: Sage.
Angrist, J. D., & Pishke, J. (2009). Mostly harmless econometrics: An empiricist’s companion. Princeton, NJ: Princeton University Press.
Archer, K. J., & Lemeshow, S. (2006). Goodness-of-fit test for a logistic regression model fitted using survey sample data. The Stata Journal, 6(1), 97–105.
Arcidiacono, P. (2005). Affirmative action in higher education: How do admission and financial aid rules affect future earnings? Econometrica, 73(5), 1477–1524. https://doi.org/10.1111/j.1468-0262.2005.00627.x
Atkins, D. C., & Gallop, R. J. (2007). Rethinking how family researchers model infrequent outcomes: A tutorial on count regression and zero-inflated models. Journal of Family Psychology, 21(4), 726. https://doi.org/10.1037/0893-3200.21.4.726
Austin, J. T., Yaffee, R. A., & Hinkle, D. E. (1992). Logistic regression for research in higher education. In J. C. Smart (Ed.), Higher education: handbook of theory and research, VIII (pp. 379–410). New York: Agathon Press.
Bahr, P. R. (2008). Does mathematics remediation work?: A comparative analysis of academic attainment among community college students. Research in Higher Education, 49(5), 420–450. https://doi.org/10.1007/s11162-008-9089-4
Bastedo, M. N., & Flaster, A. (2014). Conceptual and methodological problems in research on college undermatch. Educational Researcher, 43(2), 93–99. https://doi.org/10.3102/0013189X14523039
Bastedo, M. N., & Gumport, P. J. (2003). Access to what? Mission differentiation and academic stratification in US public higher education. Higher Education, 46(3), 341–359. https://doi.org/10.1023/A:1025374011204
Bastedo, M. N., & Jaquette, O. (2011). Running in place: Low-income students and the dynamics of higher education stratification. Educational Evaluation and Policy Analysis, 33(3), 318–339. https://doi.org/10.3102/0162373711406718
Baum, C. F. (2008). Stata tip 63: Modeling proportions. Stata Journal, 8(2), 299.
Belasco, A. (2013). Creating college opportunity: School counselors and their influence on postsecondary enrollment. Research in Higher Education, 54(7), 781–804. https://doi.org/10.1007/s11162-013-9297-4
Belasco, A. S., Rosinger, K. O., & Hearn, J. C. (2015). The test-optional movement at America’s selective liberal arts colleges: A boon for equity or something else? Educational Evaluation and Policy Analysis, 37(2), 206–223. https://doi.org/10.3102/0162373714537350
Bielby, R., House, E., Flaster, A., & DesJardins, S.L. (2013) Instrumental variables: Conceptual issues and an application considering high school coursetaking. In M. Paulsen (Ed.), Higher education: Handbook of theory and research, XXVIII (pp. 263–321). Dordrecht, The Netherlands: Springer.
Bielby, R., Posselt, J. R., Jaquette, O., & Bastedo, M. N. (2014). Why are women underrepresented in elite colleges and universities? A non-linear decomposition analysis. Research in Higher Education, 55(8), 735–760. https://doi.org/10.1007/s11162-014-9334-y
Blume, G. H. (2016). Application behavior as a consequential juncture in the take-up of postsecondary education. Doctoral dissertation, University of Washington.
Borooah, V. K. (2002). Logit and probit: Ordered and multinomial models. Thousand Oaks, CA: Sage.
Brasfield, D. W., Harrison, D. E., & McCoy, J. P. (1993). The impact of high school economics on the college principles of economics course. The Journal of Economics Education, 24(2), 99–111. https://doi.org/10.2307/1183159
Cabrera, A. F. (1994). Logistic regression analysis in higher education: An applied perspective. In J. C. Smart (Ed.), Higher education: Handbook of theory and research, X (pp. 225–256). Bronx, NY: Agathon Press.
Cameron, A. C., & Trivedi, P. K. (1998). Regression analysis of count data. New York: Cambridge University Press.
Carnevale, A. P., & Strohl, J. (2013). Separate and unequal: How higher education reinforces the intergenerational reproduction of white racial privilege. Washington, DC: Georgetown University Center on Education and the Workforce.
Carnevale, A. P., & Van der Werf, M. (2017). The 20% solution: Selective colleges can afford to admit more Pell grant recipients. Washington, DC: Georgetown University Center on Education and the Workforce.
Ceja, M. (2001). Understanding the role of parents and siblings as information sources in the college choice process of Chicana students. Journal of College Student Development, 47(1), 87–104. https://doi.org/10.1353/csd.2006.0003
Cha, K.-W., & Weagley, R. O. (2002). Higher education borrowing. Financial Counseling and Planning, 13, 61–74.
Cha, K.-W., Weagley, R. O., & Reynolds, L. (2005). Parental borrowing for dependent children’s higher education. Journal of Family and Economic Issues, 26, 299–321. https://doi.org/10.1007/s10834-005-5900-y
Chen, X., Ender, P., Mitchell, M. & Wells, C. (2003). Regression with Stata. Retrieved from https://stats.idre.ucla.edu/stata/webbooks/reg/chapter2/stata-webbooksregressionwith-statachapter-2-regression-diagnostics/
Cheng, S., & Starks, B. (2002). Racial differences in the effects of significant others on students’ educational expectations. Sociology of Education, 75(4), 306–327. https://doi.org/10.2307/3090281
Chung, A. S. (2012). Choice of for-profit college. Economics of Education Review, 31, 1084–1101. https://doi.org/10.1016/j.econedurev.2012.07.004
Clinedinst, M., Koranteng, A., & Nicola, T. (2015). The state of college admission. Arlington, VA: National Association for College Admission Counseling. Retrieved from: https://indd.adobe.com/view/c555ca95-5bef-44f6-9a9b-6325942ff7cb
Cragg, J. G. (1971). Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica, 39(5), 829–844. https://doi.org/10.2307/1909582
Cramer, J. S. (2003). The origins and development of the logit model. In J. S. Cramer (Ed.), Logit models from economics and other fields (pp. 149–158). Cambridge, UK: Cambridge University Press.
Cribari-Neto, F., & Zeileis, A. (2010). Beta regression in R. Journal of Statistical Software, 34(2), 1–24. https://doi.org/10.18637/jss.v034.i02
DesJardins, S. L. (2002). An analytic strategy to assist institutional recruitment and marketing efforts. Research in Higher Education, 43(5), 531–553. https://doi.org/10.1023/A:1020162014548
Dey, E. L., & Astin, A. W. (1993). Statistical alternatives for studying college student retention: A comparative analysis of logit, probit, and linear regression. Research in Higher Education, 34(5), 569–581. https://doi.org/10.1007/BF00991920
Doyle, W. (2007). Public opinion, partisan identification, and higher education policy. The Journal of Higher Education, 78(4), 369–401. https://doi.org/10.1080/00221546.2007.11772321
Dynarski, S. M. (2004). Does aid matter? Measuring the effect of student aid on college attendance and completion. The American Economic Review, 93(1), 279–288. https://doi.org/10.1257/000282803321455287
Eagan, K., Lozano, J. B., Hurtado, S., & Case, M. H. (2013). The American freshman: National norms fall 2013. Los Angeles: Higher Education Research Institute, UCLA.
Eagan, M. K., Hurtado, S., Chang, M. J., Garcia, G. A., Herrera, F. A., & Garibay, J. C. (2013). Making a difference in science education: The impact of undergraduate research programs. American Educational Research Journal, 50(4), 683–713. https://doi.org/10.3102/0002831213482038
Eliason, S. R. (1993). Quantitative applications in the social sciences: Maximum likelihood estimation. Thousand Oaks, CA: SAGE.
Engberg, M. E., & Allen, D. J. (2011). Uncontrolled destinies: Improving opportunity for low-income students in American higher education. Research in Higher Education, 52(8), 786–807. https://doi.org/10.1007/s11162-011-9222-7
Engberg, M. E., & Gilbert, A. J. (2014). The counseling opportunity structure: Examining correlates of four-year college-going rates. Research in Higher Education, 55(3), 219–244. https://doi.org/10.1007/s11162-013-9309-4
Engberg, M. E., & Wolniak, G. C. (2010). Examining the effects of high school contexts on postsecondary enrollment. Research in Higher Education, 51(2), 132–153. https://doi.org/10.1007/s11162-009-9150-y
Federal Student Aid, U.S. Department of Education. (2016). Official cohort default rates for schools. Washington, DC: Author. Retrieved from https://www2.ed.gov/offices/OSFAP/defaultmanagement/cdr.html
Ferrari, S., & Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799–815.
Freeman, K., & Thomas, G. (2008). Black colleges and college choice: Characteristics of students who choose HBCUs. The Review of Higher Education, 25(3), 349–358. https://doi.org/10.1353/rhe.2002.0011
Furquim, F., & Glasener, K. M. (2016). A quest for equity? Measuring the effect of QuestBridge on economic diversity at selective institutions. Research in Higher Education, 58, 646. https://doi.org/10.1007/s11162-016-9443-x
Furquim, F., Glasener, K. M., Oster, M., McCall, B. P., & DesJardins, S. L. (2017). Navigating the financial aid process: Borrowing outcomes among first-generation and non-first generation students. The Annals of the American Academy of Political and Social Science, 671(1), 69–91. https://doi.org/10.1177/0002716217698119
Goldrick-Rab, S. (2006). Following their every move: An investigation of social-class differences in college pathways. Sociology of Education, 79(1), 67–79. https://doi.org/10.1177/003804070607900104
Gonzales, R. G. (2011). Learning to be illegal: Undocumented youth and shifting legal context in the transition to adulthood. American Sociological Review, 76, 602–619. https://doi.org/10.1177/0003122411411901
Gonzalez, J. M., & DesJardins, S. L. (2002). Artificial neural networks: A new approach to predicting application behavior. Research in Higher Education, 43(2), 235–258. https://doi.org/10.1023/A:1014423925000
Greene, W. H. (2002). Econometric analysis (5th ed.). Upper Saddle River, NJ: Prentice Hall.
Hahn, E. D., & Soyer, R. (2005). Probit and logit models: Differences in the multivariate realm. The Journal of the Royal Statistical Society, Series B, 1–12.
Hart, N. K., & Mustafa, S. (2008). What determines the amount students borrow? Revisiting the crisis–convenience debate. Journal of Student Financial Aid, 38(1), 17–39.
Hillman, N. W. (2013). Economic diversity in elite higher education: Do no-loan programs impact Pell enrollments? The Journal of Higher Education, 84(6), 806–833. https://doi.org/10.1353/jhe.2013.0038
Hillman, N. W. (2014). College on credit: A multilevel analysis of student loan default. The Review of Higher Education, 37(2), 169–195. https://doi.org/10.1353/rhe.2014.0011
Horace, W. C., & Oaxaca, R. L. (2006). Results on the bias and inconsistency of ordinary least squares for the linear probability model. Economics Letters, 90, 90321–90327. https://doi.org/10.1016/j.econlet.2005.08.024
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Hoboken, NJ: Wiley.
Howell, J. (2010). Assessing the impact of eliminating affirmative action in higher education. Journal of Labor Economics, 28(1), 113–166. https://doi.org/10.1086/648415
Hurtado, S., Inkelas, K. K., Briggs, C., & Rhee, B. S. (1997). Differences in college access and choice among racial/ethnic groups: Identifying continuing barriers. Research in Higher Education, 38(1), 43–75. https://doi.org/10.1023/A:1024948728792
Hurwitz, M. (2012). The impact of institutional grant aid on college choice. Educational Evaluation and Policy Analysis, 34(3), 344–363. https://doi.org/10.3102/0162373712448957
Ishitani, T. T., & McKitrick, S. A. (2016). Are student loan default rates linked to institutional capacity? Journal of Student Financial Aid, 46(1), 17–37.
Kelchen, R., & Li, A. Y. (2017). Institutional accountability: A comparison of the predictors of student loan repayment and default rates. The Annals of the American Academy of Political and Social Science, 671(1), 202–223. https://doi.org/10.1177/0002716217701681
Kim, J., DesJardins, S., & McCall, B. (2009). Exploring the effects of student expectations about financial aid on postsecondary choice: A focus on income and racial/ethnic differences. Research in Higher Education, 50(8), 741–774. https://doi.org/10.1007/S11162-009-9143-X
Kim, J., Kim, J., DesJardins, S. L., & McCall, B. P. (2015). Completing algebra II in high school: Does it increase college access and success? The Journal of Higher Education, 86(4), 628–662. https://doi.org/10.1353/jhe.2015.0018
Lin, T. F., & Schmidt, P. (1984). A test of the Tobit specification against an alternative suggested by Cragg. The Review of Economics and Statistics, 66(1), 174–177. https://doi.org/10.2307/1924712
Little, R. J., & Rubin, D. B. (2014). Statistical analysis with missing data. Hoboken, NJ: Wiley.
Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage.
Long, J. S., & Freese, J. (2014). Regression models for categorical dependent variables using Stata (3rd ed.). College Station, TX: Stata Press.
McDonough, P. M. (1994). Buying and selling higher education: The social construction of the college applicant. The Journal of Higher Education, 65(4), 427–446. https://doi.org/10.2307/2943854
McDonough, P. M. (1997). Choosing colleges: How social class and schools structure opportunity. Albany: State University of New York Press.
Mehta, C. R., & Patel, N. R. (1995). Exact logistic regression: Theory and examples. Statistics in Medicine, 14, 2143–2160. https://doi.org/10.1002/sim.4780141908
Menard, S. (2002). Applied logistic regression analysis (2nd ed.). Thousand Oaks, CA: Sage.
Menard, S. (2010). Logistic regression: From introductory to advanced concepts and applications. Thousand Oaks, CA: Sage.
Morrison, E., Rudd, E., Picciano, J., & Nerad, M. (2011). Are you satisfied? PhD education and faculty taste for prestige: Limits of the prestige value system. Research in Higher Education, 52(1), 24–46. https://doi.org/10.1007/s11162-010-9184-1
Myers, S. M., & Myers, C. B. (2012). Are discussions about college between parents and their high school children a college-planning activity? American Journal of Education, 118(3), 281–308. https://doi.org/10.1086/664737
Niu, S. X., & Tienda, M. (2008). Choosing colleges: Identifying and modeling choice sets. Social Science Research, 37(2), 416–433. https://doi.org/10.1016/j.ssresearch.2007.06.015
Norton, E. C., Wang, H., & Ai, C. (2004). Computing interaction effects and standard errors in logit and probit models. The Stata Journal, 4(2), 154–167.
O’Connor, N., Hammack, F. M., & Scott, M. A. (2010). Social capital, financial knowledge, and Hispanic student college choices. Research in Higher Education, 51(3), 195–219. https://doi.org/10.1007/s11162-009-9153-8
Office for Civil Rights, U.S. Department of Education. (2016). Securing equal opportunity: Report to the president and secretary of education. Washington, DC: Author. Retrieved from: https://www2.ed.gov/about/reports/annual/ocr/report-to-president-and-secretary-of-education-2016.pdf
Ospina, R., & Ferrari, S. L. (2012). A general class of zero-or-one inflated beta regression models. Computational Statistics & Data Analysis, 56(6), 1609–1623. https://doi.org/10.1016/j.csda.2011.10.005
Palardy, G. J. (2015). High school socioeconomic composition and college choice: Multilevel mediation via organizational habitus, school practices, peer and staff attitudes. School Effectiveness and School Improvement, 26(3), 329–353. https://doi.org/10.1080/09243453.2014.965182
Pallais, A. (2015). Small differences that matter: Mistakes in applying to college. Journal of Labor Economics, 33(2), 38. https://doi.org/10.1086/678520
Pampel, F. C. (2000). Logistic regression: A primer (Series Number 07-132). Thousand Oaks, CA: Sage.
Papke, L. E., & Wooldridge, J. M. (1996). Econometric methods for fractional response variables with an application to 401(k) plan participation rates. Journal of Applied Econometrics, 11, 619–632. https://doi.org/10.1002/(SICI)1099-1255
Peng, C. Y. J., So, T. S. H., Stage, F. K., & St. John, E. P. (2002). The use and interpretation of logistic regression in higher education journals: 1988–1999. Research in Higher Education, 43(3), 259–293. https://doi.org/10.1023/A:1014858517172
Perna, L. W. (2006). Studying college access and choice: A proposed conceptual model. In J. C. Smart (Ed.), Higher education: Handbook of theory and research, XXI (pp. 99–157). Dordrecht, The Netherlands: Springer.
Perna, L. W., & Titus, M. A. (2004). Understanding differences in the choice of college attended: The role of state public policies. The Review of Higher Education, 27(4), 501–525. https://doi.org/10.1353/rhe.2004.0020
Perna, L. W., & Titus, M. A. (2005). The relationship between parental involvement as social capital and college enrollment: An examination of racial/ethnic group differences. The Journal of Higher Education, 76(5), 485–518. https://doi.org/10.1080/00221546
Porter, S., & Umbach, P. (2006). College major choice: An analysis of person-environment fit. Research in Higher Education, 47(4), 429–449. https://doi.org/10.1007/sl1162-005-9002
Posselt, J. R., Jaquette, O., Bielby, R., & Bastedo, M. N. (2012). Access without equity: Longitudinal analyses of institutional stratification by race and ethnicity, 1972–2004. American Educational Research Journal, 49(6), 1074–1111. https://doi.org/10.3102/0002831212439456
Pryor, J. H., Hurtado, S., Saenz, V. B., Santos, J. L., & Korn, W. S. (2007). The American freshman: Forty year trends. Los Angeles: Higher Education Research Institute, UCLA. Retrieved from http://heri.ucla.edu/PDFs/40TrendsManuscript.pdf
Roderick, M., Coca, V., & Nagaoka, J. (2011). Potholes on the road to college: High school effects in shaping urban students’ participation in college application, four-year college enrollment, and college match. Sociology of Education, 84(3), 178–211. https://doi.org/10.1177/0038040711411280
Rowan-Kenyon, H. T., Bell, A. D., & Perna, L. W. (2008). Contextual influences on parental involvement in college going: Variations by socioeconomic class. The Journal of Higher Education, 79(5), 564–586. https://doi.org/10.1353/jhe.0.0020
Scott, M., Bailey, T., & Kienzl, G. (2006). Relative success? Determinants of college graduation rates in public and private colleges in the US. Research in Higher Education, 47(3), 249–279. https://doi.org/10.1007/s11162-005-9388-y
Scott-Clayton, J. (2011). On money and motivation: A quasi-experimental analysis of financial incentives for college achievement. Journal of Human Resources, 46(3), 614–646. https://doi.org/10.3368/jhr.46.3.614
Smith, J. (2014). The effect of college applications on enrollment. E. Journal of Economic Analysis & Policy, 14(1), 151–188. https://doi.org/10.1515/bejeap-2013-0002
Smith, J., Pender, M., & Howell, J. (2013). The full extent of student-college academic undermatch. Economics of Education Review, 32, 247–261. https://doi.org/10.1016/j.econedurev.2012.11.001
Sribney, W. (n.d.). Why should I not do a likelihood-ratio test after an ML estimation (e.g., logit, probit) with clustering or pweights?. Retrieved from http://www.stata.com/support/faqs/statistics/likelihood-ratio-test/
Stratton, L. S., O’Toole, D. M., & Wetzel, J. N. (2007). Are the factors affecting dropout behavior related to initial enrollment intensity for college undergraduates? Research in Higher Education, 48(4), 453–485. https://doi.org/10.1007/s11162-006-9033-4
Taggart, A., & Crisp, G. (2011). The role of discriminatory experiences on Hispanic students’ college choice decisions. Hispanic Journal of Behavioral Science, 33(1), 22–38. https://doi.org/10.1177/0739986310386750
Teranishi, R. T., & Briscoe, K. (2008). Contextualizing race: African American college choice in an evolving affirmative action era. The Journal of Negro Education, 77(1), 15–26.
Titus, M. A. (2007). Detecting selection bias, using propensity score matching, and estimating treatment effects: An application to the private returns to a master’s degree. Research in Higher Education, 48(4), 487–521. https://doi.org/10.1007/s11162-006-9034-3
Wells, R. S., Lynch, C. S., & Siefert, T. A. (2011). Methodological options and their implications: An example using secondary data to analyze Latino educational expectations. Research in Higher Education, 52(7), 693–716. https://doi.org/10.1007/s11162-011-9216-5
Winship, C., & Mare, R. D. (1984). Regression models with ordinal variables. American Sociological Review, 49(4), 512–525.
Wooldridge, J. M. (2002). Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.
Wooldridge, J. M. (2008). Introductory econometrics: A modern approach. Ontario, Canada: Nelson Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
/*********************************************************************************************** These are examples of commands used to estimates the models in the chapter. The full code is not contained here for space constraints. ***********************************************************************************************/ **Set directories, open data, start log as needed. *set macro vars global $iv = " " *enrl_college is the outcome variable we created. **** Goodness of Fit ****** * unconditional model logit enrl_college, or *full model logit enrl_college $iv estimates store loges predict loges, pr *describing the pred probs predict pprob5 set scheme s2mono histogram pprob5, title("", color(black) margin(zero) size(small)) /// xti("Predicted probabily", size(small)) graphregion(color(white)) /// plotregion(color(white)) yti("Density", size(small)) summarize pprob5 *examining LR fitstat *examining classification estat classification lsens, title("", color(black) margin(zero) size(small)) /// graphregion(color(white)) plotregion(color(white)) xti(,size(small)) yti(,size(small)) lroc, title("", color(black) margin(zero) size(small)) /// graphregion(color(white)) plotregion(color(white)) xti(,size(small)) yti(,size(small)) /******LOGIT*****/ estimates restore loges margins, dydx(*) post estimates store loges_me *graphing estimates restore loges margins, dydx(gpa) asobserved at(gpa=(1 (.25) 4)) set scheme s2mono marginsplot, recastci(rarea) recast(line) ciopts(color(*.7)) /// graphregion(color(white)) plotregion(color(white)) ti("") yti("Change in Pr(Enroll)", size(small)) xti("GPA, 10th grade", size(small)) *look at a few populations of interest mtable, rowname(1 Female first-gen low-inc ) ci clear at(student_gender==2 parental_ed==1 family_income==(1 2) ) atmeans /******PROBIT*****/ probit enrl_college $iv estimates store probes predict probes, pr margins, dydx(*) post estimates store probes_me /******LPM *****/ regress enrl_college i.student_gender $dems $acad $expct $netwk $sch estimates store lpm predict lpm, xb *diagnostic of lpm histogram lpm set scheme s2mono histogram lpm, title("", color(black) margin(zero) size(small)) /// xti("Predicted probabily", size(small)) graphregion(color(white)) plotregion(color(white)) yti("Density", size(small)) xline(0 1, lstyle(foreground) lpattern("--")) *plot residual v fitted set scheme s2mono rvfplot, yline(0, lstyle(foreground) lpattern("--")) graphregion(color(white)) plotregion(color(white)) xline(0 1, lstyle(foreground) lpattern("--")) xti(, size(small)) yti(, size(small)) *check for heteroskedasticity estat imtest ************************ORDINAL/MULTINOMIAL*********************** *pse_enroll_sel is the dependent var we created. ologit pse_enroll_sel i.student_gender $iv, or estimates store ord *get some marginal effects estimates restore ord margins, dydx(gpa) post estimates store ord_me predict nocol_log lsel_log sel_log msel_log *test if we need multinomial oparallel, ic brant, detail ***run it as multinomial mlogit pse_enroll_sel $iv, rrr estimates store multi *get a marginal effect margins, dydx(gpa) post estimates store multi_me *tests of IVs estimates restore multi mlogtest, lr estimates restore multi mlogtest, wald *Test of categories - can we collapse them? mlogtest, combine estimates restore multi mlogtest, lrcomb estimates restore multi mlogtest, hausman *Interpretation estimates restore multi listcoef student_gender student_race_combo stugpa_10 stu_mathirt apcred, gt adjacent *Pred Probs for select subgroups estimates restore multi mtable if student_gender==2 & parental_ed==1 & family_income==1, atmeans noci rowname(lowinc firstg) clear brief ************************COUNT************************ poisson apps $iv, irr estimates store pois estat ic prcounts pois, max(20) plot label var poispreq "Poisson" labe var poisobeq "Observed" label var poisval "# of apps" nbreg apps $iv, irr estimates store nb estat ic countfit apps $iv, nbreg prm
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Rodriguez, A., Furquim, F., DesJardins, S.L. (2018). Categorical and Limited Dependent Variable Modeling in Higher Education. In: Paulsen, M. (eds) Higher Education: Handbook of Theory and Research. Higher Education: Handbook of Theory and Research, vol 33. Springer, Cham. https://doi.org/10.1007/978-3-319-72490-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-72490-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72489-8
Online ISBN: 978-3-319-72490-4
eBook Packages: EducationEducation (R0)