Abstract
In a regression or classification setting where we wish to predict Y from x1,x2,..., xp, we suppose that an additional set of ‘coaching’ variables z1,z2,..., zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,..., xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp. The relative merits of these approaches are discussed and compared in a number of examples.
Similar content being viewed by others
References
Andrews, D. and Herzberg, A. (1985) Data, Berlin: Springer-Verlag.
Breiman, L. and Friedman, J. (1997) Predicting multivariate responses in multiple linear regression, (with discussion), Journal of the Royal Statistical Society B, 59, 3.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression Trees, Wadsworth.
Cleveland, W., Grosse, E., Shyu, W. and Terpenning, I. (1991) Local regression models. In J. Chambers and T. Hastie (eds) Statistical models in S, Wadsworth.
Hastie, T. and Tibshirani, R. (1993) Varying coefficient models (with discussion), Journal of the Royal Statistical Society B, 55, 757-96.
Hosmer, D. and Dick, N. (1974) Information and mixtures of two normal distributions, Journal of Statistics and Computer Simulation, 995-1006.
Jacobs, R., Jordan, M., Nowlan, S. and Hinton, G. (1991) Adaptive mixtures of local experts, Neural Computation, 3, 79-87.
Jordan, M. and Jacobs, R. (1994) Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6, 181-214.
MacLachlan, G. and Basford, K. (1988) Mixture models: inference and applications to clustering, Marcel Dekker.
Nowlan, S. (1991) Soft competition and adaptation, Technical report, PhD thesis, Computer Science, Carnegie Mellon University.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
TIBSHIRANI, R., HINTON, G. Coaching variables for regression and classification. Statistics and Computing 8, 25–33 (1998). https://doi.org/10.1023/A:1008815025242
Issue Date:
DOI: https://doi.org/10.1023/A:1008815025242