Skip to main content
Log in

From Euclidean distances to APC models

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

In this paper we show that a recently developed method for the study of “cultural” differences, called DBS-EM, or Distance Between Strata estimated with the EM (Expectation Maximization) algorithm, can also be used to circumvent the difficulties posed by APC (or Age, Period, Cohort) models. The DBS-EM method produces an original measure of the distance (dependent variable) between any two subsets of observations (strata) within a sample, where the stratification variables can be interpreted as regressors. When these stratification variables are age, period, and cohort, what results is an APC model which, however, proves immune to the “intrinsic collinearity problem” (C = P−A). With a few limitations, to be sure, which are discussed in the article. In our application to Italian data over the years 1993–2013, age and cohort strongly shape cultural consumption, while cohort and period impact, but only up to a point, on political participation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Source See Table 1

Fig. 2

Source See Table 1. See also Fig. 1

Fig. 3

Source See Table 1. See also Table 4

Fig. 4
Fig. 5

Source See Table 1

Fig. 6

Source See Table 1

Fig. 7

Source See Table 1. Software: gretl ver. 1.10.2

Fig. 8

Source See Table 1. Software: gretl ver. 1.10.2

Similar content being viewed by others

Notes

  1. See Sect. 4 for a very brief overview of the literature.

  2. Unfortunately, Istat released the 2013 data in a different format: fewer observations, and with the respondents’ age grouped in classes, instead of by single year, with the youngest grouped as “14–17 years” and “18–19 years”. For the sake of comparability over time, we eventually decided to merge these two and keep the age bracket “14 to 19 years” also for the preceding years (1993 to 2008). Therefore, the youngest age group that we actually formed was 14–19, instead of 15–19: this is not fully consistent with the cohort perspective, but our checks indicate that the bias thus introduced is minor. At the other extreme, we stopped at age 64 because older respondents were both rare and, we feared, scarcely comparable with the rest of the group (e.g., in terms of health and memory).

  3. Which, however, is not our case. Cronbach’s alpha, perhaps the most widely used measure of internal consistency of a battery of indicators, considered acceptable for values over.6 and good for values over .7, is .69 for our former set of variables (cultural consumption) and .74 for the latter (socio-political participation). Cronbach’s alpha is \(\alpha = \frac{Z}{Z - 1}\left( {1 - \frac{{\sum\nolimits_{z = 1}^{Z} {\sigma_{{Y_{z} }}^{2} } }}{{\sigma_{X}^{2} }}} \right)\), where the composite variable X be the sum of Z elementary (manifest) variables YZ, and \(\sigma_{X}^{2}\) and \(\sigma_{{Y_{z} }}^{2}\) are their variances.

  4. To be sure, there are numberless alternatives to the Euclidean distance: see, e.g, Cha (2007) and Deza and Deza (2009). In the case of the DBS method discussed here, however, the dimensions are defined in such a way (proportions of units of a given stratum belonging to cluster 1, 2,…, K) that no alternative seems reasonable.

  5. In formal terms, if there are N respondents from S strata and these respondents can be classified in K mutually exclusive, homogeneous clusters, each stratum s can be characterized by the series [p s,1 ; p s,2 ;… p s,k ;…; p s,K ] where p s,k is the proportion of respondents from stratum s who belong to cluster k, and where, for all s, \(\sum\nolimits_{k = 1}^{K} {p_{s,k} } = 1\). Note that this linear constraint reduces to (K−1) the dimensionality of the strata space (see Fig. 1) — but this has no practical implication for the method, except that it is probably preferable to make sure that K ≥ 3, to better characterize the strata.

  6. Not counting duplications (the distance from i to j coincides with the distance from j to i) and the distance of each dot (stratum) from itself (which is 0, by definition), with S strata there are S(S−1)/2 distances.

  7. With tenfold cross-validation, which is the standard way of measuring the error rate of a learning scheme on a (large) dataset.

  8. The standard, here, is to require that the improvement be of at least 5 %, which is also the threshold that we adopted for our applications.

  9. The linear constraint discussed in footnote 2 makes it advisable to impose K > 2. This, however, did not prove necessary in our case.

  10. Except that, of course, with more observations, we are more confident that random effects in the proportional distributions of these observations among clusters play only a minor role.

  11. We remind our readers that, in this approach, the Euclidean distance is a measure of the cultural distance between subgroups with respect to the specific latent variable considered.

  12. Absolute value of the average distance, in years, between the members of the various age classes. In practice, in this case, this age distance can only be 5, 10, 15,…, 45 years.

  13. For an overview of the methodology up to the nineties, see Wilmoth (1990, 2001). Methodological advances have been proposed in recent years: e.g. the Intrinsic Estimator (Yang et al. 2008; Masters et al. 2014), which has, however, raised controversy (te Grotenhuis et al. 2016; Masters et al. 2016), or the Hierarchical Age-Period-Cohort model - HAPC (Reither et al. 2009) which too has been criticized (Bell and Jones 2014, 2015) and defended (Reither et al. 2015).

  14. In other words, ours is not an APC model, but an A’P’C’ model, where age, period and cohort are treated as distances (or absolute differences).

  15. As we did in Sect. 3.2 for instance (see also Sect. 6).

  16. More precisely, of what remains of these cultural attitudes after transformation into Euclidean distances, along the lines of the DBS_EM method.

  17. With ordinary linear regression (OLS) one can model the relationship between a set of predictors and the mean of the response variable. Conversely, quantile regression models the relation between a set of predictors and specific quantiles of the response variable (e.g., the median). So, the quantile regression parameters estimate the change in a specified quantile of the response variable (median) produced by a unitary change in the predictor. If some quantiles are affected by the predictors (age, period, and cohort) significantly more than others, the shape of the distribution of the dependent variable changes, depending on the values of the predictors.

  18. Of course, regression is possible on as many quantiles as one likes. In our case we verified that the results were not qualitatively different in the various cases, and decided to present only the simplest case, with three quantiles.

  19. When the Wald test (with bootstrapping approach - 1000 replication) does not detect any significant discrepancy between quantiles, the hypothesis of a constant relationship between the dependent variable (ED, or Euclidean distance, in this case) and the regressors cannot be rejected.

References

  • Bell, A., Jones, K.: Another ‘futile quest’? A simulation study of Yang and Land’s Hierarchical Age-Period-Cohort model. Demogr. Res. 30, 333–360 (2014)

    Article  Google Scholar 

  • Bell, A., Jones, K.: Should age-period-cohort analysts accept innovation without scrutiny? A response to Reither, Masters, Yang, Powers, Zheng and Land. Soc. Sci. Med. 128, 331–333 (2015)

    Article  Google Scholar 

  • Cha, S.H.: Comprehensive survey on distance/similarity measures between probability density functions. Int. J. Math. Model. Method. Appl. Sci. 1(4), 300–307 (2007)

    Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)

    Google Scholar 

  • De Santis, G., Maltagliati, M., Salvini, S.: A measure of the cultural distance between countries”. Social Indicators Research, online first (2015). doi: 10.1007/s11205-015-0932-7

  • Deza, M.M., Deza, E.: Encyclopedia of distances. Springer, New York. p. 94 http://www.uco.es/users/ma1fegan/Comunes/asignaturas/vision/Encyclopedia-of-distances-2009.pdf (2009)

  • Easterlin, R.: What will 1984 be like? Socioeconomic implications of the recent twists in age structure. Demography 15(4), 397–432 (1978)

    Article  Google Scholar 

  • Inglehart, R.: Changing values among western publics from 1970 to 2006. West Eur. Polit. 31(1–2), 130–146 (2008)

    Article  Google Scholar 

  • Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)

    Book  Google Scholar 

  • Masters, R.K., Hummer, R.A., Powers, D.A., Beck, A., Lin, S.F., Finch, B.K.: Long-term trends in adult mortality for U.S. blacks and whites: an examination of period- and cohort-based changes. Demography 51, 2047–2073 (2014)

    Article  Google Scholar 

  • Masters, R.K., Powers, D.A., Hummer, R.A., Beck, A., Lin, S.F., Karl, B.: Fitting age-period-cohort models using the Intrinsic Estimator: assumptions and misapplications. Demography 53(4), 1253–1259 (2016)

    Article  Google Scholar 

  • Mucciardi, M., De Santis, G.: Cultural versus objective distances: the DBS-EM approach, Social Indicators Research, online first. (2015). doi 10.1007/s11205-015-1213-1

  • Reither, E.N., Hauser, R.M., Yang, Y.: Do birth cohorts matter? Age-period-cohort analyses of the obesity epidemic in the United States. Soc. Sci. Med. 69, 1439–1448 (2009)

    Article  Google Scholar 

  • Reither, E.N., Masters, R.K., Yang, Y.C., Powers, D.A., Zheng, H., Land, K.C.: Should age-period-cohort studies return to the methodologies of the 1970s? Soc. Sci. Med. 128, 356–365 (2015)

    Article  Google Scholar 

  • Ryder, N.: The cohort as a concept in the study of social change. Am. Soc. Rev. 30(6), 843–861 (1965)

    Article  Google Scholar 

  • te Grotenhuis, M., Pelzer, B., Luo, L., Schmidt-Catran, A.W.: The Intrinsic Estimator, alternative estimates, and predictions of mortality trends: a comment. Demography 53, 1245–1252 (2016)

    Article  Google Scholar 

  • Wilmoth, J.: Variation in vital rates by age, period, and cohort. Sociol. Methodol. 20, 295–335 (1990)

    Article  Google Scholar 

  • Wilmoth, J.: Les modèles âge-période-cohorte en démographie. in Démographie: analyse et synthèse. Vol. I: La dynamique des populations, 379–397 (2001)

  • Witten, I.H., Frank, E.: Data mining. Carl Hanser, München and Wien (2011)

    Google Scholar 

  • Yang, Y., Schulhofer-Wohl, S., Fu, W.J., Land, K.C.: The Intrinsic Estimator for age-period-cohort analysis: what it is and how to use it. Am. J. Sociol. 113(6), 1697–1736 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Mucciardi.

Additional information

We thank Ettore Recchi for his comments on an initial draft of this paper and anonymous referee. All remaining errors are our own.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

De Santis, G., Mucciardi, M. From Euclidean distances to APC models. Qual Quant 51, 829–846 (2017). https://doi.org/10.1007/s11135-016-0442-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-016-0442-y

Keywords

Navigation