From Euclidean distances to APC models

De Santis, Gustavo; Mucciardi, Massimo

doi:10.1007/s11135-016-0442-y

From Euclidean distances to APC models

Published: 15 October 2016

Volume 51, pages 829–846, (2017)
Cite this article

Quality & Quantity Aims and scope Submit manuscript

Gustavo De Santis¹ &
Massimo Mucciardi²

170 Accesses
2 Citations
Explore all metrics

Abstract

In this paper we show that a recently developed method for the study of “cultural” differences, called DBS-EM, or Distance Between Strata estimated with the EM (Expectation Maximization) algorithm, can also be used to circumvent the difficulties posed by APC (or Age, Period, Cohort) models. The DBS-EM method produces an original measure of the distance (dependent variable) between any two subsets of observations (strata) within a sample, where the stratification variables can be interpreted as regressors. When these stratification variables are age, period, and cohort, what results is an APC model which, however, proves immune to the “intrinsic collinearity problem” (C = P−A). With a few limitations, to be sure, which are discussed in the article. In our application to Italian data over the years 1993–2013, age and cohort strongly shape cultural consumption, while cohort and period impact, but only up to a point, on political participation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantifying Spatial Variation in Aggregate Cultural Tolerance

A Measure of the Cultural Distance Between Countries

Article 17 March 2015

Cultural Versus Objective Distances: The DBS-EM Approach

Article 29 December 2015

Notes

See Sect. 4 for a very brief overview of the literature.
Unfortunately, Istat released the 2013 data in a different format: fewer observations, and with the respondents’ age grouped in classes, instead of by single year, with the youngest grouped as “14–17 years” and “18–19 years”. For the sake of comparability over time, we eventually decided to merge these two and keep the age bracket “14 to 19 years” also for the preceding years (1993 to 2008). Therefore, the youngest age group that we actually formed was 14–19, instead of 15–19: this is not fully consistent with the cohort perspective, but our checks indicate that the bias thus introduced is minor. At the other extreme, we stopped at age 64 because older respondents were both rare and, we feared, scarcely comparable with the rest of the group (e.g., in terms of health and memory).
Which, however, is not our case. Cronbach’s alpha, perhaps the most widely used measure of internal consistency of a battery of indicators, considered acceptable for values over.6 and good for values over .7, is .69 for our former set of variables (cultural consumption) and .74 for the latter (socio-political participation). Cronbach’s alpha is \(\alpha = \frac{Z}{Z - 1}\left( {1 - \frac{{\sum\nolimits_{z = 1}^{Z} {\sigma_{{Y_{z} }}^{2} } }}{{\sigma_{X}^{2} }}} \right)\), where the composite variable X be the sum of Z elementary (manifest) variables Y_Z, and \(\sigma_{X}^{2}\) and \(\sigma_{{Y_{z} }}^{2}\) are their variances.
To be sure, there are numberless alternatives to the Euclidean distance: see, e.g, Cha (2007) and Deza and Deza (2009). In the case of the DBS method discussed here, however, the dimensions are defined in such a way (proportions of units of a given stratum belonging to cluster 1, 2,…, K) that no alternative seems reasonable.
In formal terms, if there are N respondents from S strata and these respondents can be classified in K mutually exclusive, homogeneous clusters, each stratum s can be characterized by the series [p _s,1 ; p _s,2 ;… p _s,k;…; p _s,K] where p _s,k is the proportion of respondents from stratum s who belong to cluster k, and where, for all s, \(\sum\nolimits_{k = 1}^{K} {p_{s,k} } = 1\). Note that this linear constraint reduces to (K−1) the dimensionality of the strata space (see Fig. 1) — but this has no practical implication for the method, except that it is probably preferable to make sure that K ≥ 3, to better characterize the strata.
Not counting duplications (the distance from i to j coincides with the distance from j to i) and the distance of each dot (stratum) from itself (which is 0, by definition), with S strata there are S(S−1)/2 distances.
With tenfold cross-validation, which is the standard way of measuring the error rate of a learning scheme on a (large) dataset.
The standard, here, is to require that the improvement be of at least 5 %, which is also the threshold that we adopted for our applications.
The linear constraint discussed in footnote 2 makes it advisable to impose K > 2. This, however, did not prove necessary in our case.
Except that, of course, with more observations, we are more confident that random effects in the proportional distributions of these observations among clusters play only a minor role.
We remind our readers that, in this approach, the Euclidean distance is a measure of the cultural distance between subgroups with respect to the specific latent variable considered.
Absolute value of the average distance, in years, between the members of the various age classes. In practice, in this case, this age distance can only be 5, 10, 15,…, 45 years.
For an overview of the methodology up to the nineties, see Wilmoth (1990, 2001). Methodological advances have been proposed in recent years: e.g. the Intrinsic Estimator (Yang et al. 2008; Masters et al. 2014), which has, however, raised controversy (te Grotenhuis et al. 2016; Masters et al. 2016), or the Hierarchical Age-Period-Cohort model - HAPC (Reither et al. 2009) which too has been criticized (Bell and Jones 2014, 2015) and defended (Reither et al. 2015).
In other words, ours is not an APC model, but an A’P’C’ model, where age, period and cohort are treated as distances (or absolute differences).
As we did in Sect. 3.2 for instance (see also Sect. 6).
More precisely, of what remains of these cultural attitudes after transformation into Euclidean distances, along the lines of the DBS_EM method.
With ordinary linear regression (OLS) one can model the relationship between a set of predictors and the mean of the response variable. Conversely, quantile regression models the relation between a set of predictors and specific quantiles of the response variable (e.g., the median). So, the quantile regression parameters estimate the change in a specified quantile of the response variable (median) produced by a unitary change in the predictor. If some quantiles are affected by the predictors (age, period, and cohort) significantly more than others, the shape of the distribution of the dependent variable changes, depending on the values of the predictors.
Of course, regression is possible on as many quantiles as one likes. In our case we verified that the results were not qualitatively different in the various cases, and decided to present only the simplest case, with three quantiles.
When the Wald test (with bootstrapping approach - 1000 replication) does not detect any significant discrepancy between quantiles, the hypothesis of a constant relationship between the dependent variable (ED, or Euclidean distance, in this case) and the regressors cannot be rejected.

References

Bell, A., Jones, K.: Another ‘futile quest’? A simulation study of Yang and Land’s Hierarchical Age-Period-Cohort model. Demogr. Res. 30, 333–360 (2014)
Article Google Scholar
Bell, A., Jones, K.: Should age-period-cohort analysts accept innovation without scrutiny? A response to Reither, Masters, Yang, Powers, Zheng and Land. Soc. Sci. Med. 128, 331–333 (2015)
Article Google Scholar
Cha, S.H.: Comprehensive survey on distance/similarity measures between probability density functions. Int. J. Math. Model. Method. Appl. Sci. 1(4), 300–307 (2007)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)
Google Scholar
De Santis, G., Maltagliati, M., Salvini, S.: A measure of the cultural distance between countries”. Social Indicators Research, online first (2015). doi: 10.1007/s11205-015-0932-7
Deza, M.M., Deza, E.: Encyclopedia of distances. Springer, New York. p. 94 http://www.uco.es/users/ma1fegan/Comunes/asignaturas/vision/Encyclopedia-of-distances-2009.pdf (2009)
Easterlin, R.: What will 1984 be like? Socioeconomic implications of the recent twists in age structure. Demography 15(4), 397–432 (1978)
Article Google Scholar
Inglehart, R.: Changing values among western publics from 1970 to 2006. West Eur. Polit. 31(1–2), 130–146 (2008)
Article Google Scholar
Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)
Book Google Scholar
Masters, R.K., Hummer, R.A., Powers, D.A., Beck, A., Lin, S.F., Finch, B.K.: Long-term trends in adult mortality for U.S. blacks and whites: an examination of period- and cohort-based changes. Demography 51, 2047–2073 (2014)
Article Google Scholar
Masters, R.K., Powers, D.A., Hummer, R.A., Beck, A., Lin, S.F., Karl, B.: Fitting age-period-cohort models using the Intrinsic Estimator: assumptions and misapplications. Demography 53(4), 1253–1259 (2016)
Article Google Scholar
Mucciardi, M., De Santis, G.: Cultural versus objective distances: the DBS-EM approach, Social Indicators Research, online first. (2015). doi 10.1007/s11205-015-1213-1
Reither, E.N., Hauser, R.M., Yang, Y.: Do birth cohorts matter? Age-period-cohort analyses of the obesity epidemic in the United States. Soc. Sci. Med. 69, 1439–1448 (2009)
Article Google Scholar
Reither, E.N., Masters, R.K., Yang, Y.C., Powers, D.A., Zheng, H., Land, K.C.: Should age-period-cohort studies return to the methodologies of the 1970s? Soc. Sci. Med. 128, 356–365 (2015)
Article Google Scholar
Ryder, N.: The cohort as a concept in the study of social change. Am. Soc. Rev. 30(6), 843–861 (1965)
Article Google Scholar
te Grotenhuis, M., Pelzer, B., Luo, L., Schmidt-Catran, A.W.: The Intrinsic Estimator, alternative estimates, and predictions of mortality trends: a comment. Demography 53, 1245–1252 (2016)
Article Google Scholar
Wilmoth, J.: Variation in vital rates by age, period, and cohort. Sociol. Methodol. 20, 295–335 (1990)
Article Google Scholar
Wilmoth, J.: Les modèles âge-période-cohorte en démographie. in Démographie: analyse et synthèse. Vol. I: La dynamique des populations, 379–397 (2001)
Witten, I.H., Frank, E.: Data mining. Carl Hanser, München and Wien (2011)
Google Scholar
Yang, Y., Schulhofer-Wohl, S., Fu, W.J., Land, K.C.: The Intrinsic Estimator for age-period-cohort analysis: what it is and how to use it. Am. J. Sociol. 113(6), 1697–1736 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

DiSIA - Department of Statistics, Informatics, Application “G. Parenti”, University of Florence, Florence, Italy
Gustavo De Santis
Department of Economics, University of Messina, Messina, Italy
Massimo Mucciardi

Authors

Gustavo De Santis
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Mucciardi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Massimo Mucciardi.

Additional information

We thank Ettore Recchi for his comments on an initial draft of this paper and anonymous referee. All remaining errors are our own.

Rights and permissions

Reprints and permissions

About this article

Cite this article

De Santis, G., Mucciardi, M. From Euclidean distances to APC models. Qual Quant 51, 829–846 (2017). https://doi.org/10.1007/s11135-016-0442-y

Download citation

Published: 15 October 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11135-016-0442-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From Euclidean distances to APC models

Abstract

Access this article

Similar content being viewed by others

Quantifying Spatial Variation in Aggregate Cultural Tolerance

A Measure of the Cultural Distance Between Countries

Cultural Versus Objective Distances: The DBS-EM Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

From Euclidean distances to APC models

Abstract

Access this article

Similar content being viewed by others

Quantifying Spatial Variation in Aggregate Cultural Tolerance

A Measure of the Cultural Distance Between Countries

Cultural Versus Objective Distances: The DBS-EM Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation