1 Introduction

Socioeconomic inequality in China has been rapidly rising, along with the country’s fast economic growth in recent decades. The Gini coefficient for income inequality, for example, has increased from about 0.30 in 1980 to the 0.53–0.55 range around 2010, well above the current levels in the U.S., around 0.45 (Xie and Zhou 2014) or OECD countries, ranging in 0.25–0.50 (OECD 2012). It is well known that health status varies systematically by socioeconomic status (SES), although empirical evidence has been gathered mainly from the developed world (Link and Phelan 1995; Phelan et al. 2010). The escalation in socioeconomic inequality may, in turn, drive up health disparities by SES in China. Unfortunately, empirical evidence concerning health inequality by SES in China remains limited, partly due to lack of population-representative quality data.

Even less is known about the relative importance of each SES dimension in shaping health status in reform-era China, a country in the midst of rapid transitions in multiple domains—socioeconomic, political, demographic, and epidemiologic. Findings from Western societies may not be readily generalizable to the Chinese population. For example, education, perhaps the most salient SES determinant of health in Western studies, has been reported to be non-related to self-rated health among Chinese adults 18 years or older (Yang and Kanavos 2012) or all-cause mortality among the elderly (Luo and Xie 2014). Another study reported a significant educational gap in self-rated health among Chinese adults 21 years or older, but such a gap was substantially reduced across successive birth cohorts, a pattern opposite those found in the U.S. (Chen et al. 2010). Higher income has been associated with unhealthy diet and increased risks of overweight and obesity in young Chinese adults (Du et al. 2004). What is particularly striking is that literate, higher-income Chinese elderly are more likely to die from diabetes than their illiterate, lower-income peers (Luo and Xie 2014). In addition, political resources are known to be important in contemporary China (Bian 2002) and thus should not be ignored in research on SES and health.

In this research, our purpose is to reexamine the relationship between SES and health for all Chinese adults ages 18 and above in recent years. Given the wide age range, selection of a suitable health indicator merits careful consideration. For obvious reasons, the use of mortality, a common health indicator for the elderly, is inappropriate for middle-aged or younger adults. Biomarker data are costly to collect and thus usually not available in general household surveys. Among low-cost self-reported indicators, self-rated health status is one of the most commonly used choices because of its robust predictive power for mortality and morbidity (Singh-Manoux et al. 2006; Goldberg et al. 2001; Benjamins et al. 2004; Idler and Benyamini 1997). However, it is possible that people of varying levels of SES may adopt systematically different frames of reference in rating their health, thus causing a reporting heterogeneity bias (King et al. 2004) to the observed relationship between SES and self-reported health. This may explain why, contrary to research described in the literature in Western societies, several studies in various developing countries, including China, Thailand, and the Philippines, have found either no significant positive association between SES and self-rated health or even a negative relationship (Luo and Wen 2002; Pei and Rodriguez 2006; Whyte and Sun 2010; Zimmer and Amornsirisomboon 2001; Zimmer et al. 2000).

In this study, we fill these gaps in the literature by assessing the up-to-date SES disparities in self-rated health from the 2010–2012 waves of the China Family Panel Studies (CFPS). We capitalize on the richness in the CFPS data and examine the influences on health of a wide range of SES indicators, including cognitive ability and political capital, that are rarely available in previous research. This approach allows us to better adjudicate between the relative roles of different SES domains in affecting health. As for the health indicator, we take advantage of the newly available anchoring vignette data in the CFPS to correct for reporting heterogeneity in self-rated health and derive a more accurate statistical relationship between SES and health in China. Capitalizing on the panel design of the CFPS, we employ a lagged dependent variable strategy by regressing self-rated health measured in 2012 on SES indicators measured in 2010.

2 Theoretical Background

2.1 Relationships Between SES and Health

To be comprehensive, we measure SES in three domains: human capital, material conditions, and political capital. Education has been used as the archetypical marker of human capital and shown to be a robust predictor of health. However, the specific pathways linking education to health remain unclear. Common candidates include material pathways in improving income and job opportunities (Link and Phelan 1995), behavioral pathways in promoting healthy behaviors and lifestyles (Lynch 2003; Mirowsky and Ross 2003), psychosocial pathways in cultivating greater senses of self-efficacy and self-control that facilitate coping with negative life events and chronic stressors (Lantz et al. 2005; Schnittker and McLeod 2005; Williams 1990), and cognitive pathways in accumulating health knowledge, improving problem-solving skills, and maneuvering mental resources to initiate behavioral changes and manage disease risks (Link and Phelan 1995; Grossman 1972; Gottfredson and Deary 2004).

Cognition is usually correlated with education and captures a distinct aspect of human capital in itself. Cognitive ability involves the mental capacity to reason, plan, think abstractly, comprehend complex ideas, learn quickly, and learn from experiences plus the aptitude to apply skills to solving problems (Gottfredson 1997). Although it is left unmeasured in many household surveys, it can be argued that cognitive ability may be an underlying common cause for both SES and health (Singh-Manoux et al. 2005; Daly 2011; Deary et al. 2005). Ignoring cognitive ability thus may dramatically overestimate the causal effect of education on health in observational data by 25–100 % (Auld and Sidhu 2005; Baker et al. 1997). Health production may require more cognitive effort now than it used to, given on-going rapid improvement in knowledge about medicine and health, explosive growth in new treatments and technologies, escalating complexity of medical regimens, and the critical importance of daily preventative self-care as opposed to curative care (Gottfredson 2004; Cutler et al. 2011; Schnittker 2005a). Cognitive ability is also crucial to appropriately performing many daily tasks that involve complex information acquisition and processing in a modern, literate society (e.g., dealing with banks or social service agencies, seeking job or financial opportunities, and exercising civil rights and duties). Small differences in performing these literacy tasks on a daily basis can cumulate and snowball into larger differences in socioeconomic and physical well-being over the long run (Gottfredson 2004; Ross and Wu 1996).

Material conditions also affect health, as having more economic resources allows an individual to purchase better nutrition, housing, transportation, and health care, all of which contribute positively to health (Gottfredson 2004). While income measures the flow of economic resources at any given point in time, wealth reflects a stock of financial assets accumulated over time (Keister 2000). Wealth can buffer the negative shock of low income on health during times of unemployment or illness (Pollack et al. 2007). Furthermore, net of income, wealth is a good overall indicator predictive of political power, social prestige, and educational and occupational opportunities (Hajat et al. 2010; Keister 2000), all of which positively correlate with health. However, wealth is used less often as an economic indicator in health research because it is more difficult to measure than income. In a handful of Western studies that measure both income and wealth, wealth proves to be a stronger predictor for health status than income (Adams et al. 2003; Boyle et al. 2006). In studies of developing countries, wealth is often approximated by a composite index of family possession of durable household goods (e.g., TV, car) and housing characteristics (e.g., water source, sanitation facility). Such a crude measure may result in underestimation of the true wealth gap in health. Capitalizing on the comprehensive measurements of net assets (including housing, financial, agricultural, and business assets) in the CFPS, we are among the first researchers to examine the association of total family wealth with health in a developing country setting.

Political capital is a unique SES indicator in China and other former or contemporary socialist countries. Since the market transition, human capital has gained importance in determining life chances (Hauser and Xie 2005; Song and Xie 2014; Nee 1989; Xie and Wu 2005). Nevertheless, key social institutions, under the dominant rule of the Communist Party of China (CPC), have remained unchanged. For example, during the latest health reform, state bureaucrats took advantage of their institutional power to gain better access to health care, especially when quality service was not distributed on an open market (Hsiao 2007; Akin et al. 2004). Party membership also remains a type of “structural social capital” that can either generate social support in buffering one’s own adverse life experiences (Chen 2011) or mobilize collective actions in promoting community-level health outcomes (Yip et al. 2007). Furthermore, since the market reform itself serves the purpose of preserving its legitimacy, the CPC has maintained its control over the national economy by occupying privileged positions in both planned and market sectors (Whyte 2009). Party members in these positions are entitled to a legitimate rising wage premium or an opportunity for extracting rents (Appleton et al. 2009), which in turn improves health. The rising wage premium, together with job security and privileged access to social services, may help empower party members (Carlsson et al. 2009) and boost their life satisfaction (Appleton and Song 2008) as labor market competition intensifies in the private sector and thus benefit their health conditions through a psychosocial pathway.

As detailed below, the rich CFPS data provide appropriate measures for different dimensions of SES, each of which may involve multiple interrelated but distinct components. By modeling these diverse SES components simultaneously, we are in a better position to avoid potential confounding biases.

2.2 Reporting Heterogeneity

In a conventional analysis of self-rated health, individuals from different social backgrounds are assumed, albeit implicitly, to use the same standard in their subjective health ratings. Technically speaking, this assumption is tantamount, for an ordered probit or logit model, to fixing the cut-points that divide respondents’ continuous latent health into discrete response categories (i.e., poor, fair, good, very good, or excellent health) at the same values. In reality, however, individuals from different social groups are likely to have had different health experiences and may thus vary in their subjective ratings of the same objective health status (Krause and Jay 1994). In particular, high-SES respondents tend to compare themselves to their better-off peers and hence adopt a higher standard for what is considered “excellent” health, whereas those of low SES may apply a lower standard, resulting in an inflated level of self-rated health relative to their objective health conditions (Dowd and Todd 2011; Schnittker 2005b). Thus, reporting heterogeneity, also known as differential item functioning (King et al. 2004), may lead to an underestimation of the true SES health gradient.

In China, several studies have suggested that reporting heterogeneity by SES is a predominant rather than exceptional phenomenon in self-rated health among adults. For example, in a regional sample of three Chinese provinces surveyed during 2000–2001, Bago d’Uva et al. (2008a) found evidence of reporting heterogeneity in self-rated cognition, pain, self-care, and daily activity by education, income, and rural–urban residence. In a nationally representative sample interviewed in 2012, Xu and Xie (2015) found systematic variation by socioeconomic and demographic characteristics in thresholds used by respondents in rating their overall health status.

One method of adjusting for reporting heterogeneity is to use anchoring vignettes, brief descriptions of hypothetical people or situations that survey respondents are asked to evaluate on the same scale as they use to assess their own situations. This methodology has been applied to correct reporting bias in health inequalities by SES in predominantly American and European elderly populations (Grol-Prokopczyk et al. 2011; Bago d’Uva et al. 2008b; Dowd and Todd 2011). There is some concern that anchoring vignettes could be cognitively taxing to Chinese respondents, whose average educational attainment is lower compared with that of their Western counterparts (Bago d’Uva et al. 2008b). Nevertheless, Xu and Xie (2015) have demonstrated that anchoring vignettes can be an effective survey instrument in obtaining bias-adjusted estimates of health disparities not only for the original survey sample, but also for an independent sample from another, contemporaneous survey.

3 Data and Methods

3.1 Data Source and Measures

This study draws upon data from the CFPS, a nationally representative longitudinal survey of Chinese communities, families, and individuals. The CFPS focuses on the well-being of the Chinese population, with a wealth of information on economic activities, education outcomes, family dynamics and relationships, and health. The 2010 nation-wide CFPS baseline survey successfully interviewed 14,960 households from 635 communities, including 33,600 adults and 8990 children, located in 25 designated provinces. The approximate response rate was 81 %, the majority of the non-response being due to non-contact (Xie 2012). The CFPS’s stratified multi-stage sampling strategy ensures that the sample represents 95 % of the total population in China in 2010. The first full-scale follow-up survey was conducted in 2012 with 82 % of the baseline respondents re-interviewed.

We choose the CFPS as the data source in this study for its broader research agenda, wider population coverage, and national representativeness. By comparison, the China Health and Nutrition Survey (CHNS), for example, only covers nine provinces and does not provide sampling weights due to certain design problems that were present from the very beginning of the study (Popkin et al. 2010). The CFPS, however, has developed a novel rural–urban, integrated, multistage, probability-proportional-to-size sampling scheme with implicit stratification to ensure the validity and representativeness of its sample (Xie and Hu 2014). Another possibility is the China Health and Retirement Longitudinal Study (CHARLS), which tracks a national sample but focuses on individuals aged 45 years or older and their spouses. The CFPS, however, collects data on not just one but all members of a sampled family, through proxy reports when personal interviews are not feasible, whether they are at home or have left for various reasons (Xie and Hu 2014). This design allows the CFPS to capture greater variations in both health status and SES across a wider age range and to better measure family-level SES indicators (e.g., total family income, net wealth, and access to political capital by any family member).

The dependent variable in this study is self-rated health in the 2012 follow-up survey, in which anchoring vignettes were implemented to adjust for reporting heterogeneity. Self-rated health information was collected by asking respondents to rate their overall health status at the time of interview by selecting one of five categories: poor, fair, good, very good, or excellent. Every respondent who rated his/her own health was then administered two anchoring vignettes in random order, on the same response scale, about the health status of a hypothetical person with a typical Chinese male or female name corresponding to the respondent’s sex. The health vignettes were designed to represent two substantially different health conditions, thereby providing greater power to differentiate the varying cut-points applied by respondents to assessing their own health status. The English translations of the two vignettes can be found in Appendix “1”. Evidently, the first vignette depicts a much healthier person compared to the second vignette.

For three reasons, we use lagged SES and control variables measured in 2010 and estimate their coefficients in regressions with self-rated health in 2012 as the dependent variable. First, although self-rated health was also measured in 2010, it used a different, symmetrical (very poor, poor, fair, good, or very good) scale and was thus incomparable to the response categories adopted in 2012. Second, anchoring vignettes were not introduced in the 2010 baseline survey to adjust for potential reporting heterogeneity. Third, lagging SES measures helps to mitigate potential reverse casual effects of health on SES when interpreting our regression estimates.

Education is measured by the highest degree attained and divided into four categories: no schooling, primary school, junior high school, and senior high school and above. Cognitive functioning is captured by scores from an adaptive 34-item verbal test designed specifically for the CFPS in 2010. The test score ranges from 0 (no correct answer) to 34 (perfect score). Economic resources are measured by annual family income per capita and total family wealth. Total family income is summarized across self-reports from multiple sources, including salary and wage, business income, property income, and transfer income (e.g., financial support from relatives and friends and government subsidy). We chose not to use individual income because many Chinese households, especially in rural areas, act as single economic entities. Total family wealth is measured in terms of net worth, which is the sum of land, housing, financial assets (including savings, stock, funds, bonds, financial derivatives, etc.), fixed assets for production (including agricultural machinery and business assets), and durable goods (valuables included), minus housing and non-housing liabilities (Xie and Jin 2015). Political capital is also measured at the family level by a dichotomous variable indicating whether any family member is a cadre, CPC member, or United Front Democratic Party member. We chose this family level measure for two reasons. First, the proportion of individual-level cadres or party members is very small (only 10 % in our analytical sample). Second, prior research suggests that the other family members’ cadre or party status matters more than one’s own political position for health (Xu and Xie 2015).

We control for socio-demographic and health behavior variables, including age, gender, marital status, rural–urban residence, smoking, drinking, and physical exercise in regression models. Age is divided into five categories to capture potential nonlinearity in age trajectory of health. We also control for county-level gross domestic product (GDP) per capita and dummy variables for geographic regions, which helps to adjust for oversampling in the five provinces (or their administrative equivalents) of Shanghai, Liaoning, Henan, Gansu and Guangdong (Xie and Hu 2014). We use dummy variables in regression analysis for all variables that are categorical.

Among a total number of 30,168 adult respondents aged 18–70 in the 2010 baseline CFPS survey, 79.8 % were successfully followed up in 2012 and 77.2 % were administered personal interviews in which self-rated health was supposed to be assessed. We chose 70 as the upper age limit for two reasons. First, few respondents (<6 % of the sample after excluding missing data) were above 70 years old as the national average life expectancy was 74.8 years (72.4 years for males and 77.4 years for females) in 2010 (NBS 2012). Second, those who were above 70 years old had drastically different life experiences and were less comparable to their younger counterparts with respect to SES and health. Studies focusing on the elderly respondents above age 70 can and should use a variety of other health outcome variables available in the data, such as functional status, illness, and even mortality.

We further excluded 545 respondents who had missing data on self-rated health or at least one of the two vignettes, and about 15 % of the remaining sample who gave ratings inconsistent with the designed rank ordering of the two vignettes, and thereby were in violation of the vignette equivalence assumption underlying the vignette methodology (King et al. 2004). As a group, this 15 % of respondents had significantly lower SES (e.g., lower educational attainment, worse cognition, and lower income) and reported poorer health compared to those whose ratings of the vignettes were consistent with the survey design (results not shown). After we excluded these respondents, the sample size was 19,274, and it was further reduced to 17,877 after cases with missing data on covariates were further dropped.

3.2 Methods

In this study, we capitalize on the CFPS vignette data to purge reporting heterogeneity and obtain bias-corrected estimates of health disparities by SES for the Chinese adult population. Since a vignette is a description of a hypothetical person’s health status presented to all respondents in the same way, we should expect no systematic variation (apart from random error) in the ratings of the vignette by different respondents, except that they may apply different cut-points. This is true if they perceive the vignette in the same way and on the same unidimensional scale—known as the vignette equivalence assumption (King et al. 2004). In other words, we assume that all respondents assess the health status of each vignette only on the basis of its true latent health level. Formally, let \({\text{y}}_{\text{i,j}}^{{\text{v}*}}\) denote the continuous latent true health of vignette j as perceived by respondent i, and it can be modelled as a linear combination of an intercept \(\upalpha_{\text{j}}\) and random measurement error \(\in_{\text{i,j}}^{\text{v}}\):

$${\text{y}}_{\text{i,j}}^{{\text{v}*}} = \alpha_{j} + \in_{i,j}^{v} , \quad \in_{i,j}^{v} \sim{\text{N}}\left( {0, 1} \right)$$
(1)

with the normalization \(\upalpha_{1} = 0\) for identification.

Assuming response consistency between vignettes—respondents use the same cut-points to evaluate all vignettes (King et al. 2004), and then translate the continuous latent health of vignette j into one of K ordered response categories, in this case, poor (=1), fair (=2), good (=3), very good (=4), and excellent (=5), through a threshold model (Powers and Xie 2008):

$$y_{i,j}^{v} = k, \quad if\,\,\tau_{i}^{v, k - 1} \le\, y_{i,j}^{v*} < \tau_{i}^{v,k} , \quad k = 1, \ldots , 5$$
(2)

where \(\tau_{i}^{v,k}\) denotes the cut-point for respondent i to rate the latent true health status of the vignettes as in one of the K categories; and \(\tau_{i}^{v,0} < \tau_{i}^{v,1} < \tau_{i}^{v,2} < \cdots < \tau_{i}^{v,5}\), \(\tau_{i}^{v,0} = - \infty\), and \(\tau_{i}^{v,5} = \infty\). Unlike a conventional ordered probit model that assumes no reporting heterogeneity and hence homogeneous cut-points, we allow the cut-points to vary as a linear function of covariates \(X_{i}\), plus individual heterogeneity \(u_{i}^{v,k}\):

$$\tau_{i}^{v,k} = \gamma_{0}^{v,k} + X_{i} \gamma^{v,k} + u_{i}^{v,k} , \quad k = 1, \ldots , 4$$
(3)

where \(\gamma_{0}^{v,k}\) are the intercepts in the respective cut-points for the vignettes. Equations (1)–(3) define the first component of a hierarchical ordered probit (HOPIT) model (King et al. 2004) that will be used to assess patterns of reporting heterogeneity, and obtain bias-adjusted estimates of health disparities.

The second component of a HOPIT model takes a similar form. Let \(y_{i}^{s*}\) denote the continuous latent true health for respondent i. We model it as a linear combination of the objective SES and other covariates, denoted together by vector \(X_{i}\), plus independent residual ɛ i :

$$y_{i}^{s*} = \beta_{0} + X_{i} \beta \,+ \in_{i} , \quad \in_{i} \sim N\left( {0, \sigma^{2} } \right)$$
(4)

where \(\beta_{0}\) is the intercept. The measurement model divides \(y_{i}^{s*}\) into K ordinal response categories of self-rated health \(y_{i}^{s}\) through a similar threshold model as Eq. (2):

$$y_{i}^{s} = k, \, \quad if\,\,\tau_{i}^{s,k - 1} \le y_{i}^{s*} < \tau_{i}^{s,k} , \quad k = 1, \ldots , 5$$
(5)

where \(\tau_{i}^{s,k}\) denotes the cut-point for respondent i to report his/her health status as in one of the K categories; and \(\tau_{i}^{s,0} < \tau_{i}^{s,1} < \tau_{i}^{s,2} < \cdots < \tau_{i}^{s,5}\), \(\tau_{i}^{s,0} = - \infty\), and \(\tau_{i}^{s,5} = \infty\). Again, we allow the cut-points for self-rated health to vary as a linear function of observed covariates \(Z_{i}\), plus individual heterogeneity \(u_{i}^{s,k}\):

$$\tau_{i}^{s,k} = \gamma_{0}^{s,k} + Z_{i} \gamma^{s,k} + u_{i}^{s,k} , \quad k = 1, \ldots , 4$$
(6)

where \(\gamma_{0}^{k}\) are the intercepts in the respective cut-points, and \(Z_{i}\) can include the same covariates as X i . However, without the auxiliary information provided by the vignettes, the above model is under-identified since we cannot simultaneously estimate β (the effects of covariates on self-rated health), \(\gamma^{s}\) (the effects of covariates on cut-points), and σ 2. Identification is achieved by assuming response consistency between vignette ratings and self-rated health—respondents rate their own health in the same way as they assess each of the vignettes: \(\tau_{i}^{s,k} = \tau_{i}^{v,k}\).

Throughout the analyses, we apply inverse probability weighting (IPW) to reduce the potential bias induced by panel attrition. We followed the same procedure as that adopted by the CFPS (Lu 2014) to implement IPW and chose the covariates tailored to our specific research aims. Specifically, we first estimated the baseline respondents’ probabilities of successful follow-up by regressing the binary follow-up outcome in 2012 on their observed characteristics in 2010. The regression estimates can be found in Appendix “2”. In brief, the probability of successful follow-up was significantly higher for respondents who were older, female, married, less educated but cognitively superior in math, and living in a rural area than for their counterparts. We then calculated individual-specific weights by taking the inverse of their predicted probabilities of being followed up. Upweighting the respondents who are less likely to be tracked, IPW corrects for the bias due to differential attrition under the missing-at-random assumption (Little and Rubin 2014), i.e., no unobserved systematic difference between respondents who were followed up and those who were not given observed characteristics. We also calculate robust standard errors to adjust for correlations among multiple members clustered within the same families.

4 Results

4.1 Descriptive Statistics

Table 1 presents frequency distributions of self-rated health and vignette ratings. We focus only on the weighted statistics, since the unweighted ones exhibit nearly identical patterns. The modal response to self-assessment was good health, accounting for about one third of the sample, while only 8.7 % considered themselves to be in excellent health. The rest of the sample was more or less evenly distributed across the three categories of fair, poor, and very good health, each around 19 %. As expected from the vignette design, the majority of respondents rated the hypothetical person in the first vignette as being in good, very good, or excellent health (over 95 %) and the person in the second vignette as being in poor or fair health (over 80 %).

Table 1 Summary statistics for self-rated health and ratings of vignettes in 2012 among Chinese adults (18–70 years old; N = 17,877)

Table 2 summarizes the distributions of the independent variables. We again describe only the weighted results given small differences between weighted and unweighted results. With respect to demographics, our analytical sample is almost evenly split between men and women and between different age groups, except for a smaller proportion of 61–70 year-olds (13 %) and a larger proportion of 41–50 year-olds (24.7 %). Nearly 85 % of the respondents were married. As for SES indicators, about a quarter of the sample graduated from senior high school or above, while one fifth—mostly elderly—received no formal education at all. On average the respondents could answer only the first 18 out a total of 34 verbal test items and earned an annual family income per capita of 10,200 Chinese yuan (roughly equivalent to US $1600; median = 6609 yuan, or US $1034), approximately 4.4 times as high as China’s rural poverty line (2300 yuan, see Zhang et al. 2012). The average total family wealth was about 303 thousand yuan (roughly equivalent to US $47,555; median = 123.6 thousand yuan, or US $19,333). As for political capital, about 28.3 % of the respondents had at least one family member who was a cadre or party member. In terms of residence, nearly half of the respondents lived in urban areas due to China’s rapid urbanization and rural-to-urban migration. Lastly, close to one third of the respondents were smokers at the time of interview. At the county level, the average GDP per capita was over 40,000 yuan, with an even larger standard deviation (over 50,000), confirming substantial geographic variation in economic development.

Table 2 Summary statistics for independent variables in 2010 in the regression analysis of self-rated health among Chinese adults (18–70 years old; N = 17,877)

4.2 Regression Results

4.2.1 Variations in Reporting Behaviors

Table 3 shows estimates of cut-point shifts from a HOPIT model with IPW, in which we allowed not only the SES indicators but also the other control variables to influence reporting heterogeneity. The results show that a number of factors are predictive of reporting behavior. Among these are SES measures such as education, verbal test, and family wealth, but not family income or political capital. The roles of education and verbal test in reporting behavior were non-monotonic, contingent upon the level of health. For example, better education was associated with upward cut-point shifts (i.e., higher standards) for ratings of very good (versus good) health or excellent (versus very good) health, but downward cut-point shifts (i.e., lower standards) for ratings of fair (versus poor) health and good (versus fair) health. Similarly, verbal test score was negatively related to the cut-point between fair and good health, but positively associated with the cut-point between good and very good health. In other words, respondents with higher education and higher cognition are less likely to use extreme values for rating health conditions than those with lower education and lower cognition. For a given level of true poor health, respondents with better education and verbal test score would be less likely to report poor health, but more likely to report fair health, thereby inflating the SES health gap. For a given level of true good health, however, respondents with higher education and higher cognition would be less likely than those with lower education and lower cognition to report very good (versus good) health or excellent (versus very good) health, thereby deflating the SES health gap. In contrast, however, certain group differences in reporting behavior are monotonic throughout the whole range of health conditions. For example, respondents from wealthier families held themselves to uniformly higher standards for health ratings.

Table 3 Associations of socioeconomic status in 2010 with cut-point shifts in self-rated health in 2012 among Chinese adults (18–70 years old; N = 17,877)

As for control variables, age, gender, and marital status were associated with reporting heterogeneity. In particular, men maintained consistently higher cut-points compared with women at different levels of health status. We do not observe much rural–urban difference, nor variation by health behavior, in reporting heterogeneity.

4.2.2 Vignette-Adjusted SES Variations in Self-Rated Health

Given the findings above on reporting heterogeneity, we fitted a trimmed HOPIT model with only the SES indicators and control variables that significantly affect cut-point shifts (e.g., education, verbal test, and family wealth). Table 4 reports vignette-adjusted group differences in self-rated health as estimated from this more parsimonious HOPIT model, along with estimates from a conventional ordered probit model for comparison. Because the original coefficients were normalized within models and are thus incomparable in scale across models,Footnote 1 we fixed the scale of the HOPIT model coefficients by dividing the estimated coefficients by the estimated variance terms,Footnote 2 which is equivalent to imposing the same variance as in the ordered probit model (Jones et al. 2007).

Table 4 Associations between socioeconomic status in 2010 and self-rated health in 2012 among Chinese adults (18–70 years old; N = 17,877)

The comparison in coefficients across the regular ordered probit model and the HOPIT model highlights the importance of considering reporting heterogeneity. After we control for reporting heterogeneity, health disparities by SES become more evident. For instance, the educational gradient in self-rated health would have been seriously underestimated had reporting heterogeneity been left uncorrected. In the conventional ordered probit model, only those who graduated from junior high school reported significantly better health than those without any formal education, whereas those who possessed a senior high school or higher degree reported no significant health advantage. However, after taking into account their higher standards for ratings of very good (versus good) health or excellent (versus very good) health, not only those who attended senior high school or above, but also those who only completed primary school reported significantly better health than those without any formal education. While family wealth was positively associated with self-rated health even without vignette adjustment, the estimated coefficient increased by more than one third (from 0.25 to 0.34) after vignette adjustment, because family wealth is positively associated with cut-point shifts for health ratings (Table 3).

For other SES indicators, reporting heterogeneity is less consequential, so vignette adjustment did not change results much. Family income was positively associated with self-rated health, anchored by vignettes or not, and the coefficient estimate remained largely unchanged after vignette adjustment. Similarly, in terms of political capital, respondents from families with at least one cadre or party member reported significantly better health than those without such family members, and the magnitude of such difference is about the same after vignette adjustment. Verbal test score, on the other hand, was not a significant predictor of self-rated health.

Men had better self-rated health than women, and the gender gap increased by 50 % (from 0.16 to 0.24) after vignette adjustment. Urban residents reported significantly worse health than their rural peers, which is somewhat surprising but consistent with previous research (Whyte and Sun 2010; Tandon et al. 2006). One possible explanation is that urban residents are more likely to suffer from the health penalties of China’s unprecedented urbanization, including increased environmental pollution and overcrowding, new epidemic diseases (e.g., SARS, H1N1 flu), and rising obesity-related chronic diseases (Van de Poel et al. 2012). The two measures of health behaviors, drinking alcohol and regular physical exercise, were positively associated with self-rated health, although vignette adjustment made little difference in coefficient estimation.

4.2.3 Reporting Biases in SES Gradients in Self-Rated Health

To better understand the impact of reporting heterogeneity on the observed associations between SES and self-rated health, we carried out a counterfactual exercise as employed in prior research (Bago d’Uva et al. 2008b). In this exercise, we fixed the latent health status for a reference person with the sample average characteristics—modal category for discrete characteristic and mean response for continuous characteristic. We then predicted the probability of reporting excellent health with varying cut-points that respondents would adopt who had different values on an SES dimension, such as level of education, while holding everything else constant. We reported the results as ratios in probabilities (i.e., relative probability) for two SES groups being compared.

For this study, we focus on the effects of education and family wealth, not only because they are of primary research interest to us but also because they are correlated with reporting heterogeneity. Figure 1 plots the relative probabilities of reporting very good and excellent health with a fixed level of latent health and varying cut-points. The denominators are probabilities corresponding to the cut-points of respondents in the reference category. For the education results (the left figure), the reference category is no schooling. For the family wealth results (the right figure), the reference category is the value of family net worth at the 10th percentile. As shown in Fig. 1, reporting biases were substantial for both education and family wealth. For example, the relative probability of reporting excellent health ranged between 0.61 and 0.89 as the associated cut-points shifted between primary school graduates and senior high school or above. Similarly, the relative probability of reporting excellent health ranged between 0.76 and 0.95, as the associated cut-points varied between the 25th percentile and the 90th percentile of family wealth. This means that, given the same latent health of an average respondent (i.e., the reference person), the probability of giving an excellent health self-rating could be reduced by more than one-third simply by the fact that better-educated or wealthier respondents applied higher standards for self-rated health.

Fig. 1
figure 1

Ratio of probabilities of reporting excellent health with fixed latent health and varying cut-points. Notes: Latent health is fixed for a reference person with characteristics set at mean (for continuous covariates) or mode (for categorical covariates). The ratio of two probabilities with different cut-points reflects the relative magnitude of the reporting effect

4.2.4 Sensitivity Analysis

We undertook two sets of sensitivity analyses. First, we experimented with extending the age limit to 90 years. We obtained results similar to those using 70 years as the upper bound. Second, we investigated alternative coding schemes for the SES indicators, including a continuous measure of education (years of schooling) and categorical measures (quantiles) of verbal test score, family income per capita, and total family wealth. Again, this exercise yielded key findings similar to those presented above: (1) education and family wealth affect both reporting behavior and vignette-adjusted self-rated health; and (2) cognition is significantly associated with reporting behavior but barely related to health, whereas family income and political capital affect health but not reporting heterogeneity.

5 Discussion and Conclusion

In this study, we address a long-standing research question concerning the true association between SES and self-rated health in contemporary China—a research site where quality data are still limited and the social context is different from those in Western countries. Relative to prior research on the topic, our study has several strengths. First, we drew on recent data from the 2010–2012 CFPS, a newly launched, nationally representative survey in a period when socioeconomic inequality surged to an unprecedentedly high level (Xie and Zhou 2014; Gan et al. 2014). The CFPS covers a wide age range, from young adults to elderly, so our results are generalizable to the entire Chinese adult population. In addition, the longitudinal nature of the CFPS data allows us to better alleviate, if not eliminate, the potential issue of reverse causality in the relationship between SES measures and health status by lagging the dependent variable in time.

Second, we proposed a new conceptual framework that encompasses distinct dimensions of SES, each of which is captured by one or more indicators in this study. The results support our approach, as not all the SES dimensions are associated with self-reported health in the same way, i.e., a positive SES-health gradient. Rather, the patterns are quite complex, with some associations contradicting our expectations and others being non-monotonic. For example, contrary to the emerging literature on cognitive epidemiology (Deary 2008; Deary and Batty 2007; Gottfredson and Deary 2004), we found no evidence of an association between cognition, measured through a verbal test, and self-rated health. On the other hand, education remains a robust predictor of health status even after adjustment for reporting heterogeneity. Because disparities in educational attainment have significantly declined during the reform era between men and women (Zeng et al. 2014) and between rural and urban areas (Ye 2015), this finding suggests a narrowing and overall decline over recent decades in health inequality between men and women and between rural and urban areas. However, given the importance of family income to health, and the increasing trend in income inequality over time (Xie and Zhou 2014), between-family differences, net of the rural and urban disparity, may have risen over time.

As for material conditions, we found that greater family wealth was associated with better self-rated health, independent of income. This finding is consistent with the growing health literature in developed countries showing that wealth is as important as income, if not more important, in affecting health (Pollack et al. 2007). By examining family income and wealth simultaneously and adopting a comprehensive measure of wealth summed across various family assets and liabilities, we improve upon prior research in developing countries that often relies on a composite measure of living standards as the sole economic indicator.

Unique to the Chinese context, political capital remains an influential factor for health, although the effect size is relatively small. This finding suggests that political elites continue to enjoy health advantages despite the ongoing market transition. More importantly, it seems that health benefits of political capital can also be extended to other family members who themselves do not possess it directly.

Third, we employed anchoring vignette methodology to adjust for reporting heterogeneity in self-rated health. We again uncovered complicated patterns of self-assessment by SES in that certain indicators such as income and political capital did not matter, while others such as education and verbal skill affected cut-point shifts in non-monotonic ways. We found family wealth to be associated with upward cut-point shifts except for the cut-point for ratings of fair (versus poor) health. We demonstrated that estimates of true health disparities by SES indicators could be biased by as much as nearly 40 % by reporting heterogeneity. To the extent that self-rated health is likely to remain a widely used health indicator in future household surveys in China or elsewhere, our findings on reporting heterogeneity highlight the crucial value of anchoring vignettes for reducing systematic measurement errors.

We acknowledge several limitations of our study. First, there is still abundant room for improving measurements, both for SES and health. We only considered verbal skill as a measure of cognition while future research should incorporate other cognitive abilities such as numeracy, logical reasoning, graphic skill, and memory function. Future research should also examine other health indicators beyond self-rated health, such as biomarkers. Second, we based our study on observational data in the CFPS and thus are limited in our capacity to draw causal inferences. The lack of anchoring vignette data in the 2010 baseline and inconsistent measures of self-rated health between the baseline and the 2012 follow-up survey prevents us from using such techniques as fixed-effects and difference-in-differences models to better control for individual-level heterogeneity. Third, related to the above point, our methodological approach in using IPW to adjust for sample non-responses and panel attributions relies on the so-called missing at random assumption (Rubin 1976), i.e., no unobserved relevant confounders conditional on observed covariates that may affect both SES indicators and self-rated health. Of course, this assumption is unlikely to hold true in reality. The real question is to what extent likely violations of it would seriously alter the results we have presented in the paper. Unfortunately, we are not in a position to answer this question.

These limitations do not overshadow the significance of our study. Our study adds to an on-going and expanding literature that documents the existence of SES inequalities in health among Chinese adults (Zhu and Xie 2007; Luo and Xie 2014; Yang and Kanavos 2012; Chen et al. 2010). However, SES health inequalities in China have not exactly followed those observed in Western societies. We have shown in this paper that the SES-health gradient in China varies in both direction and strength, depending on the specific indicator being examined. This complex pattern has been shaped by the unique social and economic experiences of China and should be understood in this specific societal context.