1 Introduction

Existing measures of health care access are frequently adapted to suit local contexts. While this produces indices that are well-suited to their study areas, this limits the generalizability of the results and their applicability to other study areas. This limited generalizability presents a challenge within a project to map out and study access to primary health care within the state of West Virginia, called HealthLink WV. (Hong et al 2020) The purpose of the project is to develop an improved measure of health care access to identify gaps in potential access, both spatially and non-spatially. This is a particular need for the state of West Virginia, as it is rural, mountainous, and relatively impoverished. The rural nature of the state means health care facilities are less numerous, while mountainous terrain impedes travel with winding roads that require slower speeds, and the greater potential for landslides leads to increased maintenance costs, which are exacerbated by the impoverished nature of the state. These challenges mean that traditional measures of health care access, which are designed for the entire United States, such as the Health Professional Shortage Area (HPSA), designate the entire state as lacking sufficient access to health care. Such a definition cannot be used to guide the state’s policymaking because there is not sufficient funds to address all the needs of the entire state. Therefore, areas of greater need must be able to be differentiated from areas of lesser need, even if all areas are underserved to a degree. This improved definition is then to be used by the medical schools within West Virginia to inform the placement of medical students for their rural residency, so that the rural residency program can meet both the educational needs of the students and the access needs of rural areas with limited or no access to health care.

Measurements of access to health care have long followed a paradigm contrasting potential access versus realized access, and spatial access versus non-spatial access (Penchansky and Thomas 1981). The HealthLink WV interactive map allows users to examine the spatial barriers to health care access, the non-spatial barriers to access, and provides a map integrating both spatial and non-spatial barriers into a single integrated index. (Hong et al. 2020) The literature on health care access provided more guidance for the choice of the spatial measure of accessibility than the choice of non-spatial measure of accessibility. The HealthLink WV spatial accessibility index uses a parameterization of the E2SFCA approach to measuring spatial access because of its prominence in the literature, as described below. However, there is no comparably dominant measure for assessing non-spatial barriers to accessing health care. Socio-economic barriers to accessing health care were incorporated into the HealthLink non-spatial accessibility index in two sets of factors: the ability to access primary care services, which is represented through the proportion of households without vehicle access or health insurance; and the ability to afford services, which is represented through socio-economic variables of the percent of households in poverty, the percent of residents without a high school diploma, the percent of households headed by females, and the percent of households who rent. These variables are selected with reference to the literature on health care access and local knowledge about the situation in West Virginia. As explained below, this selection is not as straight-forward as choosing the E2SFCA approach for measuring spatial access because of the wide variety of socio-economic and demographic indicators present in the literature, often with little or no overlap in the choice of variables from one study to another. The purpose of this paper is to assess the robustness of the choice of indices to measure non-spatial barriers to access to health care.

While there are plenty of methods for measuring spatial access, the most common approach is the Enhanced 2-Step Floating Catchment Area (E2SFCA) and methods derived from it (Luo and Qi 2009; Fransen et al. 2015; Delamater 2013; Luo and Whippo 2012; McGrail 2012). Most of the methods use this general framework, but vary the definition for identifying the floating catchments, and varying the specification of the distance decay function that is employed by the gravity model underlying the E2SFCA. Donohoe et al. (2016a, b) compared methods of measuring spatial access, all of which are within the E2SFCA framework, as has Ngui and Apparicio (2011). They find similarities in the general spatial patterns of access, although there are some minor differences among spatial accessibility measures. This robustness within the spatial patterns of access, and the frequency with which the E2SFCA model is employed in the literature enabled the HealthLink team to choose the E2SFCA approach for measuring spatial access.

However, while the E2SFCA is frequently used to measure spatial access, there is less consistency around how to measure non-spatial aspects of access to health care. Each study that develops an index to measure non-spatial aspects of health care access selects a distinct set of variables, often with limited overlap with variables in other studies. This limits the ability of existing literature to inform the selection of variables for developing the non-spatial component of the access index for HealthLink WV. In the absence of consistent guidance from existing literature, we initially developed a straightforward measure of non-spatial access based upon three likely socio-economic barriers: vehicle ownership, lack of health insurance, and poverty. Recognizing the lack of consensus in non-spatial access variables, we then solicited feedback from representatives of the medical schools in meetings with them, anticipating valuable feedback of additional variables to incorporate based upon their experiences. However, the feedback we received was simply that the existing non-spatial measure was acceptable. With such limited feedback, and no clear guidance from the literature, we still have no indication that the index based upon these three variables is a valid measurement of socioeconomic barriers to accessing health care, so we expand the index slightly to include the variables listed above. Furthermore, the extent to which there is agreement or disagreement among the various non-spatial indices remains an open question. While they use different socio-economic and demographic variables as inputs, there may be correlations among them because they all attempt to measure aspects of the same phenomenon: access to health care. If there is agreement, then the results of a healthcare access analysis are less sensitive to the choice of variables and thus choosing which metric to use is less critical. However, if there is disagreement, the sensitivity can mean policies to address gaps in health care access may not target the areas of greatest need.

The aim of this paper is to assess the sensitivity of the choice of variables in developing the socio-economic component of the health care access index for HealthLink WV. To carry out this aim, we examine and implement a variety of non-spatial indices based upon different sets of socioeconomic and other demographic variables used in published literature. Then, following Donohoe et al.’s (2016a, b) comparisons of spatial measures of health care access, and LeSage and Pace’s (2014) comparison of different spatial weights matrices for spatial econometrics, we compare the indices to determine the extent to which they correlate.

2 Literature review

Much of the recent research on spatial access to health care centers on a single approach, the concept of catchment area considering supply, demand, and distance (Luo and Qi 2009), with variation in the specifications of how a catchment is defined and the form of the distance decay function. However, there is a wider variety of approaches for measuring non-spatial access to health care. All have the same general concept of identifying social, demographic, and economic barriers that can prevent a person from using a health care facility. However, the selection of which variables to use in this measurement is inconsistent, often with little or no overlap in the variables which are used. Table 1 provides an overview of the variables which are included in the indices analyzed in this paper. Some studies, such as Bascuñan and Quezada (2016), Domnich et al. (2016), and Yin et al. (2018), acknowledge the important rural/urban divides in health care access by using variables such as population density and USDA rural/urban codes. Others, such as Shah et al. (2015), Daly et al. (2019), and Chateau et al. (2012) recognize how economic barriers, such as low income and a lack of health insurance, can prevent people from accessing health care, even if it is physically located near their residence. Another set of limitations is derived from transportation issues, above and beyond the distance or travel time, such as rates of vehicle ownership or public transportation usage (Bascuñan and Quezada 2016; Shah et al. 2015; Paez et al. 2010; Asanin and Wilson 2008). Lastly, demographics such as family structure and education are recognized through variables like the single-parent household and the education level of the head of household, and are included in some indices (Paez et al. 2010; McGrail and Humphreys 2015; Gao et al. 2017). Most indices use variables from more than one of these categories, although in many cases there is little or no overlap of the specific variables used by any pair of indices.

Table 1 Variables and weights for implemented indices of socioeconomic barriers to health care access

Furthermore, how the indices or studies identify the non-spatial variables associated with a lack of health care access also varies. Many used regression analysis to find which variables are statistically significantly correlated with a dependent variable measuring spatial access to health care (Paez et al. 2010; Domnich et al. 2016; Gao et al. 2017). Others, however, used more qualitative means to select their variables (e.g. Shah et al. 2015; Asanin and Wilson 2008). For example, Shah et al. pulled potential factors, such as income and age structure, from the literature on health access, without conducting any additional analyses to identify variables which are significant within their study area. Meanwhile, Asanin and Wilson identified barriers to access through focus groups and interviews with their target population of immigrants. Despite this variation in the social, demographic, and economic attributes used to assess non-spatial barriers to health care access, as well as the means of selecting those variables, there is no systematic comparison of these measurements to determine the level of agreement among them.

3 Methods

To investigate how sensitive the map of access is of the choice of measurement of socio-demographic limitations, a total of 13 indices are calculated. These indices are derived from our preliminary work and 9 additional studies, most of which are regression-based analyses (Asanin and Wilson 2008, Bascuñan and Quezada 2016, Chateau et al. 2012, Daly et al. 2019, Domnich et al. 2016, Gao et al. 2017, McGrail and Humphreys 2015, Paez et al. 2010, Shah et al. 2015, Yin et al. 2018). The HealthLink index is the initial index we created, as described in the introduction, and others are referred to by the first author of the paper in which they appear. The indices were calculated as closely as possible to the work from the papers, although because of limitations in data availability or differences in the data definitions from one country to another, some are not exact replicas. All of them are coded such that higher values represent less access, or more substantial socio-economic barriers to access. Missing variables occur when the US Census Bureau does not collect the attributes at the block group scale. If weights are provided in the article, we use those. If weights are not provided, we derive them from regression equations. If no means of assessing weights are available in an article, we use 1 for a variable that is included and 0 if it is excluded. Table 1 summarizes which attributes are used in which studies, and lists the weights used in each index as well.

The socio-economic data are from the United States Census Bureau’s American Community Survey, using the five-year estimates from 2011–2015 (U.S. Census Bureau 2017). Each variable was standardized to a range of 0 at the minimum and 1 at the maximum to avoid variables with larger ranges dominating the variation of the indices. Indices are calculated as weighted averages of these standardized variables, using ArcGIS Pro, version 2.0, (ESRI 2018) using the Field Calculator.

Once the indices are calculated, a correlation matrix is created using R (R Core Team 2018). This is presented in Table 2 and is visualized in Fig. 1, in which larger blue circles represent pairs of indices with a strong positive correlation, and in which red circles would represent pairs of indices which are inversely correlated, although there are no such pairs. Smaller and lighter circles represent pairs that have weaker correlations. Finally, a correlation-based heat map and a principal components analysis were produced in R to identify which groups of indices are more similar to each other. From these groups, we map one index from each group to identify a distinct set of indices and thereby illustrate the portions of the state where the different indices agree and where they disagree. This can highlight the portions of the state which are most sensitive to the choices of how socioeconomic barriers to health care access are measured, and are thus more susceptible to a continued lack of health care access if a policy choice is based upon a set of socioeconomic variables that is unsuited to the local context.

Table 2 Correlation matrix among all indices
Fig. 1
figure 1

A visualized correlation matrix of the indices

4 Results

Figure 1 presents a visualization of the correlation matrix in Table 2, in which more strongly correlated pairs of indices are represented with larger and darker dots, while weakly correlated pairs have smaller and lighter dots. Note that all dots are blue and all values in Table 2 are positive, indicating positive correlations, so all indices calculated here are directly correlated with each other, just to varying degrees. Correlation values range from 0.094 to 0.968. Figure 2 presents a heatmap with dendrograms, and Table 3 presents the loadings for the first three components from the principal components analysis. The first principal component represents 59.1% of the variance within the dataset, and has loadings that are similar for all indices, ranging from 0.203 to 0.339, again reinforcing the general correlation among all indices. The colors in the heatmap indicate the extent to which two indices are associated with each other. A red square indicates two indices strongly correlate with each other, while a green square indicates weaker correlation. Therefore, groups of indices which all share red boxes in Fig. 2, and which are connected in the dendrogram, are related to each other in how they represent socioeconomic barriers to health care access in West Virginia, whether or not they have many variables in common. For example, the three indices in the top left of this image, which are those by Domnich et al. (2016), Yin et al. (2018), and Bascuñan and Quezada (2016), form a bright red box, indicating strong agreement among all three of these indices. This is reinforced as the relevant dots in Fig. 1 are all large and dark blue, with correlation values ranging from 0.787 to 0.869. Likewise, all three of these indices are the only indices which have negative loadings for principal component 2, illustrating their distinction. Therefore, these three indices measure socioeconomic barriers to health care access in much the same way, even though there is little overlap in the selection of variables between any pair of these three indices, and no variable that is consistent across all three. Likewise, the dendrogram has these three as a group connected to each other, and disconnected from the remaining indices.

Fig. 2
figure 2

“Heatmap” of similarity among the indices, with dendrograms linking the most similar indices to each other

Table 3 Loadings of the first three components from a PCA of the indices

The largest grouping is in the larger red area in the lower right of Fig. 2, with the Daly et al. (2018) index, two indices from Chateau et al. (2012), and those from McGrail and Humphreys (2015), Gao et al. (2017), Paez et al. (2010), and Asanin et al. (2008). The correlation values within this group range from 0.557 to 0.968, and this is not as solidly bright red in the heatmap, which indicates less consistency in the measures in this larger set of indices. The dendrograms show that these can be broken down into two groups, with Daily and the two Chateau indices in one subgroup and the remaining four indices in the other.

The other three indices have less agreement with these two groups of indices, with the social deprivation index from Chateau et al. (2012) having a weaker correlation with most other indices. Also, the HealthLink index from the introduction of this paper is combined in the dendrogram with the index from Shah et al. (2015), with a correlation value of 0.566, although both also have high correlation values with the index from Daly et al. (2019) at 0.739 and 0.796 for the HealthLink index and the Shah index respectively. These three indices feature the greatest positive loadings for principal component 3, reinforcing this similarity.

To map these out, we select the Domnich index from the first group, and present it in Fig. 3. The two subgroups of the large set of indices are represented by the Paez index in Fig. 4 and the material deprivation index from Chateau et al. represents the other group in the map in Fig. 5. Lastly, the group with weaker relationships with the others is represented by the HealthLink index in Fig. 6.

Fig. 3
figure 3

Map of WV health care accessibility index based upon Domnich et al. (2016)

Fig. 4
figure 4

Map of WV health care accessibility index based upon Paez et al. (2010)

Fig. 5
figure 5

Map of WV health care accessibility index based upon Chateau et al. (2012) Material Deprivation Factor

Fig. 6
figure 6

Map of WV health care accessibility using the original WV HealthLink Index

The first grouping places a higher emphasis on rural versus urban disparities, as reflected in the map of the Domnich index, presented in Fig. 3. In this map, the most rural and mountainous parts of the state, in the center and southeast, form the largest contiguous area of red in the map. Meanwhile, the areas with smaller block groups, and which are also the darker blue, are in urban areas. The additional red area along the southwestern border of the state is a historic coal mining region, which is now very impoverished.

As the largest group of indices, two maps from the second group are presented, one from each of the two subgroups identified in the dendrogram: the Paez index and the material deprivation index by Chateau et al. (2012) The similarities between the two maps are evident, as is the dissimilarity from the map in Fig. 3. Both of these two maps highlight more socio-demographic barriers to access in the southwest and parts of central West Virginia, again highlighting the impoverished area in the southwest as facing especially severe socioeconomic barriers to health care access. The material deprivation index (Fig. 5) presents this pattern more clearly. However, the rural penalty that was apparent in Fig. 3, in the southeastern portion of the state, is not indicated here.

Lastly, the original HealthLink socioeconomic index is presented in Fig. 6. In Fig. 2, it is one of the indices which is not strongly in a group. It differs from the others in having a stronger presence in the negative end of the scale, which represents areas with fewer barriers to access. This is visible as the larger area represented as having fewer barriers to access in the northern and eastern parts of West Virginia. Even though it is not strongly in a group and has clear differences from the other maps, the same general trend is evident. There are more barriers in the southwest and central parts of the state than elsewhere. The southeastern area, which was identified as facing severe barriers by the Domnich index in Fig. 3, but having fewer barriers by the indices in Figs. 4 and 5, is mixed in the HealthLink index.

These results show that all indices are correlated and have the same general trend, with better access in urban areas, especially in the northern and eastern parts of the state, and worse access in the poorer areas in the central and southwestern areas. However, as the southeastern portion of the state demonstrates, this agreement among the indices is not consistent statewide.

5 Discussion

Even though they have a wide range of different variables, there is a positive correlation among all of the indices, as shown in Fig. 1 and Table 2. This suggests that, despite their differences, they are broadly effective at capturing the overall dynamics of where West Virginians face socio-economic barriers to accessing primary health care. The agreement among all the indices for the southwestern part of the state indicates that residents of this area experience more barriers to access than elsewhere. As mentioned in the results section, this is the most impoverished portion of the state, so it would show up strongly in economic limitations to access, and economic barriers are represented in all of the indices studied here through one variable or another. Meanwhile, urban areas, as they are generally wealthier areas of the state, fare better in all four maps.

This agreement occurs because of the common theme of incorporating some economic variables in the indices. The Domnich and Paez indices use per capita income, while the material deprivation index uses median household income. To capture the same general trend of income, the HealthLink index uses the proportion of the population that has an income less than the poverty line. Whether it is unemployment, income, or poverty, all but one of the indices incorporates at least one variable to measure employment and income. The outlier is the Bascunan index, which may capture some of this dynamic with the variable measuring the proportion of the population that does not have a high school degree. Throughout all these studies, with varied study areas and health care services, this common factor emphasizes the importance of income in overcoming socioeconomic barriers to accessing health care. This is reinforced by the correlations among variables used to measure socioeconomic status in both urban and rural parts of the United States. (Wang et al. 2012) That all indices are positively correlated even though some of them use different variables to measure this important factor illustrates the robustness of the indices to the variable chosen to represent this factor.

However, within this broad income-based agreement, there are still notable differences that emerge, owing to the different choices of variables measuring the other socioeconomic processes beyond the common theme of income and employment. The statewide agreement among all indices represented by the correlation matrix in Fig. 1 obscures local variations where there are regions within the state in which there is less agreement. Within the maps in Figs. 36, the most obvious discrepancy is within the southeastern and central parts of West Virginia, where the Domnich et al. (2016) index indicates a severe lack of access because it has population density and the USDA Rural–Urban Codes as variables, thus adding a strong rural versus urban component. Other indices that do not explicitly incorporate the rurality through population density or RUC codes place those areas in the middle or even slightly better than average with respect to non-spatial access to primary care. The factor common to the Paez and material deprivation indices are that both incorporate the proportion who are unmarried, acknowledging the effect of family structure on access to health care. Reflecting that they were in separate subgroups within the larger group of indices, the Paez index has variables related to age structure (proportion over 65 and under 18) while the material deprivation index has a heavy weighting on the proportion without a high school degree. Meanwhile, the HealthLink index is the only index examined here that explicitly includes the proportion of residents without health insurance, which could contribute to its lower correlation with other indices, as it is one of the indices that is not part of a distinct group within Fig. 2. Likewise, the social deprivation index has stronger weight on the proportion of residents who live alone and moved in the past five years than any other index. The different variables used here contribute to the regional differences in the access maps, and highlight that, even though there is broad statewide agreement of the general pattern, there should be care taken to ensure that the selection of variables is locally appropriate as well.

The choices of variables within these studies reflects the different origins of the studies used to create the indices. Some are for primary care (e.g., Daly et al. 2019), and others were conducted to measure access to more specialized facilities such as physiotherapy centers (e.g., Shah et al. 2015). Additionally, there are different parts of the world and different scales covered among these studies, ranging from the province of Saskatchewan (Shah et al. 2015) to the municipal scale in Concepcion, Chile (Bascuñan and Quezada 2016), to the entire country of China (Yin et al. 2018). Reflecting the unique situations of the study areas and types of services being accessed, there the different processes promoting or inhibiting health care access emphasized in these studies. This manifests as the different sets of variables as mentioned in the preceding paragraph, with some highlighting the detrimental effects of rurality on health care access, while others focus more on family structure.

Because of the differences among the maps, we caution against the complete nonchalance in the choice of variables which was expressed in the initial meetings that prompted this research. This caution is especially true if the indices will be used to guide locally specific policies, as in this research, which will be used to guide the policies directing medical school residents to rural communities in need of improved health care access. For example, using the Domnich et al. (2016) index in Fig. 3, medical school residents would be sent to the southeastern and central parts of West Virginia, while that region of the state would not be as high a priority using other indices. Therefore, knowledge of the study area and the geographic processes that guide health care access in that study area is still valuable and potentially critical when guiding effective policy.

The main limitation of this work is that, while the indices it uses are derived from studies with a range of purposes and a range of study areas, it is limited to West Virginia, which may limit its generalizability to other parts of the world. By taking indices developed for one part of the world, and applying them to another part of the world, which may have a vastly different health care system, it is also stretching the generalizability of those studies, as some variables which are important in one place may be irrelevant elsewhere. For example, as note b in Table 1 states, McGrail uses altitude and coastal status, which may be important factors in transportation networks in Australia. However, they are not relevant for West Virginia, which does not even have a coastline. Similarly, the HealthLink index incorporates the proportion without health insurance, but studies in other countries which have universal coverage and thus everyone has health insurance, this cannot be a barrier.

This work isbeing combined with measurements of spatial access to primary health care facilities across West Virginia. Future work will aim to distinguish among the different facility types, both in measuring distinct spatial access to types of facilities, such as behavioral health care, but also identify socioeconomic barriers specific to different types of health care facilities. These indices will also be compared against facility usage and health outcomes to measure realized access instead of just potential access to health care. These comparisons against actual usage will also inform the development of a more methodologically rigorous approach to constructing an index of health care access, as was done by Chateau et al. (2012). Most papers from which the indexes were derived do not attempt to relate the potential barriers to realized barriers, instead associating the socioeconomic or demographic variables with potential barriers such as distance to the nearest health care facility. An exception to this is Chateau et al. (2012), who correlate their three indices (Socio-economic Factor, Material Deprivation, and Social Deprivation) with health outcomes such as life expectancy and self-rated health. Even this, though, does not directly relate to usage of health care facilities. Additionally, Asanin and Wilson (2008) presented their quantitative analyses alongside the results of focus groups that discuss the realized barriers to access. The data within the HealthLink system can also be analyzed with survey data collected about Appalachians’ beliefs and responses to the first several months of the COVID-19 pandemic to compare these potential barriers to access with the realized barriers to access measured in questions within this survey. This will illustrate the correlation between potential and realized barriers to access, at least within the context of the COVID-19 pandemic.

6 Conclusion

The general agreement among all the indices in this study suggest that, like the conclusion of LeSage and Pace (2014), despite the proliferation of many indices for measuring and assessing non-spatial factors limiting (or promoting) access to health care, the identification of communities at risk is somewhat robust to the choice of variables and used to measure socioeconomic barriers to health care access. The robustness comes from the common theme from the economic side of socioeconomic barriers. All indices measured the effect of a community’s economic status through variables such as income, employment, or the poverty rate. As long as economic status is incorporated somehow, there will be some robustness to these measures. However, this robustness is not complete, and statewide agreements reflected in the correlation matrix can hide local sensitivities to the choice of variables, especially with respect to how the social aspect of socioeconomic barriers are represented within variables. There was less agreement among all the indices about the social aspects which are relevant and thus included in the set of variables, and this had a corresponding decrease in agreement among the indices such that some of the correlations were weak, albeit still positive correlations. As such, the indices have only a partial robustness from the common theme of economic status, as they are still sensitive to the choice of social and demographic variables, especially at local scales.