Introduction

Bikesharing is an affordable mode of transportation and a potential tool to reduce car usage in cities. However, in many cities, bikesharing seems to be used unproportionally by affluent populations. Disadvantaged areas are often underserved by bikesharing (Lin et al., 2018; Ogilvie & Goodman, 2012; Rixey, 2013), while bikesharing users tend to have a higher income than the average (Bernatchez et al., 2015; Fishman et al., 2014; Murphy & Usher, 2015).

A new tool that can increase bikesharing usage in disadvantaged areas is the electric bicycle (e-bike). Most of the underserved regions are located at the periphery of Indego’s service area. The e-bike’s motorized power and speed allow users to travel faster from these distant neighborhoods to the CBD and other distant key destinations. This may especially be relevant for persons without a car or those living in areas poorly served by public transport. The reduced effort required by e-bikes has the potential to attract new populations to Indego. However, there is no evidence that e-bikes increase bikesharing usage in disadvantaged areas. This study aims to fill that gap.

Indego, Philadelphia’s bikesharing system, joined the Better Bike Share Partnership (BBSP) to promote cycling among disadvantaged areas and populations in the city (BBSP, 2018). To promote equity, Indego offers a cash payment option for those without a credit card, gives discounts for food-stamp recipients, and has built additional docking stations in underserved regions (BBSP, 2018). However, these early efforts were not effective enough to equalize the usage between low, medium, and high income areas in Philadelphia (Caspi & Noland, 2019).

In the past year, Indego integrated e-bikes into its docked bikeshare. In November 2018, Indego started a pilot with ten e-bikes (Indego, 2019a), and in May 2019, the bikeshare added 400 e-bikes to its 1500 bicycle fleet (Indego, 2019b). Indego’s pass costs $17 a month or $156 a year for regular users and $5 a month or $48 a year for food-stamp holders. In addition to Indego’s membership fee, e-bike usage costs 15¢ per minute or 5¢ per minute for food-stamp holders. Penalties for missing or damaged equipment are $1000 for a bicycle and $2500 for an e-bike (Indego, 2019b). BBSP’s manager, Waffiyyah Murray, sees e-bikes as a vehicle for equity: “The introduction of e-bikes has been a game-changer in the bike share industry. This new technology will help address several barriers and open the door for new cyclists to try biking for the first time or use it more often as a regular form of active transportation” (Indego, 2019a).

In this study, I examine the influence that shared e-bikes have on usage in disadvantaged areas served by Indego. My research questions are: Has electric bikesharing increased Indego’s usage in disadvantaged areas? How many people in disadvantaged areas use shared bicycles and e-bikes in comparison to other regions? Do trips that start or end in disadvantaged areas have different patterns than trips in other areas? Finally, how do different spatial demographic factors associated with bikesharing and e-bikesharing usage?

To answer these questions, I examined Indego’s usage in three months in 2019 – June, September, and December. I used descriptive statistics, Random Forrest (RF) analysis, and regression analysis. In Sect. 2, I examine the existing literature. In Sect. 3, I describe the methods and the data sources I used. In Sect. 4, I portray the results of my analysis. In Sect. 5, I discuss the results of the different methods I used in relation to my questions and the existing literature, and in Sect. 6, I conclude this paper.

Literature review

The bicycle is an affordable mode of transportation, cheap to buy, use, and maintain. Naturally, it has great potential to attract underprivileged people. Bicycles have been in the world since the nineteenth century. They have had great usage in developed countries, such as Denmark and the Netherlands (Garrard et al., 2012) and developing economies such as China and India (Zuidgeest & Brussel, 2012). In the U.S., census tracts with the highest bicycle commute share have lower median income and higher poverty levels than other census tracts (Schneider & Stefanich, 2015).

However, some disadvantaged individuals seem to reject bicycle transportation (Gibson, 2015). Cycling has an image of privileged activity and a symbol of gentrification and displacement of disadvantaged communities (Gibson, 2015; Wild et al., 2018) but also an image of unsuccessful individuals (Hull Grasso et al., 2020). Also, fear of robbery and assault is a major barrier for cycling among blacks and Hispanics (Brown, 2016).

Bikesharing is disproportionately used by affluent individuals (Bernatchez et al., 2015; Fishman et al., 2014; Murphy & Usher, 2015), and bikesharing stations in low-income areas are used less than others (Caspi & Noland, 2019; Lin et al., 2018; Ogilvie & Goodman, 2012; Rixey, 2013). Most bikesharing systems across the U.S. provide better service to advantaged communities (Brown et al., 2019), while disadvantaged areas tend to have fewer stations (Goodman & Cheshire, 2014; Smith et al., 2015). Nevertheless, lower income individuals are more likely to shift from being a non-cyclist to being a regular bikesharing user (Reilly et al., 2020). In Vancouver, lower income individuals are more likely to be among the most frequent 10% users who are responsible to more than half of the trips rather than among less frequent users (Winters et al., 2019).

Race and gender also affect bikesharing usage (Hull Grasso et al., 2020); users are mainly white, educated, employed, young, and male (Buck et al., 2013; Fishman, 2016; Fishman et al., 2013; LDA Consulting, 2012; Virginia Tech, 2012). There is some evidence, however, that disadvantaged people of color use bikesharing more than disadvantaged whites (Dill & McNeil, 2021). Bikesharing may be unsuitable for disadvantaged areas as these are often too far from desirable destinations such as service and employment hubs. The high distance makes cycling uncomfortable, and users may exceed the bikesharing time limit (Hull Grasso et al., 2020). There is limited evidence that by overcoming accessibility barriers bikesharing might be used by disadvantaged individuals at the same rate as others (Dill & McNeil, 2021).

The integration of e-bikes to bikesharing fleets gained popularity in Europe and expanded to Asia and North America (Galatoulas et al., 2020). E-bikesharing systems can be docked (BiciMAD, 2014; Citi Bike NYC, 2019; Indego, 2018) or dockless (Guidon et al., 2019), integrated into a system with regular bicycles (Citi Bike NYC, 2019; Indego, 2018; Nice Ride Minnesota, 2018) or have their own system (Anzilotti, 2019; BiciMAD, 2014). Users prefer e-bikes to serve a wider variety of purposes, more distant destinations, deal with topography, and substitute walking (Langford et al., 2013). In China, e-bikesharing attracts more car and transit users than non-powered bikesharing, as well as more young to middle-age low-income males (Campbell et al., 2016). An inquiry into Zurich's e-bikesharing system shows that e-bikesharing riders travel the same or even shorter distances than regular cyclists. In Park City, Utah, however, user completed distant e-bikesharing trips (He et al., 2019). In urban context, e-bikesharing primary trip purpose is commuting, and it is competing with public transportation and taxi (Guidon et al., 2019). In touristic destinations, e-bikesharing is being used mostly for long round trips by casual users (He et al., 2019).

E-bikes are more expensive than non-motorized bicycles. E-bikes usage is biased toward higher-income individuals (MacArthur et al., 2014; Simsekoglu & Klöckner, 2019). In e-bikesharing, however, the users do not need to buy the vehicle. E-bikesharing can benefit disadvantaged individuals by providing a faster and easier way to ride a bicycle to reach more distant destinations and make bikesharing suitable for their purposes. On the other hand, the higher price of e-bikes may deter them from using that service. To date, there is no evidence that the integration of e-bikes in a bikesharing system promotes usage equity, and I do not expect it to do so.

Methodology

In this study, I examined Indego’s usage in three periods in three steps of analysis. First, I used descriptive statistics to explore usage patterns in Indego’s trip logs. After calculating spatial variables, I created a Random Forest (RF) model to investigate the importance of several external influences on Indego’s ridership rates. One of the most significant benefits of RF is its ability to examine the importance of many explanatory variables without the need to account for multicollinearity. However, RF does not provide the direction (positive or negative) of the association. Therefore, in my last step, I used a spatial negative binomial regression model to utilize the benefits of the RF model and compensate for its shortcoming.

Data

The data in this study is an integration of Indego trip logs, census statistics, and other spatial components of Philadelphia. I retrieved Indego’s trip logs between April 2018 and March 2020 while later choosing to focus on a shorter period. The trip logs provide information for each trip conducted in the system and include the time, date, and location of departures and arrivals. They also include the type of shared vehicle – bicycle/e-bike, the type of subscription, and the bike ID. I used these datasets for a) descriptive statistics analysis, b) extracting the locations of active docking stations, and c) aggregating the total trips departing and arriving at each station in different periods. To reduce the risk of error records, I excluded trips longer than 12 h.

After examining Indego’s annual usage pattern, I decided to focus on three periods: June, September, and December 2019 (Fig. 1). Focusing on three periods allowed to examine the assimilation of e-bikes in Indego over the course of the year while controlling for seasonality. A full-year analysis was impossible as the service was interrupted by a temporary suspension of the e-bike fleet between mid-January to mid-February 2020. Moreover, the COVID-19 pandemic, which erupted in the U.S. in mid-March 2020, led to a sharp decline in Indego’s ridership. The three periods that I decided to examine represent three phases in the Indego service: June – Bikesharing usage is at peak while e-bikesharing usage is still growing; September – Bikesharing and e-bikesharing usage are at peak; and December – Bikesharing and e-bikesharing usage are at a low due to the winter.

Fig. 1
figure 1

Daily usage of shared bicycles and e-bikes in Indego between April 2019 and March 2020

To examine the role of external influences on Indego’s usage, I geocoded the locations of the active docking stations during each of the examined periods and created service area polygons (Figs. 2 and 3). Some stations were introduced, and some were shut down during the examined period. The service area polygons are based on a 400 m street network distance from the stations with any overlapping areas allocated to the closest station. Other research has used a wide variety of buffer areas. For example, these range from 200 m (El-Assi et al., 2017), 250 m (Faghih-Imani et al., 2014), 300 m (Faghih-Imani & Eluru, 2015; Zhang et al., 2017), 400 m (Caspi & Noland, 2019; Mateo-Babiano et al., 2016; Noland et al., 2016; Sun et al., 2018; Wang et al., 2016) 500 m (Wang et al., 2018), and 800 m (Buck & Buehler, 2012). NACTO (2016) recommends that a 3–5 min walk between stations is best (5 min is about 400 m). The service area polygons served as the study unit in the RF and the regression analysis.

Fig. 2
figure 2

Daily bicycle trips (departures and arrivals) in September 2019 by docking station service area

Fig. 3
figure 3

Daily bicycle trips (departures and arrivals) in September 2019 by docking station service area

Using ArcGIS 10.7, I calculated the spatial components of the service areas, which are listed in Table 1. The demographic variables retrieved from the American Community Survey (ACS) 2014–2018 5-year estimates by the U.S. Census Bureau (U.S. Census Bureau, 2019). These are aggregated demographics at the census block group level. I used area-based weighted means to project the variable values from the census block group polygons to the service area polygons. Since Indego’s trip logs do not include demographic data, I had to use area-based demographics.

Table 1 List of spatial variables

Five census block groups that overlay with eight Indego service areas had no residents and no demographic variables. Two of them are on the west bank of the Schuylkill River, in the area of the University of Pennsylvania. Three census block groups are located at the Philadelphia Naval Business Center at the city's southern shore. To avoid any bias, I excluded the affected service areas from the analysis.

Before examining bikesharing usage in disadvantaged areas, there is a need to define what is considered disadvantaged. The common tool across micromobility studies is income (Caspi & Noland, 2019; Goodman & Cheshire, 2014; Heinen et al., 2010; Ogilvie & Goodman, 2012; Smith et al., 2015). However, low income can misleadingly classify students, who tend to use micromobility more than other populations, as disadvantaged (Caspi et al., 2020; Schneider & Stefanich, 2015). Another indicator can be the rate of the population under the poverty line, which may reduce the false indication of students. Race is another important demographic indicator that was found to be correlated with bikesharing usage. Previous studies suggest that bikesharing is used more by whites (Biehl et al., 2018; Wang et al., 2016) while others, including blacks and Hispanics, avoid the service (Brown, 2016). I, therefore, used the measurements for the annual median income per household, the rate of households under the poverty line, the proportion of the white population, and the proportion of the black population. I also included the proportion of the student population to control for its effect.

To determine the disadvantaged areas, I ranked the value of each of the four variables for each docking station service area. I summed the ranks and flagged the lowest 25% service areas (35 stations in each month) as disadvantaged areas. In the descriptive statistics section, I compared the disadvantaged areas to the rest of the areas. The differences in attributes between the disadvantaged station and the other stations are detailed in Appendix A.

While the focus of this study is on sociodemographic factors, I used a wide variety of variables to control for influences, which were found to be significant in previous bikesharing studies (Eren & Uz, 2020). These variables are listed in Table 1. Cervero and Kockelman (1997) coined the term 3D’s to indicate density, diversity, and design as major contributors to increasing non-motorized transportation. This study used population density to measure density, land use entropy to measure diversity, and intersection density to measure street design. Many bikesharing studies found that denser environments (Ahillen et al., 2016; Biehl et al., 2018; El-Assi et al., 2017; Faghih-Imani et al., 2014, 2017; Lin et al., 2018; Noland et al., 2016), mixed land uses (Lin et al., 2018; Zhang et al., 2017), and denser street networks (Li et al., 2018; Lin et al., 2018) increase bikesharing ridership.

I calculated the land use entropy, a measurement for the diversity of land uses in each service area, using an entropy formula: \(Entropy = \left\{ { - \mathop \sum \limits_{k} \left[ {\left( {p_{i} } \right)\left( {\ln p_{i} } \right)} \right]} \right\}/\left( {\ln k} \right)\) in which pi is the proportion of each land use and k is the number of land uses measured (Song et al., 2013); scaled from zero to one, this index increases as land use mix increases within a polygon.

Other factors found to increase bikesharing usage include higher employment density (El-Assi et al., 2017; Lin et al., 2018; Sun et al., 2018; Wang et al., 2018), and shorter distance to the CBD (Biehl et al., 2018; Faghih-Imani et al., 2014; Li et al., 2018; Wang et al., 2016). Land use compositions have been found to influence ridership in different manners (Faghih-Imani & Eluru, 2015; Goodman & Cheshire, 2014; Noland et al., 2016; Wang et al., 2018). Residential land uses, for example, generate more trips in the mornings and attract more trips in the evenings (Mateo-Babiano et al., 2016; Zhang et al., 2017), while commercial land uses show the opposite direction (Buck & Buehler, 2012; Faghih-Imani et al., 2014). Other studies, however, found different patterns (Li et al., 2018; Sun et al., 2018). Based on Philadelphia’s official dataset, I divided the land use into five categories: residential, commercial, industrial, recreation, and others (which include infrastructure, transportation, water, cemeteries, and more).

In some cases, bikesharing stations in proximity to bus stops and metro and rail stations have higher ridership (Caspi & Noland, 2019; El-Assi et al., 2017; Faghih-Imani et al., 2014; Goodman & Cheshire, 2014; Li et al., 2018; Noland et al., 2016). Here, I examined the influence of bus stops, trolley stops, metro stations, and regional rail stations separately. In addition, I concluded all the transit locations into one variable.

Bicycle infrastructure has been found to have a great influence on bikesharing and cycling in general (Buehler & Dill, 2016; Faghih-Imani & Eluru, 2015; Faghih-Imani et al., 2014; Fishman, 2016; Heinen et al., 2010; Mateo-Babiano et al., 2016; Wang et al., 2016, 2018; Zhang et al., 2017). Here I tested three different approaches to measuring bikeways, which include on and off-road trails dedicated to cycling: (a) a dummy variable for the existence of any length of bikeways in the service area, (b) the total length of bikeways in the service area, and (c) the ratio between bikeways and roads, which should compensate for variations in nature and size of the service areas.

The location of a docking station within the bikesharing system also influences its usage. Some studies found that docking station density increases ridership (Ahillen et al., 2016; El-Assi et al., 2017; Faghih-Imani et al., 2014, 2017; Li et al., 2018), while others found the opposite (Faghih-Imani et al., 2017; Wang et al., 2016, 2018; Zhang et al., 2017). I used two measurements for docking station density: the number of docking stations within 800 m and the distance to the closest docking station.

Pucher et al. (2010) claim that car availability greatly influences an individual’s decision to cycle, although the findings regarding the influence on bikesharing are mixed (Biehl et al., 2018; Buck & Buehler, 2012; Lin et al., 2018). Women use bikeshares substantially less than men (Beecham & Wood, 2014; Faghih-Imani & Eluru, 2015; Ogilvie & Goodman, 2012). Therefore, I added the number of cars available per person and the proportion of males in the population.

Random forest

To understand how different spatial factors associated with bikesharing usage, I used RF analysis. RF is a machine learning technique based on a decision tree model. Decision tree analysis split the sample based on the value of a selected explanatory variable (Xi) into two groups with similar response values (Yi). This process repeats until no more splits are possible (Hastie et al., 2009).

RF is an array of decision trees, where each decision tree is given a limited set of explanatory variables. The data are divided into a training set and a validation set. The training set is used to build the model, and the validation set is used to evaluate the model's accuracy. As a result, the RF provides an importance analysis of the explanatory variables (Hastie et al., 2009). The importance of each variable is measured for its contribution to the improvement in the split-criterion across the forest using two measurements: Increased Mean Squared Error (MSE) and Increased Node Purity. Increased MSE is calculated by subtracting the MSE of a permuted variable from the MSE of that variable: \({{{\text{Increased MSE}} = \left( {{\text{MSE}}_{{{\text{permuted}}}} - {\text{MSE}}} \right)} \mathord{\left/ {\vphantom {{{\text{Increased MSE}} = \left( {{\text{MSE}}_{{{\text{permuted}}}} - {\text{MSE}}} \right)} {{\text{MSE}}}}} \right. \kern-\nulldelimiterspace} {{\text{MSE}}}}\). The higher the Increased MSE value, the greater the importance of this variable. Increased Node Purity is based on the Gini impurity technique, used by the RF model to determine the node splits (Hoare, 2018). However, the Increased Node Purity may be biased (Strobl et al., 2007); thus, in this study, I use the Increased MSE measurement.

The advantage of machine learning and specifically RF over regression analysis is the possibility of using a large number of explanatory variables. Unlike in regression analysis, the decision tree model does not use all the explanatory variables but algorithmically chooses the most effective one that splits the sample in the best way. There is no collinearity in that case since the influence is not divided among the exploratory variables, as it is done in a regression analysis. The RF analysis iteratively provides a random set of variables for each decision tree and makes sure that all the variables are considered for splitting the sample.

The RF method is useful in evaluating the importance of different variables in an explanatory process. However, RF does not provide the direction of the effect of explanatory variables on the dependent variable, as a regression model does. Moreover, RF strength comes from large samples and is suitable for big data tasks. In this study, the sample is small (N = 126 to 135 stations, depending on the month), and the error is big. Therefore, I use regression analysis in addition to RF to strengthen my findings.

For this study, I created 48 RF models using randomForest package in R. For the three periods, June, September, and December, I examined departing and arriving bicycle and e-bike trips in four temporal divisions: all the sample, weekday mornings (7–10 AM), weekday evenings (4–7 PM), and weekends and holidays. For each RF model, I found the optimal number of trees and used the default number of variables in each tree, nine (The total number of explanatory variables divided by three). The findings are portrayed in Sect. 4.2.

Spatial negative binomial regression

To get better insights into the usage of Indego’s bikesharing and e-bikesharing, I performed an additional regression analysis. Indego’s usage rate is represented by the number of trips conducted in each station. As appropriate for count data, the usage rates have a negative binomial distribution. Due to the nature of the bikesharing system, and as I found in a series of Moran’s I tests, that usage rates are spatially autocorrelated, i.e., affected by the station’s location. Therefore, I adopted a conditional-autoregression (CAR) approach for spatial negative binomial (Poisson-Gamma) regression, using the INLA package in R. A negative binomial CAR regression model allows examining the effect of explanatory variables on a count exploratory variable while accounting for the spatial distribution. INLA provides an approximate Bayesian inference for Latent Gaussian Models (Rue et al., 2021). In addition, I performed a Moran’s I test for the residuals of each model and reported the results. Insignificant spatial autocorrelation indicates that the model successfully accounted for the spatial component.

In this study, I use 25 different variables to measure the spatial components of the docking stations’ service areas. To reduce the risk of multicollinearity in the regression models, I did not include two variables with a Pearson’s correlation higher than 0.3 (Appendix B). I also only included one variable from a series of variables representing the same phenomena (e.g., bikeway (dummy), bikeway length, and bikeways to roads ratio). In each regression model, I included the variables with the higher increased MSE (Mean Squared Error) values found in the RF analysis. Before running the regressions, I ensured no multicollinearity by performing a series of variance inflation factor (VIF) tests. All the VIF results were below 2, and there was no concern of multicollinearity.

Results

Trip statistics

During its first half-year of operation, Indego’s e-bikes were used less than its standard bicycles. In May 2019, Indego added 400 e-bikes to its 1500 shared bicycle fleet, composing 26.7% of all the vehicles (Indego, 2019b). However, the e-bike usage share was lower than this proportion. In June, e-bike usage was only 12.7% of the total usage in the system; in September, the share of e-bike trips increased to 22.2%, and in December, it reduced to 18.5%. The share of round trips to the same station, which implies recreational usage, was almost double for e-bikes than bicycles as the service launched, but the difference diminished as time progressed (Table 2).

Table 2 Mean values for trips that started or ended in disadvantaged service areas versus the rest of the service areas

Compared to the previous year, bicycle usage was lower in June and December but higher in September (Table 3). However, the total usage of Indego increased across the city. E-bike usage was responsible for a tremendous increase in Indego usage in disadvantaged areas in June and September. In December, bicycle and e-bike usage decreased and led to only a slight increase in the total number of trips.

Table 3 Indego trip summary in June, July, and December 2019 compared to 2018—in disadvantaged areas versus the rest of the areas and the percent change from the previous year

Disadvantaged areas composite 25% of the stations. These stations were the origin or the destination of about 22–23% of Indego’s bicycle trips in 2018, while the rest of the trips remain entirely in non-disadvantaged areas. In 2019, the share of bicycle trips in disadvantaged areas remained similar and even decreased, but together with the e-bike ridership, disadvantaged areas were responsible for about 25% of Indego’s ridership in June and September. In December, Indego’s ridership share in disadvantaged areas was even lower than the previous year.

The most significant advantage of an e-bike is its ability to go faster and further than bicycles. Indeed, e-bike trips were longer in length, a gap that intensified throughout the year. However, the average trip speed for e-bikes was lower than for bicycles in June, about the same in September, and faster only in December. E-bike trip duration was much longer than bicycle trip duration in June and September and only slightly longer in December. The increase in distance and duration is higher in disadvantaged areas, which indicates that users in those areas use e-bikes to reach more distant destinations. Most of the disadvantaged areas are in the periphery of the bikesharing system (Figs. 2 and 3); hence it is reasonable to observe longer trips.

Round trips indicate leisure trips. Indego’s pricing mechanism makes it unreasonable to use the service to reach destinations far from docking stations. Throughout the examined period, the share of bicycle and e-bike round trips was higher in non-disadvantaged areas. The only exception is a higher share of e-bike round trips in June in disadvantaged areas. These findings imply that leisure trips were less common in disadvantaged areas, but e-bikes leisure trips were more common in June.

In disadvantaged areas, Indego’s usage, as shown by the number of trips per station in Table 2, was slightly lower than the rest of the system. The usage of shared e-bikes in these areas, however, was higher. In June, e-bike trips that started or ended in disadvantaged areas almost doubled the trips between two non-disadvantaged stations. Thus, the overall increase of Indego usage was higher in disadvantaged areas than in the other areas. However, bicycle usage remained lower in disadvantaged areas. Figure 2 shows that bicycle usage in September was low in disadvantaged areas compared to other areas, while Fig. 3 shows that e-bike usage was as high as other parts of the system.

The temporal distribution of ridership on weekdays (Fig. 4) and weekends (Fig. 5) implies Indego’s trip purpose. The two-peek pattern, which indicates commute, is apparent for weekday bicycle trips in disadvantaged and non-disadvantaged areas in all months. The e-bike usage, however, has a weaker two trips pattern. Moreover, the pattern is weaker in disadvantaged areas and getting weaker with the progress of the year. The distortions in these patterns indicate either a greater share of leisure trips, utilitarian trips for purposes other than commute (such as errands, getting with friends, or participating in recreational activities), or commuting at different times. The weekend patterns show a one-peak pattern in all the months around all the areas as expected.

Fig. 4
figure 4

Indego’s hourly usage patterns for bicycles and e-bikes on weekdays in disadvantaged regions versus the rest of the regions

Fig. 5
figure 5

Indego’s hourly usage patterns for bicycles and e-bikes on weekends and holidays in disadvantaged regions versus the rest of the regions

Random forest results

The Increased MSE measurements for each explanatory variable are reported in Table 4 for the Total trips and in Appendix C for all the models by month. The most prominent explanatory variable across many of the models is the distance from the CBD. This variable accounts for the effect of location on the usage rate. Using Moran’s I tests, I found that all the usage variables are spatially correlated. Therefore, the docking station location plays a significant role in its usage patterns. This effect is slightly weaker for e-bike trips in some models; hence e-bike trips are slightly less related to the distance to the CBD. E-bikes may benefit areas farther away from the city center.

Table 4 Scaled increased MSE measurements for explanatory variables in Random Forest models for Indego trips in June, September, and December 2019

In most models, the sociodemographic variables have a low to medium effect on the number of trips. Among the four sociodemographic variables, the white population proportion has the most potent effect on bikesharing and e-bikesharing usage across 24 RF models. Median annual income per household has the most considerable effect across 16 models. The proportion of the population under the poverty line is the largest across eight models, and the black population proportion is the strongest in one model. For the models that examine the overall ridership and presented in Table 4 the increased MSE is higher for white population except for e-bikes in December, where it is higher for income. This suggests that a higher white population contributes more to the usage than the median income in that area. There are no major differences in these patterns between bicycle and e-bike trips. In all months, socio-economic status had a medium association with bicycle and e-bike usage with no clear trend.

Examination of the morning and evening trip models resemble commute patterns for both bikesharing and e-bikesharing. Bikesharing morning departures are associated with residential land use, population density, and intersection density (this correlation may be negative or positive), and arrivals are associated with commercial land use and employment density. Interestingly, e-bikesharing adopts this pattern only in September. Recreational land uses have a relatively low increased MSE across the models. The scores are slightly higher for September weekend e-biking trips.

E-bike trips were more associated with the student population most of the time. The share of males in the population has a low effect on the RF models but slightly higher on e-bike trips in June and September. The ratio of cars to people in the household has a low-medium effect throughout the models. Although in some models, it exceeds the association with other sociodemographic variables. Among the various transit variables, bus stops had a consistently high effect on e-bikesharing across the models but low on non-motorized bikesharing. This indicates that e-bikes are being used more in areas highly served by buses or poorly served by buses. Bicycle infrastructure has a low effect on bikesharing and e-bikesharing. Bikesharing density, represented by the existence of other bikesharing stations around the service area, has a medium effect in June and September and a medium–high effect in December for both bicycle and e-bike trips.

Negative binomial regression results

A conceptual summary of the regression results is presented in Table 5 for June, Table 6 for September, and Table 7 for December. These tables show only the significance levels (p-values) and direction of the coefficients for the different explanatory variables while the full results are presented in Appendix D. As discussed in Sect. 3, each model used a different set of explanatory variables, based on the RF results. Empty cells in the regression results tables represent variables that were excluded from the models.

Table 5 Conceptual summary of the negative binomial regression results for trips in June 2019
Table 6 Conceptual summary of the negative binomial regression results for trips in September 2019
Table 7 Conceptual summary of the negative binomial regression results for trips in December 2019

The regression results show that socio-economic status related the usage of bikesharing and e-bikesharing. Each of the 48 regression models has no more than one of the four variables that indicate the sociodemographic level: median income, poverty rate, white population rate, and black population rate. In 11 models, however, the number of cars per person, which is highly correlated with the sociodemographic variables, had a higher increased MSE than the four sociodemographic variables. Therefore these models do not have any sociodemographic variable. Nevertheless, in all but one model that include the number of cars, this variable is not significant. Thus, models with car availability show a weak connection between Indego’s usage and sociodemographics.

In June, socio-economic status was a significant factor for all the bikesharing models, but not for e-bikesharing, suggesting that e-bikes were used around neighborhoods from all levels of sociodemographics. In September and December, socio-economic status was a significant factor for all the bikesharing but only for about half of the e-bike models. Throughout the model, when they were significant, median income and white population rate had a positive correlation with usage, while the poverty rate had a negative correlation with usage, all in alignment with expectations. These results indicate that bicycle usage was much more related to sociodemographics than e-bike usage.

Indego’s weekday usage patterns imply that users use both bicycles and e-bikes for commute trips. In all the three examined months, most morning bicycle and e-bike trips start in a residential area (or area with higher intersection density) and end in areas with high employment density. Recreational land uses are mostly not correlated or negatively correlated with Indego trips. The only exception is among weekend e-bike trips in September.

The rate of students in the population is positively correlated in most e-bikes and bicycles model when present. In June, the student ratio is significant only for bicycle trips. In September, the student rate is correlated with two of two bicycle trip models and one of two e-bike models.

Interestingly, Indego’s ridership is positively correlated with bus stop locations and transit stops and stations across many regression models. This is more apparent in e-bike models. Trolley stops and regional train stations are mostly insignificant. Metro stations are positively significant in six out of nine models. Bicycle infrastructure is positively correlated with many bicycle models, but only three e-bike models. This shows that e-bike trips took place in areas with fewer bikeways and implies that e-bike riders are less sensitive to infrastructures.

Discussion

Indego’s usage has increased from June, September, and December 2018 to June, September, and December 2019. Many factors could contribute to this growth, including the introduction of new stations, a possible increase in the bicycle fleet, and a greater acceptance of bikesharing in the city. However, the most significant change in the system at that period was the integration of 400 e-bikes which led to an overall increase of Indego’s fleet. This study shows that while bicycle trips decreased in June and December and increased by 9.5% in September, together with e-bike trips, the total ridership increased in all the three examined months. It is also likely that the reduction in bicycle trips caused by a shift to e-bikes by some users.

In disadvantaged areas, e-bike usage led to a greater increase in ridership than in other areas. Moreover, the average e-bike trip duration and distance in these areas was higher than in other areas, which assures that people in disadvantaged areas used e-bikes to reach more distant destinations. Therefore, my conclusion is that the integration of e-bikes increased Indego’s usage in disadvantaged areas but did not increase its bicycle fleet usage.

While bicycle usage remains lower in disadvantaged areas, e-bike usage was relatively higher. The average e-bike ridership in a disadvantaged station was almost double than in non-disadvantaged stations in June. However, the difference decreased to less than 1.5 times more e-bike ridership in September and only 12% more in December. Similarly, disadvantaged areas, which composite a quarter of the study areas, were responsible for 25% of Indego’s ridership in June, 24.5% of the ridership in September, and only 22.1% of the ridership in December.

The RF results show no clear trend in the relationship between sociodemographics and ridership, but the regression results strengthen the notion of greater e-bike usage in disadvantaged areas. The coefficients remain highly significant for bicycle usage throughout the year, similar to the findings in previous Indego research (Caspi & Noland, 2019). For e-bikes, in June, none of the sociodemographic variables were correlated with e-bike ridership, while in September and December, only about half of the coefficients were significant. These findings suggest that the integration of e-bikes in Indego increased the overall ridership in disadvantaged areas by increasing e-bike usage but did not increase the usage of bicycles in these areas. The decline in ridership in disadvantaged areas in December may indicate that the e-bike effect was temporary; however, this study only examines three months, and a more extended period is required to determine that.

During the warm months, the share of e-bike trips was relatively high in disadvantaged neighborhoods, while those trips were much longer in time than bicycle trips in those areas and e-bike trips in other areas. The lower average speed of e-bike trips in disadvantaged areas implies that riders did not use the shortest route from their origin to their destination.

The hourly weekday e-bike trip distribution in disadvantaged areas (Fig. 4) shows a weak two-peak pattern. While bicycles and e-bikes used form commute across the city, the share of commute e-bike trips in disadvantaged areas was lower than in other areas. It strengthens the notion that e-bike trips in these areas were less for commute purposes compared to the usage of bicycles in these neighborhoods and e-bikes and bicycles in other neighborhoods. Though the prevalence of non-standard working hours among low-income workers may explain part of the weakened peaks, the bicycle usage at these areas at the same period does show a two-peak pattern.

When examining the entire network, Indego’s users use e-bikes for commuting, although less than they use bicycles for that purpose. The RF and the regression results show that for both bicycles and e-bikes, morning weekday trips are from residential areas to employment centers and in the other direction in the evenings. This finding is interesting as McKenzie (2018) suggests that dockless e-bikesharing services are not used for commuting but short utilitarian trips. The difference in pricing – subscription in Indego versus pay per ride in dockless e-bikesharing may be the source for the difference in trip purpose, although this requires further investigation.

The RF results suggest that race has slightly more association with Indego usage than income. This is interesting as income represents sociodemographics by default in many bikesharing studies (Bernatchez et al., 2015; Eren & Uz, 2020; Fishman et al., 2014; Murphy & Usher, 2015). Bikesharing and e-bikesharing are more common where the share of white population is larger. The median annual income and the proportion of the white population increase Indego usage, while the proportion of households under the poverty line and the proportion of the black population decrease ridership.

Findings from both RF and regression analyses show that Indego usage has some correlation with a high rate of student population. A high correlation of shared micromobility usage with students is a recurring finding in micromobility research (Caspi et al., 2020; Schneider & Stefanich, 2015) and was used in this study to control for that effect. This correlation raises the concern that low-income users are students rather than disadvantaged people. However, in most of the RF models income has a higher increased MSE value and in all the regression models that include students, the sociodemographic variable is highly significant. Therefore, disadvantaged populations that use Indego in this study are less likely to be students.

The presence of bus stops in the service area was found to be highly correlated with Indego usage in both the RF and the regression analyses. However, a correlation was not found with other transit modes including metro, trolly, and regional train stations. Interestingly, the bus variable is not correlated with any other variable, including the distance to the CBD. It is unlikely that Indego users use the bus in connection with bike riding, but not other modes, however, I cannot think of any other explanation for this finding and further investigation in this matter is required.

Conclusions

The integration of e-bikes in Indego offers an interesting case of e-bikesharing in a city with vast social polarization. E-bikes benefit riders with easier and faster cycling and give an option to use the bikesharing to reach more distant locations. As many disadvantaged areas served by Indego are in the periphery of the system, the additional e-bikes can better fit the needs of the residents of these areas. In this study, I examined the usage of bicycles and e-bikes, comparing disadvantaged and non-disadvantaged areas.

I found that the integration of e-bikes in Indego increased the overall ridership in disadvantaged neighborhoods. The average e-bike usage in these areas was higher than in other areas, but bicycle usage remained lower. People in disadvantaged areas use e-bike to have longer trips and reach more distant locations. The increase in ridership was high in the summer but weakened in the winter. A further study should examine the long-term effect of e-bikes on the overall ridership.

My various analyses’ findings suggest that like bicycles, e-bikes are being used for commuting in both disadvantaged and non-disadvantaged areas. The temporal analysis presents two-peak patterns in the mornings and evenings, and RF and regression results delineate trips from residential to commercial areas in the mornings and commercial to residential areas in the evenings. However, in disadvantaged areas, the share of e-bike commuters is lower, while riders use the vehicles for leisure and non-commute utilitarian trips more than in other areas.

The RF analysis also reveals that among the sociodemographic variables that I examined, the share of white population in the docking station’s service area has a greater association with ridership than the median annual income in almost all the models. This finding is interesting as many bikesharing studies use income to represent demographics (Bernatchez et al., 2015; Eren & Uz, 2020; Fishman et al., 2014; Murphy & Usher, 2015).

This study has some shortcomings in terms of the length and depth of the data. As mentioned, the sociodemographic analysis in this study is based on aggregated census data. Therefore they reflect the characteristics of the residents around Indego’s docking stations rather than the characteristics of the actual users. Due to privacy aspects, Indego does not provide user information; hence, there is a need to contact users actively to address this limitation. In addition, due to the lack of consecutive trip logs free of the interruptions of shutdowns and pandemics, I only examined three months of Indego usage. Bikesharing usage changes all time and is related to many factors along the way. To better view the integration of e-bikes in the long term, there is a need to examine a more extended period. Future studies should address these limitations.

This study generates a few possible policy implications. First, the integration of e-bikes in bikesharing is beneficial for promoting bikesharing equity. Second, e-bikes could connect distant neighborhoods with desired destinations and attract individuals who prefer a shorter and less exhausting bicycle ride. In addition, bikeshares should promote usage in areas with a less white population rather than areas with lower income, as the former show less participation in bikesharing.

The study focuses on the case of Indego in Philadelphia, a city with affluent white population in its center, surrounded by non-white non-affluent neighborhoods. The findings of this study can be generalized to cities with similar form and similar cultural context, such as North American cities, and possibly cities in western countries. Cities with a different form and cultural context may benefit from the findings of this study with the appropriate adjustments.