Keywords

1 Introduction

Head mounted displays (HMD) systems like Occulus Rift and HTC Vive are bringing virtual reality to public consumers. Unfortunately, HMDs anecdotally have a higher rate of cybersickness, or motion-sickness like symptoms due to visual stimuli, than virtual reality systems that employ large screens. These systems include single projection screens, large TVs, and multi-wall CAVEs, that normally track a user’s position. If the displays result in differences in cybersickness, then further study should separate the results into two distinct categories. There have been relatively few multi-display studies, and most alter other factors such as the field of view [1,2,3,4]. We controlled for additional external factors in our experiments which considered the HMD’s weight, the perceived screen size, and the rendering mode.

Cybersickness has a large number of potential factors with Kolasinski [5] and Renkewitz and Alexander [6] proposing over 40 possible factors. Virtual reality applications that use large screen systems often have different factors than HMDs beyond the display. For example, the field of view, interaction paradigm, viewing position, and rendering mode regularly differ. This makes comparison difficult as factors such as field of view have shown strong effects on cybersickness [7,8,9,10,11]. If the differences in the reported results are due to uncontrolled factors, future models could be simplified by the decreasing the number of factors.

Cybersickness and simulator sickness have been under study for decades, but there is a lack of predictive models for frequency and severity. With over 40 factors listed and hardware limitations, it is often difficult to hold all factors constant across different experiments. While many factors have uncertain effects, some factors that tend to vary between HMDs and large screens, are known to have an effect such as standing versus sitting, field of view, and independent visual backgrounds. Further difficulties occur since there are multiple measures of cybersickness and there are not conversions between different units.

While most factors not under analysis in our experiments were held constant, there remain a few that were affected. The refresh rate differed due to hardware limitations between the HMD and projection screen, and there was a small change in resolution. Neither appeared to affect our results. We could not control for the participants by selecting individuals that fit certain criteria, and Rebenitsch and Owen suggested the participants alone could account for 43% of the variance [12]. Therefore, participants were requested to attend both sessions of an experiment set.

Initially, we found there was a difference in display types, but after normalizing for the independent visual background, any differences were eliminated. This means that the differences between displays is due to other factors. This is fortunate, as it decreases the large number of factors that would need to be controlled in experiments. Specifically, we found no effect from weight or perceived screen size. Stereo is less clear, as any results appear to be small or there may have been an interaction effect. The display did not have effect when controlling for field of view, viewing position, application, and independent visual background. This also signifies that results from different displays can be compared, if normalized for different settings.

2 Background

Cybersickness results from visual stimuli, rather than physical motion as in traditional motion sickness. As most virtual reality requires a visual display, there is a risk of cybersickness, and the effect of the display is uncertain. One reason for the lack of multi-display studies is that such studies are costly in both time and resources. Also, the results on one display are assumed to transfer directly to another display. Unfortunately, these few studies suggest results do not transfer directly.

Sharples, Cobb et al. compared four different displays [2]. The HMD had the highest symptoms of nausea, while a large projection screen and a large curved screen had similar symptoms. A desktop system had the lowest scores. Liu and Uang examined a standard monitor, a stereoscopic monitor, and an HMD [4]. The HMD had higher scores than the standard monitor. Smart, Otten, and Stoffregen examined the percentage of participants that became ill using different displays [1]. They used the same sinusoidal motion in all conditions, although the visuals changed. They found 23% became ill in a moving room, 43% became ill in a space travel simulator, 17% became ill in a projector system, and 42% became ill with a HMD.

Unfortunately, in the above studies, several factors other than the display were altered between conditions, and some of these factors have shown effects on cybersicknes. These are weight of the display, the perceived screen size, the rendering mode (stereo or mono), the viewing position while using the system, the field of view, the application, the participant, and the interaction paradigm.

2.1 Weight

The additional weight of a HMD has been assumed to be the source of higher symptoms. However, only one study, by Dizio and Lackner, was found that examined the effect of weight [10]. Weight was a secondary component in this study and while they reported there was no effect, no statistics were reported. This left the effect of this factor uncertain, and was the motive for our first experiment.

2.2 Perceived Screen Size

A screen’s distance to the eye will affect how a participant perceives the size of an object and how the eyes adjust to focus on an object. Vergence is the location where the eyes’ focus crosses and functions as normal in virtual reality. Accommodation is the lense’s adjustment for distance. Since the screen distance is constant in virtual reality, accommodation is no longer accurate. No studies were found that directly examined this effect. Distance and screen size have been varied to change the field of view, but since the field of view has a known effect, we maintained the same field of view for our second experiment in order to test for a possible cybersickness effect due to the vergence and accommodation discrepancy.

2.3 Stereo

Stereo rendering is associated with higher rates of illness. Therefore, this may have an effect in virtual reality as well. There have been few studies using cybersickness as measurement. Hakkinen, Vuori, and Ehrlich reported only that the nausea symptom were higher during stereo rendering [13]. Kershavarz and Hecht compared four versions of a roller coaster ride: real stereoscopic video, a stereoscopic rendering of a three-dimensional model, a monoscopic video, and a monoscopic modeled rollercoaster [14]. They found no significant differences between the conditions, although the real stereoscopic rollercoaster cybersickness scores trended higher. Given these variable results, we included stereo as our third experiment.

2.4 Viewing Position

Large screen systems typically have participants standing and the screen updates according to where people move, while HMDs are often seated for safety. However, there have been past studies that show that such a change in procedure can influence cybersickness. Merhi et al. found that all their standing participants withdrew early [15]. Moss and Muth altered their procedure midway to include a hand rail due to a high dropout rate [16]. In general, increased tactile information seems to decrease cybersickness. This means when comparing systems, the viewing position should be the same between the two conditions.

2.5 FOV

The field of view (FOV) is traditionally assumed to be the horizontal or diagonal angle that a screen occupies in a person’s vision. FOV is one of the strongest cybersickness factors, with symptoms doubling when the FOV doubles. Seay et al. found that symptoms were higher for a 180° field of view than for a 60° field of view, even when changing user control and rendering mode [7]. Dizio and Lackner reported that halving the field of view also halved the symptoms of cybersickness [10]. Stoffregen et al. reported a similar doubling effect [11]. Therefore, having different FOV when comparing displays can dramatically affect the result. This is the second reason for our screen size experiment: to confirm that the changes in symptoms with changes of distance and size were not exclusively due to the change in FOV.

2.6 Application

The application can also have an effect, but metrics that fully quantify this factor are lacking. Application factors included are speed, color, contrast, brightness, scene content, and independent visual backgrounds. If using the same application for all conditions, most of these factors will remain constant. However, brightness, and independent visual background (IVB) still often change. Independent visual background are objects that appear static relative to the real world in virtual reality. They include seeing the real world around the display in large screen environments, having the virtual display overlap with the real word [17], and including objects that never change their position on the screen such as vertical or horizontal bars used in Duh et al. [18, 19].

Large screen systems normally have a brighter environment than an HMD. Given that a human’s flicker threshold decreases with light, and that flicker can cause migraines, any possibility of flicker could affect symptoms. HMDs typically have persistent screens, but still can have a flicker-like effect if quickly turning the head does not result in a perceived blur due to a too low refresh rate.

Independent visual backgrounds (IVB) are normally available in large screens, but require intentional inclusion in HMDS. Duh et al. found that merely adding constant lines (much like wearing a mask) decreases symptoms [18, 19]. Prothero et al. used a transparent screen to decrease symptoms [17]. The most dramatic result is the study by Kershavarz, Hecht, and Zschutschke [3]. They tested an HMD, a standard projection screen, and a projection screen with the external environment blocked to mimic an HMD. Without the environment blocking, the HMD and the projection screen had a significant difference, as is normally reported. With the environment blocking, the HMD and the projection screen had the same symptoms.

2.7 Measuring Cybersickness

Cybersickness has diverse symptoms, and thus, the measurements also are diverse. The most common methods are questionnaires, of which the simulator sickness questionnaire (SSQ) is in the broadest use [20]. This questionnaire asks for the severity of multiple symptoms and then groups them into nausea (e.g. stomach awareness, nausea, etc.), oculomotor (e.g. headache, eyestrain, etc.), and disorientation (e.g. vertigo, dizziness, etc.), and are abbreviated N, O, and D, respectively.

Following the SSQ in popularity are numerous one-question scales. The SSQ is too long for monitoring of participants, and thus a single numeric response is employed instead, and represents current feeling of wellness. While there are variations in the number of one-question survey, the questions are normally on a 0–10 scale, with higher numbers meaning greater illness.

3 Hypotheses

Experiments were designed to determine if the cybersickness effects were due to the display itself, or other factors that may be changed between two systems. The first experiment was done to assure that the assumption that weight would influence cybersickness. The second experiment was to determine if the perceived screen size, or that changing the screen size while the field of view is held constant, would still have an affect across different displays. The third was done to determine if stereo had an effect if the application and field of view remained the same. This resulted in the following hypotheses:

  1. 1.

    A heavier HMD will increase cybersickness.

  2. 2.

    Changing the screen size will not have an effect, if the field of view is held constant.

  3. 3.

    Stereo rendering will cause more cybersickness than mono rendering.

Since the experiments included both HMDs and large screens, we also considered whether there was an effect in changing the display, if the application, navigation paradigm, and field of view remained the same. This resulted in our last hypotheses:

  1. 4.

    HMDs will have higher cybersickness than large screen.

4 General Methods

All participants were over 18, and signed a consent form before they began. To decrease the effects of habitation, repeat sessions were separated by a minimum of one week. Participants were then given an X-box controller for navigation. The first session included a tutorial so that they could learn how to work the controls and play the game. Participants were then placed into the virtual treasure hunt environment. All experiments had the participants standing.

The participants were monitored for symptoms every three minutes using a one question scale dubbed an “immersion” rating. This rating asked, “On a scale of zero to ten, where zero is how you felt coming in, and ten is that you want to stop, where you are now?” The highest value during a session is called the “max immersion rating.” This is based on the scale from Bos et al. who showed good correlation with the SSQ-T [21]. However, we wished to allow participants to stop for any reason and avoid possible demand characteristics mentioned by Young et al. [22]. This proved necessary as only 30% of those that withdrew early specified nausea as their reason for stopping. Immediately following the session, the participants were given the SSQ.

Non-parametric statistical methods on the SSQ-T were employed as the results were decidedly non-Gaussian. The Wilcoxon test is a non-parametric test and has shown to be robust with respect to outliers, which are typical in cybersickness data. Paired tests were used in consideration of the effect of individual variations.

4.1 Environment

To generalize the results, the experiments required an environment that could be seen in the home. Specifically, the environment needed to be interactive, fully 3D, have least some effects of gravity (no flying), and could not be made with the intention to encourage cybersickness. A treasure hunt game was created to meet these conditions. The virtual environment consisted of five to nine rooms, two of which were mazes. The rooms were varied for each session, but always included one rectangular maze, and one curved wall maze. The object of the game was to locate all the items given in a left-hand menu as quickly as possible. Example screen shots are provided in Fig. 1. The environment was created to scale, if possible, and most objects were approximately 80 cm from the floor.

Fig. 1.
figure 1

Screen shots from the experiment virtual environment

The same participants were employed within each experiment set. This was to minimize to effect of the wide range of individual factors which could mask results. However, this does create the possibility of habituation and learning effects. To offset the latter, a different set of rooms and treasure list were provided in each session. Later analysis with the Kruskal test (the Kruskal test is a non-parametric variant of ANOVA) displayed no effect based on the choice of room set (p < 0.75).

4.2 Hardware

The virtual environment was presented using Vizard 3.0 with 3-sample antialiasing and a 4:3 aspect ratio. Tracking was done with an Intersense IS900 which has a specified latency of 4 ms. Formal tracker-to-display latency calculations were not performed. If stereo was used, the software IPD was 6 cm. The two different display technologies were Glasstron LDI-D100B HMD, with an 800 × 600 resolution, fixed interpupillary distance, and a 35° diagonal FOV, and a stereo projector with a maximum resolution of 1600 × 1200 and a refresh rate of 100 Hz due to the shutter glasses.

5 Experiments

Three main experiments were designed to determine how the display affects cybersickness: weight, perceived screen size, and rendering mode. Given that the HMD weight and screen size experiments used the same FOV, application, and the navigation paradigms, a forth cross-display analysis was also performed. This collated the results of the prior three experiments into a cohesive whole. The motivation for the cross-display analysis was to determine how much the change in cybersickness was due entirely to the display and not other factors.

The motivation for the weight experiment was to prove the assumption that the weight of the HMD caused the increase in symptoms.

The motivation for the perceived screen size experiment was to determine if the change in cybersickness when changing the display was due to the change in screen size, the change in display, or due to the change in FOV. A secondary motivation was that even if the FOV does remain the same, the discrepancy between the vergence and accommodation of the eyes would increase with the change in distance to the screen when assuring the same FOV. Therefore, this could still influence cybersickness.

The motivation for the stereo experiment was to establish a baseline difference for cybersickness with stereo and mono rendering with the SSQ measurement.

5.1 HMD Experiment Methods

Participants were recruited to test two different weight conditions with a HMD: the base condition with a weight of 340 g and the weighted condition with an additional 150 g. The additional weight was placed towards the front as is normal for HMDs as seen in Fig. 2. Participants were given the base and weighted condition at least a week apart, in random order. We recruited 24 participants for both weight conditions. Their average age was 19.8 years with a standard deviation of 2.5 years. There were 5 females and 19 males. We had 4 early withdrawals, all males, out of 48 sessions.

Fig. 2.
figure 2

The head mounted display

5.2 HMD Experiment Analysis

To our surprise, there was no effect of weight on cybersickness, with p < 0.88. This high p-value means that modern HMD weight is unlikely to affect cybersickness, and even extremely lightweight HMDs are unlikely to have an effect. While the mean and standard deviation’s reliability is less certain with non-Gaussian data, they still provide trends in values. The mean for the non-weighted condition was 23.12 with a standard deviation of 21.7, and the mean for the weighted condition was 29.6 with a standard deviation of 31.3.

5.3 Perceived Screen Size Methods

We recruited 22 participants for the screen size experiments with an average age of 19.9 years and a standard deviation of 2.6 years. There were 5 females and 17 males. We had 2 early withdrawals out of 44 sessions, with one male and one female participant.

In the perceived screen size experiment, the participants were presented with a 113-centimeter screen or a 70-centimeter screen in random order. These two sizes were chosen to mimic a moderate sized 1-walled CAVE, and a monitor. Under both conditions, the participants were placed at a specific distance to the screen so that they would have the same starting field of view as the HMD experiment. Therefore, the smaller screen had participants closer to the screen. Participants were permitted a temporary step in each direction, so therefore, the average field of view was identical for all the participants. Hardware limitations required the smaller screen to have 80% of the resolution of the larger screen.

5.4 Screen Size Analysis

As expected, there was no effect on cybersickness, with p < 0.66. The mean for the smaller screen was 18.5 with a standard deviation of 19.3, and the larger screen had a mean of 18.8 with a standard deviation of 15.7. This suggests that the differences reported earlier were changes in the field of view, rather than the change in screen size or distance.

5.5 Stereo Methods

In this experiment, the participants were permitted to move freely, and the screen field of view was increased to 225 cm on the diagonal. The field of view was typically between a 60–90° diagonal FOV during the session. A participant was presented with the stereo or mono condition, in random order, at least one week apart.

We had 28 participants, but 6 could not return due to time constraints. The remaining participants were on average 21.4 years old with a standard deviation of 3 years. There were 8 females and 14 males. We had 8 early withdrawals out of 50 sessions, consisting of 3 males and 3 females. Two male participants withdrew twice.

5.6 Stereo Analysis

There was no effect on cybersickness, with p = 0.22. The mean for the mono condition was 28 with a standard deviation of 27.3, while the stereo’s mean was 33.3 with a standard deviation of 24.1. This was somewhat surprising, but some prior studies found no effect, such as Howarth [23] and Kershavarz and Hecht only showed a trend [14].

5.7 Cross-Display Analysis

Participants were asked to attend both the screen size and HMD weight experiments. Both experiments adding the head tracked viewpoint directly on top of the controller position so that the “forward” direction remained the same in both conditions, and both the HMD and screen size conditions permitted only one step in any direction. This encouraged the HMD participants to always face in the same direction as they would in a large screen environment.

Since the HMD and screen size experiments showed no effect, a participant’s scores within each of the two experiments were averaged. If a participant did not attend both sessions in an experiment set, only their single score from the set was employed.

We had 24 participants with data for at least one HMD and one screen size experiment. These participants had an average age of 19.7 years with a standard deviation of 2.5 years. There were 5 females and 19 males. We had 7 early withdrawals out of 92 sessions, consisting of 6 males and 1 female. We had four participants miss either one HMD session or one screen experiment, either due to request or scheduling issues.

Initially, we found a significant effect on cybersickness, with p < 0.02, with the HMD having higher symptoms. This suggests one or more of the following. There was an IVB effect, the slight change in interaction had an effect, there was sufficient transfer of habituation, or a combination of these effects.

The study by Kershavarz, Hecht, and Zschutschke suggested an IVB effect [3]. They also held the application, interaction, and field of view constant, but saw an effect between a HMD and large screen display. However, they performed another experiment to clarify the effect. They removed the real room imagery in the screen condition so it would resemble that of an HMD, therefore eliminating the IVB effect. This rendered the screen and HMD cybersickness results statistically equivalent. If we normalize our data using the average change of 69.9% when an IVB is included from the literature [3, 16,17,18,19], there is no longer a statistical difference (p < 0.77).

6 Discussion

Cybersickness has numerous potential factors, and decreasing the selection of those factors that must be held consistent to compare results is a desirable element of research. The possibility of a display affecting cybersickness complicates matters, as there are numerous versions. However, there are been little research into directly comparing displays.

HMDs and stereo anecdotally have higher cybersickness, but experimental results were lacking. In our experiments, we found no statistical effect on HMD weight. This is contrary to expectations, but this is of benefit to HMD developers as it signifies that additional weight for hardware is not a cause for concern, except for one caveat. While the HMD weight did not increase symptoms, the participants were vocal about the discomfort of the display. The heavier display placed more pressure on the bridge of the nose, which was near universally disliked.

There was no effect on the perceived screen size. This was expected as the screen size is often changed to change the FOV, but FOV has a strong effect on its own. Since the FOV was not changed in this instance, any difference would be due to the eyes’ accommodation and vergence discrepancy. The lack of statistical significant result signifies that the discrepancy between vergence and accommodation does not influence cybersickness, at least within 1–3 m range. An effect with a wider difference may still be possible. Given the low amount of physical movement in our study, the effect of angular momentum is still uncertain. The lack of effect on perceived screen size benefits cybersickness researchers, as it signifies that the results from monitor experiments can be compared directly with the results from large screen experiments, assuming the remaining factors are held constant.

There was no effect with stereo, which was surprising, but the results are tentative. Stereo rendering may interact with the application. The human visual system only relies on stereo fusion out to several feet, and primarily within arm’s reach. The visual system uses other visual cues to estimate the distance of farther objects. Our environment did not include many items within arm’s reach. Mon-Williams and Wann also mentioned an increased effect on cybersickness if the focal distance changes frequently [24]. Also, several participants only showed a temporary effect in the immersion scores and returned to their baselines within 10 min. This may have affected the results as the effect may be time variable, rather than simply increasing with usage as is typical. While more study is needed, one can conclude that the effect is likely to be small.

HMDs initially showed a greater amount of cybersickness. However, after normalizing for the independent visual background (IVB) effect that most single large screens possess, there was no longer an effect. Our initial scores suggested the HMD was worse than a projection screen if an IVB was available, while Kershavarz, Hecht, and Zschutschk suggested the opposite [3]. We theorize this is due to expectation on the part of the participant. We allowed user directed interaction in an unknown world, while Kershavarz, Hecht, and Zschutschk did not permit user movement or interaction in the familiar environment of a car ride as a passenger. In our case, the users may have favored the familiar real world, while Kershavarz, Hecht, and Zschutschk user’s may have favored the familiar car ride. Ideally, another application holding these factors constant should be tested. This is to determine the effect of expectation which is a likely source of habituation. Specifically, the experiment could include an environment that is foreign to the real world, and an environment that is a common experience to see which stimuli is preferred.

In summary, differences in symptoms between displays are due to other inconsistent factors. Some factors that are likely affecting the results are FOV, viewing position, and independent visual backgrounds. Stereo remains uncertain, as its effect appears to be small, but there also may be interaction effects. Removing weight and display from potential factors is beneficial to researchers as there are numerous other factors that may still have an effect. The results of the IVB are promising for both normalizing results across displays and as a method to decrease cybersickness.