Keywords

1 Introduction

Navigation is one of the important interactions in games and applications within a 3-D Virtual Environment (VE). Navigation is used to complete tasks by controlling the movement of a virtual character, which represents the user themselves, inside a VE of a terrain or a city, either in first-person view or in third-person view.

Movement through VE generally involves two kinds of controls, movement direction and movement velocity. Each application uses some combination to achieve full navigation in 3-D VE [1]. Direction control can be achieved using orientation of the head-mounted display (HMD) or some other technique such as keyboard arrows. This paper will focus on velocity control.

Different velocity control techniques have been compared [2]. Although velocity can be set automatically based on the scale of the nearby environment [3], active velocity control based on handheld input devices can provide users with greater freedom of control. Traditionally, the handheld input devices used for velocity control are common hardware such as a mouse, keyboard, touchpad or joystick. However, the emergence of cameras for capturing human posture and motion allows hand or body gestures to be used to control movement velocity. These touchless techniques, typically based on 3-D hand gestures, may provide a more natural user experience than the traditional handheld input device, especially in systems that use large displays in public exhibition places or HMDs [4, 5].

A natural technique for touchless velocity control is to map the user’s hand position to forward or backward velocity using a linear relationship, much like the velocity of an automobile is controlled with a foot pedal. Some systems have adopted this technique to allow large scale travel in a VE [6]. For this technique, the mathematical mapping of hand position to velocity is likely to influence user performance and user experience. Although different velocity control techniques have been compared, including touchless ones versus traditional ones, the user experience on various velocity mapping functions remains relatively unexplored.

In this paper, we evaluate 3 parameters to map the user’s forward/backward hand position to forward/backward movement velocity in VE: the hand position to velocity slope (simple sensitivity); the hand position to velocity linearity (complex sensitivity); and the size of the zero-velocity area around the resting hand position (dead zone size). In addition, we also compared forward/backward hand movement with the hand floating vs the hand supported on a desk surface. Subjects completed low precision and high precision navigation tasks with different mapping parameter values. The outcome measures were performance, preferences and comfort. The task involved moving as rapidly as possible in a forward direction (1-D) to a visible target of different widths and distance, and has some similarities to the Fitt’s tapping task [7], in which user performance is proportional to target size and inversely proportional to target distance.

In Sect. 2 of this paper we introduce the 3 parameters for the mapping of hand position to velocity. In Sect. 3 we describe experimental methods. In Sects. 4 and 5 we present some of the results and discuss interesting findings.

2 Velocity Curve Parameters

We define a 1-D mapping function that converts the forward/backward position of hand to the velocity of forward or backward self-travel in a virtual environment. The absolute position of hand is not directly used in the mapping function, because in practice the position where the user sits or stands and the position of the hand motion sensor device could be arbitrary every time the user engages with the system. Therefore, at the beginning of each engagement, we set a zero point which is the position of the palm when the user feels comfortable and ready to control self-travel in the virtual environment. Typically, the upper arm is next to the torso, the elbow is bent to approximately 90°, the forearm is parallel to the floor, and the user assumes a loose fist or relaxed hand posture. The hand may float in air or be supported by the desk surface. D is defined as the forward-backward distance of the hand relative to the zero point, with positive representing a hand position forward of the zero point, and negative representing the hand position backward from the zero point.

On either side of the zero point there is a distance called the dead zone which is centered at the zero point and has a total width of \( \lambda \). When the hand is inside the dead zone, the velocity (V) remains 0. When the user’s hand moves beyond the dead zone, e.g., when D is greater than \( \lambda \)/2, movement in the VE begins. Forward velocity increases as the hand moves forward of the dead zone boundary. The width of the dead zone \( \lambda \) is one of the parameters of the velocity mapping function evaluated.

Beyond the dead zone, the shape of the hand position to velocity curve is defined by two sensitivity related parameters, simple sensitivity α and complex sensitivity \( \gamma \). Simple sensitivity α follows a classic meaning of sensitivity commonly used in many computer games, i.e. representing a linear change in velocity relative to hand displacement. Complex sensitivity \( \gamma \) represents a nonlinear (e.g., exponential) relationship between hand position and velocity.

The mapping function of hand position D to velocity V with the 3 parameters is

$$ V = \alpha \left( {\frac{D - \lambda /2}{M}} \right)^{\gamma } $$
(1)

where M is a constant value. The purpose of M is to define the position where multiple curves with different nonlinear parameter values meet; in other words, the velocity is independent on γ when the distance of hand position to the dead zone boundary is M.

When D is less than \( - \lambda \)/2 (e.g., backward movement), the mapping function is a negative mirror of its positive counterpart. In addition, the ceiling for D is set to be 15 + \( \lambda \)/2 (cm); thus V does not increase despite D exceeding the ceiling threshold. The 15 cm distance was chosen to keep shoulder flexion within a comfortable range, approximately 0° to 45°, for a person of the 50th percentile [8]. We chose M to be 7.5 cm, half of the ceiling distance, so that different complex sensitivity values resulted in less overall difference in velocity over the full range of hand motion.

A nonlinear velocity curve (i.e. complex sensitivity \( \gamma \) > 1.0) provides an area just beyond the dead zone where users can operate at a slower velocity when approaching targets that require increased control. It also allows users to move with higher velocity between targets since the velocity increases more rapidly as the hand is further away from the dead zone.

3 Experiment Method

3.1 Virtual Environment

The virtual environment included a virtual character controlled by users which moved forward to arrive in a visible target ahead of the user. Backward motion was allowed when a target was overshot. The virtual environment included an endless road, 3.7 m wide with side railings of 0.8 m high, which disappeared into the distance. Trees were periodically visible just outside the railings to provide a sense of distance. The virtual character stood in the center of the road and provided the viewpoint, but the character was not visible to the user. Instead, a red line crossing the road indicated the position where the virtual character was standing. The red line was positioned 0.3 m above the road floor. A virtual camera was placed 1.5 m above the road floor and 2.5 m behind the virtual character, and followed the virtual character whenever the virtual character moved, thus capturing and rendering a third-person view. Movement was constrained to the center of the road. By flexing at the shoulder, movement of the hand/forearm segment controlled the forward or backward movement of the virtual character along the road. The relative position of the camera to the virtual character is fixed, therefore, the red line always appeared at a fixed location close to the bottom of the screen (Fig. 1).

Fig. 1.
figure 1

Screenshot of the VE which included an endless road for 1-D navigation wherein a red line indicated self-position. Two yellow lines plus a semi-transparent bubble indicated a target for subjects to arrive in (Color figure online).

Subjects moved forward or backward along the road to move the red line (i.e., self) to the next target, indicated by a sphere that was marked by two yellow lines (Fig. 1). The distance between the two yellow lines and the diameter of the sphere, were the target width \( \rho \). The bubble sat on the road surface, and was meant to provide information about the distance to the target. Users could easily see when the red line was between the two yellow lines, which was the ultimate goal of the task.

3.2 Experiment Environment

The virtual environment was rendered on a desktop PC, and a monitor was placed on a table. The table and the chair were height adjustable and set to fit the subject’s anthropometry. A depth camera (RealSense F200, Intel) was attached to the end of a mechanical arm, and positioned in front of the right hand pointing horizontally toward the users’ hand, about 5 cm above the table surface.

3.3 Hand Position Detection

Hand position was detected with a depth camera. To improve stability of hand detection, an environment-robust hand detection algorithm was not used, but instead, the object closest to the camera was assumed to be the hand. A histogram of depth value of all the pixels was calculated for every frame captured by the depth camera. To minimize noise, 0.1% of all the pixels that had the smallest depth values were ignored. After removal of the ignored pixels, the pixels that had a depth value in the range of [d, d + 150] were considered belonging to the hand, where d was the smallest depth value. The depth of the hand, i.e. the distance of the hand to the camera D h was calculated as the average depth value of all the pixels that belonged to the hand.

Assume the zero point had a depth value of D z, the hand position D which was used in the velocity mapping function was defined as D = D zD h, i.e. the distance between the hand and the zero point.

3.4 Subjects

Sixteen subjects with ages ranging from 18 to 35 years participated in the experiment. The experiment was approved by the University of California at Berkeley Committee on Human Research. Seven were male and 9 were female. Eight subjects report ethnicity to be Asian, 6 Caucasian, and 2 were Latino. The mean height was 1.74 m (range 1.57 m to 1.96 m). Two subjects reported using their left hand for writing, ten reported no experience in video gaming and ten reported using tablet or smart phone more than 10 h per week.

3.5 Task

Subjects were instructed to move their right hand forward or backward in order to move the red line to the next target in the VE. When arriving at the target, indicated when the red line was between the two yellow lines, they pressed the space bar with their left hand. Arriving in the target was considered successful if the red line was inside the target and was not moving. After successfully arriving in a target, the target disappeared and the next target appeared. Subjects were instructed to move as rapidly as possible to acquire the next target. There were a total of 30 targets per trial. If the space bar was pressed when the red line was outside of the target, the subject heard a sound indicating failure and then continued until successfully reaching the target.

The targets appeared at varying distances from the virtual character. To account for practice, the first 6 of the 30 targets in a trail were not considered in the analysis. Among the remaining 24 targets, 8 targets each were set at distances of 3 m, 9 m and 15 m from the last target. For each trial, the target size was either a small (1 m) or large (2 m) width.

3.6 Experiments on Different Parameter Levels

The three mapping function parameters were each tested at 3 different levels. The levels were chosen based on pilot tests. The levels tested for α were 7, 14 and 21 m/s; for \( \lambda \) were 40, 60 and 80 mm; and for γ were 1.0, 1.4 and 2.0. Examples of velocity curves of different complex sensitivity levels (\( \gamma \)) are presented in Fig. 2. When testing the levels of one parameter, the other two parameters were set at the middle value of the levels tested (i.e., α = 14 m/s; \( \lambda \) = 60 mm; \( \gamma \) = 1.4).

Fig. 2.
figure 2

Velocity curves of movement velocity V against hand position D with 3 different levels for complex sensitivity γ. In this example, the dead zone width is 6 mm.

For each of the 4 experiments, subjects completed 6 trials; one trial for each of the 3 parameter levels for both the large and small targets. For each experiment there were two levels of randomization; the first was target size followed by randomization of levels within each target size. After the parameter levels were randomized they were labeled A, B, and C so that subjects could reference them on the usability questionnaire. The order of experiments was simple sensitivity α, followed by dead zone width \( \lambda \), then complex sensitivity \( \gamma \). During the first 3 experiments, subjects rested their hand on the desk surface during hand movement. The 4th experiment repeated the simple sensitivity experiment but required subjects not to touch the desk surface with their hand or arm (i.e., free floating).

Before starting the 4 experiments, subjects spent approximately 15 min practicing self-travel and target attainment with different target distances and sizes.

3.7 Questionnaires

Within each experiment, following each trial of a given level, subjects completed a questionnaire which rated the fatigue they felt following the completion of the trial using numbers 1 to 5 where 1 described no fatigue and 5 described extreme fatigue. After completing the 3 parameter levels within an experiment, subjects answered two questions: (1) “which level (A, B or C) did you prefer the most and the least for ease of stopping at the target” and (2) “which did you prefer the most and the least moving toward the target”. Open-ended comments comparing levels were also encouraged.

4 Results

For each target presented within a level, the performance time (duration from target appearance to successful movement to target) was recorded. The mean and the standard deviation of the performance time for the 24 trials tested at each level of each parameter were calculated (Tables 1, 2, 3 and 4).

Table 1. Mean time in seconds (SD) to reach each target for different simple sensitivity levels by target diameter and target distance (N = 16), when hand touches desk.
Table 2. Mean time in seconds (SD) to reach each target for different dead zone width by target diameter and target distance (N = 16).
Table 3. Mean time in seconds (SD) to reach each target for different complex sensitivity levels by target diameter and target distance (N = 16)
Table 4. Mean time in seconds (SD) to reach each target for different simple sensitivity levels by target diameter and target distance (N = 16), when hand floats in air.

Within each experiment, differences in performance time between parameter levels were analyzed by ANOVA and the p-values are shown in Table 5. Dead zone width has no significant influence while the other two parameters both have significant influence.

Table 5. P-values on performance time for the different experiments (ANOVA).

Sometimes subjects overshot when trying to arrive in a target. The percentage of overshot targets within each experiment (parameter tested) is shown in Fig. 3.

Fig. 3.
figure 3

Overshot percentage with different parameter levels, when target size is 1 m and when target size is 2 m.

The average subjective rating of fatigue after each trial in time sequence (independent of parameter levels as the order of levels is block randomized) is shown in Fig. 4. Preference rankings for parameter levels are shown in Fig. 5 and were analyzed using the Friedman Test (Table 6).

Fig. 4.
figure 4

Average fatigue ratings after each trial. There were 6 trials in each experiment.

Fig. 5.
figure 5

Summary of preference rankings for each parameter level by subjects.

Table 6. P-values on subject preference for the different experiments (Friedman Test).

Preferences for moving were significantly different between levels for simple sensitivity and dead zone width, but not for complex sensitivity (Table 6). Differences in preference were more pronounced across trials of larger target sizes. Preferences for stopping were only significantly different for complex sensitivity with large target size.

5 Discussion

Subjects were able to rapidly adapt to using hand gestures for self-travel in a virtual environment. However, there were clear performance and preference differences for different levels of the parameters tested that map hand position to velocity control.

For example, the parameter simple sensitivity adjusted subjects’ control of speed. As the slope increased, the performance time decreased but at the expense of more overshooting. For distant targets and large targets subjects preferred the larger slopes because the smaller slope is too slow. On the other hand, for stopping at small targets, the smaller slope was preferred to improve precision of travel. For smaller target distances, the small slope provides greater precision of movement control with less overshoot.

Performance was not influenced by the size of the dead zone. Many subjects reported that the different dead zone widths were the most subtle of the three parameters tested. However, there was some preference for the narrow (40 mm) or middle dead zone width (60 mm). The wide dead zone was not favored because it required subjects to move their hand more than the other dead zone sizes and was associated with more shoulder fatigue.

With a larger complex sensitivity value, the rate of velocity increase was less when the hand was just beyond the dead zone and rose dramatically the further the hand was moved from the dead zone. This exponential relationship was tested because it could provide more precise control in the slow velocity range when approaching a target, yet greater velocity during travel. It is evident that the different complex sensitivity levels influence subject performance, but the trends varied with different target sizes and target distances. The most non-linear curve (\( \gamma \) = 2.0), tended to have the worst performance and preference characteristics and was described as difficult when moving to small targets due to increased delay. On the other hand, the linear curve level (\( \gamma \) = 1.0) was described as having an abrupt change in speed with very little hand movement at the edge of the dead zone.

In the two experiments on simple sensitivity, comparing moving with the hand floating in air versus supporting the hand on the desk, performance was better when the hand was floating. This may have been a training effect related to the order of testing, since the floating condition always occurred after the supported hand conditions, and, therefore, subjects were more familiar with the experiment. However, the average fatigue ranking were substantially greater immediately following the free floating condition (experiment 4) indicating that subjects experience more fatigue due to efforts to keep hand floating in air versus supporting it on the desk. Despite the increase in fatigue, subjects varied in their preference for their upper extremity being supported versus free floating. Subjects did report more stress and fatigue in the shoulder during the free floating condition yet reported that the hand friction while the upper extremity was supported may have impeded their control. Some subjects reported that although the free floating condition caused more fatigue, it was more comfortable. Despite the differences in fatigue and comfort, the preferences for parameter levels is similar for the two experiments. It was more difficult for subjects to remember the locations of the boundaries of the dead zone during the free floating condition, since touching the desk provided a reference for the position of the hand relative to the dead zone. Also, subjects tended to move their hand further forward and backward during the free floating condition.

Of interest is that two subjects found easier ways to stop forward travel than pulling their hand back to the dead zone. For example, one subject raised the hand vertically by flexing the elbow to stop and another subject closed the fingers to form a fist to stop. Some subjects controlled velocity with sudden movements of their hand while others moved their hand smoothly.

6 Conclusion

With new depth sensing cameras, hand position can be used to control self-travel velocity in a virtual environment. The shape of the mapping function for converting hand position to velocity influenced user performance and preference. Specifically, the slope of the hand-position-to-velocity curve and the shape of the curve will affect performance of arriving in targets of varying sizes and distances. In addition, users have a preference for certain levels of hand-position-to-velocity slope and dead zone width. The use of non-contacting hand gestures provides a new, naturalistic method of human-computer interaction in a virtual environment. With optimization, this method of interaction may augment or replace interaction using controllers or other physical devices.