Keywords

1 Introduction

Interactive systems increasingly include three dimensional environments as a means of realism, as well as the appropriate interaction techniques to provide natural manipulation of those environments. 3D environments are created to display objects, places, scenes and characters more lifelike and detailed compared to 2D visualizations. Virtual reality technologies (VR) are applied in 3D visualizations in order to enrich user experience and immerse users in the visualized environment [1, 10]. VR is an artificial environment that is created with software and presented to the user in such a way that the user suspends belief and accepts it as a real environment. The most prevalent devices to support VR are head mounted displays including Oculus Rift [4, 12, 15], HTC Vive [8], etc. Another approach for realizing VR environments is the CAVE approach [3] (Cave Automatic Virtual Environment), which, however, is more expensive and difficult to setup, requiring the users to be located in specific positions in the system in order to successfully feel immersed in the VR environment.

When using head-mounted displays, immersion is accomplished through stereoscopic vision and auditory feedback that the devices are capable of. Additionally, while wearing the headset, the users’ head movement maps exactly to where the user is looking in the virtual world, allowing scene’s investigation just by moving or turning around; however, travelling cannot be accomplished without additional input. In order for users to be able to interact in VR environments, several approaches have been employed facilitating the manipulation, grab or movement of the virtual objects or their view in the virtual world, including not only the wireless controllers that accompany modern VR headsets like Oculus Rift and HTC Vive [8, 12], but also wired gloves [2] and computer vision techniques [6]. The main goal of these approaches is to enable the user to manipulate the VR environment, as well as to navigate, select objects and ultimately explore further visualized data. User interaction modalities that are already used in VR environment include full body kinesthetic interaction [11], hand gestures [9, 14] and tangible interaction [13].

Computer vision based approaches for users’ interaction in VR environments are mostly focused on the processing of images acquired by depth sensors such as Kinect [16] and Leap Motion [7]. Each device serves different interaction requirements. The Kinect sensor is more appropriate for interaction from distance and in front of large displays, making use of the whole body and hands [6]. On the other hand, Leap Motion is commonly used in systems which require interaction close to the user and finger-based item selection. Gloves, wands (Wii etc.) and remote controls can also achieve user interaction and navigation both at a distance and close to the user, but require from the user to hold the devices. Computer vision approaches are more unobtrusive and offer a more natural user interaction with the environment since the users just use their bare hands.

This paper discusses the potential of using computer vision approaches for user interaction in VR environments, by providing four different interaction techniques for users’ navigation and interaction using their bare hands.

This work proposes the employment of mid-air gestures for head-mounted VR devices, allowing the users to interact with the virtual world in a natural manner without additional equipment. The benefit of gestural interaction is twofold: it allows users to both select elements and travel in space. Item selection is accomplished by pointing towards an item and pinching. In order to move in space, the users can perform a specific hand pose and move it towards the preferred direction and the user’s view will fly accordingly in the virtual world. Flying in a virtual environment allows the unobstructed movement in space in 3 dimensions, providing a travelling technique which is applicable to the vast majority of virtual environments.

2 Interaction

The interaction techniques for VR environments presented in this paper are based on the Leap Motion sensor placed in front of an Oculus Rift, which displays a virtual world to the users. This setup allows free user movement in space, enabling them to turn their head towards any direction. Gesture recognition is accomplished with the camera placed in front of the user’s head and therefore the user’s hands are never occluded by the user’s torso, which is a shortcoming for different setups where the depth sensor is placed in a static position. In order to move in space, several alternatives were examined. The users can close their fist and move it towards the preferred direction and the camera will move accordingly.

Four alternatives for navigating using gestural interaction are proposed. The underlying approach relies on the concept of being able to perform a specified hand posture in order to travel in space, whilst not interfering with the ability to point in the virtual environment. In all cases, the gesture is initiated when the user performs a posture and while the posture is tracked, the view of the virtual world moves with regard to the offset vector which is defined by the starting point and the current hand’s position.

  • Closed fist (Fig. 1): the user’s hands are closed in order to travel in space. This approach employs the movement metaphor of superman, as they are able to freely look in any direction and travel in virtual space in the direction of the offset vector. The cognitive model used is straightforward, as the users’ actions are augmented in a magic way through the common magical belief of the super hero’s ability to fly.

    Fig. 1.
    figure 1

    Closed fist

  • Open palm (Fig. 2): the user’s hands are open in order to travel in space. This approach is identical to the closed fist technique, however applied with the posture of keeping the hand open with all fingers extended.

    Fig. 2.
    figure 2

    Open palm

  • Open palm-normal vector (Fig. 3): the gesture is performed while the users keep their hands open, resulting into movement along the axis that is perpendicular to the open palm’s plane, both in the front and in the back side of the palm (positive and the negative values). The concept of this gesture is that the users define the direction towards which they want to travel by pointing with the open hand.

    Fig. 3.
    figure 3

    Open palm-normal vector

  • Open palm with all fingers extended (along palm’s normal vector, analyzed) (Fig. 4): the gesture performed is identical to the open palm-normal vector technique, but analyzes the palm’s normal vector with regard to the coordinate system of the users’ heads in three axes. The analyzed vector is split in vectors on three axes (x, y and z) and movement is performed along the dominant one, while the movement on the other axes is ignored. As a result, the users are able to move only in one direction at a time (i.e. left-right, up-down and forward-backward), offering increased precision and eliminating accidental movements in other axes, with the drawback of requiring multiple gestures to move in two directions (e.g. front and right).

    Fig. 4.
    figure 4

    Open palm with all fingers extended

The users’ hands are rendered in the virtual world in a one to one mapping to the physical world, creating the feeling of a mixed reality environment as the users perceive the hands that appear in their VR view as their own, and thus are confident that they have full control of the system. Furthermore, the movement speed is defined by the length of this vector, allowing the users to increase the travelling speed by moving their hand further away from the starting point.

The proposed approach aids orientation by employing an arrow placed above the user’s hands, indicating the applied gesture direction and scaled according to the movement speed (Fig. 5). Even though the feedback of movement in the virtual space might be sufficient in the case of travelling in environments with nearby points of reference, such as the ground, walls or trees, when travelling at a distance from displayed elements, such as flying over a world, in space or in underwater environments, the movement speed and direction may be unclear.

Fig. 5.
figure 5

Navigation in VR

In terms of item selection, directing the pointer finger at an item is used for aiming at an element. The pointing direction is lighted and a circular cursor is placed on the interactive element, if any, to designate the ability to select it (Fig. 6). Selection is accomplished through pinching, following the metaphor of clicking with a mouse.

Fig. 6.
figure 6

Item selection in VR

3 3D Data Centre Environment

In order to experiment with the proposed gestural interaction approaches a Data Centre 3D Visualization application [5] was used as a case study. The application aims at helping data centre experts to get an intuitive overview of a specific data centre room regarding its current condition. Additionally, the application facilitates the inspection of the racks and servers by the users and warns them, in an intuitive manner, about situations that need further investigation, such as an anomaly regarding a particular set of servers that may bring to surface malfunctions or degraded operation. The application was chosen as a case study since it encompasses both extensive travelling when viewing the data centre room from a distance and short movements when examining racks near the room’s floor.

The main screen of the Data Centre 3D Visualization application, which comprises a virtual representation of a data centre room and the basic interactive UI components is depicted in Fig. 7. All the room servers are grouped and displayed as 3D racks according to their physical location in space. Each rack may contain at most 40 servers which are displayed as a slice with a specific color, annotating its current condition. The virtual data centre room is constructed as a grid in the 3D space. The environment that encloses the scene is spherical and the servers’ grid is placed at the centre, so that users can have a 360° overview.

Fig. 7.
figure 7

3D data centre room monitoring

Upon the selection of a server of a specific rack, the visualization changes and more detailed information per server is displayed. The close – up view (Fig. 8) contains historical data information which is updated while changing the time or upon a server alternation. The historical data is designated through line charts in a spherical view so as the user to be enclosed in a spherical display of information.

Fig. 8.
figure 8

Close up view

4 Conclusion – Future Work

This paper has presented ongoing work regarding the potential uses of gestural interaction as a means of traveling in virtual reality environments in a natural manner. Four alternatives were implemented and utilized in the demanding context of a data centre room monitoring application, which requires both extensive traveling and fine-tuning the view aspect in-between racks. The early users’ comments on applying gestural interaction in combination with VR devices were very encouraging, as the approach proved natural, usable and efficient. The next planes steps involve conducting an extensive evaluation, assessing the users’ preferences both among the proposed alternate gestural approaches and in comparison to more traditional devices.

Virtual reality can be employed to provide improved user experience in the domain of 3D visualizations. The existing VR headsets will be further enhanced with portable and potentially embedded devices such as Leap motion, providing an environment supporting natural interaction which captivates human senses and offers improved perception of 3D spaces.