Keywords

1 Introduction

A head-mounted display (HMD) is a display device, worn on the user’s head and displays virtual objects for a user who wants to experience virtual reality (VR) and augmented reality (AR). In the past, it was necessary to connect a high-performance desktop to display virtual objects on the HMD, but recently, the mobile HMD, which can display virtual objects through its connection with a high-performance smartphone or built-in high-performance chip on the HMD itself, has appeared and allows the user to see virtual objects with no limit of time and space. However, in order for a user to interaction with virtual objects displayed on the mobile HMD, a portable input device is needed to manipulate those objects. Especially, to select the target virtual object, the input device can control the ray-casting which is one of the easily and fast selection method in VR or AR [1].

Conventional input devices such as a mouse, a keyboard, and a wand are not suitable for a mobile environment because a user needs to carry these devices. Therefore, the input device embedded in the HMD itself is proposed. Oculus Rift [2] and Gear VR [3] use an IMU sensor to track the orientation of a user’s head. The device allows user to move the cursor at the center of the screen on the HMD by rotating his/her head, but when the user wants to move the cursor quickly or frequently, it can cause the dizziness to the user. FaceTouch [4] proposed touch interaction by touch pad attached on the HMD, and Hololens [5] proposed a method of recognizing the user’s hand gesture by a camera attached on the HMD. When a user uses these methods in order to interact with virtual objects for a long time, his/her arm fatigue increases because the user must raise his/her hand. In addition, using these methods in a public space, he/she feels uncomfortable because these methods increase attention from others.

Therefore, in order to recognize socially acceptable gesture in a public place, the input device should be independent of the HMD [6] and be easily carried by attached on the user’s body. There have been much research on social acceptable devices and gestures. Belt [7] proposed a touch interface using the belt with multiple touch sensors. Although the menu can be easily manipulated depending on the position of the belt, there are limitations in selecting and manipulating virtual objects in 3D. FingerPad [8] and uTrack [9] proposed a method of tracking the finger position using a magnet attached to a finger and a sensor tracking magnetic field changes attached another finger. But these method is cumbersome to use because two sensors need to be attached. LightRing [10] uses a ring type device for tracking fingertip movements on the surface. However, this method is difficult to track the fingertip when the finger is unfolded and needs to add another sensor for the selection.

In this paper, we propose the AnywhereTouch, a finger tracking method using nailed-mounted inertial measurement unit (IMU) to allow a user to easily manipulate a virtual object on arbitrary surfaces. If a finger touches an arbitrary surface, AnywhereTouch calculates the normal vector of the touched surface and the angle between the surface and the finger with the defined finger gesture. Next, the finger movement on the arbitrary surface is tracked based on the inverse kinematics model and the quaternion measured from the IMU. And the finger tap gestures are recognized by analyzing the three-axis acceleration and the three-axis angular velocity from the attached IMU sensor. The virtual object can be selected by controlling the ray-casting based on the finger movement and finger tap gestures.

2 Method

2.1 Hardware

Figure 1-A shows the nailed-mounted 9-axis IMU. The 9-axis IMU can sense the 3-axis acceleration, angular velocity, and magnetic field. And orientation of the IMU is calculated from the 9-axis IMU data [11]. Bending the fingers on the arbitrary surface causes the nail to touch the surface, so the IMU is attached to the middle of the nail as shown in Fig. 1-B.

Fig. 1.
figure 1

(A) An example of AnywhereTouch with a nailed-mounted IMU on the surface. (B) IMU is attached on the middle of the nail to prevent the sensor from touching the surface.

2.2 Calibration

When an index finger touches an arbitrary surface, AnywhereTouch calculates the slope of the arbitrary surface, the angle between the surface and the finger. After the index finger is unfolded and the wrist is fixed as shown in Fig. 2–(A), the wrist rotates from side to side to calculate the slope of an arbitrary surface. Figure 2–(B) illustrates how to create the arc in unit sphere using the rotation angle from the IMU when the wrist moves side to side. The normal vector of the contacted surface is calculated based on points A, B, and C of the created arc from Eq. (1). Both ends of the arc are A and B, and C is the point on the arc that is closest to the center point of A and B. \( \vec{n} \) is a normal vector of the arbitrary surface touched by the index finger. The slope of the surface is calculated based on \( \vec{n} \) and \( \vec{z} \) from Eq. (2). \( \vec{z} \) is a unit normal vector of the x-y plane in the global coordinate system, and q plane is a unit quaternion representing the rotation angle of the contact surface. \( \theta_{init} \) is the angle between the surface and the index finger and is calculated from Eq. (3). \( \overrightarrow {CO} \) is the direction of the finger from point C on the arc to the origin point.

Fig. 2.
figure 2

(A) The movement of the index finger side to side for calibration. (B) Illustration of calibration.

$$ \vec{n} = \overrightarrow {CA} \times \overrightarrow {CB} $$
(1)
$$ q_{plane} = \left( {\vec{v}, w} \right) (\vec{v} = \vec{n} \times \vec{z}, w = 1 + \vec{n} \cdot \vec{z} ) $$
(2)
$$ \theta_{init} = 90 - {\text{acos}}(\frac{{\overrightarrow {CO} \cdot \vec{n}}}{{\left\| {\overrightarrow {CO} \cdot \vec{n}} \right\| }} ) $$
(3)

2.3 Tracking the Finger Movement

The anatomical model of the index finger and the quaternions measured by the IMU is used to track the movement of the fingers. The index finger’ joints consist of metacarpophalangeal joint (MCP) joint, proximal interphalangeal (PIP) joint, and distal interphalangeal (DIP) joint, which has 2 degrees of freedom (DOF), 1DOF, and 1DOF, respectively [12]. The side to side movement of the index finger is tracked by MCP joint’s z-axis rotation angle and the bent index finger is tracked by MCP joint, PIP joint, and DIP joint’s y-axis rotation angle. However, when the index finger moves from side to side, it is difficult to move the index finger because the range of MCP joint’s z-axis rotation angle is very small. Thus, as shown in Fig. 2-A, the side to side movement of the index finger includes the side to side movement of the wrist. The side to side movement of the index finger can be tracked by the z-axis rotation angle from the IMU.

If an external force is not applied to the fingertip, anatomically PIP’s y-axis rotation angle has 3/2*θ when the DIP’s y-axis rotation angle is θ [12]. And because the index finger is bent on the surface as shown Fig. 3-A, each joint of the index finger can be expressed as α, which is the y-axis rotation angle from the IMU, and θ, which is the angle of DIP as shown Fig. 3-B. θ is calculated from Eq. 5. H is the height from the surface to the hand and is calculated using the θ init . Equation 5 means H isn’t changed when the index finger is bent. 39.8, 22.4, 15.8 are the average lengths (mm) of the proximal phalanx, intermediate phalanx, and distal phalanx, respectively [13].

Fig. 3.
figure 3

(A) The bent index finger on the arbitrary surface. (B) Illustration of tracking the bent index finger based on IMU’s rotation angle and anatomical model of index finger.

$$ f\left( \theta \right) = {\text{H }} - \left( {15.8 \cdot \sin \left( a \right) + 22.4 \cdot \sin \left( {a - \theta } \right) + 39.8 \cdot \sin \left( {a - \frac{5}{2}\theta } \right)} \right) = 0 $$
(4)

2.4 Recognizing the Finger Tap Gestures

In this paper, we defined two finger tap gestures, which are ‘tap’ and ‘double tap’ as shown Table 1. The 3-axis acceleration and angular velocity from the IMU are used to recognize the finger tap gestures. The acceleration from the IMU includes the gravitational acceleration. The linear acceleration removed the gravitational acceleration can be calculated form Eq. (5). q IMU is the quaternion from the IMU and \( \vec{G} \) is the gravitational acceleration. Table 1 shows the angular velocity and the linear acceleration during the finger tap gestures. When a user performs ‘tap’, y-axis of the angular velocity generates a negative trough after a positive crest, and z-axis of the linear acceleration generates a positive crest after a negative trough. Similarly, when a user performs ‘double tap’, the signal of linear acceleration and angular velocitygestures and the signal of acceleration generated in ‘tap’ gesture is repeated twice.

Table 1. Defined the finger tap gestures and the signal of acceleration and angular velocity of each defined finger gesture
$$ \overrightarrow {{a_{cal\iota } }} = q_{IMU} \cdot \overrightarrow {{a_{IMU} }} - \vec{G} $$
(5)

3 Experiments and Results

Accuracy evaluations of the proposed method were conducted about tracking the index finger movements and recognizing the finger tap gestures. To measure the accuracy of tracking the index finger movements, we display the circle, the square, and the triangle pictures to participants and participants drew along the displayed images using their index finger. And to measure the accuracy of recognizing the finger tap gestures, participants were instructed to finger tap gestures on the rectangle divided 9 areas. After each experiment, we interview each participant.

10 participants is selected to participate in the experiments. We explained how to use the proposed method for 5 min. After then, we proceeded to experiment. The experiment was conducted on a desktop PC with Intel i5 4690 CPU, 8 GB RAM, and nVidia GTX 960 graphic card. And we used ‘MyARHS+’ to attach the the 9-axis IMU to index fingernail.

Figure 4 shows results of the accuracy. The average accuracy of the index finger movements is 93.19%. In detail, tracking accuracy of the rectangle movement is 94.85% (standard deviation (stdev): 2.66, standard error (se): 0.84), tracking accuracy of the circle movement 91.99% (stdev: 4.29, se: 1.36), and tracking accuracy of the triangle movement is 92.72% (stdev: 2.07, se: 0.65). After the experiment, 3 participants gave feedback that moving to a diagonal or curved line using the proposed method is a little bit harder than the straight line.

Fig. 4.
figure 4

Results of the evaluations (A) Accuracy of recognizing the finger tap gestures. (B) Accuracy of tracking the finger movements.

The average accuracy of the finger tap gesture is 89.81%. In detail, recognition accuracy of the ‘tap’ gesture is 91.11% (stdev: 1.67, se: 0.56), and recognition accuracy of the ‘double tap’ gesture is 88.52% (stdev: 2.42, se: 0.80). After the experiment, 7 participants gave feedback that it was difficult to perform finger tap gestures in the lower part of 9 divided areas.

4 Conclusion

The conclusion to be drawn here is that we proposed AnywhereTouch which can track the finger movements and recognize the finger tap gestures on the arbitrary surface using nailed-mounted IMU. The experimental results have convincingly demonstrated the accuracy of the proposed method. The proposed method has great potential in various applications of VR or AR with socially acceptable interaction in public space. In the future work, we will research how to easily manipulate virtual objects displayed on HMD in a public space based on the proposed method.