Keywords

1 Introduction

Human Computer Interaction (HCI) is evolving continuously. Nowadays the “third wave” of HCI studies is dominated by user experience (UX), virtual environments and brain-machine or gesture-control interaction [4] UX is concerned with how people interact with technologies and how these interactions are shaped by human values, contextual constraints, and society.

New technologies and interactions are demanding new methods to be really understood and holistically evaluated. Authors like Sweetser and Wyeth [21] claimed that understanding game usability had priority over understanding game enjoyment, but nowadays the study of just the usability dimension it would leave out of the analysis important factors. A discipline like UX field has offered a series of methods and tools; there are still some other aspects of the interaction that need to be explored. Among these is user engagement, which is focusing on offering a more robust measure of experience [14]. According to Lalmas, O’Brien and Yom-Tov [10] user engagement is the emotional, cognitive, and behavioural experience of a user with a technological resource that exists, at any point in time and over time and makes “the user wants to be there” (Jones n.d.).

User engagement cannot replace user experience or vice versa; thus there is no direct correlation between them, the aim of the interface design is to generate engaging experiences, although there is not a clear path to guide the design process. User engagement has been recognized as key factor to understand the user behaviour and overall efficacy in video games [6]; a positive video game experience provides an engaging and enjoyable time [5] that cannot be understood only in terms of usability analysis. There is a demand for deep understanding of its behaviour and of what makes a game enjoyable to the users [23], though users are demanding not only functional but also engaging applications.

Among the interfaces that are calling to generate a more engaging game experience are Virtual Reality (VR) and Head-Mounted Displays (HMD’s) to generate a full immersion of the user [3, 23]; however, it is also important to consider the way that a user interacts with the VR game. One of the possibilities is through a Leap Motion Controller (LMC) that is designed to track hand movement inside a small workspace and is capable of fine movement tracking. It is showing a great performance in small virtual environments, compared to Kinect, which only provides the means to detect body gestures [7].

As result, virtual game designers are seeking to generate engaging experiences, but they lack precise knowledge about how to generate and evaluate it. Even if most of the options that are available at the moment are questionnaires, there are few of them that are focused on video games. One example is the Flow State Scale (FSS) that was originally developed for physical activities but is now being used in video games. This leaves the need to further explored auto report methods to have and clear and agile instrument that can be used in the context of a video game.

O’Brien and Toms [15] have been developing the user engagement scale (UES), an interesting tool to report user engagement. Their approach explains engagement as a combination of six factors: Focused Attention (FA), Perceived Usability (PU), Aesthetics (AE), Endurability (EN), Novelty (NO) and Felt Involvement (FI) [16]. Their scale consists of a 31 item self-reported questionnaire that offers an overall evaluation of the experience, but also of user perceptions of each attribute. Even though the scale has been validated, mostly in shopping, social networking and news websites [2, 13, 14, 16] there are few studies that validated it as a helpful instrument for video game development [23].

The aim of this study is to contribute to the current knowledge of user engagement in games in the context of virtual reality environment with and HMD-LMC interface, by focusing on validating the UES for this kind of system.

2 Methodology

2.1 Participants

The simple consisted of 164 participants who attended the event “Jalisco Campus Party 2016”, an international event of high technology celebrated in Guadalajara, Mexico from June 29 to July 3, that brought together young talent of the technological and entrepreneurial community. The survey was applied to some attendants to the Universidad Panamericana stand who chose to play the game. The age range of the participants was between 16 to 35 years old (Median of 21.7 years old, Standard Deviation of 4.9). The sample was composed of 36 women and 124 men.

None of the participants had played the game before; also, it is important to mention that, as the video game is controlled by hand gesture interface (LMC), which is still little used in the Mexican technological community, most of the participants in this study did not have a previous experience with this controller.

2.2 Stimuli: Virtual Reality Game

The participants played a logic/ability virtual reality game named Fixer [19].

The game included a virtual reality environment with a hand controlled interface. The goal of the game is to assemble two different puzzles into one (Fig. 1). Each puzzle was composed of small cubes arranged to form the half of a bigger cube. Only four hand movements are recognized, vertical and horizontal cube rotations and push and pull actions over the cubes. The rotation of the cube is achieved moving both hands in the opposite direction while a pinching gesture, with the thumb and index finger, is done. The pull and push moves are accomplished making a natural movement of pushing and pulling the cube with the extended hand. The gamers had one minute to play the game, with an unlimited number of attempts. If the two puzzle halves were not correctly aligned, the assembly failed and the user needed to separate the pieces and start again; after the pieces were correctly joined, a new and more complex level appeared automatically. They were 16 different levels in the first version of the game.

Fig. 1.
figure 1

Screen shot of Fixer game, developed by UP Media Lab.

2.3 Hardware

The virtual reality environment consisted of an Head-Mounted Display, of the brand Oculus Rift 2. The hand controller interface was implemented with a Leap Motion Controller. This device employs two infrared cameras and three infrared emitters; according to the manufacturer, it has a 100\( \circ \) of field of view and a refresh rate of 75 Hz [20]. Both devices were connected through USB ports into a desktop computer. The gamer takes a seat in front of the computer; the depth distance from camera to the user was calibrated as a new gamer took place in front of the computer.

2.4 User Engagement Scale

The User Engagement Scale was measured by a 7-point Likert scale. The scale was based on the Wiebe, Lamb, Hardy and Sharek adaptation [22] that was modified to fit the video game context. From that base, the scale was translated into Spanish and presented on paper.

Table 1 shows the original scale, its adaptation to game experience and the Spanish version used in this study. Also, the engagement factors that are related to each item are included. Each item was named with the corresponding letter to the subscale and a consecutive number. Some of the items were originally formulated in negative mode, but they were inverted before the analysis; these cases are marked with a z in the name.

Table 1. User Engagement Scale adaptation.

2.5 Test Procedure

Participants were invited to play the game. Before the HMD was placed, a small introduction about the game was given verbally. At the beginning of the game, a screen with instructions was presented for a couple of seconds, during the game, no help was provided, but in the game screen, a timer and the level indicator were displayed above the puzzle cube.

According to flow studies [10], it is suggested to administer the questionnaire at the end of the game. Otherwise, it could result in negative ratings for user engagement due to the interruption rather than the actual experience. So, when the game was over, the participants left the seat and answered orally some demographic queries. As people tend to give more positive answers if the questionnaire is completed through an electronic device [9], the use of a paper-based instrument for this study was selected. The participants were required to respond each of the UES items marking the number on the scale; there were not time limit, and they could ask anything related to the scale, but no description of the item was given.

At the end of the event, the surveys were captured in a database, and the results were analyzed through the SPSS software.

2.6 Data Analysis

The data were analyzed to determine if the UES could be used to measure engagement in a virtual reality video game with an LMC interface. The reliability analysis was done following the steps proposed by [16], which include the examination of the reliability of the subscales, using the guidelines of DeVellis [8] and the application of the factor analysis of the data.

3 Results

3.1 Reliability Analysis

Table 2 presents the descriptive statistics and correlations between the subscales, as they were defined in O’Brien and Toms [16]. The reliability was analyzed using Cronbach’s alpha coefficient and it is included in Table 2. According to DeVellis guidelines [8] a Cronbach’s alpha coefficient between 0.7 and 0.8 is respectable and between 0.8 and 0.9 is very good.

Table 2. General statistics of the sample

Table 2 also shows the correlation between the subscales. The EN, NO and FI subscales presented the higher values of correlations over 0.7, so it is suspected that there is some overlapping among these subscales in the factor analysis. Another six moderate correlations were detected with a range of values between 0.4 and 0.6. The PU subscale was the one that exhibited the lowest correlations with the others subscales.

3.2 Factor Analysis

As aforementioned, the UES originally is divided into six subscales, Novelty, Focus Attention, Aesthetics, Felt Involvement, Endurability, and Perceived Usability.

As the objective of this study was to validate the UES to be applied in video games with an HMD-LMC interface, a factor analysis was applied to determine if the subscales are still differentiable. Oblique rotation with a Direct Oblimin was selected in SPSS [18], thus in these studies of generalizability, oblique rotations are preferred. In the oblique rotations, the factors were not considered orthogonals, so the scales were not completely independent between them. All the subscales were oriented to measure the engagement variable, so they must present some correlation between them.

The resulting component matrix of the factor analysis is presented in Table 3. The analysis converged after 25 iterations, and six factors were detected. The items of the subscales NO and FI were distributed in different factors.

Table 3. Factor analysis of the data

In the case of NO, two items (NO1 and NO3) were loaded in factor 1 together with the EN and FI items. While the last item (NO2) was loaded on factor 5 with the AE items. The items FI2 and FI3 were loaded on factor 1 with EN and FI items, while FI1 was loaded on factor 3 with FA items. Meanwhile, the items of the subscale PU were loaded in factors 2, 4 and 6. The distribution of the PU subscale items will be analyzed with more detail in the discussion section.

O’Brien and Toms [18] presented a comparative table between different studies. In which, the subscales of AE, FA and PUs presented a stable behaviour across five studies. Meanwhile, in just three studies the FI, EN, and NO subscales were loaded in the same factor; in addition, in the others studies, these subscales kept their structure and their items were loaded on different factors. The merge of three subscale items was also present in this study. Table 4 shows the distributions of the original items, proposed by O’Brien and Toms [16], the rearrangement by other three different studies and the result in this study. All these studies included the merging of EN, FI and NO items in one factor.

Table 4. Across studies comparison

Five principal factor groups were detected in this study. Also, the NO2 item was loaded with the AE subscale but in the others, it was associated with the factor that merged FI, EN and NO. However, this item also presented a good value to associate it to the merged factor (Table 3). Three studies eliminated three items from the total scale, reducing the number items from 31 to 28. Some of the eliminated items were loaded in the merged factor in our analysis. In the four studies, the item EN3z is load in the same factor that the PU subscale. In all studies, except the one of the Wiki-Search, the FI1 item is loaded with the FA subscale.

Table 5 shows the groups generated according through the factor analysis. The GF6 was an alternative proposal that was obtained after different trials.

Table 5. Factors detected by FA

4 Discussion

4.1 General UES Performance

During the recollection of the data of this study, the UES had an overall acceptance of the people who answered the questionnaire. However, there were some complaints about the length of the test, and a couple of items were misunderstood. The last point should be seriously taken into consideration, insomuch as the communication and consistency in defining constructs affect measurement [9]. Other studies that have discussed the duration of the test [2, 17, 18] have determined that it could be reduced to 19 items for the case of information searching [17] or even to 10 items, as it was used for Levesque et al. [11] when a haptic interface was tested. The high correlation between the items of FA and AE suggest that it is possible to reduce the items in these factors without affecting the final measurement.

4.2 Reliability Analysis

The reliability of the scale was examined using Cronbach’s alpha coefficient, which measures internal consistency for each subscale. The overall reliability of the scale was 0.860. All the factors presented an acceptable grade of reliability with Cronbach’s Alpha coefficient over 0.7. These values suggest that the results of the UES could be used in an initial test of engagement in video games with HMD and LMC with good confidence.

4.3 Factor Analysis

The factor analysis generates a different construct scale, which is an important issue to discuss. The factor analysis generated six factors that were different from the original scale [16] A high similitude in the scale construction was observed with the Wiebe study [22], while was divided into three factors in this study according to the factor analysis. The principal difference between both studies was the interfaces used, while the video-game Block Walk used a keyboard controller, Fixer used LMC. In general, there is a high consistence on the EN, FI and NO items that were loaded in the same factor across all the studies. These items were pick up from EN (1,2,4,5), NO (1,3) and FI (2,3). The result present in the factor analysis agrees with [13]. They present other two cases of studies where the merge of EN, NO and FI subscales are present, and they suggest to redefine the UE Scale as a four-factor experimental scale with FA, AE, PUs, and EN/FI/NO subscales.

Another difference is related to the behavior of Perceived Usability. All the studied consulted grouped together all the PU items, but in this case, the result was divided into three groups. In the first one, were PU1, PU3, PU4 and PU5 (coefficient of 0.818), in the second one were PU2 and PU8 and finally, PU6 and PU7 were together. After other re accommodations trials, two possibilities were found, the first one was to keep the first group and join the last two groups into one with a coefficient Alfa of 0.544. The second option was to put together all the PU factors as other previous studies have done, but the coefficient was reduced to 0.808 (Table 5). The first option was considered the best as long as the coefficient was higher that in the first option.

The four items that were put together in this new group were related with confusing to use (PU2), demanding (PU6), felt in control (PU7) and could not do things (PU8). These items could be related to a particular aspect of the usability. The general definition (ISO/IEC 9126) points that the usability is the capacity of an interfaced to be understood, learned, used and generated attraction. According to this, this group seems to be related with the understood and used of the game. On the contrary, the remaining group has items related to felt frustration (PU1), felt annoyed (PU3), felt discourage (PU4) and mentally taxing (PU5) that seems to have relation with the satisfaction/attraction dimension of the usability.

This difference was not previously showed in any other study. Thus there must be a particular factor causing this. There are two main possibilities, the first one is the HMD-LMC interface [1], and the other one is the duration of the stimuli. No other study has proved the UES in virtual reality environments, which can be a significative difference among the other studies. Even if the instrument had demonstrated to be reliable for this context, further research must be developed to identify which factors of the scale are more related to the virtual reality stimuli. Also, the time of this test was significantly shorter than in other studies, and normally this affects the efficiency dimension of the usability, so is also suggest making future research with different time of the experience.

5 Conclusion

The reliability analysis showed a respectable and very good alpha coefficient for all the subscales; this value shows that the scale could be applied in this way to video games with virtual reality environment and hand controlled interfaces.

After the factor analysis, the items were rearranged in five factors, Table 5. The reliability analysis for each new factor was calculated and the Cronbach’s alpha coefficient increased for the first four factors to values over 0.8 and lowering the acceptable value of 0.6 for the five factors. The factor that merges the EN/NO/FI subscales increased from an average of 0,747 to a value of 0.915, the second factor related to PUs went from 0,787 to 0.818, the third factor related to FA items went from 0.855 to 0,870, and the four factor related to AE went from 0,865 to 0.875. The items related to the five factor are also related with the PUs subscale as the factor two, but its behavior is different.

The factor arrangement calculated in the factor analysis is consistent with the scale structure present by Wiebe [22] when the UE scale was also applied in a video-game context. This similitude in the structure reinforces the scale validity in the video-game context.

The results suggest that the HMD-LMC and the time of the stimuli could be the cause for the arrangement of the factors (Perceived Usability), but there is needed further analysis to identify the particular elements that are causing that phenomenon.

Finally, the results for the perceived usability factor suggest that could be divided into two sub dimensions, one related to the direct use of the game, while the other will be associated with emotional or satisfaction aspects. Also, this last one group (PU1z, PU3z, PU4z, and PU5z,) seem to present a redundancy between the items, so it could be probable removed one of them to reduce the overall number questions.

This study found that the UES could generate confident results in general about the engagement in video games with an HMD-LMC interface. However, it is desirable to modify some items of the UES to include new elements present in this kind of systems. Also, it is necessary to carry out more studies about how new technologies such as HMD and LMC could influence in the engagement of video games and general software.