Keywords

1 Introduction

With increasing reliance on ever-more complex technology, it is important to revisit models of technology acceptance. [1] recently reviewed fourteen models and variants. Although his literature review identifies the Unified Theory of Acceptance and Use of Technology (UTAUT) [2], the Technology Acceptance Model (TAM) [3] and the Diffusion of Innovations (DOI) [4] models to be the most frequently referred to, there is no other attempt at empirical or other evaluation. [5] compared TAM initially with the socio-psychological Theory of Reasoned Action (TRA) [6]. Based on quantitative self-reported responses to a relatively simple technology among university students, initial comparison asserts that Perceived usefulness and Perceived ease-of-use, both constructs in the original TAM, and their influence on Behavioral intention from both TAM and TRA to be the most powerful predictors of technology adoption [5]. As well as the restricted demographic of participants casting doubt on the generalisability of results, empirical evaluation was based solely on restricted-item quantitative instruments.

Recognising such procedural limitations, [2] implemented a rigorous quantitative evaluation of eight acceptance models to derive the UTAUT model as an extended and unified version of TAM. Generalisability is more valid, they assert, by using employees rather than just students; timing needs to look at the potential for future adoption rather than analysing an historical intention to use; and voluntary as opposed to mandatory adoption is more illuminating. They do not, however, explicitly look at the type of technology or its complexity, though they identify this as a significant factor. Instead, they seek to introduce contextual variables and the moderating factors. These may resonate with the social norms and subjective factors alluded to in TRA and possibly the Theory of Planned Behaviour [7, 8]. The problem remains though that evaluation and validation are still based on quantitative survey data. Attitude and intention is known to relate to affect [9]. Further, the reported intention to act in a certain way does not necessarily mean that participants would actually behave in that way [10]. Indeed, Chuttur [11] concludes that the evaluation of acceptance models (in his case TAM) may have less relevance and practical application to information systems than claimed. Their evaluation, he concludes, needs greater rigour and relevance.

In parallel to the literature on technology acceptance, some headway has been made with combining attitudinal, contextual, technological and subjective factors specifically for trust in technology at least. McKnight, Thatcher and their colleagues, for instance, evaluated trust constructs in both adoption and post-adoption studies [12, 13]. Their experimental cohort is once again confined to students. Pickering, Engen and Walland [14] extended that model with increased theoretical validation, based on literature review only. Elsewhere, technology adoption has been associated with human adaptation to technology defects [15, 16]. Technology may also be embraced in support of human cognition, not simply via task automation [17, 18]. Recent application of affordance theory to information systems [19] refocuses attention on subjective responses in the acceptance and adoption process. It may well be that affordance perception has a role to play in the decision to adopt a technology.

Evaluation of acceptance models to date have been quantitative and largely theoretical only, therefore. Constructs like social norms and emotional response in general tend to be ignored. Further, there has been little attention paid to responses to complex technologies in a work environment. This exploratory study is a first step to addressing this problem. Two separate cohorts of IT professionals operating in different countries were asked to discuss their experience when evaluating a modelling tool. We compare both quantitative and qualitative measures of their responses to their experience of the tool in an attempt to provide a full indication of their intention to adopt the technology.

2 Method

To investigate usability, we initially used a standard quantitative instrument, the System Usability Score (SUS). In addition, we monitored behaviours and performance qualitatively in a limited ethnographic study. The approach is described in more detail below.

2.1 Research Questions

For the purpose of the studies reported there, we seek to address two research questions:

  • RQ1: can qualitative methods help re-evaluate existing approaches to technology acceptance?

  • RQ2: how can qualitative methods improve existing quantitative approaches to technology acceptance?

SUS scores were elicited and analysed to provide some measure of participants’ ease-of-use. These scores were compared with the outcome of the ethnographic study.

2.2 Design

We focus initially on two constructs from TAM, namely perceived usefulness and perceived ease-of-use. As highlighted above, they related directly to similar constructs in the more ambitious and rigorously derived UTAUT. In regard to the ease-of-use construct, we rely on a standard quantitative instrument, the original System Usability Scale (SUS) [20] as summarised in the Table 1 below. Participants score each item on a 5-point Likert scale from Strongly Agree to Strongly Disagree. Even-numbered items are reverse coded to maintain attention.

Table 1. The original 10-item system usability scale

Although [21] claim an 8-item scale gives sufficient reliability, and [22] cautions against its use for non-native speaker, we chose the original instrument for compatibility with other similar studies. This was complemented a small-scale ethnographic study [23] of tool use for the qualitative part of the study. Researcher notes from the ethnographic study were analysed with a simple thematic approach [24]. Following [2], we focus on a pre-adoption scenario to avoid any routinized experience as well as adaptive behaviours. Further, we focus in this exploratory research on a specific IT design-time technology [25, 26] targeted at IT professionals developing and supporting systems in the healthcare environment. This represents an important security-sensitive environment where risk, especially to patient care and data, must be minimised not least as a function of privacy by design [27,28,29]. At this time, the technology is under development within a research environment, though is being evaluated as a proof-of-concept by various commercial partners.

2.3 Participants

Two small cohorts of IT professionals were recruited one in Italy the other in Spain. All participants had between 5- and 20-years’ experience in their field, and covered multiple specific roles including design, development and maintenance. Five Italian engineers and developers were recruited from the Ospedale San Raffaele (OSR) in Milan. Each of the five participants was from a different area of the hospital’s IT department. The covered the following disciplines: Application Development & Management; Service Desk; Privacy, Procurement & Control; CRM, Business Intelligence & Process; and Enabling Services & Security. Four self-selecting Spanish participants were from the Biocruces Bizkaia Health Research Institute, including a bioinformatics technician, a computer science engineer, a software developer and a database manager. As such, they represent different disciplines associated with the successful development and deployment of IT services. They need to work collaboratively such that individual requirements from each area can be represented and included. All participants in both countries were fully informed of the intention to validate a security modelling tool which provides support in the identification of potential risks to individual components with a design as well as to the overall configuration. All agreed to participate.

2.4 Data Collection Method

The test context involved a scenario familiar to participants. Each cohort interacted with the tool in their respective environment working independently of each other: the Spanish participants worked together as a single team; the Italian participants interacted in two small groups of two and one working individually. After a brief introduction to the technology, participants were given a relevant task. They were asked to design a human-machine network similar to what they would expect when implementing a specific patient-data sharing service in their environment [25, 26]. The researchers in the two locations (the two co-authors MNJ and BLM) observed participants as they worked on the task, occasionally interacting with participants if specific technology questions came up. Although a limited ethnographic study in the traditional sense, it was nevertheless deemed appropriate at this stage to focus on how those who might use the technology would behave as part of a typical task for their professional role. Participants were encouraged to “think aloud” and verbalise their reaction to the technology. Following their initial experience with the tool completing the task they were presented with, participants were then asked to complete the SUS questionnaire [20]. So the experimental set up was almost identical in the two locations, with the exception of the groupings: in Italy, participants worked in small groups or alone, whereas in Spain they worked together throughout. The order of evaluation (task and ethnographic monitoring followed by SUS) may be significant, as discussed below.

2.5 Data Analysis Method

The scores of the SUS underwent the recommended treatment to provide an overall measure of usability [20]. This was taken as a quantitative measure of the usability (or ease-of-use) for the tool being investigated.

Research notes made during the sessions were reviewed to identify common themes and patterns in how individual participants reported their experience with the technology. The primary constructs of Perceived ease-of-use and Perceived usefulness from the original TAM as initial codes, with the addition of External variables which was defined as:

“… provid[ing] the bridge between the internal beliefs, attitudes and intentions represented in TAM and the various individual differences, situational constraints and managerially controllable interventions impinging on behavior” [5, p. 988]

This latter code was used simply as a catchall for any other observations which may be deemed relevant. In this way, we hoped to capture other factors such as the related Social Influence and Facilitating Conditions proposed by [2]. The results are presented in the following section.

3 Analysis

3.1 Quantitative Results

Initial experience with the modeller tool appears to be poor, with an average SUS score of 41Footnote 1 for the Italian cohort and 52.5 for the Spaniards, well below the target score of 68Footnote 2. Overall, the average across both cohorts was 46.1. The range of individual scores, however, went from 15 to 82.5. This would predict that the modeller would not be adopted, since ease-of-use as measured by the SUS instrument is so low. The item highlighted as problematic for non-native speakers, namely I found the system very cumbersome to use [22], was scored at 1.4 on average by the Italian engineers, and double that at 2.8 by the Spaniards. This is exactly the same, however, for a corresponding item I needed to learn a lot of things before I could get going on this system; and almost the same as I think that I would need the support of a technical person to be able to use this system (scored on average at 1.4 and 2.5 respectively for the Italians and the Spaniards). There appears to be some consistency in the scores, both internally (i.e., participants are consistent in their own judgements) and externally (i.e., the two cohorts report similar results).

One question is what the SUS is actually measuring. Of course, [2] replace the labels Perceived ease-of-use and Perceived usefulness from the original TAM with Effort expectation and Performance expectation. It may well be that the users in these cohorts, as experienced professionals, already have an expectation of how technology should be presented and that this may have influenced their perceptions. Effort expectation, therefore, if not at the level of current user interfaces for office tools might obscure any real Performance expectation. Secondly, it should be remembered that individual participants responded independently to the SUS questionnaire: they did not seek any consensus amongst themselves, but simply gave their own individual response.

3.2 Italian Ethnographic Study

Direct observation of participant interaction with the tool gave a different, more nuanced view. The Italian cohort reported multiple issues and concerns with Perceived ease-of-use. Overall, the researcher noted 27 specific problems from responsiveness to iconography across twelve separate aspects of the user interface. For example, the modeller requires individual components (“assets”) within the model to be connected appropriately. So, a connection between a browser and a host machine would identify that the browser runs on the host; a connection between two servers would support data transfer between them. But participants did not find the process of connecting assets obvious. They felt that there needed to be some description, either as a manual or perhaps one or more tooltips to explain the rules and logic behind connections.

The modeller user interface also provides a canvas to draw a model, along with a palette of objects to drag and drop onto the canvas. Participants complained that they couldn’t find basic functions like edit and save. In consequence, one participant supported the rationale behind the tool. He felt that the idea of such a tool to help identify risks and mitigation strategies was not wrong per se, but significantly concluded that it was the implementation that was at fault.

For these participants, then, without a positive response to usability, the TAM would predict negative effects on Perceived usefulness as well as Behavioral intention; UTAUT would similarly predict a negative effect on intention to use. Not surprisingly, therefore, observation of the Italian participants about Perceived usefulness suggested that the security modeller tool as implemented currently seems to make simple operations more obscure. There seemed to be a general feeling that there were other tools that would achieve the same results but better. Clearly, therefore, an implicit comparison was being made with existing technology. This prior knowledge and experience would lead to expectations and obscure the potential usefulness of the modeller.

The External variables in TAM or Facilitating conditions from the UTAUT for the Italian cohort includes experience with other competing offerings. As such, we can readily explain the poor SUS scores. To evaluate and assess the real potential of a prototype technology, the focus is really on searching out novel affordances which could motivate further investigation. Without a simple and intuitive user interface, the perception of those affordances is impossible [19]. Overall, only one of the five Italian participants reported any desire to continue investigating the technology.

For the Italian cohort, therefore, the SUS score is entirely consistent with a qualitative analysis of the discussion with participants. Evidence was found on specific aspects of Perceived ease-of-use which seem to mitigate against any exploration of ultimate technology adoption. This ties in with both the TAM and UTAUT models of technology acceptance.

3.3 Spanish Ethnographic Study

For the Spanish cohort, using the same technology provoked a very different response. Indeed, despite the low SUS scores (ranging from 37.5 to 65, with an average of 52.5), participants reported positive reactions to the user interface. During the ethnographic study, they stated that they had found the user interface intuitive and easy to use. Are they simply struggling with terminology [22] or are there other factors which the SUS score does not capture?

The Italian cohort, working in pairs or alone, reported multiple problems (27 individual issues across 12 areas). By contrast, the Spanish participants reported specific things which they found useful. For example, irrespective of overall system performance, they were pleased that they could develop an architecture relatively quickly (it took them about ten minutes). They also thought that results were provided in a format that made it easier to subdivide risks into different types, which would presumably make their respective jobs much easier. One participant even reported that they found the user interface “friendly”: specifically, the modeller tool presented threats directly with a description, making it easier to identify potential outcomes and to develop and implement appropriate control strategies. For the Spaniards, therefore, Perceived ease-of-use was poorly reflected in the SUS results. But the discussions they had among themselves moderated their views somewhat. In the context of Perceived usefulness, they seem to have re-assessed ease-of-use.

Their more positive attitude, although somewhat at odds with the SUS score, therefore, leads to an increased perception of usefulness: three of the four participants stated that they could see specific opportunities for deriving benefit from the tool in their respective roles. For example, the database manager wanted to use the tool to evaluate some of the databases that are hosted in the hospital; the computer science engineer thought it could be used profitably each time the architecture of the servers is changed; and the bioinformatics technician thought it particularly well adapted to be used in structural analysis (access to structural alignment services). Only the software engineer did not see any use for the tool, admitting that it was not so relevant for his job. His peers, therefore, could see beyond the usability of the user interface to identify potential benefits for themselves.

Beyond this, though, the participants in the Spanish cohort also began to explore possible scenarios where they could imagine that the modeller tool would have specific benefits. These may be summarised as follows:

  1. 1.

    Mitigating enterprise-wide risk: participants saw potential for the security modeller tool to be able to build up a picture of the whole enterprise security landscape.

  2. 2.

    Keeping risk exposure status up to date: further, participants felt the tool would help them responding to unexpected attacks as well as plan for future risk.

  3. 3.

    Promoting consistent enterprise-wide security policies: a comprehensive and consistent repository of known (and potential) threats along with appropriate mitigation strategies would help overall management of the site from a security perspective.

Participants have therefore not only looked beyond the specific problems they reported via their SUS scores but are furthermore identifying affordances in the technology. On that basis, they speculate about how to exploit those affordances, what [19] refer to as affordance perception and affordance actualization respectively. Whether these affordances would have been recognised without the Spaniards’ positive reaction to Perceived ease-of-use is a moot point. What is important is their willingness to explore potential. In effect, they begin to internalise their perception of usefulness to the point where they can see direct benefit to themselves within their own work context.

4 Discussion

In respect of our research questions, it is clear that quantitative methods alone may not be sufficient to capture all aspects of user response when testing technology (RQ1). But in addition, a simple qualitative approach (ethnographic monitoring of technology-focused behaviours) can begin to reveal a much richer and more nuanced interpretation of user responses to technology (RQ2). At the very least, Perceived ease-of-use as measured by the SUS is only part of the overall technology acceptance landscape. Although validated in multiple studies, the individual items encourage a rather broad stroke approach to technology use. Participants, especially IT professionals, may well compare a test technology with their experience or expectations from other technologies. Without specific design and implementation focus on the user interface, this is almost certainly bound to depress the SUS scores.

Other factors may account for these results. Participants in both cohorts carried out the test task first, and then responded to the SUS. In so doing, there is a very real chance that an initially negative reaction to the modeller tool for whatever reason—a user finding the user interface different from what they expect, and so forth—could lead to a negative memory of the technology and therefore a negative judgement. This type of priming has been well-attested for some considerable time [30]. The Italian participants may therefore have become fixed with the sheer number of issues they had found. Working independently or in pairs, there would be no opportunity for any diversionary discussion around the issue. Instead, for the pairs in the Italian study, one of the pairs it was reported would frequently have to provide support to the other.

Given that, though, why would the Spanish cohort react differently. They too report a negative response to the technology via the SUS questionnaire. But they didn’t focus on the negative aspects of the user interface. As outlined above, they began to think creatively about what benefit they could derive from the technology. One major difference in set up for the two test sites was, of course, that the Spanish cohort worked together. Although they too responded negatively to the closed set of questions in the SUS, the collaborative nature of the group work may have primed them for increased creativity [31]. They are more open to thinking of different ways that they might exploit the technology in the context of their own jobs. Indeed, consistent with the DOI, as one colleague begins to identify potential, so the others too perceive affordances in the modeller tool.

External variables [5] or Facilitating Conditions [2] seem to involve the collaborative environment within which the technology is being tested. Prior user experience and expectation may lead to an exaggerated focus on negative aspects (as seen in Italy) which prevents creative exploration of technology use. But if discussion and collaboration is allowed, perhaps, then all parties begin to see potential and are more positive in their response to the technology under test. There is a mediating effect, therefore, not of Perceived ease-of-use as predicted by the TAM, but of a willingness to think creatively and collaboratively. Such willingness depends on the perception of affordances which through actualisation seem to increase perceived self-efficacy and agency: potential adopters begin to explore how the technology can support them satisfy the requirements of their own responsibilities [14, 32, 33].

5 Limitations and Future Study

It should be noted first and foremost that the cohort of participants has some limitations, beyond the total number who took part. We do not claim saturation at this point, even though recommendations may be ambiguous in this area [34]. This was an opportunity sample, dependent on our access as researchers to colleagues in other areas of our respective institutions. There may well have been some experimenter effect with participants being more positive about a technology introduced to them by a colleague. The results may not therefore be generalizable. Further, because of our familiarity with technology acceptance models, there may have been some bias in our coding scheme to identify common topics in the feedback from the Spanish cohort to provide codes for the analysis of the Italian feedback.

What we believe we are finding here is increased complexity for technology adoption models. The perception and actualisation of affordances is related, we maintain, to a willingness to explore technology both pre- and post-adoption [12]. In turn, this leads to increased self-efficacy and agency [13, 14, 32]. We propose to revisit other validation activities we have been involved with in the past to re-evaluate those results for a larger cohort and in connection with other technology in a different domain. In addition, we will be using qualitative methods in other technology validation exercises to establish what sort of conclusions may emerge with a view to increasing the richness of the data itself [35]. We believe at least that qualitative methods provides a relatively straight forward mechanism to access constructs such as External variables and Facilitating conditions in a meaningful way. These are connected to self-efficacy and perceptions of agency when exploring technology potential.

6 Conclusion

The mixed methods approach as outlined in this preliminary study has begun to identify significant factors in assessing technology acceptance. In so doing, there is also some indication that the research protocol itself—that is the order of presentation of task and surveys, whether participants work individually or collectively—may influence the outcome of technology acceptance experimentation. Despite the limited scope and coverage reported here, and inconsistent though explainable results for the different cohorts, there is a strong indication that quantitative instruments may be insensitive to important considerations in technology acceptance beyond the initial estimate of Perceived or Expected ease-of-use. Previous evaluation of TAM and its derivative, UTAUT, as well as the SUS, may well fail to ensure completeness and ecological validity for technology evaluation. In reconsidering technology acceptance research, we maintain that qualitative methods provide a standardised mechanism to access subjective and social norms which have previously been highlighted in promising models such as TRA, TPB and possibly DOI, but left out of TAM/UTAUT as inaccessible.