Keywords

1 Introduction

User interface has become more important than in the last decades as a growing number of users perform several tasks with their devices currently [1]. Hugh computer machines used firstly to military and mathematics purposes became smaller, more personal and multitasking [7]. Therefore it is necessary using methods and processes of both Usability Engineering and Human-Computer Interaction (HCI) for design user-centered systems and evaluate them to ensure that [2].

Between these methods, there are the usability evaluation ones, that aim to ensure the systems are suitable and satisfactory for users, helping them to achieve their goals. In the usability test method, for example, interface designers, who conduct the evaluation sessions, observe the users behavior while they use the system or its prototype, performing designed tasks on it; other method is the questionnaire, used to obtain subjective impressions of the users about the usability of the evaluated system by completing paper forms [3].

However if these evaluations are performed by different designers, studies cited by [4] show usability findings can vary widely and these results will hardly overlap, even if the same evaluation technique is used. This indicates lack of systematicity and predictability of the evaluation results, besides of covering part of the possible actions users may take. Automating some aspects of usability evaluation can be a solution.

This has been done to capture user data mainly [4]. Questionnaires can be automated to reduce evaluation costs and to facilitate data collection and summarization, besides statistics generation to help interface designers, facilitating obtain objective data resume to make better design decisions along the software development process.

Therefore the purpose of this article is to analyze the possibility of automating a standardized usability questionnaire to automate it subsequently. For this, a bibliographic research was carried out on HCI concepts, usability evaluation methods (including usability questionnaires) and their automation.

A systematic review was performed in the literature to identify automated usability questionnaires and how this was made. Some usability questionnaires found are already available in tools. They are standardized (scientifically reliable for usability evaluation), but are paid for use. Others are free, but are not scientific reliable. The Post-Study Usability Questionnaire (PSSUQ) was chosen to be automated because it is free, small, and highly reliable.

Best practices identified in the analyzed studies resulting from the bibliographic research and systematic review were applied on PSSUQ automation process. In this work, were made: the research of the users’ context, profile and tasks; low and medium prototypes of the system and specialist evaluation of them; requirements and system architecture specification, and the implementation of the tool. The validation of the PSSUQ implementation will take place through two systems developed in the context of the Brazilian Army.

The remainder of the paper is organized in: Sect. 2, Background, describes the concepts on which this work is based; Sect. 3, Systematic Review of Literature, presents the systematic review conducted to find automated usability questionnaires in the literature; Sect. 4, Tool Specification, describes user analysis, presents the prototypes and its evaluations, requirements and architecture specifications; and Sect. 5, Conclusions, which emphasizes the main ideas and results of this work, and mention future works.

2 Background

This section presents the concepts used as a basis for the development of the work described in this paper.

2.1 Human-Computer Interface (HCI)

HCI is an interdisciplinary field concerned with design, evaluation and development of interactive systems for users, considering its context [3].

2.2 Interface Design Processes

Interface design processes group activities to design systems with user-friendly interfaces. In general, the activities are: user analysis (user profile, tasks and context), create alternative projects (prototyping) and evaluate the interface [3].

2.3 Usability

Usability is a capacity/property of a system [3] that refers to the fact that users make good use of its functionalities [1]. The goal of usability is to ensure that systems are easy to use from the perspective of the users. They must be involved in the whole process of system design and conception, so they can provide information and feedback to improve the system continuously [5].

2.4 Usability Evaluation

Usability evaluation consists of methods to measure aspects of usability of a system, ensuring high quality of the user experience with it [1]. Evaluation methods differ from each other on the objectives and approaches to discover problems, but they should be used complementary in evaluation sessions [4].

2.5 Automation of Usability Evaluation

Automating some aspects of usability evaluation is a trend due to its various benefits, such as saving time and project costs, increasing the consistency of error detection and the coverage of evaluated characteristics, assisting designers who are not experienced in performing evaluations and comparing between alternative designs. It is performed with traditional evaluation methods, automating mainly capture and data analysis [4].

2.6 Standardized Usability Questionnaires

Standardized usability questionnaires are forms with questions which aim to assess user satisfaction with perceived usability of a system. They were made under standards to produce metrics based on the responses obtained from the participants [6]. Their reliability was established scientifically through psychometric tests and are represented by the Alpha coefficient (ranging from 0 to 1, where higher values indicate better reliability) [5].

These questionnaires were made to be used repetitively and to be objective. They are cited in national and international standards (ANSI 2001 and ISO 1998), such as the post-study ones, applied after the usability tests: Questionnaire for Satisfaction with the User Interface (QUIS), the Measurement Inventory (SUS), the System Usability Scale (SUS) and the System Usability Post-Study Questionnaire (PSSUQ) [5, 6].

Some their characteristics are presented in Table 1, showing that two of four questionnaires are free, PSSUQ and SUS. The SUS questionnaire is smaller, with 10 items, while the PSSUQ has 16. But PSSUQ has higher reliability coefficient (0.94), so it was chosen to be automated.

Table 1. Standardized usability questionnaires

2.7 Mobile Devices

Mobile devices have revolutionized the technology market and also the daily lives of its users, who have many features available only on desktops before. They are very attractive because of their price (these devices became cheaper) and high performance. Many services are delivered in the form of applications focused on specific functions and goals, and can be obtained through delivery mechanisms like Play Store (Google) and Apple Store (Apple Inc.) [7].

A survey conducted in 2012 evaluated the conversion rate - percentage of visitors who complete a desired action - of 100 million visits to the Monetate e-commerce website, for accesses made by computers, tablets and mobile phones. While for Desktop devices the conversion rate was 3.5%, for iPads it was 3.2% and for iPhones only 1.4% [7]. The conversion rates were all low and between the mobile devices, iPad and iPhone, iPad presented higher rate, almost equal to the Desktop rate. For this reason, iPad was chosen platform to develop on the solution for the problem of this work.

3 Systematic Literature Review

This section describes the Systematic Literature Review (SLR) done to identify scientific papers that automated usability questionnaires. SLR is a research technique that uses a rigorous methodology through a pre-defined research protocol aiming to aggregate evidence on a desired research topic to provide background according to the defined strategy [8].

3.1 Systematic Review Protocol

In order to perform an unbiased, objective and systematic approach to collect evidence from software solutions of automated usability questionnaires, researchers defined a mapping study process adapted from [9, 10]. It consists of three phases: Planning, Conducting and Documenting the review, according to Fig. 1.

Fig. 1.
figure 1

Mapping study process.

The Planning Phase of the review aims to specify a mapping study protocol describing systematic activities to gather available evidence. The product of this phase is the detailed protocol to support the researchers in the review process. It provides a clear definition of the research questions, search strategy to gather relevant studies, inclusion and exclusion criteria of primary studies, as well as analysis of data extraction and synthesis. All details of the protocol as well as the results of the mapping study are documented in the technical report.

The Conducting Phase of the review involves the application of the research protocol specified to search for primary studies, extract data and synthesize relevant knowledge related to the automated usability questionnaires. The product of this phase is the evidence generated from all the activities of the protocol.

The Documentation Phase of the review reports the findings of the mapping study. The researchers involved consolidate all information, write the technical report, review and publish the results.

Need for This Review. Identify and summarize, in a systematic and unambiguous way, scientific papers that describe software solutions for automated usability questionnaires. There are few studies and most of them propose new questionnaires, but without automating them.

Research Questions. To characterize the evidences on automated usability questionnaires, two questions were formulated addressing different aspects:

  • Question 1: In the literature, what tools have been developed to automate usability questionnaires?

  • Question 2: In the group of studies about usability questionnaires identified, how many of them are STANDARDIZED usability questionnaires?

Search Strategy. The search strategy brings together automatic and manual procedures. Before conducting the evaluation of the studies, a pilot study validated the proposed strategy, making changes whenever was necessary. The search strings were constructed combining some terms in English: “questionnaire”, “tool”, “usability evaluation”, “user interface”, “user satisfaction” and “automation”, also the translation of these terms into Portuguese, since in the pilot study works in Portuguese were identified: “questionário”, “ferramenta”, “usabilidade”, “avaliação”, “interface de usuário”, “satisfação do usuário” and “automação”. After conducting a pilot study with these initial combinations and identifying the inclusion of relevant studies, the search strings were refined and are presented in Table 2 with the initial search results.

Table 2. Search strings and initial results

Databases. Three digital databases were selected to conduct the research: IEEE Xplore, Science Direct and Google Scholar. They were chosen because they are some of the most relevant databases for Software Engineers [8]. The search strings in Table 2 were applied in the bases using the advanced search options to identify the string terms in the article titles or in the abstracts.

Selection Criteria. To identify relevant primary studies in the research, the following study selection criteria were defined:

Inclusion Criteria (IC)

  1. 1.

    Studies in English or Portuguese on automated usability questionnaires.

  2. 2.

    Studies that contain the following terms in the title: questionnaire or both terms questionnaire and tool, or both terms usability and tool.

  3. 3.

    Studies responding directly to Question 1.

Exclusion Criteria (EC)

  1. 1.

    Studies out of the scope of this research.

  2. 2.

    Studies that do not contain sufficient evidence for this research.

  3. 3.

    Studies describing the conception of new usability questionnaires, but without automating them.

  4. 4.

    Studies describing the automation of questionnaires with different purposes than perform usability evaluation.

  5. 5.

    Studies dealing with the automation of others usability evaluation methods than the questionnaire.

  6. 6.

    Duplicate studies. When a study is published in more than one research source, the most complete version will be used.

Screening Process. One of the main objectives of the mapping study is to determine the relevant papers (primary studies) that correctly address the research questions. According to the search strategy, researchers performed manual and automatic procedures, removing repeated entries. Once the search results were found, the researchers read the titles and abstracts to apply the inclusion criteria. Next, the exclusion criteria were applied during the complete reading of the articles, generating a list of primary studies, as shown in Fig. 2.

Fig. 2.
figure 2

Search process.

Table 3 presents the final research results, after applying the inclusion and exclusion criteria on the initial works. Four relevant studies were identified.

Evaluation of Study Quality. For this study were considered studies present in academic bases appropriated to answer the research questions of this SLR.

Data Extraction. During data extraction process, the researchers carefully read the primary studies. The peer review process was performed and the researchers extracted data for the same study. A pilot data extraction test was conducted to align the researchers’ understanding to answer the research questions. The pilot was performed with all primary studies, and disagreements about the individual responses were discussed and resolved. All relevant data from each study were recorded on a form.

Table 3. Final search results

Data Summary. The data extracted from the studies were summarized according to the purpose, characteristics and results of use of the tools described in the works.

3.2 Systematic Review Results

The results of SLR performed are described below:

  1. 1.

    TOWABE -Usability Assessment Tool for Web Applications: web tool to evaluate usability of web systems. It automates three evaluation methods: a questionnaire (created based on ISO 9241), inspection and classification of cards. It has two user profiles: Evaluator, who prepares the evaluation, and the Participant, who participates of the evaluation. The 21 items in the questionnaire can be customized, deleted and commented. Each questionnaire is saved with the name of the participant. The results of the evaluations and recommendations for improvement are presented in reports. The case study of the use of the tool showed that users found it practical to receive the invitation by email to invite for the evaluations. They suggested to include online help and to improve the structure of the questionnaire page, since it did not correspond to their mental model [11].

  2. 2.

    WQT - Web Questionnaire Tool: tool to measure usability of interactive systems using questionnaire and video recorder. Developed in PHP language, it allows the creation of a desired questionnaire, by indicating the theme, evaluation criteria, quality factors and items, which can be added or deleted. There are templates of questionnaires already implemented and statistical reports are generated. They can be in textual (HTML or Excel) or graphical format. The results of use of this tool showed that it is robust for users [12].

  3. 3.

    Mugram: tool to evaluate usability of mobile systems. For each evaluated system, a project is created. It is possible to create evaluation rounds, for evaluate a version of the system, and compare the results of up to three trial versions. Items are divided into categories and more items can be added. The results of the evaluations are showed in tabs, where the main one shows the total score and the summary of points of the main category; other tabs show the statistical detail for the results of each item. Reports are displayed in diagrams and dashboards. Evaluated by experts, they praised these last two features and found it useful, easy to learn and with clean interface. Suggestions given were: allow exclusion and inclusion of entire categories of items and integrate tool with social networks to share of results [13].

  4. 4.

    VRUSE: tool to measure usability of augmented reality systems. It has 100 items divided according to 10 usability factors. Data are collected using a spreadsheet, which facilitates conversion. It is possible to choose the level of expertise of participants regarding the expertise with usage of systems. There is a feedback field at the end of it. The results are displayed on a chart with the score for each usability factor. The case study with the use of VRUSE showed that it has high performance and users felt in control when using it [14].

3.3 Comparative Analysis

To perform a comparative analysis between the works analyzed and described before, the main information about the tools are presented in the Table 4.

Table 4. Questionnaires characteristics

Responding to Question 1. All tools run on web platform. Each one evaluates usability of specific systems and most of them allows customization of the questionnaires items. There was concern in evaluating the developed tools, but just Mugram was evaluated by specialists, which has clean design, useful features and very detailed results report. Most of the tools generates graphical reports for easy visualization of them.

Responding to Question 2. None of the analyzed tools automated a standard questionnaire of usability. According to the authors of the articles, standardized questionnaires allow only more general usability assessments, without considering specific characteristics of the systems, such as mobile systems. On the other hand, questionnaires not scientifically tested and accepted, such as the standardized ones, do not have the scientific reliability to perform evaluations. According to [7], only systematic efforts using established methods to evaluate systems can be considered in Usability Engineering. Therefore, the automation of standardized usability questionnaires will allow better analysis and interpretation of the results of the evaluations using a resource of scientific value. Good practices used in the analyzed works will be applied to the work developed in this paper. Based on this research and in the factors in the development of works of this nature, the tool was designed.

4 Tool Specification

This section describes the development and usability processes used, modeling and specification of the requirements and architecture for the proposed tool.

4.1 Software Development Process

A Software Development Process defines a set of necessary activities to develop software with quality. In this work, the process chosen was the Personal Development Process (PSP), which can be used by an individual. Its activities aim to define requirements, design architecture, prototype, generate/test code and verify the effectiveness of the project [15]. The focus of the tool is usability, so an interaction design process Sect. 2 was used with PSP.

4.2 User Analysis

Information about the target users of the tool, its profiles, tasks and context were obtained through bibliographic research and by interviewing professionals who work in the market with consulting and evaluation of system interfaces.

  1. 1.

    User Profile: target users are the interface designers, who design user experience and evaluate interfaces. They have knowledge about the usability of several types of systems. Usability evaluations are performed using established methods.

  2. 2.

    Context: usability evaluations often occur in laboratories with cameras and tools to record data, such as Google Forms and Excel spreadsheets, that help generate charts. Evaluators instruct participants and then observe them, making records. Usability tests are planned in advance, defining its goal, number of participants recruit, tasks to be performed and schedule for tests. For each completed test scenario the participant fill in a post-task questionnaire, and upon completion of all scenarios, they complete a post-study questionnaire, like PSSUQ, to assess overall system usability [2, 16].

  3. 3.

    User Tasks: evaluator explains the instructions for participant to complete the questionnaire and deliver it, performing the following steps:

    1. (a)

      Participant fills in some information to validate the recruitment, and then, fills in the items and returns the questionnaire to the evaluator;

    2. (b)

      Data filled in the questionnaire are validated by the evaluator, who checks it for errors, inconsistencies, blank items, erasures, and the like;

    3. (c)

      When completing the usability test, evaluator registers the data in a spreadsheet to generate charts, and

    4. (d)

      And generates a report with the classification of the data by punctuation and observations, problems identified and recommendations for improvement for each one. Finally, spreadsheets and reports are stored.

4.3 Functional Requirements

After analyzing the users’ profile, their tasks and context, it is possible to define and model the functional requirements of the system in the Use Case Diagram [17] present in Fig. 3.

Fig. 3.
figure 3

Functional requirements.

4.4 Low Fidelity Prototypes

The low fidelity prototype of all screens was made on paper and it was evaluated by a interface professional. Just a few screens were shown here.

Fig. 4.
figure 4

Questionnaire screen (low fidelity prototype).

Figure 4 shows the process of filling out the questionnaire by a participant, when an individual evaluation is created. On the first screen, quick prompts are given, and after selecting the start option, it answers the questions (one per screen). At the end, participant can return to some question, if wishes, and then finalize the questionnaire.

Prototype Evaluation. System requirements were evaluated through the prototype, that didn’t considered the colors choice and icons details for the app. Given recommendations for improvement were: to include messages of feedback for the user for all actions and results of actions performed by participant; to propose a way for evaluator continue his session when participant complete the questionnaire; and the items of the questionnaire should be divided into minor parts.

4.5 Medium Fidelity Prototype

After analyzing the low fidelity prototype, other prototype were made, on a specific tool, with more details, like icons and colors. In Fig. 5 is shown the prototype of the questionnaire screen, with more details in forms, colors, pattern of buttons and actions.

Fig. 5.
figure 5

Questionnaire screen (medium fidelity prototype). (Color figure online)

In the Fig. 6 is shown the comparison of project rounds screen.

Fig. 6.
figure 6

Comparison screen (medium fidelity prototype).

In these prototype, all recommendations given by the specialist were applied to the previous prototype.

4.6 Non-functional Requirements

Non-functional requirements express limitations and quality properties of the system [15]. For the tool proposed, the non-functional requirements specified are:

Platform: the system must be developed on the iOS platform - iPad mobile devices.

Security: only users registered as evaluators have access to the features and data.

Installation: the system must be obtained from the Apple Store.

Standard: the system must be developed using the architectural standard Model, Vision and Control (MVC).

4.7 Architecture

The architecture of a system defines the components of its structure. Details of algorithms are not defined here. The architectural style chosen is object-oriented, where the components encapsulate data and operations. Communication occurs by exchanging messages between objects [15].

Architectural patterns address problems in a specific context and serve as the basis for the architectural design. In this project was chosen the standard MVC, suitable for use in mobile applications. The application is separated into three layers: Model, that contains entities and system data; Vision, that contains the logic of data presentation and event capture, and Control, which handles events captured by the Vision and searches the data in the Model to update the Vision [18]. In addition, there is the data layer, that handles the logic to store the application data. In Fig. 7 it is possible to see a simplification of the relationship between the entities of the system.

Fig. 7.
figure 7

Entities diagram.

4.8 Implementation

The tool, called iQuest, was implemented according to the defined requirements and the architectural specification made. In the following images, some of the screens of the tool.

Fig. 8.
figure 8

Menu screen.

Fig. 9.
figure 9

Report screen.

Figure 8 represents the main Menu of the tool with the options offered to the Evaluator (after he/she is logged-in): Register new project, Query registered projects and Compare results of evaluations.

Figure 9 displays the general data in a report for a round of project evaluations.

5 Conclusion

In this work, the authors performed a systematic mapping study to investigate the state of the art in the automation of usability questionnaires. As a result, four primary studies were identified.

The results obtained with the studies showed that the automation of the PSSUQ standard usability questionnaire is possible, since it is free (not licensed for use) and small.

From then on, the requirements and the architecture of the tool were specified using the tools, principles and techniques of Software and Usability Engineering.

Based on the design decisions, iQuest was implemented. The tool will be tested and modified (because of error findings) in the SISDOT and SISBOL projects that will be modernized through the cooperation term between UnB and EB.

Then the data collection and summarization process will be more efficient, less error prone, and design professionals will be able to use a tool with high-quality of usability to perform usability assessments.