Keywords

1 Introduction

Users’ concern over the safety of their personal details has been a long-standing issue in privacy research. To evaluate users’ privacy concerns, the methodologies offered have been based on self-report and give a subjective account of users’ privacy concerns. These were mostly questionnaires ranging from Westin’s Privacy Segmentation Index [13] that group individuals in broad categorizations of privacy fundamentalists, pragmatists or unconcerned to those more focused on online privacy such as the Internet Users’ Information Privacy Concerns [16].

First, Preibusch [20] observed that evaluation of these measurement instruments and the methodology at large has been “fragmented and ad-hoc.” We take this as a call to action to invest in the validation of tools for privacy research, especially those suitable to support evidence-based contributions. Second, users’ privacy concern, intention and subsequent behavior, at the time of evaluation, is under the influence of their current internal states. We believe that eliciting affect states would provide an important dimension that impacts privacy concerns. We therefore set out to investigate the influence of users’ affect states on privacy concerns.

We report on a pretest which evaluates users’ affect states when exposed to standardized video stimuli for happiness and sadness. We investigate two face-geometry-based affect analysis tools (Facereader and Emotion Recognition) and evaluate their properties systematically as components for future experiments. We validate these instruments against the Positive and Negative Affect Schedule (PANAS-X) [30], a well-vetted self-report questionnaire.

Contribution. Our pre-test findings indicated that the two psycho-physiological tools accurately measured the users’ affect states. Our findings not only provide a valuable systematic comparison of the measurement tools, but also techniques for inducing and measuring affect states, beneficial for other researchers. We also provide re-usable building blocks that can be plugged into further research. In addition, to the best to our knowledge, this is the first study employing affect inducing and psycho-physiological tools in usable privacy research.

Outline. The paper is organized as follows: first, we provide background information on privacy and emotion; then present our research model and hypotheses. Next we report on the pretest experiment conducted and the results obtained. Subsequently we present the structured abstract for the main experiment. We conclude the paper by discussing the implications and limitations of our work.

2 Background

In this section, we begin with the issues associated with privacy definition, its multidimensional characteristics, then review existing literature on privacy concerns, and affect states with their measurement methods. Subsequently we describe the use of stimuli from affective psychology to induce emotions, and use of affect measurement as manipulation checks. In conclusion, we report on existing measurement instruments for privacy concerns.

2.1 Information Privacy

Nissenbaum [17] proposed that privacy is a contextual concept that occurs in different spheres of life: legal, medical, information technology to mention a few. This has led scholars to propose different definitions: starting from “the right of an individual to be left alone” [29] to “the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others” [31] amongst other privacy definitions.

Privacy Definition. Burgoon et al., Clarke, and DeCew [4,5,6] are known for their multidimensional definitions of privacy. For the purpose of this paper we adopt the definition by Smith et al. [24] as stated in Li [14], information privacy refers “to the ability of individuals to personally control information about themselves.” Information privacy enables individuals or groups or organizations to protect themselves against actual or perceived intrusions on the information about them [6, 31]. The possible occurrence of privacy intrusions can trigger a sense of panic or anxiety in users. This causes them to express their concerns over maintaining adequate access protection to their personal details.

Privacy Concerns. Privacy concerns can be described as “concerns about possible loss of privacy as a result of information disclosure to an online business” [32]. Scholars rely on the users’ expression of their privacy concerns to measure the privacy levels and classify users” [31]. Given the multidimensional properties of privacy, it is not a surprise that different survey tools have been developed for measuring privacy concerns. Some of the survey tools which adopted the use of a multidimensional approach in measuring privacy have been considered as validated and reliable. These include Concerns for Information Privacy (CFIP), Internet Users Information Privacy Concerns (IUIPC) [16, 24]. These are widely used as the standard surveys for privacy concerns.

Measuring Privacy Concerns. The development and use of differing scales have not been without issues. In his overview of the existing survey instruments used in measuring privacy concerns, Preibusch pointed out that “approaches to measure privacy concerns are fragmented and often ad-hoc, at the detriment of reliable results” [20]. The survey results derived from these tools rely on users’ feedback, memories, and rated perception of subjective factors considered to affect privacy concerns, [12].

2.2 Emotion and Affect

In this section, first we present the definition of emotion, affect, and highlight their differences. We highlight the differing views on the relationship between emotion and behavior, followed by a brief overview on the effect of emotion on behaviors, concerns, decision making.

We adopt Baumeister et al’s description of emotion as “a conscious feeling state” [2]. It is stimulated either by actual events that happen to the individual (“actions”) or anticipated events that are yet to occur (“outcomes”).

Emotion has been classified based on the duration of the feeling state [23]. Affect has been described as the “faint whisper of emotion” [23]. Affect is said to have more impact on behavior than emotion [2, 18]. Hence in this paper, we use the term affect state to describe the state of feelings experienced by the participants. This is because the stimuli films can trigger a type of quick reaction within the individual.

The sole use of surveys as the main measurement tool of a multidimensional concept like privacy is inadequate. This is in line with the findings by Paine et al., which point out that “the concept of privacy is highly complex, therefore it is unlikely that surveys can accurately reflect respondents’ true concerns” [19]. We suggest the use of a complimentary set of survey and psycho-physiological tools such as facial and emotion recognition devices. We believe users’ privacy concerns, are associated with non verbal expressions, which are unconscious, facially expressed and fleeting in nature [8]. They cannot be captured by survey tools, hence the need for psycho-physiological tools. Hall et al. [12] noted “psycho-physiological measures are particularly sensitive to the fleeting and non-conscious nature of emotional experience.” In this report we discuss the use of Facereader and Microsoft Emotion Recognition in the studies presented in this paper.

The literature review we conducted, revealed the contrasting views on the causative relationship between emotion and behavior. Loewenstein et al. [15] suggests that “the idea that emotions exert a direct and powerful influence on behavior receives ample support in the psychological literature on emotions.” In a similar vein, Frijda [9] had suggested that “emotion arouses behavior and drives it forth.” On the other hand, Baumeister et al. [2] suggest opposing views. In their review of the direct causation theory, they argue that “if a given emotion does not consistently cause same specific behavior, then again the influence of emotion on behavior can hardly be considered as direct.” Rather they suggested that behavior is indirectly influenced by anticipated emotional outcomes.

2.3 Theory and Research Model

The Theory of Planned Behavior (TPB) states that attitude, subjective norms and preconceived behavioral control have a direct influence on behavioral intention which in turn influences actual behavior. However scholars have argued that a subjective norm is “inadequate and rarely predicts behavioral intentions” as stated in Armitage et al.  [1]. Researchers have also highlighted the inefficacy of the TPB to influence or predict behaviors especially in the health field, this can be extended to privacy research based on the observed privacy paradox [25].

Fig. 1.
figure 1

Research model for the experiment.

We present our research model in Fig. 1. We investigate the influence of stimuli films, \(\mathsf {S} \), on users’ affect states, and consequently investigate the influence of affect states on their privacy concerns. We recognize that confounding factors, \(\mathsf {F} _{1\ldots {}n}\) such as user’s consumption of alcohol and recreational drugs could have an influence on the affect state. To test our research model, we first explore the influence of stimuli films on users affect states by carrying out a pilot study or pretest as it is referred to in this paper. We build on the outcomes of the pretest and present the design of an upcoming main experiment in Sect. 4.

3 Pretest

Affective psychology predicts that stimuli from films impact human affective states [22]. We designed a pretest study to assess and validate the manipulation from such stimuli and their measurement.

RQ1 : Manipulation Method. How do standardized stimuli films (for happiness and sadness) influence the user’s affect state?

 

\(\mathsf {H_{1,0}} \)::

There is no change in users’ happy and sad affect states under induced happy and sad stimuli films.

\(\mathsf {H_{1,1}} \)::

Users’ happy and sad affect states are impacted by induced standardized happy and sad stimuli films.

 

RQ2: Measurement Tools. We make a systematic comparison between the manipulation test in the validated PANAS-X questionnaire and the psycho-physiological measurement tools. What are the tools’ sensitivity, confidence intervals, their strengths and weaknesses? For the operationalization of the hypotheses, we define sensitivity as the effect size (in difference between means) between measuring the affect state of a participant exposed to a happy stimulus versus the affect state of the same participant exposed to a sad stimulus. We refer to the \(95\%\) confidence interval on the effect size.

 

\(\mathsf {H_{2,0}} \)::

There is no difference in the sensitivity and confidence intervals on happiness and sadness measurements of the tools PANAS-X, Facereader and Emotion Recognition.

\(\mathsf {H_{2,1}} \)::

The tools PANAS-X, Facereader and Emotion Recognition differ in either sensitivity or confidence interval when measuring happiness or sadness affect states.

 

3.1 Method

Participants were exposed to video stimuli to induce diametrical emotions in a within-subject design. They received a happiness as well as a sadness stimulus in random order. The observed affect was measured with PANAS-X [30], Noldus FaceReader and Microsoft Emotion Recognition and compared across conditions.

Participants. \(N= 9\) students from Computing Science Department of Newcastle University, of whom six males and three females, participated in the study. The participants’ age range was from 23 to 30 years, (\( M =26.43\), \( SD =2.23\)).

Operationalization. We induced the independent variable (IV) affect with three levels: (a) neutral baseline, (b) happy, and (c) sad.

We checked this manipulation with a self-report questionnaire, a 60-item PANAS-X [30] (joviality and sadness) with a designated time horizon “at this moment.”

We measured the Dependent Variable(DV) affect (happiness and sadness) on a scale of \([0, \ldots , 1]\) with the psycho-physiological measurement tools (a) Facereader (FR) [3, 7], (b) Emotion Recognition (ER). During the stimulus exposure, a video of the participant’s face is recorded. The video is inputed into FR; a still image of the face is taken at the end of the corresponding stimulus and inputed into ER.

Procedure. The pretest proceeded in the following steps, where Fig. 2 illustrates the main experiment design:

  1. (a)

    a demographics questionnaire,

  2. (b)

    Neutral state.

    • Induction of a neutral baseline affect state,

    • Measurement of manipulation check (PANAS-X), ER and FR.

  3. (c)

    Affect State 1: Either happy or sad, determined by random assignment.

    • Show video stimulus to induce affect.

    • Measurement of manipulation check (PANAS-X), ER and FR.

  4. (d)

    Affect State 2: Complement of Affect State 1.

    • Show video stimulus to induce affect.

    • Measurement of manipulation check (PANAS-X), ER and FR.

  5. (e)

    a debriefing survey, which checks for the participants feedback regarding the affect state experienced.

Fig. 2.
figure 2

Experiment design template for the pretest.

Inducing and Measuring Affect State. We adopted the induction of happy and sad from standardized stimuli defined in Gross et al. [10] For the induction of happiness, and sadness affect states, we used the restaurant scene from the movie When Harry meets Sally and the dying scene from the movie The Champ as stimuli films. Participants were exposed to both stimuli films clips in a within-subject experiment. Whether they received the happy or the sad film first was determined by random block assignment. After the neutral state and after each film participants filled a full 60-item PANAS-X questionnaire, with a designated time horizon of “at this moment.”

During the neutral state and during watching each film, the faces of the participants were filmed with a high-resolution video camera. The video feeds constituted the inputs for the Facereader, which computed affect scores based on changes in face geometry. At the end of the stimulus exposure, a still image is taken from the video feed, which serves as input to Emotion Recognition. Both Facereader and Emotion Recognition compute scores on the scale \([0,\ldots , 1]\) for the variables anger, contempt, disgust, fear, happiness, neutral state, sadness and surprise. Only happiness and sadness were considered for further analysis.

3.2 Results

Figure 3 contains an overview of the results, in which we have normalized PANAS-X to the interval \([0,\ldots ,1]\) to put all tools on the same scale. All inferential statistics are computed with two-tailed tests and at an alpha level of .05. We report asymptotic significance values.

Fig. 3.
figure 3

Comparative boxplots for happiness and sadness measurements, with stimuli “happy” and “sad” on the x-axes. The y-axes are normalized to \([0,\ldots ,1]\).

Assumptions. We tested the the normality of the measurements from PANAS-X, ER and FR towards the eligibility of parametric statistics. The Shapiro-Wilk test was statistically significant for PANAS-X Sadness, all Emotion Recognition and Facereader measurements (all \(p < .001\)). The PANAS-X Joviality results were borderline, \(W = 0.92\), \(p = .087\). Consequently we are not entitled to use parametric tests and opt for a two-tailed Wilcoxon signed-rank test.

Manipulation Check: PANAS-X. A self report-based manipulation check was carried out. We used the 60-item full PANAS- X questionnaire [30] as manipulation check on the induced affect state, following the methodology endorsed by Rottenberg et al. [21]. Of the different variables PANAS-X provides, we focused on sadness and joviality as equivalent of happiness.

There is a statistically significant difference between both videos stimuli for both measurements on joviality and sadness measurements. We offer a comparison of PANAS-X results for both stimuli in Table 1a. Consequently, we reject the null hypothesis \(\mathsf {H_{1,0}} \) and accept that the video stimuli have a measurable impact.

Table 1. Overview of results for measurement devices PANAS-X, FR, and ER.

Emotion Recognition. We observed with the measurements of the Emotion Recognition tool that there are statistically significant differences between the stimulus conditions, for happiness measurements as well as sadness measurements. Table 1b contains an overview of the ER results. This informs \(\mathsf {RQ2} \) that ER is a suitable measurement tool for affect comparisons with small samples.

Facereader. The Facereader measurements across video stimuli were neither statistically significant for the happiness nor for the sadness measurements. Table 1c contains an overview of the FR results. This informs \(\mathsf {RQ2} \) in that Facereader-based measurements do not have sufficient power to differentiate between these emotions at the small sample size of the pretest.

3.3 Comparison of Measurement Tools

One of the key outcomes of the pretest is a systematic comparison of the measurement tools (PANAS-X, FR and ER) while ascertaining the overall effectiveness of the induction of emotions with standardized video stimuli.

Qualitative. We first made qualitative observations based on the boxplot comparison in Fig. 3. We are aware that we had one participant who entered the experiment in a morose state, which shows as an out-lier throughout the measurements. We observe that PANAS-X provides a clear distinction between happiness and sadness stimuli in both measurements. As one can expect from a standardized and validated measurement instrument for affect, PANAS-X can be considered a sound benchmark.

Emotion Recognition (ER) offers a precise recognition of happiness. While it was able to distinguish the stimuli on the sadness measurement, as well, this difference was less pronounced.

Facereader (FR) recognized happiness in face of a participant exposed to a happy stimulus, however, FR does not use the full scale, reporting a \( Mdn \sim .3\). The result of the FR sadness measurement is striking in that it only uses \(< .025\) of the scale \([0, \ldots , 1]\). At the same time, the interquartile range is closely bracketed.

Meta Analysis. We compared the standardized mean differences for measuring either happiness or sadness across happy or sad stimuli. Figure 4 summarizes the outcome of this comparison in a meta-analysis forest plot. The meta-analysis was computed with the R package metafor [27].

Let us consider the left-hand-side Fig. 4a, which contains measurements of happiness with the three tools in question. For each measurement tool, we computed the mean difference between the happy video stimulus and the sad video stimulus, standardized over the joint standard deviation of the respective tool’s measurements.

For happiness scores, we see that all tools measure positive difference (i.e., a higher mean happiness in the case of the happy video vs. the case of the sad video). We observe that FR has the smallest mean difference, which can be interpreted as being least sensitive to measuring happiness differences. We observe further that ER shows the greatest mean difference, however is also impacted by the greatest confidence interval. The line “FE Model” below the three measurement tools offers a combined fixed-effect model of all three measurements, which informs us how our strongly happiness induced by the given videos registers in our measurement apparatus. This model is weighted by the standard deviations of the respective tools. Finally, we expect to measure happiness with a standardized mean difference of about 1 SD, which is a large effect.

The right-hand-side Fig. 4b compares the results for sadness measurements. All tools measure a negative difference (i.e., a lower mean sadness in case of the happy video vs. the case of the sad video). We notice that PANAS-X even though observing the greatest mean difference also comes with the greatest confidence interval. Again, FR reports a lower mean difference than ER. Overall, the combined fixed-effect model shows a standardized mean difference of \(-0.72\), also a large effect.

In conclusion, we observe that all three measurement tools have picked up happiness and sadness as expected from the video stimuli. Consequently, we know that the video stimuli work for inducing the emotions happiness and sadness, resulting in a medium to large effect size depending on the measurement device. This answers \(\mathsf {RQ1} \) and gives evidence to reject the null hypothesis \(\mathsf {H_{1,0}} \). FR as well as ER worked as psycho-physiological measurement tools picking up the participant’s emotional state without the interference of a self-report questionnaire. ER obtained the largest effect sizes for the case of measuring happiness as well as sadness. FR obtained the lowest effect size of the field, especially in the case of measuring sadness. From these observations, we can answer \(\mathsf {RQ2} \) in terms of qualitative differences sensitivity and confidence intervals. However, these differences are not statistically significant, by which we will retain the null hypothesis \(\mathsf {H_{2,0}} \).

Fig. 4.
figure 4

Meta-Analysis forest plot of measurement tools across induced emotions. The position of the square dot determines the effect size, the diameter of the dot shows the weight, the whiskers the \(95\%\) confidence interval on it.

3.4 Discussion

We answer the research questions as follows: For \(\mathsf {RQ1} \), we observe that the standardized video stimuli [21] can indeed be employed to induce affects. Our manipulation check with PANAS-X shows large effect sizes in the differences between video stimuli conditions. Consequently, we can use video stimuli to establish experiment conditions for true experiments in privacy and identity management, such as the main experiment we design in Sect. 4. We thereby recommend to replicate existing manipulation apparatuses from affective psychology.

For \(\mathsf {RQ2} \), we observe that the different measurement tools at our disposal differ in sensitivity and confidence intervals even if the evaluation did not turn out to yield statistical significance. While the psycho-physiological measurements (ER and FR) both worked by and large, we observed weaknesses of FR in the measurement of sadness. In addition, the meta and power analyses will need to inform future experiment designs. In particular, FR had the least power to distinguish between happiness and sadness conditions, which directly translates to a higher required sample size.

The three measurement devices exhibit strengths and weaknesses which need to be taken into account in experiment design. PANAS-X has been validated and used frequently in psychology research. However, it is a self-report questionnaire, which takes about 10 min to fill in for the full 60-item version. Consequently, we need to expect that emotional stimuli are wearing off over the time the questionnaire is answered. Even if the time horizon is set to “at this moment,” the outcomes will not be as immediate as with psycho-physiological measurements. ER works on still images and can thereby be used to measure momentary affect of the user. However, then the decision which time instant to use for the measurement becomes crucial. FR operates on video streams and comes with the capability to track affects over time. This, however, comes at a cost of less sensitivity to distinguish between conditions.

4 Main Experiment

We took on board a comment from the IFIP workshop, which highlighted the necessity to assess user’s privacy behavioral intentions whilst measuring privacy concerns. The reason given was privacy concerns questionnaires seem to be based on subjective norms, which are long term and not easily influenced. This was confirmed by a pretest we conducted on privacy concerns surveys and has led to the inclusion of a survey on behavioral intentions. The selected questionnaires are same as those used by Yang and Wang [33]. A structured abstract of the upcoming experiment is presented in the next section.

RQ3 Impact of Affect on Privacy Concern. The upcoming experiment will investigate to what extent an affect state causes differences in privacy concern. The research hypotheses being tested are:

 

\(\mathsf {H_{3,0}} \)::

There is no difference in privacy concern scores between cases with induced happiness and induced fear.

\(\mathsf {H_{3,1}} \)::

Privacy concern scores differ between cases of induced happiness and induced fear.

 

In particular, we hypothesize as a refinement of \(\mathsf {H_{3,1}} \) that users exhibit higher scores on privacy concerns when they feel fear than when they feel happiness. However, with \(\mathsf {H_{3,1}} \) we retain the capacity to evaluate two-tailed tests.

4.1 Method

A sample of \(N=60\) participants will be exposed to standardized video stimuli [10, 22] to induce emotions (happiness and fear) in a within-subjects design. The participants will receive the video stimuli in random order. Privacy concern and behavioral intention scores will be measured and compared across video conditions.

Operationalization. We will induce the independent variable (IV) affect with three levels: (a) neutral baseline, (b) happy, and (c) fearful.

We will check this manipulation against self-report and psycho-physiological measurement tools: (a) 15-item PANAS-X [30] (joviality and fear) with a designated time horizon “at this moment.” (b) FR (happiness and fear), (c) ER (happiness and fear). For the manipulation check, a video of the participant’s face will be recorded during the stimulus exposure. The video stream will serve as input for FR, a still image of the said face-recording will be taken at the end of the corresponding stimulus and used as input for ER. There will be a time of three minutes allocated to fill in the PANAS-X after the stimulus exposure.

We will measure the DV, the user’s behavioral intent on privacy concerns, using the same self-report questionnaires used by Yang and Wang [33], because they have been rigorously tested and found reliable [26].

Participants. The sample size of \(N=60\) will be chosen following an a priori power analysis, informed by the pretest in Sect. 3. As one constraint, we have seen a minimum sample size of \(N' =39\) for a within-subject experiment using the Wilcoxon signed-rank test to reach \(95\%\) power across the board. We will therefore choose a larger sample size, because we are preparing for the use of a two-tailed test and are expecting a smaller effect size in the impact of affect on privacy concerns. With \(N=60\) we can expect a sensitivity of .49, a medium effect size.

Procedure. The main experiment is designed to enable a comparison of the influence of affect states on privacy concerns and privacy behavioral intentions. The study will be spread over two days; the first day will entail the participants carrying out the first three steps, i.e. (a)–(c). On the second day, the participants will first be induced to a neutral state and then complete steps (d) and (e). The reason for this is to minimize the carryover effects of the video stimuli and effect of “questionnaire fatigue.”

The procedure consists of the following steps, where Fig. 5 illustrates the key elements of the experiment design:

  1. (a)

    Completion of pre-task questionnaire on demographics, alcohol/recreational drug use, IUIPC and CFIP surveys.

  2. (b)

    Neutral state.

    • Induction of a neutral baseline affect state,

    • DV questionnaires on privacy behavioral intentions,

    • Manipulation check with PANAS-X, ER and FR.

  3. (c)

    Affect State 1: Either happy or fearful, determined by random assignment.

    • Show video stimulus to induce affect.

    • DV questionnaire on privacy behavioral intentions,

    • Manipulation check with PANAS-X, ER and FR.

  4. (d)

    Affect State 2: Complement of Affect State 1.

    • Show video stimulus to induce affect.

    • DV questionnaire on privacy behavioral intentions,

    • Manipulation check with PANAS-X, ER and FR.

  5. (e)

    a debriefing questionnaire, used to check for missed or misreported information, subjective thoughts during study session.

The analysis compares the DV privacy concern measurements across the main levels of the IV (happy and fearful), using a two-tailed matched-pairs Wilcoxon signed rank test.

Depending on the properties of the sample (e.g., normality, homogeneity of variances) further analysis of the impact of the IV on privacy concern as target variable is possible with Univariate Analysis of Variances (ANOVA/GLM) or Linear Regression.

Fig. 5.
figure 5

Design for the main experiment.

5 Conclusion

While Wakefield [28], and Nyshadham and Castano [18] have explored the relationship between affect, information disclosure and online privacy concerns, we employ induced emotions and psycho-physiological tools in our empirical study of users’ affect states. To the best of our knowledge, there is currently, no such endeavor in usable privacy research.

Our pretest results provide empirical evidence that the specific stimuli films used had significant influence on users’ happiness and sadness. The pretest showed a successful manipulation of users’ affect states. The pretest results also indicate that ER, FR, and PANAS-X can measure users’ happiness and sadness, where ER is more sensitive in particular small sample sizes due to its large effect size.

Our pretest has therefore systematically evaluated and validated the tools for the upcoming main experiment. It further yields a detailed analysis of effect sizes and power of different psycho-physiological measurement tools that are of independent interest for usable privacy research. Other researchers can glean insights from the pretest results, use the tools employed here as validated components to induce or measure affects. Furthermore, with the design for the main experiment, we offer a template for true experiments that induce affect, control the manipulation tightly and then measure the impact on privacy concerns. The reported effect sizes and power calculations can form the basis for rigorous design for future experiments.