Keywords

1 Introduction

Recent advances in biomedical engineering have enabled the measurement of cognitive and physiological state changes non-invasively, allowing for the quantification of user states that were not measurable even a few years ago. Researchers have used non-invasive brain measurement to successfully measure mental states with great relevance to the human-computer interaction (HCI) domain, such as cognitive workload, deception, trust, engagement, and emotion [14]. Others have used brain measurement to compare the brain activation between groups of people with different traits while they complete the same task. For example, experiments have found that novices and experts show different patterns of brain activation during difficult tasks [5], and it has also been shown that people with high IQ have different patterns of brain activation than lower IQ counterparts [6]. One construct that remains difficult to define and quantify involves situational awareness (SA). SA remains an ill-defined buzzword, but Mica Endsley’s three stage model of SA remains the most widely accepted description used in SA research [7]. The model, which is discussed in further detail in the next section, breaks SA into three high-level cognitive stages. High SA aptitude is a necessary quality for many jobs, both in the military and civilian domains.

Measurement of SA is done primarily through surveys and/or viewing task performance. However, surveys are not reliable because they are subjective and must be administered post-task completion. Similarly, task performance only gives a small piece of overall information related to SA. No successful measurement of SA in the brain has been reported yet; this research provides a first step in that direction. The accurate quantitative, real-time assessment of SA aptitude could be used to screen individuals’ for employment in both the military and civilian domains. We leveraged our expertise using functional near-infrared spectroscopy (fNIRS) to measure brain activity of users while they completed tasks that require high SA aptitude. The fNIRS tool is safe, portable, non-invasive, and can be implemented wirelessly, allowing for use in real world environments.

The paper proceeds as follows: First, we describe background literature relating to our research goals. Then, we describe the challenges faced, guidelines established, and the hypotheses formulated while conducting this research. We then provide step-wise details of an experiment administered in our lab comparing high-SA and low-SA participants. The paper concludes with a thorough discussion of the next steps that follow naturally from our work to measure SA in the brain.

2 Background and Literature Review

There are several brain measurement devices available in medical and research domains. These devices monitor brain activation by measuring several biological metrics. When a stimulus is presented, neurons fire in the activated region(s) of the brain fire causing an electric potential, an increase in cerebral blood flow in that region, an increase in the metabolic rate of oxygen, and an increase in the volume of blood flow. All of these factors contribute to the blood oxygen level dependent (BOLD) signal, which can be detected (in various forms) by a number of brain measurement techniques such as fMRI, fNIRS, and PET. Ideally, a brain measurement device suitable for measuring brain activity in typical HCI activities would be non-invasive and portable. It would have extremely fast temporal resolution (for use in adaptive systems) and it would have high spatial resolution, enabling the localization of brain activation in specific functional brain regions. Electroencephalograph (EEG) and fNIRS are the two most popular devices for non-invasive imaging of the brain. However, when compared to EEG, fNIRS has higher spatial resolution, lower set-up time, and a higher signal-to-noise ratio [2, 8]. We focus on fNIRS, as this is one of the best suited technologies for non-invasive brain measurement during naturalistic HCI.

2.1 Functional Near-Infrared Spectroscopy

FNIRS is a relatively new non-invasive technique that was introduced in the late 1980s [9] to overcome many of the drawbacks of other brain monitoring techniques. The tool, still primarily a research modality, uses light sources in the near infrared wavelength range (650–850 nm) and optical detectors to probe brain activity.

Light source and detection points are defined by means of optical fibers held on the scalp with an optical probe. Deoxygenated (Hb) and oxygenated hemoglobin (HbO) are the main absorbers of near infrared light in tissues during hemodynamic and metabolic changes associated with neural activity in the brain [10]. These changes can be detected measuring the diffusively reflected light that has probed the brain cortex [10]. FNIRS has been used in recent years to measure a myriad of mental states such as workload, deception, trust, suspicion, frustration, types of multi-tasking, and stress [2]. With its high spatial resolution, fNIRS can also localize brain regions of interest in order to measure more granular cognitive states such as verbal working memory load, spatial working memory load, response inhibition load, visual search load, executive processing, or emotion regulation load [2, 3, 11]. We posit that fNIRS can also be used to measure the cognitive correlates of situation awareness.

2.2 Situation Awareness

As described previously, the first stage of Endsley’s model focuses on one’s perception of the elements in the current situation. Many SA-demanding tasks involve extremely complex environments where the human operator must be able to perceive relatively small environmental changes while working on the complex task at hand. For example, an air traffic controller who is keeping track of a particular region of airspace must easily be able to perceive the entry of a new plane into that space. Stage 1 of SA is very important in a practical sense, as it can be responsible for many SA-related accidents. A review of SA-related aircraft accidents revealed that SA errors related to perception and attention (Stage 1), constituted the majority of such accidents [18]. If users successfully meet the multitasking related demands of SA Stage 1 while working with their complex system, they will immediately perceive when their environment has changed. At this point, SA Stages 2 and 3 become important. SA stage 2 is involved with one‘s comprehension of the elements in the current situation. Simultaneous to comprehending the nature of a given scenario, someone with high SA must be able to make projections about how the newly-evolved situation will affect the future. This projection involves SA Stage 3, the ability to plan and project into the future.

Several techniques have been developed to assess SA. For example, SA has been measured subjectively, using the Situation Awareness Rating Techniques (SART) and the Participant Situation Awareness Questionnaire (PSAQ). The questions in Table 1 are representative of the questions presented in the SART and PSAQ surveys [12, 13].

Table 1. Example questions from the SART and PSAQ surveys

One of the most widely used measures of SA is provided by Endsley’s Situation Awareness Global Assessment Technique (SAGAT) [14]. With SAGAT, task simulations are frozen at randomly-selected times, and the operators quickly answer questions about their current perceptions of the situation. The questions correspond to their situation awareness requirements as determined from the results of an SA requirements analysis. Operator perceptions are then compared to the real situation, based on simulation computer databases, to provide a qualitative measure of the operator’s SA.

Although Endsley’s model and the popular SA measurement techniques of SAGAT, SART, and PSAQ, are widely accepted in the SA research domain, they all view SA on a very abstract level, and they involve distracting the subject from the task at hand in order to complete the questionnaire(s). From a human factors research point of view, it is acceptable to speak about user states such as SA or mental workload in high level terms, and survey techniques such as SART, SAGAT, and PSAQ may suffice to measure these states abstractly. However, if we are to measure these states objectively and in real-time, there is a need to understand the cognitive and physiological correlates that are related to SA. We use Endsley’s SA model throughout this next section as a general reference while we describe the low level cognitive resources that we posit are involved in maintaining high SA.

2.3 Cognitive Correlates of Situation Awareness

Visual perception and search involves searching for items within a set of distracter items, and being able to notice subtle changes in a dynamic environment, which is directly related to Stage 1 in Endsley’s SA model. Visual search has been explored extensively in the neuroscience and psychology literature [15]. It is well known that people differ in their aptitude at conducting visual search tasks. We hypothesize that: H1: A person with superior SA will have a higher aptitude at visual perception and scanning than their lower SA counterparts.

A fundamental adaptive challenge for humans is that of regulating our emotions in order to successfully complete a given task. To meet this challenge, we have the ability to exert control over the emotions we experience. Emotion regulation entails controlling or changing one’s emotions through both extrinsic means (managing overt behaviors and social situations) and intrinsic means (recruiting cognitive and neurophysiological systems) [16]. It is well known that stress can cause catastrophic breakdowns in the ability to concentrate, pay attention to a task, or make complex decisions. Emotion regulation is a component for maintaining high SA during all three stages of Endsley’s SA model. Emotion regulation has been successfully measured with fNIRS [3]. Thus, we hypothesize that: H2: A person with superior SA will have a higher aptitude at emotion regulation than their lower SA counterparts, as measured by fNIRS.

SA Stage 3 considers one’s ability to take new information and to project how that information or event will affect the future. Prospection is the act of thinking about the future. The term prospection is very similar to the term projection, which Endsley uses in her third stage of SA. When we think about the future, we use a common functional brain region that includes frontal and medial temporal systems that are associated with planning and episodic memory [17]. As with all of the other brain functions mentioned in this paper, some people are better at prospection than others. Thus, we hypothesize that: H3: A person with superior SA will have a higher activation in brain regions responsible for prospection than their lower SA counterparts, as measured by fNIRS.

There is an interesting body of literature focusing on the neural efficiency debate that is of direct relevance to SA. The basic premise behind the neural efficiency approach is that individuals with higher IQ’s use less brain activation in order to complete the same tasks as their less apt counterparts. However, recent research has shown that the neural efficiency hypothesis is heavily dependent on task difficulty:

  1. 1.

    When the underlying task is easy or of a moderate level of difficulty, the neural efficiency approach is correct; high IQ people have less, and more focused, brain activation, while completing the same task as their lower IQ counterparts.

  2. 2.

    When the underlying task is very difficult and/or complex, the opposite of the neural efficiency hypothesis occurs: high IQ individuals show more brain activation than their lower IQ counterparts while doing the same task.

In both scenarios above, the high IQ groups outperform their lower IQ counterparts, but it is interesting to note that the amount of brain activity is directly related to the difficulty of the task. People with high SA aptitude would likely show brain activation that clearly fits with the neural efficiency hypothesis. Since many tasks that require high SA are complex, we hypothesize that: H4: When completing a complex multi-tasking scenario, a person with superior SA will have more brain activation than their lower SA counterparts measured with fNIRS.

3 Experiment

Before conducting our experiment, we conducted a set of exploratory pilot studies that were designed to help us determine the best testbed that would be complex enough to enable us to differentiate between participants with high and average SA aptitude. We worked with the Research Environment for Supervisory Control of Heterogeneous Unmanned Vehicles (RESCHU) [18], the Warship Commander Task [19], and the Air Force’s updated version of the Multi-Attribute Task Battery (MATB) [20]. In the end, we decided that the MATB testbed provided the ideal environment that would require aptitude at each of Endsley’s three SA stages in order to achieve high accuracy.

We also tested the PSAQ and SART surveys (described in the prior section) out during pilot testing of our testbeds. During exit interviews with our pilot participants, they reported these subjective surveys to be difficult to understand. The sampling of PSAQ an SART questions from Table 1 supports this finding; the questions shown in Table 1 could be difficult for subjects’ to accurately respond to. One pilot participant noted that a participant can think that he or she understands the situation well, but this misconception may lead to high amounts of user error. These subjective surveys do not account for users who are not fully aware of the situation. The popular saying: “You don’t know what you don’t know’, rings very true in this situation. We purchased the SAGAT assessment tool and found that, within the context of our task, the SAGAT results were directly related to participants’ performance data. Performance data has been shown to directly relate to SA. One advantage of using performance data is that you don’t have to interrupt users during the task, as must be done with SAGAT. Although performance is usually directly related to SA, Endsley found that high SA does not always lead to good performance, and low SA does not always result in poor performance [21]. However, in the context of the AF_MATB, our pilot data suggested that performance data did directly relate to SA (as assessed by SAGAT) as long as all participants had an equal amount of training time with the system.

With the information gained from our pilot testing, we designed an experiment to help us explore the four hypotheses listed above. The primary goal of the experiment was to use fNIRS to measure differences in the brains between people with high and average SA aptitude.

3.1 Experiment Testbed

Our task involved a complex multi-tasking scenario using a variation of the Multi-Attribute Task Battery (MATB) [20, 22]. This difficult task made it imperative that high achieving users not become overloaded or overly stressed, forcing them to prioritize their actions based on the most time sensitive or important needs of the task at the time. We used the Air Force’s updated version of the Multi-Attribute Task Battery (AF_MATB) [20], and we chose a difficulty level (based on pilot testing) that required a good deal of mental effort and multi-tasking.

With the difficulty of the task and the high level of multi-tasking required, the task was nearly impossible to complete perfectly, and our pilot tests showed that all subjects had to remain extremely engaged during the entire task to receive an adequate performance score. Like the original version, AF_MATB consists of six windows which provide information about four different subtasks (see Fig. 1); these subtasks include: System Monitoring, Communications, Resource Management, and Tracking. The last two windows, which contain Scheduling and Pump Status information, are resources that the user can use to improve performance during the task. The AF_MATB keeps track of, and outputs, a thorough report of each subject’s performance data on the various subtasks. In our pilot studies, we selected a MATB difficulty level that would result in the majority, if not all, of the subjects having difficulty executing every task perfectly. They would have to multi-task, prioritize, and accept that while their performance would likely be imperfect, they must keep from becoming frustrated in order to complete the demanding MATB scenario.

Fig. 1.
figure 1figure 1

A screen shot of the MATB testbed

3.2 Experimental Protocol

Eight subjects completed this experiment (3 female). After providing informed consent, subjects completed a Trail Making Test [23], as well as a visual perception and scanning aptitude test called the finding A’s task from the Educational Testing Service [24]. The Trail Making Test is a neuropsychological test of visual attention and task switching. It consists of two parts in which the subject is instructed to connect a set of 25 dots as fast as possible while still maintaining accuracy. Subjects then spent 45 min learning about AF_MATB. We wanted to ensure that they would fully comprehend the many rules and goals included in the experiment task before beginning the experiment. Subjects then completed two experimental conditions, and repeated these conditions 5 times, resulting in 5 trials of data. In each trial, the subject completed 2 min of working with the AF_MATB (MATB condition), and then they rested for 2 min while making similar mouse movements as those caused by the AF_MATB (control condition). They repeated this process 5 times.

We collected fNIRS data using the Hitachi ETG-4000 device. Subjects wore a cap with 52 channels that take measurements twice per second. As fNIRS equipment is sensitive to movement, subjects were placed at a comfortable distance from the keyboard and mouse and were asked to minimize movement throughout the experiment. Prior research has shown that this minimal movement does not significantly corrupt the fNIRS signal with motion artifacts [24]. We also gathered performance data throughout the experiments. Based on our pilot studies with the SAGAT, SART, and PSAQ surveys, we chose to use performance data as our primary metric to assess SA. This objective metric, which has been shown to directly relate to SA, could be measured in real-time while users worked with the MATB.

4 Results and Analysis

We gathered performance data for each of the 6 multi-tasks represented in the AF_MATB. Each of the performance outputs represented an amount of error on that task: therefore, lower numbers indicate better performance. We z-score normalized each of these measures, and then summed them up for each individual, resulting in the total score column in Table 2. A lower score is better in this case (representing less error across all 6 sub-tasks). We sorted our data by the subjects’ total score and selected the top three performers (subjects 7, 6, and 4) as the high SA group. The rest of the subjects were included in our average SA group.

Table 2. Performance results on MATB

We conducted a Pearson product correlation on the visual scanning data and the total score, and on the Trail making test data and the score. There was not a strong correlation between visual scanning test scores (the finding A’s task) and the final score, but there was a strong relationship (r = −0.69, p = .058) between subjects’ scores on the Trail Making Test (neuropsychological test of visual attention and task switching) and their overall score. Thus, it seems that the same high level resources that caused one to be superior at the Trail Making test were recruited to help them during the AF_MATB task. With the contradictory results from the Trail Making Test and Visual Scanning Tests, we only had partial support for H1: A person with superior SA will have a higher aptitude at visual perception and scanning than their lower SA counterparts.

We used the NIRS_SPM Matlab suite of tools to analyze the fNIRS data [25]. We first converted our raw light intensity data into relative changes of oxygenated (HbO) concentrations. We then preprocessed all data using a band-pass filter (between .1 and .01 Hz) to remove noise and motion artifacts. We used a general linear model (GLM) to fit our fNIRS data. Because the GLM analysis relies on the temporal variational pattern of signals, it is robust to differential path length factor variation, optical scattering, or poor contact on the head. By incorporating the GLM with the p-value calculation, NIRS-SPM not only enables calculation of activation maps of HbO but also allows for spatial localization. We used Tsuzuki’s 3D-digitizer-free method for the virtual registration of NIRS channels onto the stereotactic brain coordinate system. Essentially, this method allows us to place a virtual optode holder on the scalp by registering optodes and channels onto reference brains. Assuming that the fNIRS probe is reproducibly set across subjects, the virtual registration can yield as accurate spatial estimation as the probabilistic registration method. Please refer to [26] for further information. Based on performance data, participants were placed into two groups: high SA and average SA groups. Statistical tests were run on the data from each group to determine significant regions of activation (p < .05) for each group when participants were completing MATB, as compared to the control condition. NIRS_SPM results are shown in Fig. 2.

Fig. 2.
figure 2figure 2

Results from the NIRS_SPM toolkit showing significant areas of activation at p < .05

It appears that the high SA group had more brain activation than the average SA group during the MATB tasks. All participants were undergoing a very challenging multi-tasking scenario that involved their constant vigilance and information processing, and projection of future events. Our results are in line with the recent neural efficiency research that suggests that high aptitude (in this case, high SA) subjects elicit more brain activity than their peers when undergoing highly complex tasks. The brain activation shown in the frontal view shows that the high SA group had more brain activation in areas directly responsible for cognitive load processing (i.e., concentrating and thinking hard during the task) than the average SA group. These regions correspond with executive functioning, memory, and planning and coordinating activities. This supports H4: When completing a complex multi-tasking scenario, a person with superior SA will have more brain activation than their lower SA counterparts, which is in in the neural efficiency research, as measured with fNIRS.

The left side of the high SA group’s brain shows activation in Brodman’s area 9, or the dorsolateral prefrontal cortex (DLPFC) for the high SA group. The DLPFC is recruited during workload inducing tasks, and in particular, it is responsible for emotion regulation. It is possible that the high SA group may have been expending cognitive effort to regulate their emotions, keeping them from becoming stressed by the demands of the MATB task. In stark contrast to the high SA group, there was no significant activation on the left side of the brain of the average SA group. This supports H2: A person with superior SA will have a higher aptitude at emotion regulation than their lower SA counterparts, as measured by fNIRS.

On the right side of the brain, the high SA group and the average SA group had activation in the Superior Frontal Sulcus, which is heavily involved in working memory, but the high SA group had a good deal more activation than the average SA group in that region. The high SA group also had activation in the Inferior Frontal Sulcus and Supramarginal Gyrus. The Supramarginal Gyrus is involved in language processing and perception. The activation on the right side of the brain of the high SA group also has some overlap with regions responsible for prospection. As described previously, prospection involves thinking into the future about upcoming events and situations. It is interesting to note that high SA users may have been predicting future events in the task, which is a key element of Endsley’s SA stage 3. This is in line with prior research, discussed in the literature review, that people with higher IQ’s spend more cognitive resources ‘planning ahead’ than their average IQ counterparts when doing the same task [17]. This supports H3: A person with superior SA will have a higher activation in brain regions responsible for prospection than their lower SA counterparts, as measured by fNIRS. When completing a complex multi-tasking scenario, a person with superior SA will have more brain activation than their lower SA counterparts, which is in in the neural efficiency research, as measured with fNIRS.

5 Study Limitations

One limitation of this study is that we used performance metrics to get our ‘ground truth’ values of SA. Although performance and SA often have a positive correlation, they do not always have a direct mapping. However, we did not choose performance as our ‘ground truth’ lightly; we evaluated SAGAT, PSAQ, and the SART SA assessment techniques before deciding to use objective performance measures as our ‘ground truth’. Future work should further explore the relationship between SA assessments like SAGAT, PSAQ, and SART with performance data.

6 Conclusion

Attempts have been made to evaluate people‘s SA through subjective surveys and assessment of speed and accuracy data acquired during target tasks that require SA. However, it is well known in the SA domain that more systematic measurement is necessary to test theories of SA, to screen personnel for SA, and to find ways for individuals to maximize their SA in a variety of real world contexts [7]. While preliminary attempts have been made to measure the psychophysiological correlates of SA, no research has looked deeply into this issue, even though the benefits of such research have been recognized [27]. Recent advances in biomedical engineering have enabled us to measure people’s cognitive and physiological state changes non-invasively, allowing us to quantify user states that were not measurable even a few years ago. While there are many non-invasive sensors available to take measurements of users during naturalistic HCI, fNIRS is a relatively new brain measurement device that holds great potential in the HCI domain [2, 3, 10, 28]. The fNIRS tool is safe, portable, and non-invasive, enabling use in real world environments. As with many high-level user states (such as workload, trust, and flow), SA remains an oft misunderstood buzz term. Although Mica Endsley’s well respected model of SA [7] provides us with a useful guide for our work, we found that viewing SA in the brain requires a more precise view of SA than that described in past research. This research provides a first step in that direction.

In this paper, we described an experiment conducted with a 52-channel fNIRS device to measure differences in the brains between people with high and average SA aptitude. Our task involved a complex multi-tasking scenario using a variation of the Multi-Attribute Task Battery (MATB). This difficult task made it imperative that high achieving users not become overloaded or overly stressed, forcing them to prioritize their actions based on the most important needs of the task at the time. Our results suggest that: When completing the MATB, people with superior SA have more brain activation than their lower SA counterparts, which is in line with neural efficiency research. Also, people with superior SA have a higher aptitude at emotion regulation than their lower SA counterparts, as measured by fNIRS. We also found that people with superior SA have a higher activation in brain regions responsible for prospection than their lower SA counterparts, as measured by fNIRS. Lastly, we had partial support that people with superior SA have a higher aptitude at visual perception and scanning than their lower SA counterparts. Many military and civilian jobs require high SA aptitude. Screening for SA is already a part of many of these jobs, but these screenings are based largely on survey and performance data. An objective measure of SA, based on cognitive data during completion of complex tasks, would provide an unbiased way to get these measurements in real-time. Our research makes measurable steps toward this goal.