Introduction

The ability to recognize facial expressions is crucial to adaptive social interaction. Recognizing another’s expression of rage or fear, for example, could allow for a rapid response to escape danger. It has been shown that emotionally positive facial express ions are recognized substantially faster than emotionally negative facial expressions—the positive classification advantage (PCA). This effect was evident in happiness recognition, which was faster than recognition for sadness (e.g., Crews & Harrison, 1994; Hanaya, 1992; Leppänen & Hietanen, 2004), anger (e.g., Billings, Harrison, & Alden, 1993; Hugdahl, Iversen, & Johnsen, 1993), disgust (Ducci, 1981; Stalans & Wedding, 1985), and emotional neutrality (Hugdahl et al., 1993; but see Leppänen & Hietanen, 2004). While many of the electrophysiological and neuroimaging studies have investigated emotional face processing, the phenomenon of PCA has received surprisingly little attention. Several previous studies explored the neural mechanism of explicit facial expression categorization by recording event-related potentials (ERPs; e.g., Lynn & Salisbury, 2008; for a review, see Eimer & Holmes, 2007) and found that, as compared with neutral faces, fearful faces elicited an enhanced positivity starting at 160 ms poststimulus (Holmes, Kiss, & Eimer, 2005) and that significant differences in N170 amplitude were elicited by facial expressions (Lynn & Salisbury, 2008). Although facial expression classification could modulate the ERP components, the above studies did not involve the phenomenon of PCA: responding whenever the face with a fearful or neutral facial expression presented on the preceding trial was shown again on the current trial (Holmes et al., 2005) or detecting neutral facial expressions from among five emotional expressions (happy, sad, fearful, angry, and disgusted) (Lynn & Salisbury, 2008). Particularly relevant to the present study, recently, recording response selection (onset of the lateralized readiness potential [LRP]) and response execution (LRP onset–response onset) times for positive (happy) and negative (disgusted/angry) faces, Leppänen, Tenhunen, and Hietanen (2003) found shorter response selection times for positive than for negative faces but did not find differences in response execution times, indicating that the PCA occurs primarily at premotoric processing stages. The aim of the present study was to further map the time course of the PCA. To achieve this goal, we examined the sensitivity of a sequence of ERP components to the expression of the face while the participants categorized the stimuli according to their expressions.

The earliest component that has been consistently associated with the faces is the N170, which is a negative ERP occurring between 140 and 180 ms after the stimulus onset at occipito-temporal electrodes and is reliably larger to faces than to other stimulus categories (Bentin, Allison, Puce, Perez, & McCarthy, 1996). On the basis of data showing that the N170 is not sensitive to face identity (Anaki, Zion-Golumbic, & Bentin, 2007; Bentin & Deouell, 2000; Eimer, 2000), is larger (and delayed) for face components (particularly eyes) than for full faces (Bentin et al., 1996; Itier, Latinus, & Taylor, 2006), is larger (and delayed) for inverted faces (Bentin et al., 1996; Rossion & Gauthier, 2002), and is equally large for scrambled and normally configured faces (Zion-Golumbic & Bentin, 2007), it was suggested that the N170 is triggered by the detection of global face structures as well as other face-related information in the visual field. Importantly, several studies found that the N170 component was entirely unaffected by any of the basic emotional expressions (e.g., Ashley, Vuilleumier, & Swick, 2003; Eimer & Holmes, 2002, 2007), implying that expression processing of faces occurs at a postperceptual stage. Recently, however, growing evidence suggests that the N170 can be modulated by facial emotion; for example, neutral expressions elicit smaller N170 amplitudes relative to emotional faces (e.g., Blau, Maurer, Tottenham, & McCandliss, 2007; Caharel, Courtay, Bernard, Lalonde, & Rebaï, 2005), and happy faces elicit smaller amplitudes than do other emotions (e.g., Caharel et al., 2005). These discrepancies in experimental findings might be related to differences in design and stimuli. It is known that early visual evoked potentials are very sensitive to alterations in lower-level perceptual features, which poses a problem for many studies. Importantly, in the above studies on emotion modulation of the N170, the participants did not exhibit the PCA on the basis of explicit face categorization by expression, and hence, whether the PCA occurs at the early stage relevant to the N170 is still unclear.

Another ERP component that, although not specifically associated with face perception, is modulated by face categorization is the posterior N2 component with a temporal-occipital scalp distribution. The posterior N2 is a negative deflection peaking between 200 and 300 ms and modulated by factors affecting the categorization of visual stimuli, such as the level of mutual information contained in an image patch, which is determined by simple similarity between the fragment and the image (Harel, Ullman, Epshtein, & Bentin, 2007) or the difficulty of distinguishing between the categories (Philiastides, Ratcliff, & Sajda, 2006; Philiastides & Sajda, 2006). Additionally, a recent study also considered that processing within the 200- to 300-ms latency range—that is, the posterior N2—reflects early stimulus discrimination and response selection processes (Di Russo, Taddei, Apnile, & Spinelli, 2006). Evidence connecting this component with facial emotion processing comes from the studies that have demonstrated its sensitivity to the emotional nature of faces. For example, as compared with the faces of a friend and an unknown person, the face of a beloved decreased the N2 (Langeslag, Jansma, Franken, & Van Strien, 2007). In addition, the N2 was smaller in response to faces with emotional expressions, as compared with neutral faces (Eimer & Holmes, 2007). In contrast, Balconi and Pozzoli (2003) reported emotion-specific N2 at posterior sites to emotional faces, as compared with neutral faces, with an increased N2 amplitude in response to anger, fear, and surprise faces. Moreover, as compared with happy, neutral, and fearful faces, disgust faces elicited a unique negative component at ~300 ms—that is, the posterior N2 (Ashley et al., 2003). The above mixed results for the posterior N2 component could be due to methodological differences among these studies precluding strong inferences. However, the above studies involved the implicit or explicit valence evaluation of facial emotion—indeed, not the speeded expression categorization per se—and hence, whether the modulation of emotional expression of faces on the posterior N2 component is the source of PCA is still open.

Finally, the third ERP component that will be investigated in this study in conjunction with categorizing the expressions of faces is the P3, which is a generic name for a variety of relatively late positive components with a centro-parietal or centro-frontal midline distribution (Donchin, 1981; Polich, 2007). While initially discovered in response to task-relevant oddball (infrequent) stimuli (Sutton, Braren, Zubin, & John, 1965) and found sensitive to the subjective probability assigned to the occurrence of the eliciting event (Duncan-Johnson & Donchin, 1977), many models concerning the cognitive mechanism(s) manifested by this neural event have been proposed (Donchin, 1987; Donchin & Coles, 1988, 1998; Verleger, Jaskowski, & Wascher, 2005). Notwithstanding these controversies, it is unanimously agreed that the P3 latency reflects the length of stimulus evaluation processes when a two-choice reaction time (RT) is required (e.g., McCarthy & Donchin, 1981) and that, other factors being kept constant, its amplitude is largely determined by stimulus relevance (Gray, Ambady, Lowenthal, & Deldin, 2004), by amount of attention allocated to the stimulus (Kok, 2001), and by task complexity (Johnson, 1986). To date, converging evidence has indicated that valence level of emotional faces can modulate the P3 when arousal level is controlled (e.g., Conroy & Polich, 2007; Rozenkrants & Polich, 2008). In particular, pleasant face pictures elicit larger P3 amplitudes than do unpleasant pictures when the pictures are task relevant, such as valence evaluation to ensure that the participant is attending to the pictures (e.g., Carretie, Hinojosa, Albert, & Mercado, 2006; Delplanque, Lavoie, Hot, Silvert, & Sequeira, 2004). For the valence evaluation task in these studies, however, the participants were asked to rate the valence of each picture after a long stimulus onset asynchrony (e.g., longer than 1 s), different from the speeded expression categorization (e.g., Schupp, Öhman, Junghöfer, Weike, Stockburger, & Hamm, 2004). Particularly relevant to the present study, using an oddball valence categorization task, Delplanque and colleagues found that the P3 was enhanced in response to pleasant target pictures, as compared with unpleasant ones (Delplanque, Silvert, Hot, Rigoulot, & Sequeira, 2006). The authors considered the context-updating process proposed by Donchin (1981) as accounting for the P3 effect of valence content of pictures. However, the mean RTs for the participants did not differ across the three target conditions (unpleasant, neutral, or pleasant). Indeed, the selective categorization task under the context of frequent nontarget stimuli reflects a distinctive mechanism from the task of simple two-choice classification. To this end, our prediction was that the latency and/or the amplitude of the P3 would correlate with the PCA.

In the present study, we used schematic face stimuli. Several studies using schematic facial expressions have shown the comparable emotional effect elicited by photographic facial expressions (Babiloni et al., 2010; Boucsein, Schaefer, Sokolov, Schröder, & Furedy, 2001; Eger, Jedynak, Iwaki, & Skrandies, 2003; Krombholz, Schaefer, & Boucsein, 2007). Schematic faces may have been ideal experimental stimuli because they allowed us to fully control the low-level physical features, to exclude additional information related to facial identity, such as gender, race, and so forth, and to minimize the confounding effects of general arousal, rather than valence per se. In addition, a previous study indicated that even a schematic face made from simple line fragments triggered the face-sensitive N170 (Sagiv & Bentin, 2001). Importantly, significantly increased fMRI signal was found in the amygdala, hippocampus, and prefrontal cortex in response to emotional versus neutral schematic faces, suggesting that schematic faces may be useful for studying brain responses to emotional stimuli because of their simplicity, relative to human faces (Wright, Martis, Shin, Fischer, & Rauch, 2002).

Method

Participants

Eighteen young healthy individuals participated in our study (10 female; 20–25 years of age; mean: 22.6 years). All participants were right-handed, had normal or corrected-to-normal visual acuity, and were free of a neurological or psychiatric history. They received payments for their participation and gave their written informed consent before the experiment.

Stimuli

To avoid the low-level processing of facial features, as well as boredom by the excessive repetition of one single model, each facial expression category consisted of 18 different schematic face models by manipulating the distance among facial features and by manipulating the shape of the facial features, particularly the mouths (Fig. 1). All stimuli were presented at the center of a cathode ray tube video monitor and were viewed from a distance of 100 cm at a visual angle of approximately 7.27° × 6.06°.

Fig. 1
figure 1

Example stimuli of schematic facial expressions

Procedure

Following the electrode application, the participants were seated in a dimly lit and sound-attenuated cabin. They were instructed to classify each face by the expression it represented (happy, neutral, or sad) and to respond by pressing correspondingly labeled buttons on the keyboard with the left index finger (Z key), right index finger (N key), or right middle finger (M key). Speed and accuracy were equally emphasized. All 324 stimuli (3 facial expressions × 108 faces) were randomly presented in a mixed design, with three blocks of 108 stimuli each, with a short break in-between, and the labels of the response buttons (happy–neutral–sad/sad–happy–neutral/neutral–sad–happy) were counterbalanced across the participants. Each face was presented for 300 ms with an intertrial interval ranging randomly between 600 and 800 ms, starting after response. The participants completed one practice sequence of 18 stimuli (6 from each type, equally representing the three facial expressions). These stimuli were not used in the main experiment, which lasted approximately 15 min.

EEG recording

An electroencephalogram (EEG) was recorded continuously using an electrode cap with 64 sintered Ag/AgCI electrodes mounted according to the extended international 10–20 system and referenced to the tip of the nose. An electrooculogram (EOG) was recorded via two pairs of additional electrodes, with one placed above and below the left eye and the other placed to the external canthi of both eyes. The EEG and EOG were amplified and digitized by the Neuroscan SynAmp2 Amplifier (Compumedics Ltd., Australia) with a band pass of 0.05–100 Hz and a sampling rate of 500 Hz. Electrode impedance was kept below 5 kΩ throughout the experiment.

After the EOG artifacts correction (Semlitsch, Anderer, Schuster, & Presslich, 1986), the EEG was segmented into the epoch from 200 ms prior to 800 ms after the onset of the stimulus, and a baseline from 200 ms prior to stimulus onset was subtracted from all epochs. Epochs with an incorrect response or contaminated with peak-to-peak deflection exceeding ±100 μV were excluded from averaging. Accepted trials were averaged for each facial expression. The averaged waveforms were digitally low-pass filtered at 30 Hz (24 dB/octave) to reduce high-frequency noise.

Data analysis

Accuracy rates and RTs (from the stimulus onset) were recorded and analyzed using a one-way ANOVA design, with expression (happy, neutral, and sad) as the within-subjects factor.

The amplitudes and latencies of each of the ERP components were measured and analyzed with a similar mixed-model ANOVA with the addition of two within-subjects factors: hemisphere (left, right) and site (see below). Visual inspection of the grand average waveforms revealed a posterior negativity peaking at around 270 ms following the N170 component—that is, the posterior N2 component (Fig. 2). Therefore, we defined the most negative peak as being between 120 and 220 ms for the N170 and between 220 and 320 ms for the posterior N2, and the peak amplitudes and latencies were measured at the time windows of 120–220 and 220–320 ms for the N170 and N2 components, respectively. In line with most of the previous studies of the N170, we analyzed this component at the lateral and temporo-occipital sites P8, PO8, and CB1 over the right hemi-scalp and the homologous areas over the left hemi-scalp. The posterior N2 was analyzed at the same locations as the N170. According to the visual inspection of the grand average waveforms (Fig. 3), there were different patterns for the P3 component as a function of electrode sites. Therefore, the P3 was analyzed at midline sites (Fz, Cz, and Pz) and lateral posterior sites (left, P7, PO7, and CB1; right, P8, PO8, and CB2). At midline sites, the peak amplitudes and latencies of P3 were measured from 300 to 600 ms post-stimulus-onset. Regarding the P3 at posterior sites, since in many participants, the P3 peak was not easily discernible, in each condition, the analysis was based on the mean amplitude calculated between 330 and 450 ms (based on the group-average waveforms). Degrees of freedom were corrected whenever necessary using the Greenhouse–Geisser epsilon correction factor.

Fig. 2
figure 2

Grand average ERPs in response to happy, neutral, and sad faces, respectively

Fig. 3
figure 3

Voltage topographies of the N170, posterior N2, and P3 components for different facial expressions

Results

Performance

A one-way ANOVA analysis was conducted for the percentage of correct responses. The main effect of expression was significant, F(2, 34) = 7.95, p = .003, partial η 2 = .319. Post hoc comparisons showed that neutral faces were identified more correctly (97.4 %) than either happy faces (93.7 %, p = .002) or sad faces (94.2 %, p = .007), with no differences between the latter conditions (p > .9). For each participant, incorrect responses or responses with RTs more than ±2 SDs from the mean in each condition were excluded for RT analysis. On average, 8.7 % of the responses were removed. The RTs were analyzed by using the same statistical model as that for percentages of correct responses. There was a significant main effect of expression, F(2, 34) = 95.2, p < .001, partial η 2 = .849, showing that neutral face categorization was faster (551 ms) than happy face categorization (602 ms, p < .001), which was quicker than classifying sad faces (656 ms, p < .001). To investigate the possible source of the PCA, a Pearson correlation analysis was conducted. This comparison showed that there was an overall significant positive correlation between the RT to negative face stimuli and the size of the PCA, r = .66, p < .005 (two tailed), but not between the RT to positive face stimuli and the PCA, r = .17, p > .05.

ERPs

Grand average ERP waveforms and the distribution of the relevant components are presented in Fig. 2 (N170 and N2) and Fig. 3 (P3). The effects of expression, hemisphere, and site were analyzed with an ANOVA using a mixed model design.

N170 component

An ANOVA of N170 latencies revealed a significant main effect of expression, F(2, 34) = 12.5, p = .001, partial η 2 = .423, revealing that the neutral faces elicited a shorter N170 latency (150 ms) than did the happy (155 ms, p = .002) and sad (156 ms, p = .001) faces but that there were no differences between the latter two emotional faces (p = .868). The main effect of hemisphere was also significant, F(1, 17) = 10.8, p = .004, partial η 2 = .389, showing a shorter N170 latency at right (150 ms) than at left (156 ms) occipital-temporal sites. The significant two-way interaction revealed larger hemisphere effects for both happy (p = .004) and sad (p = .004) faces than for neutral faces (p = .049).

For N170 amplitude analysis, there was a significant main effect of expression, F(2, 34) = 4.16, p = .024; partial η 2 = .196. A post hoc comparison showed that the N170 amplitude of sad faces was larger (−8.1 μV) than that of happy faces (−7.2 μV; p = .030) but that there was no difference relative to that of neutral faces (−7.5 μV; p = .274). The N170 amplitude for the latter two conditions did not differ (p = .945). The main effect of hemisphere was also significant, F(1, 17) = 27.75, p < .001, partial η 2 = .620, revealing a right-hemisphere dominance (−6.0 μV and −9.2 μV for the left and right hemispheres, respectively). The significant three-way emotion × hemisphere × site interaction, F(4, 68) = 3.43, p = .029, partial η 2 = .168, showed that the expression effect was pronounced on the most posterior electrodes (PO7/8, CB1/2).

Posterior N2 component

As is revealed in Fig. 2, a distinguishable N2 component with an occipital-temporal scalp distribution was elicited with a peak around 270 ms poststimulus. As compared with neutral and sad faces, happy faces elicited a more negative N2.

An ANOVA did not show any significant main and interaction effects for the N2 latency (ps > .1). As for the N2 amplitude, there was a significant main effect of expression, F(2, 34) = 10.80, p = .001, partial η 2 = .388, revealing that the N2 amplitudes of happy faces were significantly larger (−4.9 μV) than those of neutral (−2.8 μV; p = .001) and sad (−3.6 μV; p = .005) faces but that there was no difference between the latter two conditions (p = .545). The main effect of hemisphere was also significant, F(1, 17) = 14.06, p = .002 partial η 2 = .453, revealing a right-hemisphere dominance of the N2 (−3.1 and −4.4 μV for the left and right hemispheres, respectively). Moreover, the two-way expression × hemisphere interaction was significant, F(2, 34) = 3.78, p = .047, partial η 2 = .182. Further analysis revealed that although the N2 to happy faces was larger than that to neutral faces in both hemispheres (p = .002 and p < .001 for the left and right hemispheres, respectively), it was larger than that to sad faces at the left hemisphere (p = .003), but not at the right hemisphere (p = .072).

P3 component at midline sites

A two-way RANOVA was conducted for P3 analysis at midline sites, with expression (happy, neutral, sad) and site (Fz, Cz, Pz) as within-subjects factors.

For the analysis of P3 latencies, there was a significant main effect of site, F(2, 34) = 42.37, p < .001, partial η 2 = .714, showing a shorter P3 latency at parietal sites (387 ms) than at frontal (412 ms; p < .001) and central (418 ms; p < .001) sites, with no difference between the latter conditions (p = .091). Although the main effect of expression was not significant, F(2, 34) = 3.17, p = .055, partial η 2 = .157, we found a significant expression × site interaction, F(4, 68) = 6.44, p = .002, partial η 2 = .275, and the further analysis revealed that at frontal and central sites, sad faces elicited a delayed P3 (431 ms) more than did happy (404 ms; p = .020) and neutral (410 ms; p = .130) faces, with no difference between the latter two conditions (p > .90).

For the analysis of P3 amplitudes, there was a significant main effect of site, F(2, 34) = 7.78, p = .004, partial η 2 = .314, with a larger amplitude of 8.6 μV at the Cz site. Importantly, the two-way interaction was significant, F(4, 68) = 7.66, p = .003, partial η 2 = .215, and the following analysis showed that at the Pz site, sad faces elicited a smaller P3 (7.1 μV) than did either happy (8.5 μV; p = .001) or neutral (8.3 μV; p = .012) faces, with no difference between the latter conditions (p > .90).

P3 component at lateral posterior sites

A three-way RANOVA was conducted for P3 mean amplitudes at the lateral posterior sites, with expression (happy, neutral, sad), hemisphere (left, right), and site (P7/8, PO7/8, CB1/2) as within-subjects factors. We found a significant main effect of expression, F(2, 34) = 14.82, p < .001, partial η 2 = .46, reflecting that the P3 in response to neutral faces was larger (2.9 μV) than that to happy faces (1.4 μV; p = .045), which elicited a larger P3 than did sad faces (0.35 μV; p = .021). Both the main effects of hemisphere, F(1, 17) = 21.09, p < .001, partial η 2 = .554, and site, F(2, 34) = 9.32, p = .004, partial η 2 = .354, were significant, showing a larger P3 over the right (2.7 μV) than over the left (0.4 μV) hemisphere, as well as the largest P3 amplitude (1.9 μV) at PO7/8 sites. No interactions were significant (ps > .07).

In addition to the ANOVAs, since we assumed that the modulations of the P3 actually reflected the stimulus evaluation and decision processes, we calculated the Pearson correlations between the RTs and the P3 measurements. The Pearson correlations between the RTs and the amplitude of the P3 were conducted at lateral parietal sites and the Pz site, respectively, due to the established expression effect of P3 amplitudes at these sites. The significant negative correlations were found between the RTs and the amplitude of P3 at the lateral posterior sites, r = −.28, p < .05, not at the Pz site, r = −.17, p > .05; that is, the longer the RTs, the lower the P3 amplitude. Intriguingly, a significant positive correlation between RT and the latency of the P3 at Cz site was found, r =.27, p < .05; that is, the longer the RT, the longer the P3 latency. In addition, we did not find any significant correlations between the RTs and other ERP components (i.e., N170 and posterior N2) modulated by facial expressions (ps > .10).

Discussion

The wealth of data and the results of this experiment can be summarized as follows. Although neutral faces were classified more quickly and accurately than either happy or sad faces, a significant PCA—that is, faster classification for happy than for sad faces—was found on RTs, not on accuracy. For ERP data, as compared with sad faces, happy faces elicited a larger N170 and larger posterior N2 component. The P3 was modulated by facial expressions with higher amplitudes and shorter latencies for both happy and neutral than for sad stimuli, and particularly, the RTs were significantly correlated with the amplitude and latency of the P3. Overall, these data show a robust PCA in expression classification, starting when the stimulus has been recognized as a face revealed by the N170 component.

The performance of the present participants showed that although, in line with previous reports, face classification by expression is faster for happiness than for sadness (e.g., Crews & Harrison, 1994; Hanaya, 1992; Leppänen & Hietanen, 2004), it is quicker for neutral than for happy faces. In the Leppänen and Hietanen study, shorter RTs to straight versus curved lines were reported, which may provide an additional explanation for the present results. In their study, however, the difference between upward-curved and downward-curved lines was very small and nonsignificant. As has been proposed by Leppänen and Hietanen, in other words, the happy face advantage (note, here, it is the PCA not the classification advantage for neutral vs. emotional faces) truly reflects an advantage in the decoding of the positive emotional expression in the face. It should be noted that although, in the present experiment, the features might vary due to the varied face models, these models still have commonalities that can be used in emotional detection (i.e., up-curved line as a mouth is always an up-curved line). This issue will be addressed in the future by using real faces with low-level visual features controlled or by manipulating the perceptual processing of faces (e.g., configural processing indexed by face inversion). The fact of a faster recognition rate for neutral faces is contrary to the results of the previous study, which showed that happiness was recognized faster than emotional neutrality (e.g., Hugdahl et al., 1993). This discrepancy could be due to different stimuli models used—real face photographs in Hugdahl et al.’s study but schematic face pictures in the present experiment. However, using face stimuli similar to those in the present study, Leppänen and Hietan found that there were no differences in the recognition times between happy and emotionally neutral facial expressions, although the PCA was evident. One possible explanation is that facial expressions of emotion are culture specific (e.g., Jack, Garrod, Yu, Caldara, & Schyns, 2012). Indeed, previous studies reported that in intensity ratings of facial expressions of emotion, Caucasians perceived greater intensity in display, but East Asians inferred greater intensity of subjective experience (e.g., Matsumoto, Kasri, & Kooken, 1999). The different strategy of classifying facial expressions across participants’ races could be an important factor for the present pattern, and this hypothesis needs to be further investigated. Aside from cultural factors, there might be methodological factors that explain the slight differences in RTs to schematic neutral faces between Leppänen and Hietan’s study and ours. The present study employed a well-controlled design with fully counterbalanced response button labeling, whereas in the Leppänen and HIetanen study, the neutral face was always the middle response button (the order of happy and sad faces was counterbalanced), which may explain the nonsignificant comparison between response performance (RT and accuracy) for neutral and happy faces. Alternatively, an interpretation of the present findings is that there is no advantage for positive but a response delay for negative face stimuli. Supporting this issue, the present correlation analysis indicated that there was an overall significant positive correlation between the RT to negative face stimuli and the size of the PCA, but not between the RT to positive face stimuli and the PCA. This may relate to negative signals requiring more processing resources (e.g., Eimer, Holmes, & McGlone, 2003; Morris, Öhman, & Dolan, 1999).

In the present task, the N170 component was sensitive to emotional expression, as manifested by larger amplitudes to sad than to neutral or smiling faces. These data support previous findings for early processing of emotional expression (e.g., Caharel et al., 2005) and suggest that negative emotions engender a more intense emotional reaction than do positive ones. Moreover, converging evidence showed that valence category reflects initial selective attention capture by salient image content (appetitive, threatening) and that unpleasant stimuli can produce stronger emotional effects than can pleasant stimuli—that is, a phenomenon of negativity bias (e.g., Cacioppo et al., 1999; Crawford & Cacioppo, 2002). The present findings of enhanced N170 for sad faces is in line with the above view, further indicating that the negativity bias can occur at the early stage of face perception. The present patterns of N170 effects were also consistent with previous findings that valence of affective pictures appeared to influence relatively early (100–250 ms) components of ERPs (Olofsson & Polich, 2007; for a review, see Olofsson, Nordin, Sequeira, & Polich, 2008), indicating that affective processing can be described as an automatic feature of perception (e.g., Fox, 1991; Öhman, & Soares, 1998). However, although neutral faces were recognized faster than happiness, we did not find the differences between the N170s elicited by happy and neutral expressions, in line with several previous studies (e.g., Ashley et al., 2003; Caharel et al., 2005; Eimer & Holmes, 2002, 2007; Eimer et al., 2003). In particular, the N170 effect did not correlate with RTs. Consequently, the modulation of facial expression on N170 amplitudes did not account for the PCA, and it could occur after the structural encoding stage of face perception revealed by the face-sensitive N170 component, although it is modulated by facial expressions.

In addition to N170 component, we found a specific negativity at 270 ms in posterior electrode sites for happy faces, rather than sad or neutral faces. To date, converging evidence suggested that pleasant and unpleasant pictures elicited an endogenous negative shift over occipito-temporal sites that started around 150 ms after picture onset and reached its peak around 300 ms, and this early posterior negativity (EPN) increased with the arousal level of the emotional pictures (Conroy & Polich, 2007). The pattern of topography and the latency of the present N2 component elicited by happy faces suggested that it was in fact an EPN, replicating previous findings of selective EPN effects for positive, but not negative, pictures (e.g., Hinojosa, Carretié, Méndez, & Pozo, 2009; Schupp et al., 2004). One could argue that the enhanced EPN for happy schematic faces in the present study is related to the local processing of individual face features—that is, an upturned mouth (Calvo, Nummenmaa, & Avero, 2010; Martin, Slessor, Allen, Phillips, & Darling, 2012). However, the feature dependence seemed to be impossible in our task, which used many face models by manipulating the distance among facial features and by manipulating the shape of the facial features to avoid the low-level processing of facial features, as well as boredom, by excessive repetition of one single model. Moreover, the task adopted in the present experiment was identical to that in the previous study in which one controlled experiment proved that participants might not categorize the expression dependent on the individual face features (Leppänen & Hietanen, 2004). Recently, it has been considered that the EPN indexes natural selective attention, such that evaluation of image features is guided by perceptual qualities that select affectively arousing stimuli for further processing (Dolcos & Cabeza, 2002; Schupp et al., 2004). Similarly, processing within the 200- to 300-ms latency range (i.e., N2 or EPN) could reflect the early stimulus discrimination (decoding facial expressions) and response selection processes (Di Russo et al., 2006). Therefore, the more selective processes for happy than for sad faces could be one of the sources of the PCA, although there was no correlation between the N2 and the RTs.

The delayed classification of sad faces was also evident in the latency and amplitude of the P3. In fact, the most conspicuous expression effects on ERPs in the present experiment were associated with this late positive component, which, in line with the RTs, was larger in amplitude and peaked at a shorter latency for happy than for sad face stimuli. Apparently, the P3 reflected more closely the expression classification performance than did any other earlier component. The modulation of the P3 can be easily accounted for by the delayed classification of sad stimuli. The P3 latency was roughly in the same time range as the RTs, and there was a statistically significant positive correlation between these two measures, supporting the view that the latency of the P3 is proportionate to the stimulus evaluation time (McCarthy & Donchin, 1981). The present expression effects of P3 amplitude replicated previous studies showing that image valence level does influence P3 amplitude, such that pleasant pictures elicit a larger P3 than do unpleasant pictures when the pictures are task relevant (e.g., Carretie et al., 2006; Delplanque et al., 2004). In contrast to the interpretation of the P3 latency that is at consensus, there are several factors affecting the P3 amplitude. Among those, most relevant to the present study is evidence suggesting that the amplitude of the P3 positively correlates with the distinctiveness of the stimulus (Otten & Donchin, 2000) and the ease of target identification (e.g., Johnson & Olshausen, 2005). This pattern fits, in fact, the pattern found here, with a larger P3 to happy than to sad face stimuli. According to this interpretation, the higher amplitudes of the P3 reflect the easier detection of positive expression, as compared with negative expression. In addition, several studies have suggested that P3 amplitude is positively correlated with the amount of attention resources allocated to the eliciting stimulus (Strayer & Kramer, 1990; Wickens, Kramer, & Donchin, 1984; Wickens, Kramer, Vanasse, & Donchin, 1983) and task complexity, which probably also modulates the attentional load (Kramer, Wickens, & Donchin, 1983; for a recent review, see Polich, 2007). From this perspective, the larger P3 amplitudes in response to positive stimuli might reflect the fact that happy faces attract more attention than do sad faces, which has been proved by a growing literature showing an advantage for happy faces, as compared with other facial expressions in attention. For example, visual search studies showed that happy faces were detected faster and with higher accuracy than were other emotional faces (e.g., Calvo & Marrero, 2009). Moreover, one recent study showed that the attentional blink effect was smaller for positive face icons than for negative face icons and demonstrated that happy faces capture attention (Miyazawa & Iwasaki, 2010; see also Calvo et al., 2010).

Before concluding, it was noteworthy that the ratings for valence (unpleasant to pleasant) and arousal (calm to exciting) levels could influence emotional face processing. In the present study, we did not test the arousal and valence levels of happy, sad, and neutral schematic faces. However, the higher recognition accuracy for each expression implied that participants can correctly classify them as happy, sad, and neutral faces. As for the arousal ratings, recent studies revealed that arousal elicits a positive-going waveform from about 200 ms until stimulus offset, including P3 and slow wave components (Rozenkrants & Polich, 2008; for a review, see Olofsson et al., 2008). Consequently, on the basis of the present findings for the P3 component, it could be reasonable that the present neutral schematic faces can be perceived as more arousing than happy faces, which are perceived to be more arousing than sad faces. However, there was evidence that negative faces (e.g., threatening faces) were perceived as more arousing than either friendly or neutral faces, which did not differ from each other (Johnsen, Thayer, & Hugdahl, 1995; Schupp et al., 2004). In addition, in the study by Rozenkrants and Polich, the target RT did not differ between high-level and low-level arousal affective pictures, although stimulus arousal level, rather than valence category of affective images, produces stronger affective ERP amplitude outcomes. Therefore, whether the arousal ratings of facial expression pictures can account for the present PCA awaits further investigation.

In conclusion, the present investigation of the time course of the positive advantage in the expression classification of faces showed that the categorization processes start in basic-level classification of physiognomic stimuli reflected by the N170, which is influenced by the stimulus expression. As compared with sad faces, happy faces elicited an increased lateral-posterior N2 component, which is associated with the decoding of positive emotional expression. The most conspicuous difference among facial expressions was found in the modulation of the latency and the amplitude of the P3. The negative correlations between RTs and P3 amplitudes and the positive correlations between RTs and P3 latencies reflect easier detection and faster stimulus evaluation time for the positive facial expression over the negative facial expression.