Introduction

The performance of goal-directed instrumental actions in both humans and animals is determined by knowledge regarding the contingency between actions and their outcomes as well as the current motivational value of the outcome (Adams & Dickinson, 1981; Balleine & Dickinson, 1998; Colwill & Rescorla, 1985; Dickinson & Balleine, 1994). Outcome-specific devaluation has become a gold standard in the assessment of goal-directed behavior (Colwill & Rescorla, 1985). Devaluation of the instrumental outcome, usually a palatable food reward, is typically achieved via one of two procedures: conditioned taste aversion (CTA; Adams, 1982; Garcia & Koelling, 1967) or sensory-specific satiety. Sensory-specific satiety refers to the decrease in preference for a food recently eaten to satiety relative to other foods (Hetherington & Rolls, 1996; Rolls, 1986, 1990; Rolls, Rolls, Rowe, & Sweeney, 1981; Young, 1940), and its use is often favored over CTA as it induces a temporary, reversible devaluation of the outcome.

In humans and animals, specific satiation reduces both the value of the food reward and the performance of actions to obtain that reward. Thus, following devaluation of a food reward, the performance of instrumental actions for that food is suppressed compared to responding for another food with distinct sensory properties (e.g., Balleine, 2005; Balleine & Dickinson, 1992, 1998; Dickinson & Balleine, 1994, 2002). Typically, the effects of outcome devaluation via specific satiety on instrumental behavior in humans (Alvares et al., 2013; Gottfried, O’Doherty, & Dolan, 2003; Hogarth, Chase, & Baess, 2012; Schwabe, Tegenthoff, Hoffken, & Wolf, 2012; Tricomi, Balleine, & O'Doherty, 2009; Valentin, Dickinson, & O'Doherty, 2007) and animals (Balleine & Dickinson, 1998; Balleine & O'Doherty, 2010) are assessed immediately or shortly after satiation. This is likely due to the assumption that sensory-specific satiety represents a temporary decline in the value of the food, and over time the value of the food will recover. In humans, the effects of sensory-specific satiety are observed from 2 minutes (Rolls et al., 1981) to 24 hours after satiation (Weenen, Stafleu, & de Graaf, 2005). However, despite its frequent use in the assessment of goal-directed behavior, the precise time course of satiety-induced devaluation has not been systematically studied in animals. Understanding the time course therefore has the strong potential to influence the practice of future studies using the outcome devaluation task.

Here, we evaluated the effect of sensory-specific satiety-induced devaluation on instrumental performance and consumption at 0, 2, or 5 hours following satiation. Experiments 1 and 2 revealed that both consumption and instrumental behavior are influenced by specific satiety-induced devaluation up to 2 hours after satiation. However, after a 5 hour delay, only consumption was effected by outcome specific satiation. This result is especially interesting as it suggests that, while the prefed outcome was less desirable relative to the non-prefed outcome, rats did not respond less to the prefed than to the non-prefed outcome. One obvious difference between the consumption and instrumental tests was that the former occurred in the same context as satiety-induced devaluation, whereas the latter did not. Consistent with the literature, we observed no effect of context when the test was conducted shortly after satiation. However, there is evidence that, under some conditions, instrumental actions remain unaffected by devaluation when devaluation occurs outside the instrumental context (Holman, 1975; Wilson, Sherman, & Holman, 1981). It was therefore plausible that the failure to observe an effect of devaluation on instrumental responding with a 5 hour delay was because satiation occurred in a context different to that used for instrumental responding. To test this hypothesis, in Experiment 3, devaluation was performed either in a different context, as in Experiments 1 and 2, or in the instrumental context. We showed that the instrumental response was sensitive to devaluation 5 hours after satiation only if devaluation was performed in the instrumental context. This finding provides novel evidence that contextual cues can bias action selection by modulating value representations.

Materials and methods

Experiment 1

The aim of this experiment was to explore the time course and recovery of the effect of sensory-specific satiety on instrumental responding. Briefly, rats were trained on two actions for distinct food rewards. Then, all rats were allowed to consume one food reward for 1 h before a choice extinction test. Critically, the choice test was given either immediately or 2 h or 5 h after the devaluation session (Fig. 1). Rats were given a choice consumption test immediately after the instrumental test.

Fig. 1
figure 1

Schematic representation of the test procedure used for the three groups in Experiment 1. Following instrumental training in the operant boxes, all groups received satiety-induced outcome devaluation for 1 h in plastic feeding cages (grey bars). Rats were then returned to the operant either immediately (group 0-hr), 2 h (group 2-hr), or 5 h (group 5-hr) after devaluation for a 10 min choice unrewarded test with the two actions (black bars). Immediately after the instrumental test, all rats were given a choice consumption test with the two food rewards in the plastic feeding cages (white bars). Rats in groups 2-hr and 5-hr were returned to their home-cages before testing

Subjects and apparatus

Twenty-four experimentally naïve male outbred Long Evans rats (Janvier, France) served as subjects. They were housed in plastic boxes (two rats per box) located in a climate-controlled colony room, and were maintained on a 12-h light/dark cycle (lights on at 7:00 a.m.). All behavioral procedures occurred during the light phase of the cycle. Rats were handled daily for 4 days before the behavioral procedures and were put on a food deprivation schedule 2 days before behavioral procedures to maintain them at approximately 90 % of their ad libitum feeding weight. All experiments were conducted in agreement with the French (council directive 2013-118, 1 February 2013) and international (directive 2010-63, 22 September 2010, European Community) legislations and received approval # 5012053-A from the local ethics committee.

Training and testing took place in eight operant chambers (40 cm wide × 30 cm deep × 35 cm high; Imetronic, Pessac, France) enclosed in sound- and light-resistant shells. Each chamber was equipped with two pellet dispensers that delivered grain or sugar pellets (45 mg) into a recessed magazine when activated. The chambers contained two retractable levers that could be inserted to the left and the right of the magazine. An infrared photobeam crossed the magazine opening, allowing for the detection of head entries. Four LED house lights provided illumination of the operant chamber.

Behavioral procedures

Instrumental training

On Days 1 and 2, rats were given two sessions of magazine training. During each session, rats were confined to the operant chamber while 45 mg grain (BioServ; 3.35 kcal/g) and sugar (Test Diet; 3.4 kcal/g) food pellets were delivered at random 60-s intervals. Forty outcomes were delivered per session, 20 of each outcome. On Days 3–11, rats underwent instrumental training during which time two responses (left and right lever presses) were trained each with a different food pellet. Each session involved two presentations of each lever for a maximum of 10 min each or until 20 outcomes were earned; that is, rats could earn a maximum of 40 grain and 40 sugar pellets within each session. The inter-trial interval between lever presentations was 2.5 min. The order of the lever presentation was alternated and counterbalanced across rats and days. For the first 3 days, lever pressing was continuously reinforced. Then, the probability of the outcome given a response was gradually shifted over days using increasing random ratio (RR) schedules: a RR 5 schedule was used on Days 6–8 and a RR 10 schedule on Days 9–11.

Outcome-specific devaluation

On Day 12, rats were given their first outcome devaluation test. In this test, rats received ad libitum access to one of the two food outcomes (20 g) for 1 h in distinct, polycarbonate feeding cages (42 × 28 × 20 cm) located in a different room to that used for training. Half of the rats in each response-outcome assignment received grain pellets and the remaining rats received sugar pellets. Next, all rats were given a 10 min choice extinction test in which both levers were available but no outcome was delivered. Critically, we manipulated the delay between the end of the devaluation session and the start of the test. Rats were either tested immediately (group 0-hr; n = 8) or at 2 h (group 2-hr; n = 8) or 5 h (group 5-hr; n = 8) after the end of the devaluation session (see Fig. 1). Rats in groups 2-hr and 5-hr were returned to their home cages during the delay period. All rats were given 1 day of retraining on the RR 10 schedule, and on Day 14 rats were given a second test with the other outcome devalued. Twenty-four hours before the first devaluation session, all rats were familiarized with the plastic feeding cages for 1 h and were allowed to consume four grain and four sugar pellets.

Consumption test of specific satiety

Immediately after each extinction test (Days 12 and 14), rats were returned to the feeding cages and given a choice consumption test of satiety-induced devaluation. Rats received 10 min access to both of the food pellets (10 g each) and the total amount of each outcome (valued and devalued) was recorded.

Statistical analyses

All data were analysed using planned, orthogonal contrasts in a mixed-model analysis of variance (ANOVA) with alpha set at 0.05. Simple main effects analyses were used to establish the source of any significant interactions. Measures of effect size (partial η2 for ANOVA and Cohen’s d for between-subjects contrasts with two groups, Experiment 3 only) are stated for each comparison and confidence intervals (CI; 95 % for the mean difference, standardized using the sample standard deviation units) are reported for each significant comparison. Data are presented as mean ± SEM.

Experiment 2

As the groups of Experiment 1 were tested at different times of the day, in Experiment 2, we attempted to replicate the results of Experiment 1 with rats tested at the same time of the day.

Subjects and apparatus

Twenty-four experimentally naïve, male, Long-Evans rats (Janvier) served as subjects. The housing and training apparatus were the same as those described for Experiment 1.

Behavioral procedures

The training and testing procedures were identical to those described for Experiment 1 with one notable exception. For the outcome-specific devaluation tests, all groups were tested at the same time of day but received satiety-induced devaluation at different times of the day (see Fig. 2). Again, rats were given the devaluation treatment either immediately (group 0-hr; n = 8), 2 h (group 2-hr; n = 8) or 5 h (group 5-hr; n = 8) before the outcome-specific devaluation test.

Fig. 2
figure 2

Schematic representation of the test procedure used for the three groups in Experiment 2. The design was identical to that used for Experiment 1 except that rats were tested at the same time of day but received satiety-induced devaluation (grey bars) either immediately (group 0-hr), 2 h (group 2-hr), or 5 h (group 5-hr) before the instrumental (black bars) and consumption tests (white bars)

Experiment 3

The previous experiments show that instrumental responses are sensitive to satiety-induced outcome devaluation 2 h, but not 5 h, after satiation. However, consummatory responses were sensitive to devaluation following a 5 h delay. Given that, in the previous experiments the instrumental test was conducted in the training cages (i.e., a context where the rat had experienced the outcomes as valuable) whereas the consummatory test was conducted in the devaluation cages (i.e., a context where the rat had experienced the outcome as devalued); the context may have influenced which outcome representation was retrieved to guide behavior (Bouton, 1993). To test this hypothesis, in the current experiment devaluation was performed in either the instrumental context, i.e., the operant box (group Same) or in a different context (group Different), as in Experiments 1 and 2.

Subjects and apparatus

Sixteen experimentally naïve, male, Long-Evans rats (Janvier) served as subjects. The housing and training apparatus were the same as those described for Experiments 1 and 2.

Behavioral procedures

Rats were trained as above to press two levers for two distinct food rewards. Twenty-four hours after the final training session, rats were given an outcome devaluation test. For half of the rats, outcome-specific devaluation occurred in the same operant boxes used for training and testing (group Same) and for the remaining rats devaluation occurred in the feeding cages used in Experiments 1 and 2 (group Different). All rats were familiarized with the plastic feeding cages the day prior to the first devaluation test and were allowed to consume four grain and four sugar pellets in the cages. Rats were given 1 h access to one of the two food outcomes (20 g) in a small glass feeding dish; half of the rats in each group were given grain pellets and the remaining half was given sugar pellets. After devaluation, rats were returned to their home cages for 5 h and were then placed back into the operant cages for the instrumental test. Immediately following this test, all rats were given a choice consumption test in the operant cages.

Results

Experiment 1

Lever-pressing performance increased across instrumental training and did not differ between the planned groups (Fig. 3a). Statistical analyses confirmed a significant effect of session (F(1,21) = 351.41, p < 0.001, 95 % CI [4.02, 5.02], ηp 2 = 0.94), but no effect of group (largest F(1,21) = 0.17, p = 0.68, 95 % CI [−0.44, 0.66], largest ηp 2 = 0.02) nor any interaction between these factors (largest F(1,21) = 0.41, p = 0.53, largest ηp 2 = 0.02).

Fig. 3
figure 3

Experiment 1. (a) Mean (+SEM) lever presses averaged across levers during training for the planned groups 0-hr (open circles), 2-hr (open squares), and 5-hr (closed squares). (b) Mean (+SEM) total lever presses during the unrewarded instrumental choice test. Responding on the devalued lever is shown in the black bars and responding on the valued lever is shown in the white bars. (c) Mean intake in grams (+SEM) of the devalued and valued foods during the choice consumption test. Consumption of the devalued (prefed) food is shown in the black bars and consumption of the valued (non-prefed) food is shown in the white bars

Performance during the outcome-specific devaluation choice extinction test is presented in Fig. 3b. Inspection of the figure indicates that rats in group 0-hr and 2-hr pressed the lever associated with the valued outcome more than the lever associated with the devalued outcome. In contrast, rats that were tested 5 h after the end of the devaluation session failed to show selective devaluation. A mixed-model ANOVA conducted using factors of lever (valued vs. devalued) and group (0-hr, 2-hr, and 5-hr) found a significant effect of lever (F(1,21) = 44.13, p < 0.001, 95 % CI [1.37, 2.61], ηp 2 = 0.68) and an effect of group (F(1,21) = 6.17, p = 0.02, 95 % CI [0.12, 1.34], ηp 2 = 0.23), such that, overall, group 5-hr pressed more than groups 0-hr and 2-hr. The overall rate of lever pressing did not differ between group 0-hr and group 2-hr (F(1,21) = 1.79, p = 0.2, 95 % CI [−0.25, 1.16], ηp 2 = 0.08). There was also a significant lever by group interaction (F(1,21) = 8.02, p = 0.01, 95 % CI [0.33, 2.13], ηp 2 = 0.28). Simple-effects analyses conducted on the interaction found that rats in groups 0-hr and 2-hr pressed significantly more on the valued than the devalued lever [(F(1,21) = 21.54, p < 0.001, 95 % CI [1.33, 3.49], ηp 2 = 0.51) and (F(1,21) = 28.54, p < 0.001, 95 % CI [1.69, 3.85], ηp 2 = 0.58), respectively]; however, no such effect was found for rats in group 5-hr (F(1,21) = 2.32, p = 0.14, 95 % CI [−0.29, 1.87], ηp 2 = 0.1). The amount of the outcome consumed during the devaluation period did not differ between groups (largest F(1,21) = 0.46, p = 0.51, 95 % CI [−0.7, 1.38], ηp 2 = 0.02; mean consumption in grams: 0-hr = 7.6±0.91, 2-hr = 6.87±0.66, 5-hr = 6.92±0.7; data not shown). Magazine entries also did not differ between the groups during the instrumental test [largest F(1,21) = 0.06, p = 0.81, 95 % CI [−1.16, 0.92], ηp 2 = 0.003; mean magazine responses: 0-hr = 35±8, 2-hr = 38±11, 5-hr = 37±8; data not shown].

The results of the consumption test are shown in Fig. 3c. As shown, rats in all groups consumed more of the valued than the devalued food. Statistical analyses revealed a significant effect of food (valued vs. devalued; F(1,21) = 77.81, p < 0.001, 95 % CI [2.05, 3.31], ηp 2 = 0.79) and a significant effect of group such that, overall, rats in group 5-hr consumed more than rats in group 0-hr and group 2-hr (F(1,21) = 19.41, p < 0.001, 95 % CI [0.67, 1.88], ηp 2 = 0.48), and rats in group 2-hr consumed more than rats in group 0-hr (F(1,21) = 5.88, p = 0.02, 95 % CI [0.12, 1.51], ηp 2 = 0.22). Importantly, there was no significant food by group interaction indicating that the magnitude of the devaluation effect was similar across groups (largest F(1,21) = 2.44, p = 0.13, 95 % CI [−1.58, 0.22], ηp 2 = 0.1).

Experiment 2

Experiment 1 demonstrated that sensory-specific satiety is observed up to 5 h post-satiation whereas the impact of satiety-induced devaluation on instrumental responding is present 2 h but not 5 h after satiation. However, the groups in Experiment 1 were tested at different times of the day. Therefore, in Experiment 2, we attempted to replicate the results of Experiment 1 with rats tested at the same time of the day.

Lever-pressing performance increased across instrumental training and did not differ between the planned groups (Fig. 4a). Statistical analyses confirmed a significant effect of session (F(1,21) = 323.22, p < 0.001, 95 % CI [4.17, 5.26], ηp 2 = 0.94), but no effect of group (largest F(1,21) = 0.003, p = 0.96, 95 % CI [−0.65, 0.69], ηp 2 < 0.0001) nor any interaction between these factors (largest F(1,21) = 0.34, p = 0.57, ηp 2 = 0.02).

Fig. 4
figure 4

Experiment 2. (a) Mean (+SEM) lever presses averaged across levers during training for the planned groups 0-hr (open circles), 2-hr (open squares), and 5-hr (closed squares). (b) Mean (+SEM) total lever presses during the unrewarded instrumental choice test. Responding on the devalued lever is shown in the black bars and responding on the valued lever is shown in the white bars. (c) Mean intake in grams (+SEM) of the devalued and valued foods during the choice consumption test. Consumption of the devalued (prefed) food is shown in the black bars and consumption of the valued (non-prefed) food is shown in the white bars

Similar to the previous experiment, rats tested immediately and 2 h, but not 5 h, after the devaluation session pressed the lever associated with the valued outcome more than the lever associated with the devalued outcome (Fig. 4b).We found a significant effect of lever (F(1,21) = 47.53, p < 0.001, 95 % CI [1.39, 2.58], ηp 2 = 0.7) and an effect of group (F(1,21) = 4.46, p = 0.047, 95 % CI [0.01, 1.29], ηp 2 = 0.18), indicating that group 5-hr pressed more than groups 0-hr and 2-hr, which did not differ (F(1,21) < 0.01, p > 0.9, ηp 2 < 0.001). There was also a significant lever by group interaction (F(1,21) = 5.69, p = 0.03, 95 % CI [0.13, 1.93], ηp 2 = 0.21). Simple-effects analyses conducted on the interaction found that rats in groups 0-hr and 2-hr pressed significantly more on the valued than the devalued lever [(F(1,21) = 35.26, p < 0.001, 95 % CI [1.92, 4.0], ηp 2 = 0.63) and (F(1,21) = 15.76, p = 0.001, 95 % CI [0.94, 3.02], ηp 2 = 0.43), respectively]; however, no such effect was found for rats in group 5-hr (F(1,21) = 4.14, p = 0.06, 95 % CI [−0.02, 2.05], ηp 2 = 0.16). The amount of the outcome consumed during the devaluation period did not differ between groups (largest F(1,21) = 2.13, p = 0.16, 95 % CI [−0.27, 1.53], ηp 2 = 0.09; mean consumption in grams: 0-hr = 5.36±0.49, 2-hr = 5.56±0.58, 5-hr = 4.5±0.55; data not shown). Magazine entries also did not differ between the groups during the instrumental test [largest F(1,21) = 0.51, p = 0.48, 95 % CI [−1.21, 0.59], ηp 2 = 0.02; mean magazine responses: 0-hr = 37±7, 2-hr = 32±6, 5-hr = 42±9; data not shown].

All groups consumed more of the valued than the devalued food (Fig. 4c). Statistical analyses revealed a significant effect of food (valued vs. devalued; F(1,21) = 43.86, p < 0.001, 95 % CI [1.31, 2.51], ηp 2 = 0.68) and a significant effect of group such that, overall, rats in group 5-hr consumed more than rats in groups 0-hr and 2-hr (F(1,21) = 5.04, p = 0.04, 95 % CI [0.05, 1.32], ηp 2 = 0.19) and rats in group 2-hr consumed more than rats in group 0-hr (F(1,21) = 12.91, p = 0.002, 95 % CI [0.54, 2.01], ηp 2 = 0.38). Importantly, there was no significant food by group interaction indicating that the magnitude of the devaluation effect was similar across groups (largest F(1,21) = 3.02, p = 0.097, 95 % CI [−1.91, 0.17], ηp 2 = 0.13).

Experiment 3

The previous experiments showed that, at a 5 h delay, rats show sensitivity to satiety-induced devaluation in consumption but not in instrumental responding. However, in Experiments 1 and 2, devaluation occurred outside the instrumental context. In Experiment 3, we investigated if selective devaluation could be restored 5 h after satiation if devaluation occurred in the instrumental context.

Lever-pressing performance increased across instrumental training and did not differ between the planned groups (Fig. 5a). Statistical analyses confirmed a significant effect of session (F(1,14) = 141.61, p < 0.001, 95 % CI [2.67, 3.84], ηp 2 = 0.91), but no effect of group (F(1,14) = 0.16, p = 0.7, 95 % CI [−0.66, 0.97], ηp 2 = 0.01) nor any interaction between these factors (F(1,14) = 0.05, p = 0.83, ηp 2 = 0.01).

Fig. 5
figure 5

Experiment 3. (a) Mean (+SEM) lever presses averaged across levers during training for the planned groups Same (open squares) and Different (closed squares). (b) Mean (+SEM) total lever presses during the unrewarded instrumental choice test. Responding on the devalued lever is shown in the black bars and responding on the valued lever is shown in the white bars. (c) Mean intake in grams (+SEM) of the devalued and valued foods during the choice consumption test. Consumption of the devalued (prefed) food is shown in the black bars and consumption of the valued (non-prefed) food is shown in the white bars

Performance during the outcome-specific devaluation choice extinction test is presented in Fig. 5b. Similar to the previous experiments, rats that received devaluation in a different context to that used for training and testing (group Different) did not show selective devaluation. By contrast, rats that were devalued in the context used for instrumental training and testing (group Same) pressed the lever associated with the valued outcome more than the lever associated with the devalued outcome. A mixed-model ANOVA conducted using factors of lever and group found a significant effect of lever (F(1,14) = 27.42, p < 0.001, 95 % CI [0.79, 1.89], ηp 2 = 0.66) but no effect of group (F(1,14) = 1.62, p = 0.22, 95 % CI −0.38, 1.47], ηp 2 = 0.1), indicating that, overall, both groups pressed at a similar rate. There was also a significant lever by group interaction (F(1,14) = 7.96, p = 0.01, 95 % CI [0.34, 2.48], ηp 2 = 0.36). Simple-effects analyses conducted on the interaction found no significant difference between the devalued and valued lever for group Different (F(1,14) = 2.92, p = 0.11, 95 % CI [−0.16, 1.39], ηp 2 = 0.17) whereas rats in group Same pressed significantly more on the valued than the devalued lever (F(1,14) = 32.46, p < 0.001, 95 % CI [1.29, 2.84], ηp 2 = 0.7). The amount of the outcome consumed during the devaluation period did not differ between groups (F(1,14) = 1.29, p = 0.28, 95 % CI [−0.51, 1.64], d = 0.57; mean consumption in grams: Different = 6.76±0.98, Same = 5.78±0.73; data not shown) and magazine entries did not differ between the groups during the instrumental test [F(1,14) = 0.03, 95 % CI [−1.16, 0.98], p = 0.9, d = 0.09; mean magazine responses: Different = 55±14; Same = 58±13; data not shown].

The results of the consumption test are shown in Fig. 5c. As shown, rats in both groups consumed more of the valued than the devalued food. Statistical analyses revealed a significant effect of food (F(1,14) = 7.3, p = 0.02, 95 % CI [0.24, 2.10], ηp 2 = 0.34) but no significant effect of group (F(1,14) = 0.6, p = 0.45, 95 % CI [−0.34, 0.73], ηp 2 = 0.04) nor a significant interaction (F(1,14) = 0.6, p = 0.45, 95 % CI [−0.69, 1.48], ηp 2 = 0.04).

Discussion

These experiments show that instrumental actions are sensitive to specific satiety-induced outcome devaluation up to 2 hours after satiation. In a choice test, rats tested immediately or 2 hours after satiation responded significantly less on the lever associated with the sated (devalued) food than on the lever associated with the non-sated (valued) food. However, after a 5 hour delay, responding on the lever associated with the devalued versus valued food did not differ. Satiety-induced outcome devaluation is commonly used as a test of goal-directed behavior. Our results provide valuable insight into the time course of satiety-induced devaluation on instrumental behavior. To date, studies of goal-directed action that use satiation to devalue an outcome have been unable to temporally separate devaluation from test thereby making neural interventions during the devaluation phase, and not during the test, quite difficult. Indeed, many common neural manipulations, for example, pharmacology or chemogenetics (e.g., Armbruster, Li, Pausch, Herlitze, & Roth, 2007), lack the temporal specificity to selectively target the devaluation phase. The current results demonstrate that it is possible to temporally separate devaluation from test for up to 2 hours, and perhaps longer, if devaluation occurs in the instrumental context.

In contrast to the instrumental response, 5 hours after satiation, consumption remained sensitive to satiety-induced outcome devaluation. It therefore appears that two distinct outcome representations can simultaneously exist; devalued and valued. In the case of a 5 hour delay between devaluation and test, the devalued representation persists but sufficient time has elapsed to allow the outcome to regain some motivational value. Faced with a choice between two actions, the rat must recall one of these two competing representations to guide performance. Critically, in Experiments 1 and 2, the consumption test occurred in the same context as satiation, whereas the instrumental test occurred in a different context (the operant boxes). We therefore hypothesized that the context may have influenced which outcome representation was retrieved to control behavior (Bouton, 1993). In Experiment 3, we observed that selective devaluation was reinstated after a 5 hour delay if devaluation occurred in the same context used to test the instrumental response. This result provides clear evidence that, 5 hours after satiation, the outcome has two value representations. Which representation is retrieved to control performance depends on the contextual cues that are present during the test. When the outcome has been devalued in the same context as the test, the test context becomes associated with the devalued representation and therefore it is this representation that is used to guide behavior. In contrast, when the outcome is devalued in a different context, the test context retains its association with the valued representation of the outcome (established during training) and responding on the lever association with the prefed (devalued) outcome is restored. Importantly, our results also indicate that, up to 2 hours after satiation, only one (devalued) outcome representation exists and therefore contextual manipulations will be without effect.

Here, we have argued that the context rescues sensitivity to devaluation via a retrieval mechanism. However, it should be noted that the restoration of selective devaluation was observed when rats received acquisition, devaluation and testing in the same context. It is therefore not clear from our design if the critical manipulation was conducting devaluation in the same context as acquisition or the same context as test, or, both. Indeed, theories of reconsolidation may predict that conducting devaluation in the same context as acquisition could result in “deeper” or more effective devaluation (e.g., Hupbach, Hardt, Gomez, & Nadel, 2008). Nevertheless, our results provide clear evidence that the context can restore selective devaluation following a long delay and future research will address the specific mechanism by which this occurs.

In all three experiments, we showed that selective devaluation was abolished when the instrumental response was tested 5 hours after devaluation. Insensitivity to outcome devaluation is a hallmark of habitual actions, which are mediated not by the incentive value of the instrumental outcome or the contingency between the response and the outcome but rather by stimulus-response (S-R) associations (Adams, 1982; Dickinson, Balleine, Watt, Gonzalez, & Boakes, 1995). However, there is no reason to expect a goal-directed action to shift to a habitual one when recovering from satiety. Similarly, a change in the devaluation context is not expected to alter the nature of the response to be tested. Therefore, the failure to observe selective devaluation after a 5 hour delay does not indicate that instrumental behavior is governed by habitual S-R processes. Instead, our results provide evidence that contextual cues may be required for the value of specific outcomes to control instrumental responding; a suggestion that is consistent with several recent reports that the context can modulate instrumental behavior (Gremel & Costa, 2013; Jonkman, Kosaki, Everitt, & Dickinson, 2010; Thrailkill & Bouton, 2015; Todd, Winterbauer, & Bouton, 2012).