Over the past decades, scholars have changed the way science looks at human memory because of a paradigm shift that has set aside the multistore model of memory (e.g., Atkinson & Shiffrin, 1968). The first theoretical repositioning occurred when the concept of network made it possible to supersede the idea that memory was a deposit of discrete immutable abstract items neatly stored in our brains, and proposed the semantic network model instead. The latter model depicted memory as a set of interrelated functions implemented in a network of interconnected nodes (e.g., Collins & Loftus, 1975). In both the semantic network and the multistore model, however, memory traces had nothing to do with action and perception, but were simply combinations of abstract symbols manipulated according to syntactical rules (Fodor, 1983; Pylyshyn, 1984). The most significant change, so to speak, was still to come.

This occurred about 30 years ago and changed the very ontology of human cognition and memory. The shift was prompted by the so-called grounding problem (Harnad, 1990): abstract symbol manipulation models cannot explain how symbolic systems connect with the world. Mental representations need to be grounded in perception and action, as they cannot be a “free-floating system of symbols” (Dijkstra & Zwaan, 2014, p. 296). Memories are not abstract items neatly stored in our brains simply because they emerge from proximal sensory projections that include sensorimotor elements in their representations. The new somatic nature of memory, therefore, appears to be strictly linked to the symbol grounding hypothesis; that is, the idea that symbols only get their meaning through sensorimotor experience (Harnad, 1990). Since the seminal paper by Harnad (1990), an ever-increasing number of different theoretical approaches have endorsed the view that the body is key to shaping higher level cognitive functions such as memory (see, e.g., Wilson, 2002). Robust evidence has substantiated the claim that because mental representations are indeed grounded in both action and perception, no theory of cognition can bypass the grounding problem and be true to the facts. In particular, the one maintained by Damasio (1989) has become clear: information recognition as well as information recall require activity in multiple brain areas to take place near the sensory and motor regions. Consequently, scientists and researchers have come to conceptualize mental states and processes also in terms of embodied cognition and have committed to the view that memory trace activation, at least partly, enacts neural systems typically associated with perceptual and motor mechanisms engaged in encoding input information (e.g., Barsalou, 1999; Dijkstra & Zwaan, 2014; Glenberg, 1997). As argued by Glenberg (1997), previous theories simply presupposed that memory is a storage system and assumed that storing could be studied independently of how our own body and actions affect memory functioning and are in turn affected by that in real life situations. In point of fact, memory and (in general) cognitive functions have evolved to serve human agency and facilitate the interactions between us and our environment (Varela, Thompson, & Rosch, 1991; Barkow, Cosmides, & Tooby, 1992; Buss, 2005; Samson & Brandon, 2007; Callebaut & Rasskin-Gutman, 2005). Therefore, both cognition and memory processes are grounded in human experience and partake of a real-world environment impinging on perception and involving action. From this new perspective, a given stimulus is stored in the sensorimotor pathways that were activated, shaped, and strengthened by the item when it was initially processed: memory processes are no longer higher order cognitive activities totally detached from ordinary sensory processing; on the contrary, they reactivate it, albeit partially or, anyway, in a slightly different manner.

This theoretical change occurred within a wider paradigm shift in cognitive psychology, that leads to the so-called embodied cognition (EC) approach. One of the central cores, among others, of this new perspective is that cognition “is strongly influenced by the body” (Glenberg, Witt, & Metcalfe, 2013, p. 573). This review aims first at collecting evidence in favor of the claim that the memory system can be actually influenced by body manipulations. To this aim, it first introduces the sensorimotor model (SMM), according to which during encoding people register perceptual and motor information, and later on, when the encoded event is recalled, these representations are reactivated (see the next section). Second, it reviews studies inspired to SMM, coming from different research fields: eye-movements, co-speech gestures, body posture, and bodily expression of emotion studies. The review is mainly focused on episodic memory rather than on semantic memory. This choice has been made for two reasons. First, there are many reviews on concept representations and sensorimotor simulation (see, e.g., Meteyard, Cuadrado, Bahrami, & Vigliocco, 2012; Pulvermüller, 2013), and second, because the present article aims at addressing broader and more general issues on memory, by focusing on the effects of bodily manipulations at encoding and recall and the role of the body in the encoding specificity principle.

What emerges is that a growing body of evidence seems to support the SMM model and the general notion that the body is able to affect memory. In particular, sensorimotor reactivation it is not an epiphenomenon, but a privileged component of the memory traces through which our cognitive system is able to retrieve information. Therefore, the EC approach seems effective. However, as pointed out by Goldinger, Papesh, Barnhart, Hansen, and Hout (2016), memory is influenced by the body is a claim too vague. To better delineate the question on how events are represented and organized in our memory, one needs to outline the exact degree by which sensorimotor involvement determines the efficiency of memory processes. It is well established that when we remember, our body, and its potential actions, are involved in some nontrivial way, but at the same time, several questions still need to be answered: How and how much would the body affect cognition? Within which cognitive areas? Within what limits?

In the last part of the review, after presenting some conflicting studies, I defend an embodied approach on memory studies, highlighting at the same time its limitations coming from a critical evaluation of the empirical evidences. Indeed, the studies on the effect of body manipulations on memory detected interference effects that lead to a heterogeneous set of memory impairments, but, at the current state of research, the majority of evidence concerns effect on memory processes rather than memory representations. I will highlight how these studies seem to suggest that memory depends on sensorimotor processes, but only partially, and that in order to delineate the role of body in memory, one needs to exactly outline to what extent memory depends on these sensorimotor processes. Limitations of the SMM arise regarding important areas still unexplored. In this regard, one of the most compelling future challenges will be the investigation about the possible role of body manipulation in affecting the “sense of recollection”—that is, the subjective sense of reliving the original event.

The sensorimotor simulation model (SMM)

Barsalou elaborated Glenberg’s idea further by developing a theory of knowledge called the perceptual symbol system (PSS; Barsalou 1999, 2008). The PSS assumes the existence of a perceptual memory system through which the association areas in the brain capture bottom-up activation patterns occurring in the sensorimotor areas, whereupon an opposite top-down process is initiated and the association areas reactivate the sensorimotor areas to create perceptual symbols (see also Damasio, 1989; Edelman, 1989). Thus, memory traces are better understood in terms of sensorimotor encoding: they store information on the neural states underpinning the perception of our environment, our body, and our movements. Since they are nothing but neural patterns, memory traces are dynamic—that is, plastic and bound to be modified by subsequent encodings—so that what is retrieved in the future will not fully match in every minute detail what was at first encoded (Barsalou, 1999; Edelman, 1987, 1989).

According to the SMM, a given event fundamentally consists in perceptual information so that a reactivation of the same sensorimotor circuitry originally involved in its perception is also at stake whenever the event is recalled or comes to mind. In this respect, remembering is tantamount to creating mental simulations of bodily experiences in modality-specific regions of the brain. Memory consists in partial (or covert) reenactments of sensory, motor, and introspective states, not in amodal redescriptions of these states as suggested by the digital computer-inspired theories of the mind that dominated cognitive science during the late 20th century (e.g., Fodor & Pylyshyn, 1988). A growing body of literature has brought to the fore how body movements and position can affect cognition, especially mental simulation (see Körner, Topolinski, & Strack, 2015). In particular, some authors have hypothesized that mental simulations depend on the brain’s sensorimotor system to such a great extent as to call them “sensorimotor simulations” (e.g., Dijkstra & Post, 2015). As stated by Barsalou (1999) in the PSS theory, the sensorimotor neural circuits responsible for encoding perceptual information also store that information. From this perspective, when we try to remember information, we “mentally simulate” the original event, and the cortical reactivation of the neural areas activated when first encoding that event inheres in this cognitive process as well (for a review, see Kent & Lamberts, 2008).

In fact, Kolers (1984) anticipated the SMM, arguing that the procedures by which information is encoded are also stored in memory and could be used to speed up retrieval. In this respect, it would be impossible to track a clear distinction between storage and processing processes. Essentially, the SMM predicts that memory processes draw upon all of the perceptual modalities contributing to our experience and speculates that memory is distributed across the various modality-specific brain areas devoted to action and perception (e.g., Brunel, Labeye, Lesourd, & Versace, 2009; Niedenthal, Barsalou, Winkielman, Krauth-Gruber, & Ric, 2005), including those responsible for proprioception and introspection (for the role of introspection in cognition and the brain, see Le Doux, 2002; Solms & Turnbull, 2002). In short, memory traces are multimodal in nature, so their reactivation through mental simulation implies a multimodal reactivation. Recent theoretical views on memory mechanisms, such as the Act-In model (Versace et al., 2014), have stressed the pivotal role played by the body in memory functioning, contending that memory traces capture and reflect all the components of past experiences (see also Damasio, 1989; Edelman 1987, 1989); specifically, their sensory properties are captured by sensory receptors, regardless of whether they are visual, acoustic (see, e.g., Wheeler, Petersen, & Buckner, 2000), or motor (see, e.g., Nilsson et al., 2000) information.

Several neuroimaging and behavioral studies have gathered good evidence suggesting that shared modality-specific activation is what actually happens between encoding and recall. For example, Wheeler et al. (2000) found that retrieving vivid visual and auditory information reactivates some of the same sensory regions initially activated in its perception (i.e., the precuneus and the left fusiform cortex). The same pattern of results has been detected for encoding and retrieval of spatial information in the inferior parietal cortex (e.g., Persson & Nyberg, 2000). Such evidence is often cited to support the so-called reactivation hypothesis (see, e.g., Nyberg, 2002; Nyberg et al., 2001), which amounts to the idea that experiencing an event in the recognition phase and mentally reconstructing it at recall share the same brain modality-specific activation patterns.

To explore motor pattern reactivation, Nilsson et al. (2000) conducted an experiment in which the participants had to memorize a set of verbal commands in three separate experimental conditions: (1) in the verbal condition, they just had to listen to the commands; (2) in the imagery condition, they were invited to imagine the action described by the commands while staying still; and (3) in the enactment condition, they were asked to perform the action described by the commands. After each experimental condition, the participants had to engage in recall while being PET scanned. The results showed higher activation rates in the primary motor cortex (M1) for the enactment condition, lower rates for the verbal condition, and an average activation in the imagery condition (Nilsson et al., 2000). Similarly, Nyberg et al. (2001) ran an fMRI study to directly compare brain activity during learning with brain activity in subsequent recollections. They observed a substantial match between the cluster of regions activated in both the learning and recall phases (markedly in the left ventral motor cortex), and concluded that memorization seems to depend on activating and reactivating motor information (for similar results, see Masumoto et al., 2006). This would suggest that recollecting memories is thus a sensorimotor simulation process—information retrieval occurs through simulating past events and simulation reactivates the same sensorimotor areas originally activated at encoding.

In the SMM, episodic memory retrieval is related to the body because relevant sensorimotor aspects of the event and details on what it was about are reconstructed and packed together (Bietti, 2012). Yet although it is not the central focus of the present review, it is important to notice how the SMM is not limited to episodic memory because the perceptual components of past experiences turn out to be reactivated even in the case of semantic memory (e.g., Binder & Desai, 2011; Kiefer, Sim, Herrnberger, Grothe, & Hoenig, 2008; see also Meteyard et al., 2012; Pulvermüller, 2013). Such knowledge is coded and grounded in sensorimotor brain systems similarly to what happens for episodic memories (Tettamanti et al., 2005).

Further, it has been recognized that such reactivation works both ways: retrieval mechanisms reactivate encoding mechanisms, prompting encoding mechanisms to, in turn, facilitate recall tasks (Dewhurst & Knott, 2010; Kolers, 1973). If one is to accept that recall is a simulation of encoding processes and states—that is, repacking together the perceptual, affective and somatic components of human experiences, then prompting mechanisms congruent to those affecting encoding will speed up retrieval processes, while incongruent feelings, sensations or bodily movements will hinder them (see, e.g., Dijkstra & Zwaan, 2007; Dijkstra & Post, 2014; Ianì & Bucciarelli, 2018; Riskind, 1983).

In the next section, we shall focus on encoding specificity and show how the mutual relatedness of body and memory processes at the core of the sensorimotor simulation model of memory “implies that manipulations of the body and movement may result in memory changes” (Dijkstra & Zwaan, 2014, p. 298).

Sensorimotor memory pathways

According to the so-called principle of encoding specificity, encoding and recall are so interwoven that the more processing resources they share, the better information retrieval will be. The idea that retrieval and encoding processes are strictly linked to one another dates back to Tulving and Thomson (1973), who demonstrated that humans store very specific information about the context within which a stimulus is encoded. The literature has grown rapidly since then, and it has emerged that cognitive resource sharing is mainly due to memory. A classic example is the experiment by Baddeley and colleagues, in which participants were asked to learn a list of words either on land or under water in full scuba gear (Godden & Baddeley, 1975). During the recall phase, the participants showed better memory when tested under the same conditions as those under which they had learned the information (regardless of whether that was on land or in water) compared with when they were tested in a different environment. In other words, today we know that when a given stimulus is encoded into memory, the resulting stored representation contains information on the relevant stimulus plus information on any number of other situational and environmental cues present at the time the stimulus was encoded.

Kent and Lamberts (2008) argued that the SMM might shed light on the mechanisms underlying encoding specificity: the SMM holds that recalling perceptual information reactivates the neural networks responsible for first processing that information at encoding; so it quite naturally follows—and from memory architecture alone—that information available at encoding may also be available at recall (Barsalou, 2008; Slotnick, 2004; for semantic memory, see also Martin, 2007; Thompson-Schill, 2003; for domain specificity, see also Callebaut, Rasskin-Gutman, & Simon, 2005; Hirschfeld & Gelman, 1994). Indeed, at first, scientists considered the extent to which cognitive resources happen to be shared in memory processes with reference to environmental conditions only; that is, limited to situational and environmental external cues. However, based on evidence and hypotheses gathered over time, predictions have emerged that the encoding-recall match primarily and most importantly affects sensorimotor information regarding the body and its movement. Memory traces seem to contain detailed information on the posture, position, and movements the person underwent while encoding a given experience, and that very same sensorimotor information is deemed to be reactivated during the retrieval phase.

Two main predictions have been drawn from the mutual relatedness of body and memory processes:

  1. 1.

    The behavioral reenactment of processes involved in the encoding phase facilitates information retrieval; more precisely, if memories are simulations reconstructing an original event along with its relevant sensorimotor components, then triggering those components at recall should speed up the retrieval processes (Dijkstra & Zwaan, 2014).

  2. 2.

    Behavioral tasks drawing upon the same neural resources as those reenacted at recall should slow down the retrieval processes; in other words, sensorimotor simulation may be blocked by a concurrent task involving the same sensorimotor resources (Dijkstra & Post, 2015).

These predictions have been tested in several experimental settings.

Eye movements

There is growing consensus in the literature that the presence of eye movements during retrieval can indicate that a reenactment of the original experience is taking place, and, in addition, eye movements themselves seem functional to the retrieval process. For example, Laeng and Teodorescu (2002) found that oculomotor movements performed at encoding to explore certain given visual information were also reenacted at retrieval of the same stimulus, and crucially, the process of image generation was disrupted by forcing participants to maintain a static oculomotor position (participants were asked to fix a central cross when they were asked to answer a question about a previously seen object): memory suffered when spontaneous fixation was blocked at recall. More recently, Laeng, Bloem, D’Ascenzo, and Tommasi (2014) reported three experiments in which the participants first inspected a stimulus and then were asked to retrieve it. In the first experiment, the authors found that imagining previously seen objects (e.g., plain triangles with different orientation and form) resulted in a pattern of eye fixations that mirrored the shape and orientation of each object over the same region of the screen where it had originally appeared. In the second experiment, they found that eye movements at recall substantially overlapped those used to scan the objects in the initial phase of the study, even when more complex shapes were used. Crucially, such an overlap predicted accuracy in memory tasks in that those participants who reenacted eye movements during recall more closely resembling the original movements also showed higher scores in spatial memory tasks. In Experiment 3, memory performance significantly decreased when gaze was forced to remain on a fixation point distant from the original fixations. Interfering with gaze during recall seems to decrease the quality of the memory. Johansson and Johansson (2014) addressed the same fundamental issue using four direct eye manipulations in the retrieval phase of an episodic memory task: (a) free viewing on a blank screen, (b) maintaining central fixation, (c) looking inside a square congruent with the location of the to-be-recalled objects, and (d) looking inside a square incongruent with the location of the to-be-recalled objects. They found that a central fixation constraint perturbed retrieval performance by increasing reaction times needed for recalling such events. Secondly, memory retrieval was indeed facilitated when eye movements were manipulated toward a blank area that matched the original location of the to-be-recalled object. The results were robust both in respect of memory accuracy and RTs. Their findings provide novel evidence of an active and facilitatory role of gaze position in memory retrieval and demonstrate that memory for spatial relationships between objects is more readily affected than memory for intrinsic features of objects (Johansson & Johansson, 2014).

These results seem to suggest that there is a high gaze pattern correlation between perception and recall (see also Holm & Mäntylä, 2007). To develop empirically reliable theoretical frameworks in light of this convincing evidence, “it is essential that memory theorists . . . realize the importance of the encoding–retrieval relationship when designing experiments and building models of cognition” (Kent & Lamberts, 2008, p. 97).

Co-speech gestures: Simulating simulations

Hand and arm gestures are motor actions that often accompany speech and are intertwined with spoken content (Kelly, Manning, & Rodak, 2008; Krauss, 1998; McNeill, 1992). Several authors have considered gestures essentially as playing a pivotal communicative role (e.g., Kelly, Barr, Church, & Lynch, 1999): depending upon the context in which they are used, gestures accompanying speech can help interlocutors by adding information or disambiguating intentions. It is no coincidence that speakers’ motivation to communicate affects the size of the gestures they produce: when people are more motivated to communicate, they tend to perform larger gestures (Hostetter, Alibali, & Shrager, 2011). Similarly, individuals who are more empathic (and thus more motivated to communicate something clearly to their interlocutors) tend to produce more gestures in order to facilitate communication between the speaker and the listener (Chu, Meyer, Foulkes, & Kita, 2014). Indeed, listeners automatically incorporate the information coming from the speaker’s gesture in their mental representation of the communicative message, thereby facilitating comprehension and learning (e.g., Ping, Goldin-Meadow, & Beilock, 2014).

However, the results of several studies seem to suggest that the role of gestures is not exclusively communicative. Humans also tend to gesticulate in contexts in which their interlocutors are not able to observe their gestures (for instance, during a phone call; Rimé, 1982). Congenitally blind people use gestures too, even when they are knowingly speaking to a blind listener (see, e.g., Iverson & Goldin-Meadow, 1998). Moreover, arm gestures actually have other roles in addition to that of communication (Morsella & Krauss, 2004): indirectly, they facilitate the maintenance of spatial representations in working memory, and directly, they activate, through feedback from effectors or motor commands, the sensorimotor information that is part of the mental representations. According to these claims, in a study by Wesp, Hesse, Keutmann, and Wheaton (2001), the participants were asked to describe a painting either from memory or with it visually present: the participants gestured more often when descriptions were made from memory compared with when the spatial information was visually available. The authors concluded that gestures helped the participants by sustaining spatial and motor information associated with the mental representations stored in memory (Wesp et al., 2001).

Many results reported in the literature on gestures could easily be explained from the so-called gestures as simulated action (GSA) perspective, according to which gestures arise from cognitive processes engaged in simulating perceptual and motor states (Hostetter & Alibali, 2008, 2019). In fact, recent findings have suggested that gestures may be considered as a type of simulated action that arises when motor activation due to mental simulation processes exceeds a certain threshold (Hostetter & Alibali, 2008; Kelly, Bryne, & Holler, 2011). Put more plainly, by simulating action, gestures reflect sensorimotor mental simulations: they are external placeholders for internal processes. As a point of fact, the motor activation that results from the embodied simulation activation exceeding the threshold is a gesture. In line with this theoretical framework, the frequency of gestures during the verbal description of images seems to be influenced by the participants’ physical experiences with the stimuli portrayed in those images (Hostetter & Alibali, 2010): speakers gesture at a higher rate when they have specific motor experience with the information they are describing compared with when they do not. Gestures thus depend on the previous motor actions performed by participants: they represent the external sign of the internal motor simulation affected by the previous experiences. Further, several studies have detected that the form of a gesture roughly mirrors the form of the underlying sensorimotor mental simulation: during the explanation of a previous lifting task, which could be a physical lifting action or using a computer mouse, speakers produced more pronounced arcs in their gesture when they had experience of physically lifting objects (Cook & Tanenhaus, 2009). Thus, the form of gesture presumably depends on the nature of the underlying sensorimotor simulation.

In addition, many studies in recent decades have highlighted that gestures can also be considered as a component able to affect and shape mental simulations. In this perspective, gestures can both reflect and trigger a “sensorimotor simulation” (Dijkstra & Post, 2015), thereby inducing a beneficial effect in terms of learning and memory. Although they are not able to directly change the world, they are able to change and affect our mnestic processes (Madan & Singhal, 2012), both at encoding and retrieval. Because a stimulus is stored in the motor pathways that were involved when this stimulus was initially processed, gestures at encoding can thus enforce such motor pathways, and gestures at recall can facilitate their reactivation.

Supporting this notion, a number of studies have found that memory is enhanced in participants who accompany the items to be recalled with gestures at encoding and in participants who observe a speaker who gestures while uttering the items (e.g., Cutica & Bucciarelli, 2008; Cutica, Ianì, & Bucciarelli, 2014). Cook, Yip, and Goldin-Meadow (2010) presented participants with a series of short vignettes asking them to give detailed descriptions. The vignettes were then classified as either eliciting gestures during their description or not. The participants were given surprise free recall tasks after a brief delay, and after a 3-week delay. No matter how long the delay span, recall rates were higher for vignettes associated with gesturing when first described. In a subsequent experiment, enhanced memory performance was found even when participants were explicitly instructed to either gesture or not, rather than being allowed to gesture spontaneously (Cook et al., 2010).

Because mental simulations appear to support memory performance during retrieval and given that gestures can facilitate such mental processes, gestures could also be an effective way for improving memory at recall. In fact, a series of studies have revealed that, during the retrieval phase, verbal memory reports by both children and adults benefit from spontaneous gestures; in particular, children told to gesture when trying to recall an event remembered more about the event than children prevented from gesturing did (Stevanoni & Salmon, 2005). In a recent study (Ianì, Cutica, & Bucciarelli, 2016), individuals who had constructed an articulated mental simulation of a given text were more likely to accompany correct recollections with simultaneous gestures, compared with individuals who had constructed a less articulated model of the text; vice versa, individuals who had constructed a less articulated model were more likely to accompany correct recollections with anticipatory gestures (gestures for which the preparatory phase starts before the word they accompany). It is plausible that good learners have less need to produce anticipatory gestures to help them organize their thoughts. On the other hand, poor learners need to trigger their mental representation by using gestures. Similarly, Morrel-Samuels and Krauss (1992) investigated the relationship between the familiarity of a given imagine (i.e., the accessibility in memory) and the synchronicity of gestures during a task involving narrative descriptions of 13 photographs; again, the more familiar the image (and thus the associated word), the smaller the asynchrony. Vice versa, the less familiar the image, the more gesture onset preceded voice onset. A possible interpretation of this pattern of findings is that less familiarity with a given content is usually accompanied by the presence of anticipatory gestures that are able to play a self-structuring role of the information when there is the need to compensate for less accessibility in memory: gestures are part of the actual process of thinking (Clark, 2007).

Body postures and movements

Dijkstra, Kaschak, and Zwaan (2007) devised an experiment to assess how congruent body posture might facilitate retrieval of autobiographical memories. Autobiographical memories can also be considered as a form of sensorimotor simulation, an embodied model of the original event through which people relive the same visual, kinesthetic, spatial, and affect information of a given past experience. The authors argued that if a certain body position was assumed during a given past experience, the retrieval of that very same experience should be facilitated if the original body posture was reassumed compared with when an incongruent posture was assumed instead. That is exactly what they found. They asked the participants to retrieve autobiographical memories of specific events in the past while holding several different body positions that could be either congruent (e.g., staying lying down on a recliner while remembering the last dentist visit) or incongruent (e.g., staying lying down on a recliner while remembering the last football match) with the original one. The results demonstrated that response times were shorter in the congruent condition compared with the incongruent one. Free recalls after two weeks exhibited a similar effect: the participants retrieved memories from the first run better in the congruent condition than in the incongruent one. To conclude, having a memory-congruent body position appears to help participants gain access to their memories (Dijkstra et al., 2007). Similar body influences on memory have also been recently detected in other cognitive areas. For instance, a recent study by Morse, Benitez, Belpaeme, Cangelosi, and Smith (2015) detected a role of the body posture in word learning.

Body movements seem thus to facilitate or hinder the access of past experiences to memory recall by activating or inhibiting relevant sensorimotor aspects of the experience. A further question has arisen: does body manipulation, besides helping the process of recalling information in terms of reaction times (how information is retrieved) influence what is remembered?

Casasanto and Dijkstra (2010) asked whether simple motor actions might influence the quality of emotional memories and consequently what people choose to remember. Lakoff and Johnson (1999) had already observed how, when people talk about their emotions, they usually use linguistic expressions that are related to upward movement when referring to a positive emotional value (e.g., “My spirits soared”) and, conversely, if the emotional tonality is negative, they make use of expressions that are related to downward movement (e.g., “I’m feeling low”): metaphors are grounded in embodied representations. Thus, upward actions are associated with positive notions (good, virtue, happiness, etc.), and downward actions with negative ones (bad, sadness, pain, etc.). Casasanto and Dijkstra (2010) investigated whether these associations might also have an effect in memory processes, by testing whether body movements were able to affect the retrieval of specific emotional events. Specifically, the participants in their study were asked to retell autobiographical memories with either positive or negative valence while moving marbles upward or downward: retrieval was faster when the secondary motor task was congruent with the valence of the memory (i.e., moving marbles upward for positive valenced memories and downward for negative ones). In a subsequent and crucial experiment, they did not prompt the participants to retell positive or negative memories, and found that they retrieved more positive memories when asked to move the marbles up, and more negative memories when asked to move them down. These results led the two authors to conclude that there is a direct and causal link between action and emotion, that positive and negative life experiences are associated with schematic movement representations, and, most importantly, that body movements can also affect what we remember (Casasanto & Dijkstra, 2010).

Seno, Kawabe, Ito, and Sunaga (2013) extended these findings by showing that what is crucial for the modulation effect of emotional valence on recollected memories is self-motion and not visual motion per se. Participants underwent illusions of self-motion (i.e., vection) by viewing upward and downward grating motion stimuli. Usually, observers illusorily perceive self-motion in the direction opposite to the observed motion stimuli; thus, vection was supposed to help disentangle the effect of visual motion from the effect of self-motion. Indeed, the participants recollected positive episodes more often while perceiving upward vection. Further, no modulation of emotional valence was detected when grating motion was reduced to the point of not inducing any illusion of vection. It could therefore be inferred that vertical vection, that is, illusory self-motion perception, can modulate human mood. Väljamäe and Seno (2016) further examined the possibility by testing memory recognition using positive, negative, and neutral emotional images with high and low arousal levels. Those images were remembered accidentally while the participants performed visual dummy tasks, and were presented again later, together with novel images, during vertical vection-inducing or neutral visual stimuli. The results showed that downward vection facilitated the recognition of negative images and inhibited the recognition of positive ones. These findings on the modulation of incidental memory tasks provide additional evidence for vection influence on cognitive and emotional processing (Väljamäe & Seno, 2016).

Interference effects

While we are engaged in a sensorimotor mental simulation, performing a different secondary action that involves the same sensorimotor resources needed for the mental simulation (the action must be different from the simulated action, and thus incongruent with the mental simulation) can have an interference effect on the mnestic processes, both at encoding and recall.

For instance, Ping et al. (2014) suggested that performing movements that are incongruent with mental simulations that are active during the formation of mental representations can interfere with the formation of memory traces. In Experiment 1, the participants had to carefully observe a series of videos in which an actor uttered a series of sentences while performing a specific gesture. Immediately after each video, the participants saw a picture of an object that could be in a position that was congruent or incongruent with the gesture observed in the video (for instance, a nail in a vertical or horizontal position). Their task was to respond “yes” if the name of the object in the picture was mentioned in the sentence and “no” if it was not (filler trials). When the object in the picture was in a congruent position (i.e., when the information conveyed through the gesture matched the information in the picture), the participants were faster at responding correctly compared with when the object in the picture was in an incongruent position. This might demonstrate that participants automatically incorporated the information coming from the actor’s hand gestures into their mental representation of the speaker’s message. In Experiment 2, Ping et al. (2014) investigated whether the sensorimotor simulation triggered by the gesture involved the listener’s motor system. The participants performed the same task as in Experiment 1, with the exception that they were invited to engage in a secondary motor task during the observation phase, which was to be performed either with their arms and hands (i.e., the same effectors used by the actor in the video) or with their legs and feet (i.e., different effectors from those used by the actor). The rationale of the dual task was to discover whether involving the participants’ motor system by asking them to move their arms and hands would hinder the creation of mental simulation and result in disappearance of the congruence effect. Consistent with the somatotopic organization of the premotor activation during action observation, such interference should be specific to motor resources controlling the effectors used in producing gestures, arms and hands in the case at hand. To test this prediction, half of the participants in Experiment 2 were asked to plan and perform movements with their arms and hands, whereas the other half were asked to use their legs and feet instead. The results revealed that the participants in the arm movements condition did not show the congruent effect in the picture judgment task because they were prevented from creating the mental simulation originally associated with the stimuli. In contrast, the congruence effect persisted when the participants were asked to move their legs and feet: they responded to pictures that were congruent with the speaker’s gestures more quickly than to those that were incongruent. Crucially for memory conceptualization, Ianì and Bucciarelli (2018) highlighted that the advantage of observing gestures on memory for action phrases disappeared when at recall the listeners moved the same motor effectors as those moved by the speaker at encoding (i.e., their arms and hands), but was present when the listeners moved different motor effectors (i.e., their legs and feet). The results seem to suggest that the listener’s motor system is involved both during the encoding of actions and at their subsequent recall during retrieval.

A similar effect of a secondary motor task involving the same effector used to simulate the relevant action was observed by Yang, Gallo, and Beilock (2009). Past studies have illustrated that the perception of letters automatically activates a typing action motor program in skilled typists, so the authors set out to explore whether the fluency arising from the motor system could affect recognition memory. The results showed that expert typists made more false recognition errors for letter dyads that were easier and more fluent to type than for nonfluent dyads. To generalize, memory appears to be influenced by covert simulation of actions (e.g., typing) associated with the items being judged (e.g., letter dyads) to such an extent that the concurrent reactivation of motor programs leads to false recognitions. Vitally, a second experiment by Yang et al. (2009) showed that such effects disappeared when participants were asked to move the fingers they would use to type the dyads in an unrelated but concurrent secondary motor task, whereas it persisted when participants were asked to perform a noninterfering motor task. In other words, the participants who had in their mind a motor plan involving the fingers they would use to type the presented dyads could not simulate the action usually needed to type them because the relevant cognitive resources were involved in the secondary motor task.

Bodily expressions of emotions

Nonverbal emotional components like facial expressions might be critically implicated in the process whereby representing, during retrieval, experimental conditions and cues that were already present at encoding has an impact on memory. The study by Riskind (1983) was amongst the first on the topic and concerned the effect of facial expressions. It tested the congruence hypothesis about the priming effects of facial and body posture patterns on memory retrieval, suggesting that an individual should be more likely to recollect pleasant experiences when smiling and assuming an expansive physical posture. The hypothesis predicted that the accessibility of pleasant experiences from one’s own life history would increase when nonverbal expressive patterns were positive in valence rather than negative. Likewise, the accessibility of unpleasant experiences from one’s life history would increase when nonverbal expressive patterns were negative in valence, such as when an individual frowns and/or assumes a slumped posture (see also Laird, Wagener, Halal, & Szegda, 1982). The latencies with which the participants recalled pleasant and unpleasant life experience memories supported the congruence hypothesis: recalling an autobiographical memory with a smile and in an upright position improved access to pleasant memories.

These pioneering studies have revealed that positive/negative nonverbal expressions can affect memory retrieval. The findings reflect the importance of both sensory and motor functions and affective valence in memory retrieval, and are consistent with a view that conceptualizes cognitive processes as an integral part of the sensorimotor environment in which they are embedded. Bodily experience is more than just an emotional state exceeding a given threshold—it is part of that emotional state; not surprisingly, “in depressive patients the body becomes conspicuous, heavy and solid” (Zatti & Zarbo, 2015‚ p. 2-3). According to the so-called somatic marker hypothesis (Damasio, 1996), an emotion is a change in the representation of body state, and the results of emotions are primarily represented in the brain in the form of “transient changes in the activity pattern of somatosensory structures . . . the somatic states” (Damasio, 1996, p. 1414). Empirical findings also highlight the role of body posture and body image in the expression and communication of mood. A study by Canales, Cordás, Fiquer, Cavalcante, and Moreno (2010) revealed that during depressive episodes, patients showed a series of postural changes such as increased head flexion, increased thoracic kyphosis, a trend toward left pelvic retroversion, and abduction of the left scapula. These alterations were not found during the remission phase. Memories, with their cognitive, affective, and sensorimotor information, then influence body expression.

However, it is possible to deduce from the studies mentioned above that the manipulation of body expressions may in turn influence mood. On a clinical level, it means investigating, for example, whether people with certain disorders such as depression or other mood disorders, which tend to have negative thinking and a greater reenactment of negative memories, might benefit from the manipulation of body posture and movements.

Experimental research has demonstrated this bidirectional link between nonverbal behavior and human feeling. Assuming a stooped rather than a straight sitting posture may lead people to develop an increased level of helplessness (Riskind & Gotay, 1982). Similarly, inducing an upright or slumped posture in the laboratory setting can influence the amount of pride people express (Stepper & Strack, 1993): success at achieving an outcome led to greater feelings of pride if the outcome was received in an upright position rather than in a slumped posture.

Recently, several experiments demonstrated that expansive compared with contractive nonverbal displays (high-power-pose or low-power-pose condition) produced subjective feelings of power and increased risk tolerance (e.g., Carney, Cuddy, & Yap, 2010). In relation to memory processes, it has been demonstrated that sitting in a slumped posture while imagining positive or negative events associated with positive or depression-related words leads people to refer to more negative than positive words in an incidental free recall task: relatively minor changes in the motoric system can affect emotional memory retrieval (Michalak, Mischnat, & Teismann, 2014). It is not just slumped sitting that can induce negative emotional processing but also slumped walking or generally walking with a forward-leaning posture that can negatively affect memory. Using an unobtrusive biofeedback technique, Michalak, Rohde, and Troje (2015) manipulated participants to change their walking patterns to either reflect the characteristics of depressed patients or a particularly happy walking style. During walking, the participants first encoded and later recalled a series of emotionally loaded terms. The difference between positive and negative words retrieved at recall was lower in the participants who had adopted a depressed walking style compared with those who had walked as if they were happy: walking style affects memory functions (Michalak et al., 2015).

Because straight posture is related to feelings of power and pride (see, e.g., Oosterwijk, Rotteveel, Fischer, & Hess, 2009), an upright body posture might make it easier for people to recover from negative feelings. Conversely, recovery from negative mood might be impaired when people assume a stooped rather than a straight sitting posture. This was established by a study by Veenstra, Schneider, and Koole (2017), who asked participants to imagine a negative or a neutral situation; later on, they manipulated the participants’ body posture, telling them to adopt a stooped, straight, or self-chosen body posture. After which, they asked them to list their thoughts in order to support spontaneous mood regulation. The results revealed that a stooped posture, compared with control and straight postures, prevented effective mood recovery in the negative imagination condition and increased negative affect in the neutral imagination condition. In addition, overall stooped posture induced more negative thoughts compared with straight or control postures (Veenstra et al., 2017). Body posture can play an important role in recovering from negative mood. Arminjon et al. (2015) asked participants to first read a sad story in order to induce a negative emotional memory, and then to self-rate their emotions and memories about the text. One day later, some of the participants were asked to assume a predetermined facial feedback (smiling) while reactivating their memory of the sad story. The participants were once again asked to fill in emotional and memory questionnaires about the text. The results showed that participants who had smiled during memory reactivation rerated the text less negatively than control participants who were not asked to assume any specific facial expression. In short, manipulating somatic states can also modify the emotional aspects accompanying any given memories; thus, the body and its morphology also appear to play an instrumental role in emotional information processing (Arminjon et al., 2015). In other words, there is a reciprocal relationship between the bodily expressions of emotion and the way in which emotional information are experienced.

In this view, people’s bodily states can be considered as an opportunity for emotion regulation because it is no longer seen as a matter of the higher cognitive mechanisms controlling lower level systems—rather, it is conceptualized as an activity relying on (and emerging from) a well-established interplay between the mind and the body (that is to say, the emotional brain, namely, the person; Damasio, 1994; LeDoux, 1996; Pessoa, 2017).

Depressed patients might be in a “trap,” in which negative thinking leads them to adopt a specific body expression, which, in turn, reactivates, in most cases, the sensorimotor circuits that were activated in the perceptual coding of negative events. Changing posture might break this vicious circle: it is a simple, highly acceptable, and low-risk intervention that, applied together with other interventions, might counter the depressive symptomatology. Wilkes, Kydd, Sagar, and Broadbent (2017) directly investigated this hypothesis. They tested whether changing posture (upright vs. usual posture) could reduce negative affect and fatigue in people with mild to moderate depression undergoing a stressful task. They found that postural manipulation significantly increased high-arousal positive affect and fatigue compared with usual posture. Moreover, on the Trier Social Stress Test speech task (during which participants are asked to deliver a free speech in front of an audience), the upright group spoke significantly more words than the usual posture group. Further, upright shoulder angle was associated with lower negative affect and lower anxiety across both groups. Manipulations of body posture can also be useful for counteracting fatigue in sleep-deprived individuals (Caldwell, Prazinko, & Caldwell, 2003).

Evidence against the SMM

Evidence in favor of the SMM is not uncontroversial. Readers should not interpret the described effects as always well-replicable and accepted. A series of conflicting studies seem to cast doubts on the reliability of such embodiment effects. For instance, a recent replication attempt (Wagenmakers et al., 2016) of a pioneering study by Strack, Martin, and Stepper (1988) failed to replicate the original results about the role of body in emotional processes. Specifically, in order to demonstrate a direct influence of body on emotional processing, Strack et al. (1988) asked participants to hold a pen in their mouths so that a smile was either facilitated or inhibited. At the same time, they were asked to rate the funniness of several cartoons. Although the participants were not aware of the meaning of the particular muscle contractions (authors inhibited or facilitated the muscles associated with smiling without explicitly requiring subjects to smile), their reported amusement had been affected by the induced expressions. However, the results of 17 independent direct replication studies (Wagenmakers et al., 2016) were inconsistent with the original pattern of results. Out of 34 Bayes factor analyses (two analyses for each replication attempt; i.e., the default BF10 and the replication BFr0), only one provided evidence in favor of the alternative hypothesis.

Besides this failed replication attempt, some studies on short-term memory show no evidence supporting a role of the motor system and sensorimotor simulations in memory. For instance, Pecher (2013) investigated the role of motor affordances in memory for objects. Participants were asked to observe a series of pictures of manipulable and nonmanipulable objects. Because several studies showed that perceiving manipulable objects (or their name), automatically triggers simulations of interacting with them (e.g., Tucker & Ellis, 2004), as well as motor areas activations (e.g., Cardellicchio, Sinigaglia, & Costantini, 2011; Hauk, Johnsrude, & Pulvermüller, 2004), such manipulation was devised in order to elicit or not to elicit motor activation in the observers. For each trial (i.e., each picture), after a 5,000-ms retention interval, participants observed a test stimulus and were asked to decide whether or not it was the very same as the previous stimulus. Crucially, during the interval between the two stimuli, participants had to perform a motor-interference task, a verbal-interference task, both tasks, or no task. Because only manipulable objects were associated with motor actions, the authors’ predicted that a motor interference task would have been detrimental, particularly for these objects. However, no interaction was found. Similar no-significant results were obtained by Pecher et al. (2013) and by Quak, Pecher, and Zeelenberg (2014) by using an n-back task instead of a classic recognition task. Similarly, when participants had to memorize the name of a series of objects (rather than pictures), either manipulable or not, and were asked to run a secondary motor task (tapping with hands or feet), results did not reveal any interference effect (see Zeelenberg & Pecher, 2016).

Recently, Canits, Pecher, and Zeelenberg (2018) have investigated whether motor representations triggered by the processing of manipulable objects may play a role in long-term memory. In their experiment, participants were first asked to run a categorization task after a surprise free recall task. In the categorization task, they had to decide whether a series of objects were natural or artificial. Crucially they were instructed to respond by grasping either a small cylinder (precision grasp) or a large cylinder (power grasp). Because objects could be large or small (thus eliciting a power or a precision grasp affordance), such manipulation led to four conditions, two of which were “compatible” (large objects/power grasp responses; small objects/precision grasp responses) and two of which were “incompatible” (large objects/precision grasp; small objects/power grasp). Later on, participants were asked to recall all the objects they had previously seen (such a task was unexpected). Results indicated that at the categorization task, responses were faster when the affordance triggered by the object was compatible with the type of grasp required by the response. Nevertheless, at the free recall task, authors did not find better memory for objects for which the grasp affordance was compatible with the grasp response. They concluded that there is no evidence in support of the hypothesis that motor action plays a role in long-term memory.

However, because these studies have not manipulated the body movements during the retrieval phase, thus not creating a strong mismatch between the sensorimotor information involved at encoding and retrieval, such results may not represent strong evidence against a role of the motor system in memory processes. In this regard, it has been found that what is crucial is the matching of “motor context” at encoding and retrieval (Halvorson, Bushinski, & Hilverman, 2019). Holvorson and colleagues (2019) found when participants are engaged in an incongruent unrelated motor task both at encoding and recall, they do not suffer any interference effect. Such results do not rule out a motor interpretation, but rather that the secondary motor task at encoding may have just changed the motor representation associated with the stimuli. Interference effects occurred only when at recall the secondary motor task was different from the encoding phase task. It is worth noting that a sensorimotor mismatch can also be induced during a recognition task by showing stimuli triggering motor information incongruent with those involved at encoding (e.g., Ping et al., 2014). In other words, because the matching of the encoding and recall motor context matters, it could be that in the study by Canits et al. (2018), producing a grasping movement incongruent with the object’s affordance did not cancel the observer’s motor representations, but rather just biased them. In a free recall task, during which no secondary motor instructions or incongruent stimuli were given, it is likely that participants had no difficulty in reactivating the same motor representations.

More in general, all these studies have investigated a specific kind of memory—namely, memory for objects or words. When people observe a given object or a word of an object, they activate relevant information about it, including the possible use of this object and the resulting motor programs. Thus, memory for a given object or a given word encompasses multimodal and sensorimotor knowledge gained by humans through their personal experiences (Conca & Tettamanti, 2018). As pointed out by Rosch (1978), inseparable from the concept of a given object are the ways in which humans habitually use or interact with those objects. For instance, for concrete objects, such interactions take the form of motor movements: the word chair triggers the action of sitting down on a chair and a sequence of typical body and muscle movements that are inseparable from the nature of the attributes of chairs—legs, seat, back, and so forth. At the same time, they activate much more information than the motor ones (i.e., the propositional and phonological representations or the semantic information; for a discussion see, e.g., Mahon & Hickok, 2016). Further, as pointed out by Mahon and Caramazza (2008), the majority of findings revealing a motor activation during language processing may be interpreted as a secondary and indirect activation, which would occur only after the concept has already been understood. Language comprehension relies on the propositional representation resulting from parsing processes and then on the construction of a kinematic mental model of the state of affairs described starting from the propositional representation (see also Kintsch, 2005). Instead, when we observe an agent acting in the world, we mentally simulate what he or she is doing through an “automatic, implicit, and non-reflexive simulation mechanism” (Gallese, 2005, p. 117). This mental simulation is implicit and is triggered by the stimulus itself (the kinematics of the observed movements). Conversely, the mental simulation stemming from language comprehension is indirect in that its input is the propositional representation resulting from the previously parsing processes. In other words, when a phrase is perceived, both the linguistic and simulation systems are active, but the linguistic system’s activation peaks before the simulation system’s activation (see, e.g., Barsalou, Santos, Simmons, & Wilson, 2008). Thus, the sensorimotor simulation triggered by language processing is somehow indirect. Consistent with this claim, a recent study by Ianì, Foiadelli, and Bucciarelli (2019) highlighted how simulation from action observation prevails on simulation from action phrases when their effects are contrasted. Specifically, when at encoding phase the action phrases to be remembered were paired with pictures portraying actions whose kinematics was incongruent with the implied kinematics of the actions described in the phrases, memory for phrases was impaired. However, the reverse does not hold: when participants were asked to remember the pictures portraying actions, their memory was not affected by the presentation of phrases representing actions whose implied kinematics was incongruent with the kinematics of the actions portrayed in the picture. Therefore, the motor simulation stemming from language comprehension would seem more mediated compared with motor simulation triggered by action production or action observation (i.e., the contents of an episodic memory). A possible reason could be that the memory system evolved primarily to process perceptual and motor aspects of experience (Barsalou, 2008). Therefore, the processing of experience is more central to human cognition than the processing of words of objects. In other words, although the motor component leads to a richer memory trace, a memory for an object is not confined solely to it. Different representational formats (e.g., the propositional or phonological ones) can easily compensate for the lacking motor information.

Put plainly, motor simulations could play a crucial role in remembering events, and a less pivotal role in remembering objects. Remembering episodes means substantially remembering actions (performed or observed), which, in most cases, have not been labeled with a propositional format before.

Indeed, in contrast to Pecher’s studies, Pezzulo, Barca, Lamberti-Bocconi, and Borghi (2010) found evidence in favor of a role of motor affordance in memory. In particular, the stimulus to be remembered was not a word or an object but instead a climbing route. Participants were asked to memorize three different routes: an easy-climb route, a hard-climb route and a route impossible to climb. Such routes were indicated twice by a trainer on the climbing wall. Participants were either novice climbers or expert climbers. After the trainer’s indication, they were asked to perform a brief distractor task, and later on they had to mark on a climbing-wall picture the correct sequence of the learned route. Whereas for the impossible and easy routes the two groups did not differ, for the hard route the expert climbers’ group perform significantly better than the novice group. These results have been interpreted as being in favor of a motor involvement in memory because only expert climbers, the only ones to possess the necessary skills to actually climb the hard routes, would have been able to create a sensorimotor simulation of the route.

Further, even some studies about memory for objects or verbs have detected sensorimotor interference effects. For instance, Shebani and Pulvermüller (2013) reported that rhythmic movements of either the hands or the feet led to a differential impairment of working memory for concordant arm and leg related action words. Specifically, hand/arm movements impaired working memory for words used to speak about arm actions, whereas foot/leg movements hindered memory for leg-related words (see also Dutriaux & Gyselink, 2016; Van Dam, Rueschemeyer, Bekkering, & Lindemann, 2013).

Conclusions and directions for future research

A first, necessary conclusion that can be extracted from these results is that memory traces contain detailed sensorimotor information—for instance, on the body posture and movements the person underwent while encoding a given experience. The very same sensorimotor information involved at encoding is thought to be reactivated during the retrieval phase. Mnemonic traces are not fully amodal mental representations, independent of the body. Rather, they are at least partly reenactments of the original bodily and somatic states, which are simulated through the same sensorimotor pathways involved when the event was encoded. The collection of these studies highlights the importance of the motor system in memory retrieval and supports the assumption that memory is for action, action is for memory (Dijkstra & Zwaan, 2014). Thus, episodic memory, classically considered as forms of declarative knowledge (e.g., Tulving, 1995), seems also to contain procedural information. The declarative and procedural memory systems have been intensively studied in humans, and the demonstration of numerous double dissociations has shown that the two systems are largely independent of each other (e.g., Eichenbaum & Cohen, 2001). However, from the results highlighted in the previous section, the concepts of declarative and procedural memory appear to have thinner boundaries. As already demonstrated, the evidence suggests that the two systems can, to some extent, acquire the same or analogous knowledge (e.g., Ullman, 2004) and mutually interact in a number of ways (Poldrack & Packard, 2003). Procedural information can trigger declarative information—for instance, the procedural information conveyed by gestures can activate declarative knowledge about an event that occurred previously, thereby triggering episodic memory (Ianì & Bucciarelli, 2017; Ianì et al., 2016). Further, many studies highlighted how the motor system and procedural information provided by performing or observing gestures can improve declarative memory (Engelkamp & Zimmer, 1985). Action helps in remembering verbal information in a procedural way, without the use of intentional encoding strategies (e.g., Earles & Kersten, 2002). At the same time, incongruent procedural information seems able to interfere with memory processes (e.g., Dijkstra et al., 2007).

Because memory is the reenactment of perceptual, motor, and somatic states acquired through experience with the world, it can be conceptualized as a “sensorimotor mental simulation.” The notion of “mental simulation” has been widely used in cognitive psychology. Although it underlies a variety of other cognitive processes, such as mechanical reasoning (e.g., Hegarty, 2004), deductive and abductive reasoning (Khemlani, Mackiewicz, Bucciarelli, & Johnson-Laird, 2013), mental imagery (e.g., Moulton & Kosslyn, 2009), and empathy (e.g., Niedenthal et al., 2005), it has been used primarily in relation to social cognition (e.g., Di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Keysers & Gazzola, 2007; Rizzolatti, Fadiga, Gallese, & Fogassi, 1996) and motion inference (Carlini, Actis-Grosso, Stucchi, & Pozzo, 2012). Understanding other people’s actions involves the activation of the same areas devoted to the production of the same actions both in the agents and in the observers. Indeed, a network of brain regions (i.e., the mirror-neuron system) is activated both when an action is performed and when it is observed in others (Buccino et al., 2001; Rizzolatti & Craighero, 2004; see, e.g., Prinz, 1990). Several authors (e.g., Gallese, 2007) have argued that such activation provides an internal representation of the observed motor programs that, in absence of overt movements, is usually called “action simulation” (Jeannerod, 2001). As an action is observed, the motor system constructs a forward simulation of the action in order to predict and anticipate the action goal (Wilson & Knoblich, 2005). What is the relationship between this kind of simulation and those implicated in memory processes? The literature lacks in differentiating the nature and the relationship between them. It is likely that these types of cognitive processes involve qualitatively different kinds of simulations and different kinds of phenomenological experiences (e.g., Kent & Lamberts, 2008; Moulton & Kosslyn, 2004). At the same time, both of these mental processes rely heavily on the motor system, and both contain perceptual and sensorimotor information. Such simulations are usually defined as situated, because the situated nature of experience in the environment is reflected in the situated nature of the mental representations underlying simulations (Barsalou, 2009). Further, simulations arising from action observation are modulated by the expertise and previous experience of the observer (Calvo-Merino, Glaser, Grèzes, Passingham, & Haggard, 2004): neural activation during action observation is greater when participants are familiar with the observed action compared to the neural activation arising from either unusual actions or actions outside the reach of ordinary human motor skills. In addition, when participants are trained in a specific movement, they are better than untrained participants at visually recognizing similar movements (Casile & Giese, 2006). These latter findings seem to suggest that the effectiveness of simulations triggered by impinging stimuli is strictly dependent on previous experiences, stored in the sensorimotor system of memory. Barsalou (2009) proposed an unique “simulator” process underlying both these simulations: the brain can be viewed as a coordinated system that generates a continuous stream of multimodal simulation (Barsalou et al., 2007), which is the reenactment of perceptual, motor and introspective states acquired during experience with the world, body, and mind (Barsalou, 2008). Such process has two main phases: storage in long-term memory of multimodal states that arise across the brain’s systems for perception, and partial reenactment of these multimodal states for later representational use, including prediction. In this view, simulation triggered by action observation is also substantially a memory process. Thus, we could conceptualize the simulation deriving from the observation of the action as a particular case of memory, in which fundamental aspects related to the input stimulus are automatically reactivated. Recently, evidence for a causal role of those brain regions in action comprehension has been reported (Michael et al., 2014).

A second corollary conclusion that can be drawn from the reviewed literature is that human body manipulations are able to affect memory processes by favoring or interfering with them. In this regard, sensorimotor representations and their modality-specific cortical activations seem to be not just an epiphenomenon but rather constitutive elements of cognitive processes. Focusing on the causal influences by which body manipulation affects memory, it appears that retrieval of memory traces involves the activation of sensorimotor brain areas and, crucially, that interfering with them by performing a secondary motor task leads to interferences in memory processes. Body manipulation results in changes in memory performance. It follows that memory is not fully independent from the body. In other words, the body, along with its sensorimotor states, is at least partly a constitutive and inseparable part of the cognitive processes involved in the encoding and retrieval of mnestic traces.

Toward a more circumstantial outline

Having established that the body is involved in memory processing, the question that arises is to what extent does memory depend on these sensorimotor processes? To better delineate the solution of this issue, one needs to outline the exact degree by which sensorimotor involvement determines the efficiency of memory processes. Although it is clear how the body shapes memory traces, there is a risk of overstating the role of motor processes in memory (see, e.g., Mahon & Caramazza, 2008). In this regard, the first consideration touches upon the more obvious and evident limit of body manipulations: none of these, even the most incongruous with the original experience or the most challenging for our motor system, are able to completely erase a memory trace, neither an episodic nor a semantic one. For example, being relaxed or lying down, such as at a dental visit, is usual, it facilitates the recovery of the memory trace concerning my last dental visit, but it does not cancel the memory itself. Similarly, keeping arms and the hands behind our back does not imply the incapacity to catch the meaning of graspable object (e.g., a mug). In fact, several studies seem to suggest that patients with impaired object-directed reaching and grasping due to motor areas lesions show intact object identification (e.g., Marotta, Behrmann, & Goodale, 1997). More in general, patients with sensorimotor impairments due to lesions in motor areas do not show great impairment in action understanding or action remembering (for a discussion, see, e.g., Hickok, 2014; Mahon, 2015). In other words, a nucleus of memory that is a “purely cognitive activity” seems to exist (Goldinger et al., 2016).

Second, the facilitation provided by a body position compatible with the content of the stimulus materials to be remembered seems to be only the result of a greater availability of processing resources. In fact, most of the abovementioned studies evaluated these congruence/incongruence effects in terms of access speed or access facilitation (e.g., Casasanto & Dijkstra, 2010; Dijkstra et al., 2007; Ianì & Bucciarelli, 2018), in terms of changes in the sensation of familiarity of a given motor sequence (e.g., Yang et al., 2009), or in terms of which manipulating somatic states can modify the emotional aspects accompanying memories (e.g., Arminjon et al., 2015; Veenstra et al., 2017). In all these examples, manipulations of the body seem to lead to a modulation of the memory process, but not to a suppression thereof, nor a changing in memory representations. Body manipulations are not able to reset memory processes to zero. They seem to be able to influence how humans feel or process information, or to change the accessibility of concepts associated with a given bodily state. This would seem to suggest how—although somatosensory elements come heavily into play and, in part, constitute the ontology of a memory—the latter is not confined solely to them; there is a nucleus in which the memory processes remain independent from somatosensory ones. In other words, it seems that sensory and motor information does not exhaust memory contents (Meteyard et al., 2012). Further, in order to describe the consequences of body manipulations in more detail, three possible areas need to be considered.

Effect of body manipulations on the quality of memory traces

In order to outline the extent to which the body affects memory, one should differentiate how sensorimotor interferences operate on the retrieval processes rather than on the representation of the memory traces. In other words, the first question still under debate concerns whether the congruence/incongruence effects can also affect the quality of a memory trace and, more precisely, the accuracy with which some details are reevoked, especially those that convey motor and spatial information. From the reviewed literature, such issue only appears to have been addressed in eye-movement experimental settings: when, at recall, the participants were asked to maintain fixation away from the original location of the stimulus shape, so as to interfere with the gaze reenactment process, recall accuracy decreased (e.g., Laeng et al., 2014). Oculomotor manipulations resulted in measurable costs in the quality of memory. Would a similar cost emerge using body and movement manipulations? For instance, if during the retrieval of a previously performed action participants are invited to perform a similar but not totally identical action (e.g., grasping a bottle in two different contact points), would they be less accurate in reporting the original action (e.g., is the estimation of where I grasped the bottle influenced by the action performed at recall)? In other words, does performing a secondary task (involving the same sensorimotor resources involved in the mental simulation during retrieval) result in decreased precision with which participants report spatial and motor information?

To sum up, from the quoted literature, evidence for a consequence on the accuracy of the memory trace arise only in eye-movement studies. In the other cases, at the current state, sensorimotor information seems able to shape cognitive processing rather than mental representation (i.e., the access to a given memory trace). Because oculomotor manipulations seem able to interfere with the quality of memory, other kinds of body manipulations might also result in similar costs. More recently, the results of two studies on co-speech gestures seem to suggest that body manipulations at the recall phase may affect the way by which participants manipulate a given memory trace. In a study by Kamermans et al. (2019), participants learned a bistable figure through touch for 30 seconds. Later on, they were introduced to the idea of bistability and were asked to reinterpret the figure—that is, to find the alternative interpretation of the learned figure by mentally simulate a rotation. During the test phase, participants were divided in three groups: in the gesture condition, participants were asked to move their hands and arms as if actually having the figure in the their hands. In the no gesture condition, participants were asked to keep their hands still on the table. In the manual interference condition, they were asked to drum their fingers with both hands continuously on the table. Results revealed that participants who were engaged in the secondary motor task were less successful in reinterpreting the figure, thereby indicating that by loading up the motor system it is possible to interfere with the ability to mentally rotate the figure stored in memory (Kamermans et al., 2019). Likewise, Nathan and Martinez (2015) have found that restricting gestures production during the test phase, by asking participants to tap with one hand a particular spatial pattern, results in a significant reduction in the ability to make inferences starting from the learned material. However, these results seem to suggest that body manipulations may affect the way by which participants manipulate a given memory trace, rather than the memory trace itself. In other words, they speak in favor of a causal role of the motor system in handling a given memory trace.

Remembering another person’s actions through sensorimotor simulation

A second open question concerns the specific memory of other people's actions. Access and retrieval of a given episodic memory can be facilitated (and memory content affected) when specific action patterns are executed favoring the sensorimotor simulation of the event that initially caused the memory. However, this effect has always been observed when considering actions performed by the subject. There are situations in which the gist of the event/memory is an action by another person, such as the kind of memories typically involved in eyewitness reports (e.g., remembering having been mugged). This is, for example, the usual situation for witness context, in which the memory to be recollected focuses on another person’s actions. Do similar mental simulations play the same facilitating role during the retrieval of memory traces for actions performed by another person as they do in the case of actions performed by the subjects themselves? And does the manipulation of body movements/posture affect the process?

There is some preliminary evidence in favor of this hypothesis. On the one hand, as previously mentioned, the literature on action observation suggests that even when a person who is staying still observes an action by another individual, their motor system is automatically activated (see, e.g., Buccino et al., 2001). According to the reactivation hypothesis, that activation might play a pivotal role in recalling the event. Further, although using a specific experimental setting involving pantomime gestures rather than real, everyday actions, Ianì and Bucciarelli (2018) found that memory for sentences involving actions previously performed by an actor is greater when the actor moves compared with when he stays still while the sentence is uttered. On the other hand, involving the listener’s motor system during gesture observation cancelled the beneficial effect only when the motor task involved the same effectors used by the speaker, as the beneficial effect indeed persisted when the motor task involved different effectors. Now, supposing that these results were to go beyond this specific experimental setting, we could also expect the motor-based information to be used to simulate and reconstruct another person’s actions. But, then, a secondary motor task at recall would interfere with such memories, too.

Effects of body manipulations on sense of recollection

A third area in which it is likely that sensorimotor interference paradigms might affect memory processes is that of the phenomenal characteristics associated with memory. Since the sensorimotor simulations convey mostly perceptual and sensorial information linked to the original event, it is likely that they contribute to the sensation of “reliving” the event itself. When we remember a given memory trace, we engage in a mental simulation rich in details because not only visual information but also relevant motor states come into play. Memory does not keep trace of experience as if it were an abstract idea. Nyberg (2002) claimed that both perceptual and sensorimotor information is part of the memory trace and that brain regions storing such information are spontaneously reactivated at recall in order to reactivate the same perceptual and sensorimotor feelings. In other words, the sensorimotor simulation might support memory in terms of perceptual characteristics, motor states, emotional richness, and the feeling of “reliving,” thus in terms of its phenomenological characteristics (Mazzoni, Scoboria, & Harvey, 2010). For instance, when remembering an event in the past, we may “see” in our mind the place where the event took place as well as the objects and the people who were present, and relive the thoughts and the feelings we had during that event. All these details lead to the subjective experience of mentally reliving a past event (D’Argembeau & Van der Linden, 2006), a feeling of “warmth and intimacy” (James, 1890/1950). Therefore, involving the motor system should result in changing the motor details reactivated at recall, thereby modifying the phenomenological characteristics associated with recollection. In this respect, a growing body of literature has revealed that such a “remembering” system can be dissociated from the “believing” one: the sensation of recollection relies on different cognitive mechanisms with respect to the framework of beliefs about such an event (Mazzoni et al., 2010). In an anecdotal report, Jean Piaget (1951) described how, for much of his life, he had a detailed memory of having almost been abducted in a park when he was 2 years old, with his nurse, in the stroller. Piaget described his memory in a vivid and detailed way. Many years later, Piaget’s former nurse confessed that she had completely fabricated the story, and so he stopped believing it. Crucially, Piaget could not stop “remembering” it with a strong sense of recollection, even when he was certain that the event had not, in fact, occurred. What Piaget described is a nonbelieved memory (NBM), a counterintuitive phenomenon in which people report a vivid memory of an autobiographical event although they believe the event did not occur. Despite the newfound knowledge, Piaget remained able to “remember” the scene, which continued to feel very much like a true memory. This is a kind of “pure memory” for an event not accompanied by the belief it really occurred: the sensation to be able to mentally travel back in time and reexperience events with vivid perceptual and sensorimotor information, even though the belief about its veracity has been lost. Indeed, mental representations tend to be labeled as “memories” or as “recollected” when they are associated with high levels of vivid perceptual and contextual information, thereby inducing a strong sense of reexperiencing the past (Moskovitch, 2012).

Scoboria, Mazzoni, Kirsch, and Relyea (2004) were among the first scholars to discover the existence of this kind of “pure” memory. The participants in their study were asked to fill out a series of questionnaires about memory, belief, and plausibility in relation to 10 events that might have occurred in their childhood. Although for the 96% of items, memory implied belief and belief implied plausibility (according to the so-called nested constructs hypothesis), the authors found that in a small percentage of cases (4%), the memory rating exceeded the belief rating. These results indicate that memories are not always nested within beliefs (Scoboria et al., 2004). People can maintain a memory trace for an event despite the autobiographical belief being lost. Subsequent studies confirmed that such dissociation between memory and belief can occur in natural contexts. In particular, Mazzoni et al. (2010) examined the frequency with which NBMs usually occur, the factors that lead individuals to withdraw their belief that the events portrayed in these memories occurred, and the phenomenological features of these memories. Approximately 20% of a large sample of participants reported having at least one NBM in their life. Those who reported having at least one NBM were then asked to fill out a memory inventory assessing the reasons why these memories were no longer believed and the characteristics of these mental experiences. The participants reported having stopped believing in their memories either because of a negative social feedback or because they perceived the event as being no more plausible (for a recent and more detailed analysis, see Scoboria, Boucher, & Mazzoni, 2015). Further, the results revealed that NBMs are subjectively experienced much like normal believed memories. Although NBMs were rated less personally important and less connected to other memories, the phenomenological ratings were similar to those of believed memories, sharing with them many phenomenological features such as “strong sense of recollection” (see also, e.g., Clark, Nash, Fincham, & Mazzoni, 2012). These phenomenological features provide a “memory-like” quality to mental experiences, regardless of whether or not these mental representations were believed. These memories enabled participants “to travel back in time mentally and relive the event, reexperience the same intense emotions, clearly recall visual and other perceptual details, and form a clear idea of where objects and people were in the original event” (Mazzoni et al., 2010, p. 1339).

NBMs can be also induced in laboratory. The most common technique to create NBMs is to induce false memories and then to provide a social feedback which disconfirms them. Inducing a false memory in a participant and then telling them the false event did not actually occur is very likely to undermine the belief that it happened. But do the mental image and the feeling of “reliving” the event disappear as well? Several false memories paradigms have been devised to address this question. For example, Otgaar, Scoboria and Smeets (2013), using the classical memory-implantation procedure (e.g., Loftus & Pickrell, 1995) and a subsequent debrief, found that 40% of participants having a predebriefing false memory reported at postdebriefing a NBM (see also Clark et al., 2012; Otgaar et al., 2015). A study by Mazzoni, Clark, and Nash (2014) showed that experimentally inducing NBMs does not necessarily require first implanting false memories. In their study, when participants received negative feedback about true experiences, they stopped believing in them but at the same time maintained a strong sense of recollection.

Whereas it is widely accepted that humans can have autobiographical belief about a given event without any memory of it, at first glance the reverse pattern sounds very odd. Piaget’s anecdote, as well as the results of this series of studies, suggest that “belief” and “memory” are instead two independent and fully dissociable cognitive systems (see, e.g., Scoboria et al., 2014) and that the belief system is more responsive to social feedback than the memory system is (Otgaar, Scoboria, & Mazzoni, 2014). Whereas several studies have detected the factors able to affect and interfere with the belief system (i.e., a negative social feedback; Mazzoni et al., 2010; Scoboria et al., 2015), little attention has been paid to the factors that are able to influence the memory system. Several experimental paradigms are able to affect either belief ratings alone or both belief and memory ratings, but not memory judgments alone. In this respect, the SMM would apply well to the “memory” system and would predict that manipulation of body movements might affect the memory system, while leaving the belief system intact. Sensorimotor simulation could contribute to the phenomenological features associated with memory. This would be crucial for the NBM literature, as a double dissociation is needed in order to speak in favor of the independence of two cognitive processes (see, e.g., Shallice, 1988). Further, body manipulations effects may also have some clinical implications. For instance, the possibility that body manipulations may affect the sense of recollection may have potential applications to the study of traumatic memories. Can individuals, through specific body manipulations, cope with intrusive memories by reducing the degree to which they “feel” reliving the original events? In other words, asking people during the recollection of traumatic experiences to move their bodies in an incongruent way compared with the actions involved in the traumatic memory trace should reduce the sensation of “reliving” the event, thereby helping people decrease negative emotions associated with it. To shed light on this issue, further research is needed.

Closing comments on EC

Following the tenets of the EC approach, high-level cognition is based on lower level bodily and neural processes. Higher cognitive functions, such as the memory system, are supposed to be based on the same neural system that controls action and perception (Glenberg, 1997). Such a view is supported by the findings of the cited literature: memory is embodied because events stored in our memory are, at least in part, a reenactment of sensorimotor information. Crucially, the results of the presented literature suggest that motor information associated with a memory trace is not only an epiphenomenon that could be reactivated or not but also part of the memory process to such extent that it is able to trigger and facilitate the retrieval process. When sensorimotor information is inaccessible, our memory system uses different and compensatory processes to reach the target, but such processes result in a cost (for instance, detectable in reaction times analysis). Events are not represented in an amodal format, and their associated neural activation does not spread subsequently only to the sensorimotor representations to which they are connected. If that were true, sensorimotor interferences would not lead to a cost in retrieval of the memory trace. The motor activity during memory retrieval is a direct reflection of the event access itself, and in this sense is a constitutive part of the retrieval process. Put more plainly, it is not an additional part of a memory trace, but rather a privileged component, through which our cognitive system retrieves information.

Therefore, results from the described literature seem to support the views according to which cognitive processes are influenced by the body. At the same time, a memory trace is not totally constituted by motor information: the sensorimotor influence seems to be partial, and, further, its exact extent is still unknown. I outlined that the evidence in favor of the SMM concerns the effect of body manipulations on memory processes rather than memory representations. The aim of this review, besides to collecting evidence in favor of the SMM, was to raise important issues that remain unresolved and identify important areas for future research, such as the study of the phenomenology associated with memory retrieval. For instance, interfering with the sensorimotor system may not cancel a given memory, but it may affect the way by which the trace is retrieved and experienced by the subject. If a memory can change the reliance of different sensorial information, it would be reasonable to speculate that interfering with the motor system leads to representations that lean more heavily on other modality-specific information, such as, for instance, the visual system.

Outlining to what extent memory depends on sensorimotor processes would also be crucial in regard to the wider debate on EC. As pointed out by Mahon and Caramazza (2008), although it is clear how the body shapes memory traces, the risk is of overstating the role of motor processes in memory (see, e.g., Mahon & Caramazza, 2008). Similarly, Zeelenberg and Pecher (2016) pointed out that given the highly flexible nature of the human cognitive system, it is reasonable to assume that evidence of the motor system involvement has not implications in all cognitive components, or at least with the same extent. Therefore, it is reasonable to support the view according to which the embodied cognition approach cannot entirely replace cognitive psychology (for a discussion, see, e.g., Chemero, 2009). One of the strongest and most controversial interpretations of embodied cognition, according to which cognition may occur without internal representations, “may appear reasonable when confined to specific feature and domains, but it appears deeply flawed when extended to broader analysis” (Goldinger et al., 2016, p. 961). Rather, what is still unexplored is indeed the effect of body manipulation on such representations, and specifically the phenomenology associated with them.