Integrating Speech and Iconic Gestures in a Stroop-like Task: Evidence for Automatic Processing

Page created by Jessie Chandler
 
CONTINUE READING
Integrating Speech and Iconic Gestures in a Stroop-like
           Task: Evidence for Automatic Processing

                        Spencer D. Kelly1, Peter Creigh2, and James Bartolotti1

Abstract
■ Previous research has demonstrated a link between lan-           racy rates, RTs, and ERPs were recorded to the words. Although
guage and action in the brain. The present study investigates      not relevant to the task, participants paid attention to the se-
the strength of this neural relationship by focusing on a poten-   mantic relationship between the speech and the gesture, pro-
tial interface between the two systems: cospeech iconic ges-       ducing a larger N400 to words accompanied by incongruent
ture. Participants performed a Stroop-like task in which they      versus congruent gestures. In addition, RTs were slower to in-
watched videos of a man and a woman speaking and gesturing         congruent versus congruent gesture–speech stimuli, but this ef-
about common actions. The videos differed as to whether the        fect was greater when the gender of the gesturer and speaker
gender of the speaker and gesturer was the same or different       was the same versus different. These results suggest that the
and whether the content of the speech and gesture was congru-      integration of gesture and speech during language comprehen-
ent or incongruent. The task was to identify whether a man or a    sion is automatic but also under some degree of neurocognitive
woman produced the spoken portion of the videos while accu-        control. ■

INTRODUCTION
                                                                   have argued that the two together constitute a single inte-
Over the past decade, cognitive neuroscientists have be-           grated system (McNeill, 1992, 2005; Kita & Özyürek, 2003;
come increasingly interested in the relationship between           Özyürek & Kelly, 2007). For example, McNeill (2005) argues
language and action in the human brain (for a recent               that because speech and iconic gesture temporally overlap
review, see Willems & Hagoort, 2007). This interest is             but convey information in two very different ways—speech
fueled by the discovery that language shares neural sub-           is conventionalized and arbitrary, whereas iconic gestures
strates that are involved in more basic processes such as          are idiosyncratic and imagistic—the two together capture
the production and the perception of action (Nishitani,            and reflect different aspects of a unitary underlying cognitive
Schurmann, Amunts, & Hari, 2005; Rizzolatti & Craighero,           process. For example, imagine someone saying, “They were
2004; Rizzolatti & Arbib, 1998). These findings have led           up late last night,” while making a drinking gesture. From this
many researchers to view language as an ability that grew          example, it should be clear that the iconic gesture and the
out of action systems in our evolutionary past (Armstrong          accompanying speech combine to reveal meaning about
& Wilcox, 2007; Corballis, 2003; Kelly et al., 2002), or as        an activity that is not fully captured in one modality alone.
Elizabeth Bates so eloquently put it, “[as] a new machine             Exploring this relationship between speech and iconic
that nature has constructed out of old parts” (MacWhinney          gesture may provide a unique opportunity to better un-
& Bates, 1989; Bates, Benigni, Bretherton, Camaioni, &             derstand the extent to which language and action are
Volterra, 1979). Taking this embodied view that language           connected in the brain. Indeed, if language is inextricably
is inextricably tied to action in present day communica-           linked to action, as many have argued, one might expect
tion, the current article investigates the strength of this        iconic gesture to be inextricably linked to speech. The
relationship by focusing on a potential interface between          present study investigates the strength of this neural link
the two systems: cospeech iconic gesture.                          by exploring the extent to which the two channels are
   Cospeech iconic gestures are spontaneous hand move-             automatically integrated during language comprehension.
ments that naturally, pervasively, and unconsciously accom-
pany speech and convey visual information about object
attributes, spatial relationships, and movements (McNeill,         The Integration of Gesture and Speech
2005; Kendon, 2004; Goldin-Meadow, 2003; Clark, 1996).
These gestures are so tightly tied to speech that theorists        The first attempts to address this question of whether ges-
                                                                   ture and speech form an integrated system came from re-
                                                                   searchers in psychology, linguistics, and education using
                                                                   behavioral methods. Although some researchers have re-
1
Colgate University, 2University of Pittsburgh                      cently attempted to support this claim by investigating

© 2009 Massachusetts Institute of Technology                                Journal of Cognitive Neuroscience 22:4, pp. 683–694
processes involved in language production (Goldin-             gestures had meaningful versus nonmeaningful relation-
Meadow, 2003; Kita & Özyürek, 2003), the majority of ex-       ships to speech. Moreover, Willems et al. (2007) showed that
perimental work has focused on language comprehension,         gestural information and spoken information are both pro-
showing that people typically pay attention to gesture         cessed in the same brain regions—inferior frontal areas—
when processing speech (Beattie & Shovelton, 1999;             during language comprehension. This inferior frontal region
Cassell, McNeill, & McCullough, 1999; Kelly, Barr, Church,     is in an interesting site for the integration of gesture and
& Lynch, 1999; Kelly & Church, 1998; Goldin-Meadow,            speech because it is traditionally seen as a language region
Wein, & Chang, 1992). However, these studies have been         (as in Brocaʼs area), but more recently it has been described
criticized for not conclusively demonstrating that gesture     as a key component to the human mirror neuron system
and speech are truly integrated during language compre-        (Rizzolatti & Craighero, 2004). Other researchers have
hension. For example, researchers such as Robert Krauss        identified more posterior regions, such as the superior tem-
have argued that this work on gesture comprehension de-        poral sulcus and the inferior parietal lobule, as sites for
monstrates, at best, a trivial relationship between speech     the integration of gesture and speech (Holle et al., 2008;
and gesture at comprehension (Krauss, 1998; Krauss, Morrel-    Wilson et al., 2008). Again, these areas are noteworthy be-
Samuels, & Colasante, 1991). That is, gesture may be used as   cause they, like the more anterior regions, are independently
“add-on” information to comprehend a communicatorʼs            implicated in language processing and action processing.
meaning only after the speech has been processed. In other        This research has made good progress toward showing
words, in cases where gesture does influence comprehen-        that gesture and speech are indeed integrated during lan-
sion, the attention to gesture is post hoc and occurs well     guage comprehension and that this integration occurs in
after the semantic processing of speech. So according to       brain regions involved with human action. Still, it does not
this view, when people do pay attention to gesture during      fully address the critique by Krauss et al. (1991) regarding
language comprehension, it is an afterthought.                 the extent to which gesture and speech are integrated dur-
   Recently, researchers have turned to techniques in cog-     ing comprehension. That is, gesture may be linked to
nitive neuroscience to better understand the relationship      speech, but—to borrow an earlier description—not in-
between speech and gesture during language comprehen-          extricably linked. One way to address this concern is to
sion. Many of the initial studies used ERPs to investigate     determine whether people cannot help but integrate ges-
the time course of gesture–speech integration (Bernardis,      ture and speech.
Salillas, & Caramelli, 2008; Holle & Gunter, 2007; Özyürek,
Willems, Kita, & Hagoort, 2007; Wu & Coulson, 2007a,
                                                               Automatic and Controlled Processes
2007b; Kelly, Kravitz, & Hopkins, 2004). For example,
Wu and Coulson (2007b) presented gesture–speech ut-            To determine the strength of the gesture–speech relation-
terances followed by pictures that were related either to      ship, it may be highly revealing to test whether the integra-
gesture and speech or to just the speech alone. When           tion of the two modalities is automatic or controlled. This
pictures were related to gesture and speech, participants      distinction was first introduced within the field of cognitive
produced a smaller N400 component (the traditional se-         psychology (Schneider & Shiffrin, 1977; Posner & Snyder,
mantic integration effect, as in Kutas & Hillyard, 1980)       1975; but for a more recent perspective, see Bargh &
than when the pictures were related to just the speech.        Morsella, 2008). The traditional distinction is that auto-
This suggests that the visuospatial aspects of gestures        matic processes are low level, fast, and obligatory, whereas
combined with speech to build stronger and more vivid          controlled processes are high level, slow, and under inten-
expectations of the pictures than just speech alone. This      tional guidance. For example, recognizing that someone
finding makes sense in light of other ERP research demon-      who reaches for an object must want that object is an
strating that the semantic processing of a word is affected    automatic process—understanding why they want that ob-
by the presence of cospeech gestures that convey either        ject, a controlled process. So when someone is presented
congruent or incongruent visual information (Kelly et al.,     with gesture and speech, do they automatically integrate
2004).                                                         the two modalities or do they more consciously and stra-
   So where are gesture and speech semantically inte-          tegically choose to integrate them?
grated in the brain? Using functional neuroimaging tech-          Researchers have found evidence in other domains that
niques (mostly fMRI), researchers have discovered that         certain nonverbal or paralinguistic behaviors (such as
this integration occurs in brain regions implicated in         facial expression, body posture, and tone of voice) are
the production and comprehension of human action               automatically processed in the brain (de Gelder, 2006;
(Hubbard, Wilson, Callan, & Dapretto, 2009; Holle, Gunter,     Winston, Strange, OʼDoherty, & Dolan, 2002; Pourtois,
Rüschemeyer, Hennenlotter, & Iacoboni, 2008; Wilson,           de Gelder, Vroomen, Rossion, & Crommelinck, 2000). For
Molnar-Szakacs, & Iacoboni, 2008; Montgomery, Isenberg,        example, de Gelder (2006) provides evidence for a sub-
& Haxby, 2007; Skipper, Goldin-Meadow, Nusbaum, &              cortical neurobiological detection system (which primarily
Small, 2007; Willems, Özyürek, & Hagoort, 2007). For ex-       includes the amygdala) that obligatorily and unconsciously
ample, Skipper et al. (2007) found that inferior frontal re-   processes emotional information conveyed through fa-
gions processed speech and gesture differently when the        cial expressions and body posture. Focusing specifically

684    Journal of Cognitive Neuroscience                                                              Volume 22, Number 4
on manual actions, there is evidence that noncommuni-           not requiring any overt or conscious attention to the se-
cative hand movements, such as reaches and physical             mantics of the stimuli.
manipulations of objects, also elicit automatic processing         The third point is more general and requires some
(Iacoboni et al., 2005; Blakemore & Decety, 2001). For ex-      explanation. Previous ERP research on the time course
ample, Iacoboni et al. (2005) found that the inferior fron-     of integrating speech and cospeech gestures has con-
tal gyrus and the premotor cortex (regions traditionally        flated iconicity and indexicality (Kelly et al., 2004, 2007).1
associated with the human mirror neuron system) differ-         These previous studies used stimuli of people gesturing
entiated actions that were intentionally versus unintention-    to objects—for example, gesturing the tallness of a glass
ally produced. Interestingly, this effect held even when        and verbally describing either the tallness of that glass
people were not explicitly instructed to attend to the goals    (a congruent pair) or the shortness of an adjacent dish
of the actions, a finding that led the authors to conclude      (an incongruent pair). The main finding was that incon-
that people automatically process the meaning of human          gruent pairs produced a larger N400 component than
action.                                                         congruent pairs. However, it is impossible to determine
   There have been a handful of studies that have specif-       whether the N400 effect arose from the incongruent
ically explored the automatic versus controlled nature of       iconic information of the gesture (i.e., tall vs. short)
gesture processing (Holle & Gunter, 2007; Kelly, Ward,          or whether it arose from the incongruent indexical in-
Creigh, & Bartolotti, 2007; Langton & Bruce, 2000). On          formation (i.e., glass vs. dish). It is one thing to say that
the one hand, Langton and Bruce (2000) demonstrated             indicating one object with a gesture makes it difficult to
that people cannot help but pay attention to pointing           simultaneously process speech referring to another ob-
and emblematic gestures (in the absence of speech) even         ject, but it is another to say that processing the iconic
when their task was to attend to other information. On the      meaning of a gesture interacts with the simultaneous se-
other hand, there is some evidence that when it comes to        mantic processing of the accompanying speech. If one
processing cospeech gestures, attention to gesture may          wants to make strong and definitive claims about the
have elements of a controlled process. For example, Holle       automatic or controlled integration of speech and co-
and Gunter (2007) showed that the presence of nonmean-          speech gesture in the brain, it is very important to pre-
ingful gestures (e.g., grooming behaviors) modulates the        cisely identify what aspects of gesture—its indexicality
extent to which meaningful gestures set up semantic ex-         or its iconicity—are part of that integrated system.
pectations for speech later in a sentence. In addition, Kelly
et al. (2007) found that the integration of gesture and
                                                                The Present Study
speech was influenced by explicit instructions about the
intended relationship between speech and gesture. Spe-          The present study used improved methods to explore the
cifically, when processing speech in the presence of an         automatic integration of gesture and speech during lan-
incongruent versus congruent gesture, there was a larger        guage comprehension. First, to address the issue of “ico-
N400 effect in bilateral frontal sites when people were told    nicity versus indexicality,” we used gestures that conveyed
that the gesture and speech belonged together, but this         no indexical information. The gestures in the present
N400 effect was not preset in the left hemisphere region        study conveyed iconic information about various actions
when people were told that gesture and speech did not           in the absence of actual objects. These types of iconic ges-
belong together. In other words, people appeared to have        tures are very common in natural communication (McNeill,
some control over whether they integrated gesture and           1992), but their meaning is usually unclear or ambiguous
speech, at least under circumstances of explicit instruc-       in the absence of accompanying speech (unlike indexical
tions about whether to integrate the two modalities.            gestures). Therefore, better understanding how they are
   This finding from Kelly et al. (2007) is provocative be-     temporally integrated with co-occurring speech is impor-
cause it suggests that high-level information such as knowl-    tant for theories of gesture and speech, specifically, and
edge about communicative intent modulates the neural            action and language, more generally.
integration of gesture and speech during language com-             Second, to address the problem of heavy-handed task
prehension. However, the study leaves open a number             instructions that require conscious analysis of gesture
of unresolved issues. First, because it used very direct        and speech, we used a Stroop-like paradigm to test the
and heavy-handed (as it were) instructions not to integrate     integration of the two modalities. The classic Stroop tech-
gesture and speech, one wonders what might happen in            nique presents people with color words that are written
the absence of explicit task demands. Indeed, if people         in different colored fonts, and the “Stroop effect” arises
strategically choose (on their own) not to integrate ges-       when the meaning of the written word influences how
ture and speech, it would allow for much stronger claims        quickly and accurately people can name the color of
about the inherent controlled nature of gesture–speech          the font (Stroop, 1935). This effect has traditionally been
integration. Second, the task required people to con-           viewed as a hallmark for automatic processing (Logan,
sciously attend to the semantics of the gesture–speech          1978; Schneider & Shiffrin, 1977; Posner & Snyder, 1975),
utterances, which could have inherently encouraged con-         but more recent research has shown that certain attentional
trolled processing. A much more powerful test would be          and contextual factors can modulate the effect (Bessner

                                                                                         Kelly, Creigh, and Bartolotti    685
& Stolz, 1999; MacLeod, 1991). By using a variant of this      turer and the speaker were the same person (a man or a
Stroop procedure, we avoided the problems of explicitly        woman), but in the gender-different condition, the ges-
drawing attention to gesture and speech—attention that         turer and the speaker were two different people (e.g.,
may unintentionally encourage conscious and strategic pro-     a man gesturing but a woman speaking, and vice versa).
cessing of the two modalities.                                 The gender-same and gender-different conditions were
   In our modified version of the classic Stroop paradigm,     created by digitally inserting the speech of the man or the
participants watched videos of a man and a woman ges-          woman into the video containing the gestures of the woman
turing and speaking about common actions. The videos           or the man.
differed as to whether the gender of the gesturer and             In addition to the gender relationship, the stimuli varied
speaker was the same or different and whether the con-         as to whether the gesture and the speech had a congruent
tent of the gesture and speech was congruent or incon-         or an incongruent relationship. In the gesture-congruent
gruent. The task was to simply identify whether the man        condition, the gesture and the speech were semantically
or the woman produced the spoken portion of the vid-           related (e.g., gesturing cut and saying “cut”), and in the
eos while accuracy rates and RTs (behavioral measures)         gesture-incongruent condition, they were semantically un-
and ERPs (electrophysiological measure) were recorded to       related (e.g., gesturing stir and saying “cut”). The congruent
the spoken targets. In this way, participants were required    gesture stimuli were normed before the experiment (n = 5)
to make very superficial judgments about the acoustic          to ensure that the words and the gestures were indeed se-
properties of the spoken message, that is, whether a word      mantically related. The norming procedure presented vi-
was produced by a man or a woman. Importantly, the ac-         deo clips to participants who had to rate (on a scale of 1
tual content of the gesture and speech was irrelevant.         to 7) whether the speech and the gesture were related.
   Using this Stroop-like paradigm, we made two predic-        There were initially 26 gesture–speech pairs, but because
tions. The rationale of the predictions is that although       two of them had ambiguous gesture–speech relationships,
the relationship between gesture and speech was not rel-       we dropped these two and were left with 24 pairs (that in-
evant to the participantʼs task (i.e., to identify the gen-    cludes one practice pair). The mean congruence rating of
der of the spoken portion of the videos), it nevertheless      the remaining 24 stimuli was 6.61 (SD = 0.63). We created
should affect task performance if the integration of ges-      the incongruent pairs by inserting the audio portion into ges-
ture and speech is automatic. On the basis of this logic,      ture videos that conveyed semantically unrelated informa-
the first prediction is that participants should be slower     tion. See Appendix A for all the gesture–speech pairs used
to identify the gender of a speaker when gesture and           as stimuli, and refer to Figure 1 for an example of the gender
speech have an incongruent versus congruent relation-
ship. The second prediction is that there will be a larger
N400 component (indexing semantic integration difficul-
ties) to words accompanied by incongruent versus congru-
ent gestures.

METHODS
Participants
Thirty-two right-handed (measured by the Edinburgh
Handedness Inventory) Caucasian college undergraduates
(12 males, 20 females; mean age = 20 years) participated
for course credit. All participants signed an informed con-
sent approved by the institutional review board. A total
of three participants were not analyzed due to excessive
artifacts in brain wave data, and one was removed for
not following task instructions.

Materials
Participants watched digitized videos of a man and a woman
(only the torso was visible) uttering common action verbs
while producing iconic gestures also conveying actions. To
test our two predictions, the videos were designed in a
Stroop-like fashion. Specifically, there was a gender manip-
ulation that varied the relationship between the gesturer      Figure 1. Still frame examples of the congruent and incongruent
and the speaker. In the gender-same condition, the ges-        stimuli for the gender-same and gender-different videos.

686    Journal of Cognitive Neuroscience                                                                  Volume 22, Number 4
and gesture conditions for the “cut” item. The stimuli were        prestimulus window. Eye artifacts during data collection
designed to create a Stroop-like test that required explicit at-   were monitored with 4 EOG electrodes, with voltage shifts
tention to the gender of the speaker, with the ultimate goal       above 70 μV marked as bad (for more on the EOG algo-
of determining whether people automatically process the se-        rithm, see Miller, Gratton, & Yee, 1988; Gratton, Coles, &
mantic relationship between speech and gesture (more               Donchin, 1983). Non-EOG channels were marked as bad
below).                                                            if there were shifts within the electrode of greater than
   Each audiovisual segment was 1 sec in duration. Differ-         200 μV for any single trial. If over 20% of the channels
ent from previous research on the integration of speech            were bad for a trial, the whole trial was rejected. In all,
and cospeech gesture (cf. Kelly et al., 2007), the video           16% (SD = 22%) of the trials were rejected.
began immediately with the stroke portion of the action               The behavioral data were analyzed with a 2 (Gesture,
gesture (i.e., there was no preparation phase as there was         congruent and incongruent) × 2 (Gender, same and dif-
in previous research). Following 200 msec after gesture            ferent) repeated measures ANOVA with accuracy scores
onset, the spoken portion of the video began and lasted            and response latencies as the dependent measures. These
for 800 msec. Throughout this 800 msec, the gesture con-           behavioral responses were taken to the onset of the verbal
tinued. In this way, gesture preceded speech for 200 msec          potion of the videos.
and overlapped with speech for 800 msec. All of the ERP               The ERP data were analyzed with a 2 (Gesture, congru-
data were time locked to speech onset.2 The variable ISI           ent and incongruent) × 2 (Gender, same and different) ×
was 1.5 to 2.5 sec. The video was 25 min in length.                5 (central, frontal, occipital, parietal, and temporal elec-
                                                                   trode region) × 2 (left and right) repeated measures ANOVA.
                                                                   The present article focuses on one electrophysiological
Procedure
                                                                   measure—the N400 component, which indexes semantic
After participants were fitted with the ERP net (see be-           integration (Kutas & Hillyard, 1980). The N400 window was
low), the experimenter explained that participants would           created by calculating the average amplitude from 250 to
view videos of a man and a woman making short, one-                550 msec (N400) postspeech onset.3
word spoken utterances. The task was to identify, as quickly          The electrode manipulation refers to the location of
as possible, the gender of the verbal portion of the videos        clusters of electrodes on the scalp. On the basis of pre-
(i.e., whether the word was produced in a male or a fe-            vious studies, the 128 electrodes were broken up into five
male voice). A computer recorded these responses, and              clusters of channels that corresponded roughly to basic
their latencies were used in the RT analyses. In addition,         anatomical structures of the brain. Refer to Figure 2 for
the computer simultaneously recorded brain wave data               a diagram of the clusters for the 128 electrodes (for more
to the same responses. Note that this task does not re-            on the clusters, see Kelly et al., 2004). The purpose of
quire participants to respond to the semantic content of           these electrode clusters was to test whether the N400 ef-
the speech, nor does it require any attention to gesture.          fect was stronger or weaker in particular scalp regions.
Moreover, the gender of the gesturer is also irrelevant            Specifically, the N400 effect is typically largest in central
to the task. In this way, the task is a modified Stroop test,      regions (Kutas & Hillyard, 1980) but is more anterior for
with one piece of information relevant to the task—the             stimuli that are comprised of gesture and speech (Kelly
gender of the speaker—and other pieces of information              et al., 2007; Wu & Coulson, 2007a).
irrelevant to the task—the semantic relationship between              All repeated measures analyses were adjusted for sphe-
speech and gesture and the physical relationship between           ricity by using a Greenhouse–Geisser correction (Howell,
the speaker and the gesturer.                                      2002). Because all contrasts were planned and orthogo-
   This Stroop-like design allowed us to test our two pre-         nal, Studentʼs t tests followed up on significant interac-
dictions: if gesture and speech are automatically inte-            tion effects.
grated, participants should (1) be slower to identify the
gender of a speaker when gesture and speech have an
incongruent versus congruent relationship and (2) pro-             RESULTS
duce a larger N400 component to words accompanied
by incongruent versus congruent gestures.                          Behavioral Data
                                                                   The ANOVA on the accuracy data revealed no significant
                                                                   main effects of Gesture, F(1,27) = 0.48, ns, or Gender,
ERP Setup and Analysis
                                                                   F(1,27) = 2.03, ns. In addition, there was not a significant
Participants were fitted with a 128-electrode Geodesic ERP         interaction of Gesture × Gender, F(1,27) = 1.56, ns.
net. The EEG was sampled at 250 Hz using a band-pass fil-          These null results are likely due to a ceiling effect in the
ter of 0.1–30 Hz, and impedances were kept below 40 kΩ             accuracy scores (refer to Table 1).
(the Geonet system uses high-impedance amplifiers). The              The ANOVA on the RT data did not reveal a significant
ERPs were vertex referenced for recording and linked-              main effect of Gender, F(1,27) = 0.01 ns, but it did uncov-
mastoid referenced for presentation. Following rereferenc-         er a significant main effect of Gesture, F(1,27) = 32.60,
ing, the brain waves were baseline corrected to a 100-msec         p < .001. In addition, there was a significant interaction of

                                                                                            Kelly, Creigh, and Bartolotti   687
Figure 2. Ten electrode
clusters for the 128-electrode
geodesic net. For more on the
rationale for the clusters, see
Kelly et al. (2004).

Gesture × Gender, F(1,27) = 9.04, p = .006. Simple effects             ent study, there was a significant main effect of Gender,
analyses demonstrated that within the gender-same con-                 F(1,27) = 4.76, p = .038, with gender-different stimuli
dition, participants were slower when gestures conveyed                producing a greater negativity than gender-same stim-
incongruent information to speech (M = 818 msec, SD =                  uli.4 In addition, in accordance with the predictions, there
198 msec) compared with when it conveyed congruent in-                 was a significant main effect of Gesture, F(1,27) = 6.95,
formation (M = 772 msec, SD = 207 msec), t(27) = 5.34,                 p = .013, with gesture-incongruent stimuli producing a
p < .001; and within the gender-different condition, par-              more negative going N400 component than gesture-
ticipants were also slower—but to a lesser degree—when                 congruent stimuli. Refer to Figure 3 for the main effect
gesture and speech were incongruent (M = 805 msec,                     of Gesture, Figure 4 for the effects at different electrode
SD = 198 msec) versus congruent (M = 786 msec, SD =                    sites, and Figure 5 for a topographical map of the N400
208 msec), t(27) = 3.39, p < .001 (refer to Table 1). Note             effect (peak = 408 msec) across all electrode sites on
that although the gesture-incongruent condition was slower             the scalp. Beyond these main effects, there were no sig-
than the gesture-congruent condition in both gender con-               nificant two-way interactions for Gesture × Electrode Re-
ditions, the significant interaction reveals that this dif-            gion, F(4,108) = 1.23, ns, Gender × Electrode Region,
ference was greater when the gender of the gesturer and                F(4,108) = 2.62, ns, Gesture × Hemisphere, F(1,27) =
speaker was the same versus different.                                 1.02, ns, Gender × Hemisphere, F(1,27) = 0.12, ns, or
   These results confirm our first prediction: Participants            Gesture × Gender, F(1,27) = 0.01, ns. Nor were there
were slower to identify the gender of the speaker when                 any three- or four-way interactions involving Gesture and
gesture and speech had an incongruent versus congruent                 Gender variables.
relationship, even when the relationship between the                      From observation of Figures 3 and 4, it appears that
two modalities was not relevant to the task.                           the difference between the gesture-congruent and the
                                                                       gesture-incongruent conditions may be the result of a
                                                                       carry over from differences earlier at the P2 peak.5 To ad-
Electrophysiological Data
                                                                       dress this possibility, we subtracted the P2 window (150–
The ANOVA on the N400 time window revealed two main                    250 msec) from the N400 window and reran the ANOVA.
effects. Although not part of the predictions of the pres-             Even after using this new baseline, the gesture-incongruent

Table 1. Accuracy Scores and RTs across the Gesture and Gender Conditions
                                  Mean Proportion Correct (SD)                                 Mean RT (SD)

                       Gesture Congruent         Gesture Incongruent     Gesture Congruent (msec)      Gesture Incongruent (msec)
Gender same                 0.99 (0.03)              0.98 (0.03)                 772 (207)                      818 (208)
Gender different            0.97 (0.03)              0.98 (0.03)                 786 (198)                      805 (199)

688      Journal of Cognitive Neuroscience                                                                    Volume 22, Number 4
Figure 3. Grand averaged
brain waves (collapsed across all
electrode regions) for the
gesture main effect. The
gesture-incongruent condition
produced a larger N400
component than the
gesture-congruent condition.
Microvolts are plotted
negative up.

stimuli produced a more negative going N400 component                       the gesture-incongruent stimuli producing a larger neg-
than the gesture-congruent stimuli, F(1,27) = 4.99, p =                     ativity than gesture-congruent stimuli in bilateral Central,
.034. Interestingly, this new analysis also yielded a mar-                  F(1,27) = 5.88, p = .022, and Parietal regions, F(1,27) =
ginally significant interaction of Gesture × Electrode re-                  8.40, p = .007. However, there was not a significant effect
gion, F(4,108) = 2.49, p = .103. This effect was driven by                  in bilateral frontal regions, F(1,27) = 1.26, ns.

Figure 4. Topographical plots for the 10 electrode regions for the gesture condition. The significant N400 effects are boxed in bilateral central
and parietal electrode regions. Microvolts are plotted negative up.

                                                                                                         Kelly, Creigh, and Bartolotti         689
Figure 5. Topographical maps
for the N400 peaks (408 msec)
for the two gesture conditions.
The top of each map is toward
the front of the head. Hot colors
are positive, and cool colors are
negative. The figure can be
viewed in color on line.

  These results from the electrophysiological measure           gesture–speech stimuli, even when the semantic relation-
confirm our second prediction: Although the relationship        ship of gesture and speech had nothing to do with the task
between gesture and speech was irrelevant to the partic-        at hand.
ipantsʼ task, gestures that were semantically incongruent          Regarding the other two hallmarks—low level and fast—
to the content of the speakerʼs words produced a larger         our results are less conclusive, but they are at least sugges-
N400 component than gestures that were semantically             tive of automatic processing. In some recent neuroimaging
congruent.                                                      research on the neural mechanisms of gesture–speech
                                                                integration, Hubbard et al. (2009) argued that the planum
                                                                temporale integrates the prosodic elements of speech and
DISCUSSION
                                                                gesture (specifically beat gestures that convey no iconic
The results of the experiment support our two predic-           content), suggesting that these types of gesture influence
tions: Participants were slower to process the gender of        low-level acoustic analysis of spoken input. More generally,
the speakerʼs voice when the content of gesture and             Iacoboni et al. (2005) have argued that low-level and phylo-
speech—information not relevant to the task—was in-             genetically old neural mechanisms (e.g., the mirror neuron
congruent versus congruent (Prediction 1), and partici-         system, which overlaps with the planum temporale) are
pants produced a larger N400 component when gesture             designed to automatically process manual hand actions.
and speech were incongruent versus congruent (Predic-           Considered in this context, one can speculate that iconic
tion 2). These findings are consistent with the claim that      hand gestures may interact with low-level, perceptual as-
gesture and speech are automatically integrated during          pects of speech processing in a similarly automatic fashion.
language comprehension.                                            When it comes to the speed of gesture–speech integra-
                                                                tion, the present results are consistent with other ERP stud-
                                                                ies on gesture–speech processing that have identified
Gesture–Speech Integration as an
                                                                early semantic components (Bernardis et al., 2008; Holle
Automatic Process
                                                                & Gunter, 2007; Özyürek et al., 2007; Wu & Coulson,
The traditional hallmarks of automatic processing are that      2007a, 2007b). Although we did not report ERP effects
it is low level, fast, and obligatory (Logan, 1978; Schneider   earlier than the N400 effect, previous research has uncov-
& Shiffrin, 1977; Posner & Snyder, 1975). Using a modi-         ered evidence for even earlier integration of verbal and
fied Stroop paradigm, we provided definitive evidence           nonverbal information ( Willems, Özyürek, & Hagoort,
for the third aspect of automaticity—the obligatory nature      2008; Kelly et al., 2004). In the Kelly study, the presence
of gesture–speech processing. Our procedure did not re-         of hand gesture influenced the P2 component and even
quire any overt attention to the semantic content of speech     earlier sensory stages of speech processing (within the
and gesture, yet the content of gesture influenced the          first 150 msec). Although the present article did not ex-
superficial acoustic analysis of speech in both the behav-      plore such early effects, the two studies together suggest
ioral and the electrophysiological measures. That is, when      that gesture and speech may be automatically integrated
participants simply had to process the auditory differences     very quickly in the brain.
between a male and a female voice—differences that were            In general, this claim about the automaticity of gesture
extremely salient—they could not do so without also pro-        processing is consistent with theories that the human
cessing irrelevant information conveyed through gesture         brain has evolved specialized neural mechanisms to auto-
and speech. This interference resulted in slower RTs and        matically process nonverbal behaviors, such as bodily ges-
a larger N400 component to incongruent versus congruent         tures, facial expressions, and manual actions (de Gelder,

690      Journal of Cognitive Neuroscience                                                             Volume 22, Number 4
2006; Iacoboni et al., 2005; Blakemore & Decety, 2001).          which meaningful gestures are integrated with speech dur-
Building on this, if hand gestures served as a foundation        ing sentence processing. Moreover, Kelly et al. (2007) ar-
for the evolution of spoken language, as some have ar-           gued that explicit knowledge about a communicatorʼs
gued (Corballis, 2003; Bates & Dick, 2002; Kelly et al.,         intent modulates the neural integration of gesture and
2002; Rizzolatti & Arbib, 1998), it makes sense that these       speech during word comprehension. Similarly, it is possi-
hand movements would be automatically linked to                  ble that on some level, participants in the present study
speech processes. Thus, in contrast to views that gesture        were aware that gesture and speech produced by different
is a mere afterthought in language processing, gesture           people did not “belong” together, and consequently, they
appears to be a real contender in communication, one             integrated the two modalities differently than when ges-
that has an inextricable link with the speech that it natu-      ture and speech did belong together.
rally and ubiquitously accompanies.                                 This connection to previous research bears further dis-
   Importantly, the present study extends previous re-           cussion. In the study by Kelly et al. (2007), there were very
search on the automatic integration of gesture and speech        similar patterns of behavioral data—in both studies, partic-
(Kelly et al., 2007). In this previous work, participants were   ipants integrated gesture and speech to a greater extent
explicitly instructed to attend to the meaning of speech. In     when the two modalities were meant to go together. How-
contrast, by using a Stroop-like paradigm in the present         ever, unlike that previous research, the present study did
study, participants were not required to attend to the           not find a comparable pattern with the electrophysiologi-
meaning of speech—or gesture—but did so anyway. More             cal data. Specifically, in the previous work, the N400 effect
importantly, the present results shed light on the role of       for congruent and incongruent gesture–speech pairs was
iconicity in the integration of gesture and speech. Unlike       eliminated in left hemisphere regions when participants
previous research that conflated indexicality and iconicity,     believed that the gesture and speech were not intention-
the present results demonstrated that the representational       ally coupled (the no-intent condition). However, in the
content (i.e., the iconic meaning) of gesture is automati-       present study, there was a strong bilateral N400 effect
cally connected to the representational content of speech.       for congruent and incongruent gesture–speech pairs even
This finding strongly supports theories that speech and          when gesture and speech did not belong together (the
iconic gesture—at least iconic action gestures—are inte-         gender-different condition). One possible explanation for
grated on a deep conceptual level (McNeill, 1992, 2005;          this difference is that in previous research, participants
Kita & Özyürek, 2003). To further explore this relation-         were explicitly told that gesture and speech did not belong
ship, it would be interesting for future research to in-         together, whereas in the present study, they had to de-
vestigate the automatic integration of speech and iconic         termine this for themselves. In fact, participants may have
nonaction gestures (e.g., gestures about object attributes       assumed that gesture and speech were meant to go to-
or spatial relationships) that have an even more ambigu-         gether in the present study. After all, people have experi-
ous meaning in the absence of accompanying speech.               ence with video media that artificially combine language
                                                                 and action (e.g., as in watching a dubbed movie or docu-
                                                                 mentary footage with a narrator). Perhaps participants ap-
Gesture–Speech Integration Is Not an Exclusively
                                                                 plied this sort of previous experience to the stimuli used
Automatic Process
                                                                 in the present study, and this is why the ERP data did not
Although we found strong support for the obligatory na-          correspond to the previous research. In any event, it will
ture of gesture–speech integration, the RT data suggest          be important for future research to determine the condi-
that it may not be an exclusively automatic process. Recall      tions under which people naturally integrate gesture and
that there were larger RT differences between the two            speech and when this integration is modulated by higher
gesture–speech pairs in the gender-same versus gender-           level contextual information.
different condition. This suggests that participants were           Focusing just on the present study, it is important to ad-
sensitive to the context within which they processed ges-        dress another inconsistency—why did the gender context
ture and speech. That is, when the context encouraged            modulate the behavioral but not the electrophysiological
integration of gesture and speech (i.e., when the same           integration of gesture and speech? This sort of inconsis-
person produced the gesture and word), participants were         tency between the behavioral and the electrophysiological
more likely to combine the two modalities than when the          findings is actually consistent with previous ERP research
context did not encourage integration (i.e., when different      on the N400 effect (Ruz, Madrid, Lupiánez, & Tudela,
people produced the gesture and word). This finding fits         2003; Brown & Hagoort, 1993; Holcomb, 1993). For exam-
with claims that the Stroop effect can be modulated by           ple, Brown and Hagoort (1993) found that incongruent
higher level contextual variables (Bessner & Stolz, 1999).       masked primes produced slower RTs to targets than con-
   Moreover, this modulation is consistent with previous         gruent masked primes, but this effect was not carried over
research on the controlled nature of gesture–speech in-          in the electrophysiological data—there was no corre-
tegration (Holle & Gunter, 2007; Kelly et al., 2007). For        sponding N400 effect for masked incongruent items. This
example, Holle and Gunter (2007) showed that the pres-           suggests that behavioral measures can occasionally reveal
ence of nonmeaningful gestures influences the extent to          differences that electrophysiological measures cannot.

                                                                                          Kelly, Creigh, and Bartolotti   691
Although this inconsistency tempers strong claims that          ioral measure) that people are sensitive to whether speech
context modulates the integration of gesture and speech,        and gesture “belong” together, suggesting a potential con-
it encourages future work to explore the circumstances          trolled mechanism as well. Although it will be important to
under which controlled mechanisms play a definitive role        build on these findings using more natural stimuli in larger
in the processing of gesture and speech during language         discourse contexts, they serve as a preliminary foundation
comprehension.                                                  for the claim that gesture and speech are inextricably
                                                                linked during language comprehension. And more gener-
                                                                ally, by viewing gesture as a unique interface between
Local versus Global Integration
                                                                actions and words, this work will hopefully lend a hand
Previous research has made a distinction between “local”        to the growing research on the relationship between body
and “global” integration of gesture and speech (Özyürek         and language in the human brain.
et al., 2007). Local integration concerns how a gesture is
integrated with a temporally overlapping word (i.e., when
a gesture and a word are presented at roughly the same
time). In contrast, global integration concerns how a           APPENDIX A
gesture is integrated with speech over larger spans of
discourse (e.g., when a gesture precedes a word in a            List of the 23 Congruent and Incongruent
sentence or vice versa). The majority of ERP research on        Gesture–Speech Pairs
gesture–speech integration has found evidence for glob-                   Congruent                             Incongruent
al integration (Bernardis et al., 2008; Holle & Gunter,
2007; Özyürek et al., 2007; Wu & Coulson, 2007a, 2007b),         Speech           Gesture               Speech            Gesture
but the present study has provided evidence for local
                                                                Beat              Beat                 Beat              Wipe
integration.
   On the surface, this finding is inconsistent with the work   Brush             Brush                Brush             Shake
by Özyürek et al. (2007) who, using gestures in larger sen-     Chop              Chop                 Chop              Mow
tence contexts, found evidence for global—but not local—
integration. One explanation for this inconsistency is that     Close             Close                Close             Screw
there were different SOAs (the timing difference between        Cut               Cut                  Cut               Stir
the onset of speech and gesture) when measuring the local       Hammer            Hammer               Hammer            Twist
integration of gesture and speech in the two studies.
Whereas the present study found an N400 effect for incon-       Knock             Knock                Knock             Write
gruent gesture–speech pairs using SOAs of 200 msec, the         Lift              Lift                 Lift              Stab
work by Özyürek et al. had a simultaneous onset of ges-
                                                                Mow               Mow                  Mow               Turn
ture and speech and they found no N400 effect for local
gesture–speech incongruencies. One possible explanation         Roll              Roll                 Roll              Hammer
for this inconsistency is that local integration occurs only    Saw               Saw                  Saw               Type
for gesture–speech pairs that are removed from natural
discourse (as in the present study). An equally plausible       Screw             Screw                Screw             Cut
possibility is that gesture and speech are indeed locally in-   Shake             Shake                Shake             Close
tegrated in natural discourse, but not when they are simul-     Squeeze           Squeeze              Squeeze           Lift
taneously presented (as in the study by Özyürek et al.,
2007). Indeed, gesture slightly precedes speech onset in        Stab              Stab                 Stab              Sweep
natural discourse (McNeill, 1992), and presenting them          Stir              Stir                 Stir              Break
simultaneously may disrupt their natural integration. It will
                                                                Sweep             Sweep                Sweep             Tear
be important for future ERP research to directly compare
different gesture–speech SOAs to more thoroughly under-         Tear              Tear                 Tear              Roll
stand how the timing of gesture and speech affects the          Turn              Turn                 Turn              Chop
neural integration of the two modalities.
                                                                Twist             Twist                Twist             Knock
                                                                Type              Type                 Type              Brush
Conclusion
                                                                Wipe              Wipe                 Wipe              Squeeze
By using a variation on the Stroop procedure, we demon-
strated that people obligatorily integrate speech and iconic    Write             Write                Write             Saw
gestures even when that integration is not an explicit re-
quirement. This suggests that the neural processing of                                      Practice
speech and iconic gesture is to some extent automatic.
                                                                Wring             Wring                Wring             Scrub
However, there is also evidence (at least with the behav-

692    Journal of Cognitive Neuroscience                                                                      Volume 22, Number 4
Acknowledgments                                                         Bessner, D., & Stolz, J. A. (1999). Unconsciously controlled
                                                                           processing: The Stroop effect reconsidered. Psychonomic
We would like to thank the anonymous reviewers for their help-             Bulletin & Review, 6, 99–104.
ful suggestions on previous versions of this manuscript. In ad-         Blakemore, S. J., & Decety, J. (2001). From the perception of
dition, we are grateful to Ron Crans for technical assistance in
                                                                           action to the understanding of intention. Nature Reviews
software creation. Finally, we thank the Colgate University for            Neuroscience, 2, 561–567.
providing generous support for undergraduate research.
                                                                        Brown, C., & Hagoort, P. (1993). The processing nature of
Reprint requests should be sent to Spencer Kelly, Department               the N400: Evidence from masked priming. Journal of
of Psychology, Neuroscience Program, Colgate University, 13                Cognitive Neuroscience, 5, 34–44.
Oak Dr., Hamilton, NY 13346, or via e-mail: skelly@colgate.edu.         Cassell, J., McNeill, D., & McCullough, K. E. (1999).
                                                                           Speech–gesture mismatches: Evidence for one underlying
                                                                           representation of linguistic and nonlinguistic information.
                                                                           Pragmatics & Cognition, 7, 1–34.
Notes                                                                   Clark, H. H. (1996). Using language. Cambridge, England:
                                                                           Cambridge University Press.
1. This, of course, is not to say that these are the only two ERP       Corballis, M. C. (2003). From hand to mouth: The origins of
studies to find an N400 effect using speech and gesture stimuli            language. Princeton, NJ: Princeton University Press.
(see Bernardis et al., 2008; Holle & Gunter, 2007; Özyürek et al.,      de Gelder, B. (2006). Towards the neurobiology of
2007; Wu & Coulson, 2007a, 2007b). However, they are the only              emotional body language. Nature Reviews Neuroscience,
two studies to find an N400 effect to speech stimuli that were simul-      7, 242–249.
taneously coupled with incongruent versus congruent gestures.           Goldin-Meadow, S. (2003). Hearing gesture: How our hands
2. Because the gestures preceded speech by such a short                    help us think. Cambridge, MA: Belknap Press.
interval (200 msec) and because the ERPs were time locked to            Goldin-Meadow, S., Wein, D., & Chang, C. (1992). Assessing
speech that co-occurred with gesture, the brainwaves had a                 knowledge through gesture: Using childrenʼs hands
positive skew, especially in frontal regions. This skew is common          to read their minds. Cognition and Instruction, 9,
in priming studies that employ very short SOAs (e.g., Kiefer &             201–219.
Brendel, 2006), and the positive slow wave most likely reflects         Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new
the electrophysiological response to speech riding on top of               method for off-line removal of ocular artifacts.
an electrophysiological response to the gesture. Importantly, this         Electroencephalography and Clinical Neurophysiology,
positive skew is uniform for all stimuli, so it should not confound        55, 468–484.
any of our treatment conditions.                                        Holcomb, P. (1993). Semantic priming and stimulus
3. Note that 250 msec is relatively early for the start of the N400        degradation: Implications for the role of the N400 in
component, but there is precedence in the literature for this early        language processing. Psychophysiology, 30, 47–61.
onset (e.g., Van Berkum, van den Brink, Tesink, Kos, & Hagoort,         Holle, H., & Gunter, T. C. (2007). The role of iconic gestures in
2008). A post hoc analysis showed that the early negativity was            speech disambiguation: ERP evidence. Journal of Cognitive
not a separate N300 component (McPherson & Holcomb, 1999).                 Neuroscience, 19, 1175–1192.
4. Although not central to the present study, this gender               Holle, H., Gunter, T., Rüschemeyer, S. A., Hennenlotter, A., &
effect is noteworthy because it is related to a recent ERP study           Iacoboni, M. (2008). Neural correlates of the processing of
demonstrating a large N400 component when people hear                      co-speech gestures. Neuroimage, 39, 2010–2024.
utterances that are unusual for a speaker (e.g., a man saying           Howell, D. C. (2002). Statistical methods for psychology
that he wished he looked like Britney Spears; Van Berkum et al.,           (5th ed.). Belmont, CA: Duxberry Press.
2008). The present findings suggest that simply seeing the gen-         Hubbard, A. L., Wilson, S. M., Callan, D. E., & Dapretto, M.
der of speaker sets up a “semantic” expectation of the gender of           (2009). Giving speech a hand: Gesture modulates activity in
the voice it accompanies.                                                  auditory cortex during speech perception. Human Brain
5. We do not explore the P2 effect in the present manuscript.              Mapping, 30, 1028–1037.
For coverage of the early integration of gesture and speech, see        Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G.,
Kelly et al., 2004.                                                        Mazziotta, J. C., & Rizzolatti, G. (2005). Grasping intentions
                                                                           of others with oneʼs own mirror neuron system. PloS Biology,
                                                                           3, 529–535.
REFERENCES                                                              Kelly, S. D., Barr, D., Church, R. B., & Lynch, K. (1999). Offering
                                                                           a hand to pragmatic understanding: The role of speech
Armstrong, D. F., & Wilcox, S. E. (2007). The gestural origin of           and gesture in comprehension and memory. Journal of
  language. New York: Oxford University Press.                             Memory and Language, 40, 577–592.
Bargh, J. A., & Morsella, E. (2008). The unconscious mind.              Kelly, S. D., & Church, R. B. (1998). A comparison
  Perspectives on Psychological Science, 3, 73–79.                         between childrenʼs and adultsʼ ability to detect childrenʼs
Bates, E., Benigni, L., Bretherton, I., Camaioni, L., & Volterra, V.       representational gestures. Child Development, 69,
  (1979). The emergence of symbols: Cognition and                          85–93.
  communication in infancy. New York: Academic Press.                   Kelly, S. D., Iverson, J., Terranova, J., Niego, J., Hopkins, M., &
Bates, E., & Dick, F. (2002). Language, gesture, and the                   Goldsmith, L. (2002). Putting language back in the body:
  developing brain. Developmental Psychobiology, 40, 293–310.              Speech and gesture on three timeframes. Developmental
Beattie, G., & Shovelton, H. (1999). Do iconic hand gestures               Neuropsychology, 22, 323–349.
  really contribute anything to the semantic information                Kelly, S. D., Kravitz, C., & Hopkins, M. (2004). Neural correlates
  conveyed by speech? An experimental investigation.                       of bimodal speech and gesture comprehension. Brain and
  Semiotica, 123, 1–30.                                                    Language, 89, 253–260.
Bernardis, P., Salillas, E., & Caramelli, N. (2008). Behavioural        Kelly, S. D., Ward, S., Creigh, P., & Bartolotti, J. (2007). An
  and neurophysiological evidence of semantic interaction                  intentional stance modulates the integration of gesture and
  between iconic gestures and words. Cognitive                             speech during comprehension. Brain and Language, 101,
  Neuropsychology, 25, 1114–1128.                                          222–233.

                                                                                                   Kelly, Creigh, and Bartolotti       693
Kendon, A. (2004). Gesture: Visible action as utterance.               and gesture: Insights from event-related brain potentials.
  Cambridge: Cambridge University Press.                               Journal of Cognitive Neuroscience, 19, 605–616.
Kiefer, M., & Brendel, D. (2006). Attentional modulation            Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive
  of unconscious “automatic” processes: Evidence from                  control. In R. L. Solso (Ed.), Information processing and
  event-related potentials in a masked priming paradigm.               cognition: The Loyola symposium (pp. 55–85). Hillsdale, NJ:
  Journal of Cognitive Neuroscience, 18, 184–198.                      Lawrence Erlbaum.
Kita, S., & Özyürek, A. (2003). What does cross-linguistic          Pourtois, G., de Gelder, B., Vroomen, J., Rossion, B., &
  variation in semantic coordination of speech and gesture             Crommelinck, M. (2000). The time-course of intermodal
  reveal?: Evidence for an interface representation of spatial         binding between hearing and seeing affective information.
  thinking and speaking. Journal of Memory and Language,               NeuroReport, 11, 1329–1333.
  48, 16–32.                                                        Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp.
Krauss, R. M. (1998). Why do we gesture when we speak?                 Trends in Neurosciences, 21, 188–194.
  Current Directions in Psychological Science, 7, 54–59.            Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron
Krauss, R. M., Morrel-Samuels, P., & Colasante, C. (1991). Do          system. Annual Review of Neuroscience, 27, 169–192.
  conversational hand gestures communicate? Journal of              Ruz, M., Madrid, E., Lupiánez, J., & Tudela, P. (2003). High
  Personality and Social Psychology, 61, 743–754.                      density ERP indices of conscious and unconscious
Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences:      semantic priming. Cognitive Brain Research, 17,
  Brain potentials reflect semantic incongruity. Science, 207,         719–731.
  203–204.                                                          Schneider, W., & Shiffrin, R. M. (1977). Controlled and
Langton, S. R. H., & Bruce, V. (2000). You must see the point:         automatic human information processing: Detection, search,
  Automatic processing of cues to the direction of social              and attention. Psychological Review, 84, 1–66.
  attention. Journal of Experimental Psychology: Human              Skipper, J. I., Goldin-Meadow, S., Nusbaum, H. C., & Small,
  Perception and Performance, 26, 747–757.                             S. L. (2007). Speech-associated gestures, Brocaʼs area, and
Logan, G. D. (1978). Attention in character classification:            the human mirror system. Brain and Language, 101,
  Evidence for the automaticity of component stages.                   260–277.
  Journal of Experimental Psychology: General, 107,                 Stroop, J. R. (1935). Studies of interference in serial verbal
  32–63.                                                               reactions. Journal of Experimental Psychology, 18, 643–662.
MacLeod, C. M. (1991). Half a century of research on the Stroop     Van Berkum, J. J. J., van den Brink, D., Tesink, C., Kos, M.,
  effect: An integrative review. Psychological Bulletin, 109,          & Hagoort, P. (2008). The neural integration of speaker
  163–203.                                                             and message. Journal of Cognitive Neuroscience, 20,
MacWhinney, B., & Bates, E. (1989). The crosslinguistic                580–591.
  study of sentence processing. New York: Cambridge                 Willems, R. M., & Hagoort, P. (2007). Neural evidence for the
  University Press.                                                    interplay between language, gesture, and action: A review.
McNeill, D. (1992). Hand and mind: What gesture reveals                Brain and Language, 101, 278–289.
  about thoughts. Chicago: University of Chicago Press.             Willems, R. M., Özyürek, A., & Hagoort, P. (2007). When
McNeill, D. (2005). Gesture and thought. Chicago: University           language meets action: The neural integration of gesture and
  of Chicago Press.                                                    speech. Cerebral Cortex, 17, 2322–2333.
McPherson, W. B., & Holcomb, P. J. (1999). An                       Willems, R. M., Özyürek, A., & Hagoort, P. (2008). Seeing and
  electrophysiological investigation of semantic priming               hearing meaning: ERP and fMRI evidence of word versus
  with pictures of real objects. Psychophysiology, 36,                 picture integration into a sentence context. Journal of
  53–65.                                                               Cognitive Neuroscience, 20, 1235–1249.
Miller, G. A., Gratton, G., & Yee, C. M. (1988). Generalized        Wilson, S. M., Molnar-Szakacs, I., & Iacoboni, M. (2008).
  implementation of an eye movement correction procedure.              Beyond superior temporal cortex: Intersubject correlations
  Psychophysiology, 25, 241–243.                                       in narrative speech comprehension. Cerebral Cortex, 18,
Montgomery, K. J., Isenberg, N., & Haxby, J. V. (2007).                230–242.
  Communicative hand gestures and object-directed hand              Winston, J. S., Strange, B. A., OʼDoherty, J., & Dolan, R. J.
  movements activated the mirror neuron system. Social                 (2002). Automatic and intentional brain responses during
  Cognitive & Affective Neuroscience, 2, 114–122.                      evaluation of trustworthiness of faces. Nature Neuroscience,
Nishitani, N., Schurmann, M., Amunts, K., & Hari, R. (2005).           5, 277–283.
  Brocaʼs region: From action to language. Physiology, 20,          Wu, Y. C., & Coulson, S. (2007a). Iconic gestures prime related
  60–69.                                                               concepts: An ERP study. Psychonomic Bulletin & Review, 14,
Özyürek, A., & Kelly, S. D. (2007). Gesture, brain, and language.      57–63.
  Brain and Language, 101, 181–184.                                 Wu, Y. C., & Coulson, S. (2007b). How iconic gestures enhance
Özyürek, A., Willems, R. M., Kita, S., & Hagoort, P. (2007).           communication: An ERP study. Brain and Language, 101,
  On-line integration of semantic information from speech              234–245.

694     Journal of Cognitive Neuroscience                                                                    Volume 22, Number 4
You can also read