Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Eye Movements A Window on Mind and Brain_Van Gompel_2007

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
15.82 Mб
Скачать

464

M. K. Tanenhaus

the listener’s attention would be drawn to the larger of the two apples, because a scalar adjective signals a contrast among two or more entities of the same semantic type (Sedivy et al., 1999). Thus apple will be immediately interpreted as the misshapen green apple, even though a more prototypical red apple is present in the display. And, when the apple is encountered in the second clause, the red apple would be ignored in favor of the large green apple, because that apple has been introduced into the discourse, but not as the most salient entity (Dahan, Tanenhaus, & Chambers, 2002).

The standard account does not make sense when we try to generalize it to processing in the context of concrete referents. In contrast, the account that emerges from visual world research does generalize to processing in the absence of a more specific context. In particular, the scalar big would still be interpreted as the member of a contrast set picked out by size; it is just that the contents of the contrast set are not instantiated until the noun, apple has been heard. And, any increase in processing difficulty when the apple is processed would not be reflecting an inference to establish that the referent is the big apple, but rather the shift in discourse focus from the previously focused entity (the pencil) to a previously mentioned entity (the apple). In this example, then, the display clearly changes processing, but in ways that clarify (but do not distort) the underlying processes.

Now consider the discourse: The man returned home and greeted his pet dog. It/ The beast/A beast then began to lick/attack him. Well-understood principles of reference assignment mandate that it should refer to the dog, and a beast to an animal other than the dog. Now imagine the same discourse in the context of a display containing a man standing in front of an open door to a hut in a jungle village, a dog with a collar, a tiger, and a rabbit. Compared to appropriate control conditions, we would expect a pattern of looks indicating that it was interpreted as the dog, and a beast as the tiger, regardless of whether the verb was lick or attack. If, however, it were interpreted as the tiger when the verb was attack, or if a beast were to be interpreted as the dog for lick, then we would have a clear case of the display distorting the comprehension process. This conclusion would be merited because these interpretations would violate well-understood principles of reference resolution. Now consider the definite noun phrase, the beast. This referential expression could either refer to the mentioned entity, the pet dog, or it could introduce another entity. In the discourse-alone condition, a listener or reader would most likely infer that the beast refers to the dog, because no other entity has been mentioned. However, in the discourse with display condition, a listener might be more likely to infer that the beast refers to the tiger. Here the display changes the interpretation, but it does not change the underlying process; the display simply makes accessible a potential unmentioned referent, which is consistent with the felicity conditions for the type of definite reference used in the discourse. Indeed, we would expect the same pattern of interpretation if the tiger had been mentioned in the discourse.1

1 This observation is due to Gerry Altmann. The example presented here is adapted from one presented by Simon Garrod in a presentation at the 2003 Meeting of the CUNY Sentence Processing Conference.

Ch. 20: Eye Movements and Spoken Language Processing

465

Thus far investigations of the effects of using a display and using a task have not uncovered any evidence that the display or the task is distorting the underlying processes. To the contrary, the results have been encouraging for the logic of the visual world approach. However, it will be crucial in further work to explore the nature of the interactions between the display, the task, and linguistic processing in much greater detail. Moreover, the ability to control and understand the context in which the language is being produced and understood, which is one of the most powerful aspects of the visual world paradigm, depends in large part on developing a better understanding of these interactions.

5. Conclusion

This chapter has provided a general introduction to how psycholinguists are beginning to use eye movements to study spoken language processing. We have reviewed some of the foundational studies, discussed issues of data analysis and interpretation, and discussed issues that arise in comparing eye movement reading studies to visual world studies. We have also discussed some of the issues that arise because the visual world introduces a context for the utterance.

The following chapters each contribute to issues that we have discussed and illustrate the wide range of questions to which the visual world paradigm is now being applied. Dahan, Tanenhaus, and Salverda examine how preview affects the likelihood that a cohort competitor will be looked at as a target word unfolds, contributing to our understanding of the effects of the display on the inferences we can draw about spoken word recognition in visual world studies. Bailey and Ferreira, 2005 show how the visual world paradigm can be used to investigate expectations introduced by disfluency, extending investigations of spoken language processing to the kinds of utterances one frequently encounters in real life, but infrequently encounters in psycholinguistic experiments. Wheeldon, Meyer, and van der Meulen (this book) extend work on eye movements in production by asking whether fixations can shed light on the locus of speech errors in naming. The surprising results have important theoretical and methodological implications for the link between fixations and utterance planning. Finally Knoeferle explores how thematic role assignment is affected by when information in a scene becomes relevant, contributing to our understanding of the interplay between the scene and the unfolding language. She also provides a general framework for understanding these interactions.

Acknowledgments

This work was supported by NIH grants DC005071 and HD 27206. Thanks to Delphine Dahan, Anne Pier Salverda, and Roger Van Gompel for helpful comments.

466

M. K. Tanenhaus

References

Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition: Evidence for continuous mapping models. Journal of Memory and Language, 38, 419–439.

Altmann, G. T. M., & Kamide, Y. (1999). Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition, 73, 247–264.

Altmann, G. T. M., & Kamide, Y. (2004). Now you see it, now you don’t: Mediating the mapping between language and the visual world. In J. M. Henderson, & Ferreira, F. (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Altmann, G. T. M., & Steedman, M. J. (1988). Interaction with context during human sentence processing.

Cognition, 30, 191–238.

Arnold, J. E., Eisenband, J., Brown-Schmidt, S., & Trueswell, J. C. (2000). The rapid use of gender information: Evidence of the time course of pronoun resolution from eyetracking. Cognition, 76, B13–B26.

Arnold, J. A., Tanenhaus, M. K. Altmann, R. J., & Fagnano, M. (2004). The old and, theee, uh, new: Disfluency and reference resolution. Psychological Science, 9, 578–582.

Bailey, K. G. D., & Ferreira, F. (2003). Disfluencies affect the parsing of garden-path sentences. Journal of Memory and Language, 49, 183–200.

Bailey, K. G. D., & Ferreira, F. (2005). The disfluent hairy dog: Can syntactic parsing be affected by non-word disfluencies? In J. Trueswell, & M. K. Tanenhaus (Eds.), Approaches to studying world-situated language use: Bridging the language-as-product and language-as-action traditions. Cambridge, MA: MIT Press.

Bock, J. K., Irwin, D. E., & Davidson D. J. (2004). Putting first things first. In J. M. Henderson, & F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Bock, J. K., Irwin, D. E., Davidson, D. J., & Levelt, W. J. M. (2003). Minding the clock. Journal of Memory and Language, 48, 653–685.

Boland, J. E. (2005). Visual arguments.Cognition, 95, 237–274.

Brown-Schmidt, S., Campana, E., & Tanenhaus, M. K. (2005). Real-time reference resolution in a referential communication task. In J. C. Trueswell, & M. K. Tanenhaus (Eds.), Processing world-situated language:

Bridging the language-as-action and language-as-product traditions. Cambridge, MA: MIT Press. Brown-Schmidt, S., & Tanenhaus, M. K. (2006). Watching the eyes when talking about size: An investigation

of message formulation and utterance planning. Journal of Memory and Language, 54, 592–609. Chambers, C. G., Tanenhaus, M. K., Eberhard, K. M., Filip, H., & Carlson, G. N. (2002). Circumscribing

referential domains during real-time language comprehension. Journal of Memory & Language, 47, 30–49. Chambers, C. G., Tanenhaus, M. K., & Magnuson, J. S. (2004). Action-based affordances and syntactic ambiguity resolution. Journal of Experimental Psychology: Learning, Memory & Cognition, 30, 687–696.

Clark, H. H. (1992). Arenas of language use. Chicago: University of Chicago Press.

Clifton, C., Traxler, M., Mohamed, M. T., Williams, R., Morris, R., & Rayner, K. (2003). The use of thematic role information in parsing: Syntactic processing autonomy revisited. Journal of Memory and Language, 49, 317–334.

Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language: A new methodology for the real-time investigation of speech perception, memory, and language processing. Cognitive Psychology, 6, 84–107.

Dahan, D., Magnuson, J. S., & Tanenhaus, M. K. (2001). Time course of frequency effects in spoken-word recognition: Evidence from eye movements. Cognitive Psychology, 42, 317–367.

Dahan, D., Magnuson, J. S. Tanenhaus, & Hogan, E. (2001). Subcategorical mismatches and the time course of lexical access: Evidence for lexical competition. Language and Cognitive Processes, 16, 507–534.

Dahan, D., & Tanenhaus, M. K. (2004). Continuous mapping from sound to meaning in spoken language comprehension: Evidence from eye movements. Journal of Experimental Psychology: Learning, Memory and Cognition, 30, 498–513.

Dahan, D., & Tanenhaus, M. K. (2005). Looking at the rope when looking for the snake: Conceptually mediated eye movements during spoken-word recognition. Psychological Bulletin & Review, 12, 455–459.

Ch. 20: Eye Movements and Spoken Language Processing

467

Dahan, D., Tanenhaus, M. K., & Chambers, C. G. (2002). Accent and reference resolution in spoken language comprehension. Journal of Memory and Language, 47, 292–314.

Eberhard, K. M., Spivey-Knowlton, M. J., Sedivy, J. C., & Tanenhaus, M. K. (1995). Eye-movements as a window into spoken language comprehension in natural contexts. Journal of Psycholinguistic Research, 24, 409–436.

Ferriera, F. & Bailey, K. G. B., This volume.

Ferreira, F., & Henderson, J. M. (2004). The interface of vision, language, and action. In J. M. Henderson & F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Frazier, L., & Rayner, K. (1982). Making and correcting errors during sentence comprehension: Eye movements in the analysis of structurally ambiguous sentences. Cognitive Psychology, 14, 178–210.

Griffin, Z. M. (2004a). Why look? In J. M. Henderson, & F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Griffin, Z. M. (2004b). The eyes are right when the mouth is wrong. Psychological Science, 15, 814–821. Griffin, Z. M., & Bock, J. K. (2000). What the eyes say about speaking. Psychological Science, 11, 274–279. Hanna, J. E., Tanenhaus, M. K., & Trueswell, J. C. (2003). The effects of common ground and perspective on

domains of referential interpretation. Journal of Memory and Language, 49, 43–61.

Hayhoe, M., & Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences 9, 188–194. Henderson, J. M., & Ferreira, F. (Eds.). (2004). The interface of language, vision, and action: Eye movements

and the visual world. New York: Psychology Press.

Huetting, F., & Altmann, G. T. M. (2004). The online processing of ambiguous and unambiguous words in context: Evidence from head mounted eye tracking. In M. Carreiras, & C. Clifton (Eds.), The on line study of sentence comprehension: Eyetracking, ERP and beyond. New York, NY: Psychology Press.

Ju, M., & Luce, P. A. (2004). Falling on sensitive ears: Constraints on bilingual lexical activation. Psychological Science, 15, 314–318.

Kaiser, E., & Trueswell, J. C. (2004). The role of discourse context in the processing of a flexible word-order language. Cognition, 94, 113–147.

Kamide, Y., Altmann, G. T. M., & Haywood, S. L. (2003). Prediction and thematic information in incremental sentence processing: Evidence from anticipatory eye movements. Journal of Memory and Language, 49, 133–156.

Keysar, B., Barr, D. J., Balin, J. A., & Brauner, J. S. (2000). Taking perspective in conversation: The role of mutual knowledge in comprehension. Psychological Science, 11, 32–38.

Knoeferle, P., Crocker, M. W., Scheepers, C., & Pickering, M. J. (2005). The influence of the immediate visual context on incremental thematic role-assignment: Evidence from eye-movements in depicted events.

Cognition, 95, 95–127.

Lucas, M. (1999). Context effects in lexical access: A meta-analysis. Memory and Cognition, 27, 375–398. Luce, R. D. (1959). Individual choice behavior. New York: Wiley.

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19, 1–36.

Magnuson, J. S. (2005). Moving hand reveals dynamics of thought. Proceedings of the National Academy of Sciences, 102, 9995–9996.

Magnuson, J. S., Dixon, J. F., Tanenhaus, M. K., & Aslin, R. N. (in press). Dynamic similarity in spoken word recognition. Cognitive Science.

Marslen-Wilson, W. (1987). Functional parallelism in spoken word recognition. Cognition, 25, 71–102. Marslen-Wilson, W. (1990). Activation, competition, and frequency in lexical access. In G. T. M. Altmann

(Ed.), Cognitive models of speech processing. Psycholinguistic and computational perspectives. Hove, UK: Erlbaum.

Marslen-Wilson, W. (1993). Issues of process and representation in lexical access. In G. T. M. Altmann, & R. Shillcock (Eds.), Cognitive models of speech processing: The second Sperlonga meeting. Hove, England UK: Lawrence Erlbaum Associates.

Matin, E., Shao, K., & Boff, K. (1993). Saccadic overhead: information processing time with and without saccades. Perception & Psychophysics, 53, 372–380.

468

M. K. Tanenhaus

McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1–86.

McMurray, B., Aslin, R. N., Tanenhaus, M. K., Spivey, M. J., & Subik, D. (2005). Gradient sensitivity to sub-phonetic variation in voiced-onset time in words and syllables, manuscript submitted for publication.

McMurray, B., Tanenhaus, M. K., & Aslin, R. N. (2002). Gradient effects of within-category phonetic variation on lexical access. Cognition, 86, B33–B42.

Meyer, A. S., Sleiderink, A. M., & Levelt, W. J. M. (1998). Viewing and naming objects. Cognition, 66, B25–B33.

Novick, J. M., Trueswell, J. C. & Thompson-Schill, S. L. (2005). Cognitive control and parsing: Reexamining the role of Broca’s Area in sentence comprehension. Cognitive, Affective, & Behavioral Neuroscience, 5, 263–281.

Rayner, K. (1998). Eye movements in reading and information processing: Twenty years of research. Psychological Bulletin, 124, 372–422.

Rayner, K., & Duffy, S. A. (1986) Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition, 14, 191–201.

Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass and Vanderwart’s object pictorial set: The role of surface detail in basic-level object recognition. Perception, 33, 217–236.

Runner, J. T., Sussman, R. S., & Tanenhaus, M. K. (2003). Assignment of reference to reflexives and pronouns in picture noun phrases: Evidence from eye movements. Cognition, 81, B1–13.

Runner, J. T., Sussman, R. S., & Tanenhaus, M. K. (2006). Assigning referents to reflexives and pronouns in picture noun phrases. Experimental tests of Binding Theory. Cognitive Science, 30, 1–49.

Salverda, A. P., & Altmann, G. (2005, September). Cross-talk between language and vision: Interference of visually-cued eye movements by spoken language. Poster presented at the AMLaP Conference, Ghent, Belgium.

Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G., & Carlson, G. N. (1999). Achieving incremental interpretation through contextual representation: Evidence from the processing of adjectives. Cognition, 71, 109–147.

Sereno, S. C., & Rayner, K. (1992). Fast priming during eye fixations in reading. Journal of Experimental Psychology: Human Perception and Performance, 18, 173–184.

Simpson, G. B. (1984). Lexical ambiguity and its role in models of word recognition. Psychological Bulletin, 96, 316–340.

Snodgrass, J. G., & Yuditsky, T. (1996). Naming times for the Snodgrass and Vanderwart pictures. Behavioral Research Methods, Instruments & Computers, 28, 516–536.

Spivey-Knowlton, M. J. (1996). Integration of visual and linguistic information: Human data and model simulations. Ph.D. dissertation, University of Rochester, New York.

Spivey, M., & Geng, J. (2001). Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychological Research, 65, 235–241.

Spivey, M., Grosjean, M., & Knoblich, G. (2005). Continuous attraction toward phonological competitors.

Proceedings of the National Academy of Sciences, 102, 10393–10398.

Spivey, M., & Marian, V. (1999). Cross talk between native and second languages: Partial activation of an irrelevant lexicon. Psychological Science, 10, 281–284.

Spivey, M. J., Richardson, D. C., & Fitneva, S. A. (2004). Thinking outside the brain: Spatial indices to visual and linguistic information. In J. M. Henderson, & Ferreira, F. (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Spivey, M. J., Tanenhaus, M. K., Eberhard, K. M., & Sedivy, J. C. (2002). Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology, 45, 447–481.

Swinney, D. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects.

Journal of Verbal Learning and Verbal Behavior, 18, 645–660.

Tanenhaus, M. K. (2004). On-line sentence processing: past, present and, future. In M. Carreiras, & C. Clifton, Jr. (Eds.), On-line sentence processing: ERPS, eye movements and beyond (pp. 371–392). New York: Psychology Press.

Ch. 20: Eye Movements and Spoken Language Processing

469

Tanenhaus, M. K., & Brown-Schmidt, S. (in press). Language processing in the natural world. To appear in Moore, B. C. M., Tyler, L. K., & Marslen-Wilson, W. D. (Eds.), The perception of speech: from sound to meaning. Theme issue of Philosophical Transactions of the Royal Society B: Biological Sciences.

Tanenhaus, M. K., Chambers, C. G., & Hanna, J. E. (2004). Referential domains in spoken language comprehension: Using eye movements to bridge the product and action traditions. In J. M. Henderson, & F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world. New York: Psychology Press.

Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S. (1979). Evidence for multiple stages in the processing of ambiguous words in syntactic contexts. Journal of Verbal Learning and Verbal Behavior, 18, 427–441.

Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. E. (l995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632–1634.

Tanenhaus, M. K., & Trueswell, J. C. (2005). Using eye movements to bridge the language as action and language as product traditions. In J. C. Trueswell, & M. K. Tanenhaus (Eds.), Processing world-situated language: Bridging the language-as-action and language-as-product traditions. Cambridge, MA: MIT Press.

Trueswell, J. C., Sekerina, I., Hill, N., & Logrip, M. (1999). The kindergarten-path effect: Studying on-line sentence processing in young children. Cognition, 73, 89–134.

Trueswell, J. C., & Tanenhaus, M. K. (Eds.), (2005). Processing world-situated language: Bridging the language-as-action and language-as-product traditions. Cambridge, MA: MIT Press.

Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. (1994). Semantic effects in parsing: Thematic role information in syntactic ambiguity resolution. Journal of Memory and Language, 33, 285–318.

Wheeldon, L. R., Meyer, A. S., & van der Muelen, F. F. this volume.

Yee, E., Blumstein, S., & Sedivy, J. C. (2000). The time course of lexical activation in Broca’s aphasia: Evidence from eye movements. Poster presented at the 13th Annual CUNY Conference on Human Sentence Processing, La Jolla, California.

This page intentionally left blank

Chapter 21

THE INFLUENCE OF VISUAL PROCESSING ON PHONETICALLY DRIVEN SACCADES IN THE “VISUAL WORLD” PARADIGM

DELPHINE DAHAN

University of Pennsylvania, USA

MICHAEL K. TANENHAUS and ANNE PIER SALVERDA

University of Rochester, USA

Eye Movements: A Window on Mind and Brain

Edited by R. P. G. van Gompel, M. H. Fischer, W. S. Murray and R. L. Hill Copyright © 2007 by Elsevier Ltd. All rights reserved.

472

D. Dahan et al.

Abstract

We present analyses of a large set of eye–movement data that examine how factors associated with the processing of visual information affect eye movements to displayed pictures during the processing of the referent’s name. We found that phonetically driven fixations are affected by display preview, by the ongoing uptake of visual information, and by the position of pictures in the visual display. Importantly, lexical frequency associated with a picture’s name affects the likelihood of refixating this picture and the timing of initiating a saccade away from this picture, thus supporting the use of eye movements as a measure of lexical processing.

Ch. 21: The Influence of Visual Processing on Phonetically Driven Saccades

473

Eye movements have increasingly become a measure of choice in the study of spokenlanguage comprehension, providing fine-grained information about how the acoustic signal is mapped onto linguistic representations as speech unfolds. Typically, participants see a small array of pictured objects displayed on a computer screen, hear the name of one of the pictures, usually embedded in an utterance, and then click on the named picture using a computer mouse. Participants’ gaze location is monitored. Of interest are the saccadic eye movements observed as the name of the picture unfolds until the appropriate object is selected. Early research revealed that, as the initial sounds of the target picture’s name are heard and processed, people are more likely to fixate on an object with a name that matches the initial portion of the spoken word than on an object with a non-matching name. Moreover, fixations to matching pictures are closely timelocked to the input, with signal driven–fixations occurring as quickly as 200 ms after the onset of the word (Allopenna, Magnuson, & Tanenhaus, 1998; also see Cooper, 1974; Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995).

Subsequent research has established that eye movements are a powerful tool for investigating the processes by which speech is perceived and interpreted, especially the time course of these processes. Allopenna et al. (1998) showed that the proportion of looks to each picture in the display over time can be closely predicted by the strength of the evidence that the name of the object is being heard. Strength of evidence for each object’s name was computed by transforming word activations predicted by a connectionist model of spoken-word recognition, TRACE (McClelland & Elman, 1986), into fixation proportions over time using the Luce choice rule (Luce, 1959) over the set of four word alternatives. Subsequent work has shown that eye movements are extremely sensitive to fine-grained phonetic and acoustic details in the spoken input (Dahan, Magnuson, Tanenhaus, & Hogan, 2001; McMurray, Tanenhaus, & Aslin, 2002; Salverda, Dahan, & McQueen, 2003).

The use of eye movements to visual referents as a measure of lexical processing requires the use of a circumscribed “visual world”, which is most often perceptually available to listeners before the target picture’s name is heard. This world provides the context within which the input is interpreted because, in most studies, the referent object is present on the display. Furthermore, for eye movements directed to visual referents to reflect processing of the spoken signal, information extracted from each type of stimulus must interact at some level. These aspects raise two interrelated questions: At what level does information extracted from the visual display interact with processing of the spoken input, and does the influence of the display limit the degree to which the results will generalize to less constrained situations?

Here, we lay out three possible ways by which visual and spoken stimuli might interact to constrain gaze locations. One possibility is that previewing the display before the spoken input begins provides a closed set of phonological alternatives against which a phonological representation of the speech input is later evaluated. This view assumes that the phonological forms associated with the pictured objects have been accessed before the spoken input begins, either because listeners implicitly prename the pictures to themselves or because the pictures automatically activate their names. When the spoken signal becomes available, no lexical processing per se is initiated. Instead, participants