Ординатура / Офтальмология / Английские материалы / Eye Movements A Window on Mind and Brain_Van Gompel_2007
.pdfThis page intentionally left blank
PART 4
MODELLING OF EYE MOVEMENTS
Edited by
WAYNE S. MURRAY
This page intentionally left blank
Chapter 11
MODELS OF OCULOMOTOR CONTROL IN READING: TOWARD A THEORETICAL FOUNDATION OF CURRENT DEBATES
RALPH RADACH
Florida State University, USA
RONAN REILLY
National University of Ireland, Maynooth, Ireland
ALBRECHT INHOFF
State University of New York, Binghamton, USA
Eye Movements: A Window on Mind and Brain
Edited by R. P. G. van Gompel, M. H. Fischer, W. S. Murray and R. L. Hill Copyright © 2007 by Elsevier Ltd. All rights reserved.
238 |
R. Radach et al. |
Abstract
This chapter begins with a review and classification of the range of current approaches to the modeling of eye movements during reading, discussing some of the controversies and important issues arising from the variety of approaches. It then focuses on the role and conceptualization of visual attention inherent in these models and how this relates to spatial selection, arguing that it is important to distinguish visual selection for the purpose of letter and word recognition from visual selection for the purpose of movement preparation. The chapter concludes with some proposals for model testing and evaluation and some challenges for future model development.
Ch. 11: Models of Oculomotor Control in Reading |
239 |
1. Introduction: Models of oculomotor control in reading
The study of complex cognitive processes via the measurement and analysis of eye movements is a flourishing field of scientific endeavor. We believe that three reasons can account for this continuing attractiveness (Radach & Kennedy, 2004). First, reading provides a domain in which basic mechanisms of visual processing (e.g. “selective attention” or “intersaccadic integration”) can be studied in a highly controlled visual environment during a meaningful and ecologically valid task. Second, reading can be studied as an example of complex human information processing in general. This involves the explicit description of the relevant levels and modules of processing and the specification of their dynamic interactions. Third, understanding links between cognition and eye-movement control in reading is one of the backbones of modern psycholinguistics, where oculomotor methodology has become a standard tool of studying the processing of written language at the level of words, sentences and integrated discourse. Excellent discussions of theoretical and methodological issues in this important subfield of reading research can be found in Murray (2000) and Clifton, Straub, and Rayner (Chapter 15, this volume).
In the context of the description given above, the development of computational models is relevant for all three levels of ongoing research on reading. An important theoretical starting point for any model of this kind is the idea that continuous reading involves two concurrent streams of processing. Quite obviously, the primary task is the processing of written language, where the acquisition of orthographically coded linguistic information feeds into the construction of a cognitive text representation. At the same time, the targeting and timing of saccadic eye movements serves to provide adequate spatio-temporal conditions for the extraction of text information. To understand the coordination and integration of these two processing streams is the main motivating force for the development of computational models of oculomotor control in reading.
This view on modeling reading differs slightly from the position taken in the commentary chapter by Grainger (2003) in the book edited by Hyönä, Radach, & Deubel (2003). In his thoughtful discussion of relations between research on single-word recognition and continuous reading, he emphasizes a distinction between a “model of reading” and a “model of a task” used to study reading. In this context, measuring eye movements is one particular task, with a role similar to a lexical decision or perceptual identification task. All possible tasks emphasize different aspects of the process and the “functional overlap” between them will be instrumental in understanding the true nature of reading. Although this view provides a useful strategy for research on specific hypotheses about word processing, it neglects one fundamental point: Eye movements are not just an indicator of cognition, they are part and parcel of visual processing in reading, just as “active vision” (Findlay & Gilchrist, 2003) is part and parcel of perception and information processing in general. The eyes are virtually never static and words always compete with other visual objects for processing on multiple levels and for becoming the target of the ensuing
240 |
R. Radach et al. |
saccade. Thus, coordinating eye and mind is a key part of the natural process of reading rather than a level of complexity added by just another task.1
In the mid-1990s, the arena of (pre-computational) models was dominated by the debate about the degree to which “eye” and “mind” are linked during reading. In this discussion, the prototypical adversary on the cognitive side was the attention-based control model originally proposed by Morrison (1984) and reformulated by Rayner & Pollatsek (1989). On the other visuomotor end of the spectrum was the “strategy and tactics” theory by O’Regan (1990), claiming that the eyes were driven by a global scanning strategy in combination with local (re)fixation tactics. However, it soon became clear that these extremes could not account for the full range of phenomena, so that both sides acknowledged the need to include ideas from both points of view (O’Regan, Vitu, Radach, & Kerr, 1994; Rayner & Raney, 1996). As evident in the description of the current E-Z reader model in Chapter 12, the model implements both the refixation tactics suggested by O’Regan and the metrical principles of saccade-landing positions proposed by McConkie, Kerr, Reddix, & Zola, (1988). On the other hand, Glenmore, a model that has been developed out of the tradition of low-level theories (Reilly & O’Regan, 1998; Radach & McConkie, 1998) incorporates a word-processing module that, together with visual processing and oculomotor constraints, determines the dynamics of eye movements during reading.
Today the most comprehensive computational models of visuomotor control in reading are E-Z reader and the SWIFT model, which, on a comparable level of model complexity, account for an impressively wide range of empirical phenomena. These include not just basics such as effects of word frequency and word length on spatial and temporal parameters, but also such intricate phenomena as the modulation of parafoveal preprocessing by foveal processing, the generation of regressions and the socalled “inverted optimal viewing position effect”. Interestingly, there are a number of similarities between both families of models, such as the idea of a labile and a non-labile phase of saccade programming and the implementation of saccade amplitude generation based on McConkie et al. (1988). Major differences concern two central questions about the nature of the eye–mind link. While in E-Z reader every interword saccade is assumed to be triggered by a specific word-processing event, saccades are triggered in SWIFT by an autonomous generator which in turn is modulated (delayed) by the mental load of foveal linguistic processing. The second important difference is the degree of spatial and temporal overlap in the processing of words within the perceptual span. In all sequential attention shift (SAS) models, a one-word processing beam, referred to as “attention”, moves in strictly sequential fashion. In contrast, linguistic processing
1 Measuring eye movements in continuous reading situations is certainly not the only way to understand the processing of written language. Single-word recognition paradigms make extremely valuable contributions to the understanding of word processing under rigorously controlled conditions. These paradigms have, in many respects, laid the foundation for experimental reading research as a whole (see Jacobs & Grainger, 1994, for an informative review).
Ch. 11: Models of Oculomotor Control in Reading |
241 |
encompasses several words within a gradient of attention or “field of activation” in SWIFT.
The idea that there is a certain degree of parallel word processing is shared in one or another way by a number of recent models, which are often referred to as PG (processing gradient) models (Reilly & Radach, 2006; Inhoff, Eiter, & Radach, 2005) or GAG (guidance by attentional gradient) (e.g. Engbert, Nuthmann, Richter, & Kliegl, 2006) models. We prefer the term “processing gradient” because it avoids reference to the notion of “attention”, which, as we will discuss below in some detail, may be applied to a variety of diverse phenomena and hence might lead to misunderstandings.
Looking at the recent models listed above, it is clear that all can be readily classified along the axes just mentioned, autonomous saccade generation vs cognitive control on the one hand and sequential vs parallel word processing on the other hand. It is also interesting to note that both dimensions are indeed necessary for a meaningful classification. This can be illustrated using the example of Mr. Chips, an ideal observer model developed by Legge, Klitz, & Tjan (1997; see also Legge, Hooven, Klitz, Mansfield, & Tjan, 2002). Here, letter and word processing is parallel across words boundaries within a fixed visual span but saccade control is exclusively determined by cognitive processing.
Within the family of PG models, there is substantial variation in the degree to which (more or less) parallel processing is associated with cognitive modulation of saccade triggering. Today, nearly everyone in the field acknowledges the overwhelming evidence that linguistic processing is reflected in temporal and spatial aspects of saccade control. However, it is nonetheless fascinating to see how far one can go with models that rely almost exclusively on low-level visual processing and oculomotor constraint. One such model is the SERIF model (McDonald, Carpenter, & Shillcock, 2005), another is the Competition-Interaction model (Yang, 2006; Yang & McConkie, 2001). In both models, there is little (and rather indirect) cognitive influence on saccade control, while, in contrast, ongoing linguistic processing has a substantial impact on saccade triggering in SWIFT and Glenmore.
As mentioned above, at the other end of the spectrum there is the assumption by the authors of the E-Z reader model that each saccadic eye movement is triggered by a single cognitive processing event. Given the number and diversity of opposing approaches, it may seem that the sequential attention assumption is a “minority position” or an extreme viewpoint. However, such an impression is unjustified for two reasons. First, the E-Z reader model is actually not at the real end of the spectrum, as it incorporates a lot of low-level machinery. Much more extreme versions of cognitive control have been proposed in the past (e.g. Just & Carpenter, 1988) but did not receive much empirical support. Second, and more importantly, when conceiving a spatially distributed mode of processing, there are many ways by which one can design and implement such a system. However, in the case of a sequential processing system, there are much tighter constraints such as the need for a precisely defined trigger event for saccades, driving the design of a model in a certain direction. It is therefore no surprise that there is only one family of E-Z reader models and a relatively large number of competitors. Engbert, Nuthmann,
242 |
R. Radach et al. |
Richter, & Kliegl (2005) have recently expressed this relation of opposing approaches by referring to a sequential control mechanism as one special case within a space of possible solutions along a dimension from massive to very limited parallel processing.
By occupying one of the extreme positions on this continuum, the E-Z reader model gains a unique quality: It becomes so specified that falsification of core mechanisms within the model becomes feasible (see Jacobs, 2000, for a detailed discussion). We believe that it is primarily for this reason that this type of model has provoked an enormous amount of empirical work. Therefore, even if some of the central assumptions of the sequential architecture turn out to be incorrect, the model will have contributed more to the field than many of its less traceable competitors.
Our goal for this commentary chapter is not to discuss the state of the art in the entire field of computational modeling of continuous reading. An excellent overview has been provided in the review by Reichle, Rayner, & Pollatsek (2003), summarizing key features of virtually all existing types of models. Publications on the most recent versions of the SWIFT model (Engbert, Nuthmann, Richter, & Kliegl, 2005) and the E-Z reader model (Pollatsek, Reichle, & Rayner, 2006) include detailed discussions of their background and also point to some controversial points in the ongoing theoretical debate. A special issue of Cognitive Systems Research edited by Erik Reichle includes new or updated versions of no less than six different models: the SWIFT model (Richter, Engbert, & Kliegl, 2006), the E-Z Reader model (Reichle, Pollatsek, & Rayner, 2006), the Glenmore model (Reilly & Radach, 2006), the Competition/Interaction model (Yang, 2006) and the SHARE model (Feng, 2006).
Rather than giving another descriptive overview concerning design principles, mechanisms and implementations of the existing models, this chapter will attempt to provide a relatively detailed discussion of some rather fundamental theoretical ideas that are implicit in many modeling approaches but only rarely made explicit. We will focus this discussion on aspects of visual processing often subsumed under the concept of “attention”. This discussion will include quite a few references to research on oculomotor control outside the domain of reading, which appears particularly appropriate in a volume that reflects the state of the art in oculomotor research as a whole. The idea is to explore the question to what extent current modeling in the field of reading is grounded in basic visuomotor research and how it corresponds to evidence from neighboring domains. Since there is only one integrated human information processing system, any model of reading should be seen as special case of a more general theory of visual processing and oculomotor control. Processing mechanisms proposed for reading should be in harmony with the mainstream of visuomotor research, and models should, in principle, be able to generalize to other domains such as scene perception and visual search. After looking in some detail at answers to a number of key questions about “visual attention”, we will consider consequences for models of reading. In the final part of the chapter we will discuss some issues for future model developments and point to important problems for model comparison and evaluation.
Ch. 11: Models of Oculomotor Control in Reading |
243 |
2. The role of visual attention for theories and models of continuous reading
2.1. Definitions of visual selective attention
Until recently, in most cases the notion of “attention” was used in the literature on eye movements in reading without providing or referring to any explicit definition. The reason for this lack of precision may be that some authors have shared the popular view that “everyone knows what attention is” (James, 1890), so that a definition was considered unnecessary.2 Alternatively, authors may have avoided definitions because they were aware of the problems that still exist in the present literature on attention with providing a precise and unambiguous specification of the concept. In recent publications of the E-Z reader model, Reichle et al. have specified their understanding of attention by referring to the classic work by Posner (1980, see below). Other authors have linked their theoretical ideas to a gradient conception of attention (e.g. Engbert, Nuthmann, Richter, & Kliegl, 2005) or deliberately avoided the term “attention” altogether (Reilly & Radach, 2006). In this section we intended to deepen the ongoing discussion by discussing some fundamental questions of research about “attention” that should from the base of more detailed consideration within the framework of certain models on eye-movement control in reading.
There is an enormous body of literature on attention and for non-specialists it is quite difficult to keep track with the dynamic and complex development in the field (see, e.g., Chun & Wolfe, 2001; Egeth & Yantis, 1997 and Pashler, 1998, for reviews). Following Groner (1988), we can differentiate different facets of “attention” as follows: On a very abstract level, one can first distinguish general attention in terms of alertness, arousal or general activation from specific attention. Within the domain of specific attention, the notion is used to describe issues related to “divided” attention, the processing of information on competing channels, further, the “orienting of attention”, and finally, problems related to the “allocation of attentional resources”. Zooming in on attentional orienting, the distinction of overt vs covert orienting refers to whether “orienting” is observable in terms of a behavioral act (e.g. an eye movement) or is hypothetical, for example as an “attentional movement” inferred indirectly from behavioral data. A further distinction can be drawn between orienting that is controlled externally, for example by a sudden visual onset, “capturing” attention, or internally, for example by inducing expectations about the location of a to-be-displayed stimulus. In this chapter we will concentrate primarily on attentional orienting, and, more specifically, visual selection,
2 The explanation given by James (1890) |
reads |
as follows: |
“It is the taking possession by the mind, |
in clear and vivid form, of one out of |
what |
seem several |
simultaneously possible objects or trains |
of thought. Focalization, concentration, of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state which in French is called distraction, and Zerstreutheit in German” (http://psychclassics.yorku.ca/James/Principles/prin11.htm). Interestingly, although this definition appears straightforward, it captures only one aspect of attentional processing, which is of rather peripheral importance to our discussion.
