Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Eye Movements A Window on Mind and Brain_Van Gompel_2007

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
15.82 Mб
Скачать

74

G. Westheimer

Westheimer, G. (1954b). Mechanism of saccadic eye movements. A.M.A. Archives of Ophthalmology, 52, 710–724.

Westheimer, G., & McKee, S. P. (1975). Visual acuity in the presence of retinal-image motion. Journal of the Optical Society of America, 65(7), 847–850.

Wiener, N. (1948). Cybernetics: or, Control and communication in the animal and the machine. Cambridge, MA: Technology Press.

Chapter 4

FIXATION STRATEGIES DURING ACTIVE BEHAVIOUR: A BRIEF HISTORY

MICHAEL F. LAND

University of Sussex, UK

Eye Movements: A Window on Mind and Brain

Edited by R. P. G. van Gompel, M. H. Fischer, W. S. Murray and R. L. Hill Copyright © 2007 by Elsevier Ltd. All rights reserved.

76

M. F. Land

Abstract

The study of the relationships of fixation sequences to the conduct of everyday activities had its origins in the 1950s but only started to flourish in the 1990s as head-mounted eyetrackers became readily available. The main conclusions from a decade of study are as follows: (i) that eye movements are not driven by the intrinsic salience of objects, but by their relevance to the task in hand; (ii) appropriate fixations typically lead manipulations by up to a second, and the eye often leaves the fixated object before manipulation is complete; (iii) many fixations have identifiable and often surprising roles in providing information for locating, guiding and checking activities. The overall conclusion is that, in contrast to free viewing, the oculomotor system is under tight top-down control, and eye movements and actions are closely linked.

Ch. 4: Fixation Strategies During Active Behaviour: A Brief History

77

1. Introduction: before 1990

Eye-movement recordings have been made for over a century (Wade & Tatler, 2005), but until comparatively recently such recordings were confined to tasks in which the head could be held stationary in a laboratory setting. These early studies included tasks such as reading aloud (Buswell, 1920), typing (Butch, 1932), and even playing the piano (Weaver, 1943). An important outcome of all these investigations was that eye movements typically lead actions, usually by about a second. Vision is thus in the vanguard of action, and not just summoned up as specific information is required.

The studies just quoted all used bench-mounted devices of various kinds to record the eye movements of restrained subjects. For tasks involving active movements of participants a different system is required, and all the devices that have been used successfully for such recordings over the last 50 years have been head-mounted. The first of these mobile eye-trackers was made by Norman Mackworth in the 1950s (Thomas, 1968). His device consisted of a head-mounted ciné camera which made a film of the view from the subject’s head, onto which was projected, via a periscope, an image of the corneal reflex which corresponded to the direction of the visual axis. Mackworth and Thomas (1962) used a TV camera version of the device to study the eye movements of drivers. However, the development of more user-friendly devices had to wait until the 1980s, when video cameras became smaller and computers could be enlisted in the analysis of the image from the eye. A review of the methodologies currently employed is given by Duchowski (2003).

The general question that mobile eye-trackers could begin to answer was the nature of the relation between patterns of fixations and the actions that they are associated with. In this chapter we will mainly be concerned with gaze direction, that is the points to which the fovea is directed. Shifts of gaze usually involve movements of the eyes and head, and often of the whole body. To record the whole pattern of movement that results in a gaze shift requires the separate measurement of the eye, head and body contributions. The interrelations of the different components are quite complex. For example, head movements augment eye movements during gaze shifts, but during fixations the effects of head movement are cancelled out by the vestibulo-ocular reflex, which rotates the eye with a velocity equal and opposite to the head movement (Guitton & Volle, 1987). A similar reciprocity exists for head and trunk rotations, mediated by the vestibulo-collic reflex (Land, 2004). Mounting the eye-tracker on the head obviates the need to record head and body movements separately, and only eye-in-head movements have to be measured in order to retrieve gaze direction on the image from the (also head-mounted) scene camera (see Figures 6 and 8).

Such eye-trackers allow us to address a variety of questions. Which points in the scene are selected, given that the fovea can only view one point at a time? What kinds of information do each fixation supply? And what are the time relations between eye movements and actions?

Although not concerned with active tasks, a key figure in the development of current ideas about active vision was Alfred Yarbus. In his book Eye movements and vision

78

M. F. Land

(a)

(b)

(c)

(d)

Figure 1. Eye movements made by subjects while examining I. E. Repin’s painting “An Unexpected Visitor”, with different questions in mind (adapted from Yarbus, 1967). (a) The picture. (b) “Remember the clothes worn by the people.” (c) “Remember the positions of the people and objects in the room.” (d) “Estimate how long the ‘unexpected visitor’ had been away from the family.” Saccades are the thin lines; fixations are the knot-like interruptions.

(Yarbus, 1967) he demonstrated convincingly that the kinds of eye movements people make when viewing a scene depend on what information they are trying to get from it, and not just on the eye-catching power (“intrinsic salience”) of the objects in that scene (Figure 1). He provided an account of eye movements in which central control, related to the task in hand, was seen as being more important than reflex-like responses to stimulus objects. In fact Guy Buswell had somewhat anticipated Yarbus’ approach in his book How People Look at Pictures, which also demonstrated that different patterns of eye movement result when different questions were asked (Buswell, 1935).

During manipulative activities of various kinds, we require different kinds of visual information from the world at different times (“Where’s the hammer?” “Is the kettle

Ch. 4: Fixation Strategies During Active Behaviour: A Brief History

79

boiling?”). We may thus expect that, like Yarbus’ subjects, our eye-movement patterns reflect the information needs of the tasks we are engaged in.

2. The block copying task of Dana Ballard: Two useful maxims

One of the first detailed studies of eye movements in relation to manipulative activity was by Ballard, Hayhoe, Li, and Whitehead (1992). They used a task in which a model consisting of coloured blocks had to be copied using blocks from a separate pool. Thus the task involved a repeated sequence of looking at the model, selecting a block, moving it to the copy and setting it down in the right place (Figure 2). The most important finding

Model

Source

Eye

Hand

1 s

Figure 2. The block copying task of Ballard et al. (1992). A Copy (bottom left) of the Model is assembled from randomly positioned blocks in the Source area. Typical movements of hand and eye are shown, together with their timing in a typical cycle. The eyes not only direct the hands, but also perform checks on the Model to determine the colour and location of the block being copied.

80

M. F. Land

was that the operation proceeds in a series of elementary acts involving eye and hand, with minimal use of memory. Thus a typical repeat unit would be as follows: Fixate (block in model area); remember (its colour); fixate (a block in source area of the same colour); pickup (fixated block); fixate (same block in model area); remember (its relative location); fixate (corresponding location in model area); move block; drop block. The eyes have two quite different functions in this sequence: to guide the hand in lifting and dropping the block, and, alternating with this, to gather the information required for copying (the avoidance of memory use is shown by the fact that separate glances are used to determine the colour and location of the model block). The only times that gaze and hand coincide are during the periods of about half a second before picking up and setting down the block (as with other tasks the eyes have usually moved on before the pickup or drop are complete).

The main conclusion from this study is that the eyes look directly at the objects they are engaged with, which in a task of this complexity means that a great many eye movements are required. Given the relatively small angular size of the task arena, why do the eyes need to move so much? Could they not direct activity from a single central location? Ballard et al. (1992) found that subjects could complete the task successfully when holding their gaze on a central fixation spot, but it took three times as long as when normal eye movements were permitted. For whatever reasons, this strategy of “do it where I’m looking” is crucial for the fast and economical execution of the task. This strategy seems to apply universally. With respect to the relative timing of fixations and actions, Ballard, Hayhoe, and Pelz (1995) came up with a second maxim: the “just in time” strategy. In other words the fixation that provides the information for a particular action immediately precedes that action; in many cases the act itself may occur, or certainly be initiated, within the lifetime of a single fixation. It seems that memory is used as little as possible.

3. Everyday life tasks: making tea and sandwiches

Activities such as food preparation, carpentry or gardening typically involve a series of different actions, rather loosely strung together by a flexible “script”. They provide examples of the use of tools and utensils, and it is of obvious interest to find out how the eyes assist in the performance of these tasks.

Land, Mennie, and Rusted (1999) studied the eye movements of subjects whilst they made cups of tea. When made with a teapot, this common task involves about 45 separate acts (defined as “simple actions that transform the state or place of an entity through manual manipulation”; Schwartz, Montgomery, Palmer, & Mayer, 1991). Figure 3 shows the 26 fixations made during the first 10 s of the task. The subject first examines the kettle (11 fixations), picks it up and looks towards the sink (3 fixations), walks to the sink whilst removing the lid from the kettle (inset: 4 fixations), places the kettle in the sink and turns on the tap (3 fixations), then watches the water as it fills the kettle (4 fixations). There is only one fixation that is not directly relevant to the task (to the sink tidy on the right).

Ch. 4: Fixation Strategies During Active Behaviour: A Brief History

81

Figure 3. Fixations and saccades made during the first 10 s of the task of making a cup of tea (lifting the kettle and starting to fill it). Note that fixations are made on the objects that are relevant at the time (kettle, sink, lid, taps, water stream) and that only one fixation is irrelevant to the task (the sink tidy on the right). Two other subjects showed remarkably similar fixation patterns (from Land et al., 1999).

Two other subjects showed remarkably similar numbers of fixations when performing the same sequence. The principal conclusions from this sequence are as follows:

1.Saccades are made almost exclusively to objects involved in the task, even though there are plenty of other objects around to grab the eye.

2.The eyes deal with one object at a time. This corresponds roughly to the duration of the manipulation of that object, and may involve a number of fixations on different parts of the object.

There is usually a clear “defining moment” when the eyes leave one object and move on to the next, typically with a combined head and eye saccade. These saccades can be used to “chunk” the task as a whole into separate “object-related actions”, and they can act as time markers to relate the eye movements to movements of the body and manipulations by the hands. In this way the different acts in the task can be pooled, to get an idea of the sequence of events in a “typical” act. The results of this are shown in Figure 4. Perhaps surprisingly, it is the body as a whole that makes the first movement in an object-related action. Often the next object in the sequence is on a different work surface, and this may necessitate a turn or a few steps before it can be viewed and manipulated. About half a second later the first saccade is made to the object, and half a second later still the first indications of manipulation occur. The eyes thus lead the hands. Interestingly, at the end of each action the eyes move on to the next object about half a second before manipulation is complete. Presumably the information that they have supplied remains in a buffer until the motor system requires it.

Almost identical results were obtained by Mary Hayhoe (Hayhoe, 2000; Hayhoe, Srivastava, Mruczec, & Pelz, 2003) in a study of students making peanut butter and jelly sandwiches. She found the same attachment of gaze to task-related objects and the same absence of saccades to irrelevant objects as with the tea-making gaze led manipulation,

82

M. F. Land

Number of observations

Number of

20

10

0

observations

–2

–1

0

1

2

3

4

–2

–1

0

1

2

3

4

 

 

Time (s)

 

 

 

 

 

Time (s)

 

 

 

Figure 4. The average sequence of events during the 40 or so “object related actions” that comprise the teamaking task (3 subjects). When a trunk movement is required (for example from one work surface to another) this precedes the first fixation on the next object by about half a second. Similarly the first fixation precedes the first movement of the hands by about the same amount. At the end of the action, the eyes have already moved to the next object in the sequence about half a second before manipulation is complete, implying that information is retained in a buffer.

although by a somewhat shorter interval. This difference is probably attributable to the fact that the sandwich-making was a sit-down task only involving movements of the arms. Two other differences that may have the same cause are the existence of more short duration (<120 ms) fixations than in the tea-making study and the presence of more “unguided” reaching movements (13%) mostly concerned with the setting down of objects. There was a clear distinction in both studies between “within object” saccades which had mean amplitudes of about 8 in both, and “between object” saccades which were much larger, up to 30 in the sandwich-making on a restricted table top, and 90 in tea-making in the less restricted kitchen (Land & Hayhoe, 2001).

From their tea-making study Land et al. (1999) concluded that individual fixations had four main functions: locating (an object for future use), directing (hand to object), guiding (one object with respect to another, e.g., lid to pan), and checking (that some condition is met). The last three (directing, guiding, checking) comply with the maxims (do it where I’m looking and just in time) of Ballard et al. (1992, 1995), but the first (locating) does not. There is no action at the time, and information is stored for future use. In a study

Ch. 4: Fixation Strategies During Active Behaviour: A Brief History

83

of hand washing, Pelz & Canosa (2001) found a small number of similar “look-ahead fixations” to objects to be contacted a few seconds later, as did Hayhoe et al. (2003) during sandwich-making. These fixations show that, in contrast to the apparent outcome of some “change blindness” studies, positional information is sometimes retained across time intervals corresponding to many fixations (see Tatler, 2002; Tatler, Gilchrist, & Land, 2005).

4. Ball games

Some ball sports are so fast that there is barely time for the player to use his normal oculomotor machinery. Within less than half a second (in baseball or cricket) the batter has to judge the trajectory of the ball and formulate a properly aimed and timed stroke. The accuracy required is a few cm in space and a few ms in time (Regan, 1992). Half a second gives time for one or at the most two saccades, and the speeds involved preclude smooth pursuit for much of the ball’s flight. How do practitioners of these sports use their eyes to get the information they need?

Part of the answer is anticipation. Ripoll, Fleurance, and Caseneuve (1987) found that international table-tennis players anticipated the bounce and made a pre-emptive saccade to a point close to the bounce point. Land and Furneaux (1997) confirmed this (with more ordinary players). They found that shortly after the opposing player had hit the ball the receiver made a saccade down to a point a few degrees above the bounce point, anticipating the bounce by about 0.2 s. At other times the ball was tracked around the table in a normal non-anticipatory way: tracking in this case was almost always performed by means of saccades rather than smooth pursuit. The reason why players anticipate the bounce is that the location and timing of the bounce are crucial in the formulation of the return shot. Up until the bounce, the trajectory of the ball as seen by the receiver is ambiguous. Viewed monocularly, the same retinal pattern in space and time would arise from a fast ball on a long trajectory or a slow ball on a short one (Figure 5). (Whether either stereopsis or looming information is fast enough to provide a useful depth signal is still a matter of debate). This ambiguity is removed the instant the timing and position of the bounce are established. Therefore the strategy of the player is to get gaze close to the bounce point (this need not be exact) before the ball does, and lie in wait. The saccade that effects this is interesting in that it is not driven by a “stimulus”, but by the player’s estimate of the location of something that has yet to happen.

In cricket, where – unlike baseball – the ball bounces before reaching the batsman, Land and McLeod (2000) found much the same thing as in table tennis. With fast balls the batsmen watched the delivery and then made a saccade down to the bounce point, the eye arriving 0.1 s or more before the ball (Figure 6). They showed that with a knowledge of the time and place of the bounce the batsman has the information he needs to judge where and when the ball will reach his bat. Slower balls involved more smooth pursuit. With good batsmen this initial saccade had a latency of only 0.14 s, whereas poor or non-batsmen had more typical latencies of 0.2 s or more.