Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
8.75 Mб
Скачать

274 Andrew Derrington

together. When the di erentiation that calculates the spatial and temporal gradients is combined with ltering, as it generally is in any biologically plausible motiondetecting model, it is possible to express the two approaches in very similar ways.

There are two reasons for this. First, di erentiating an image and then ltering it with a linear lter gives exactly the same result as di erentiating the lter and then using it to lter the original image. Second, di erentiating a lter converts an even lter into an odd lter, introducing the same phase shifts as exist between the quadrature pairs of lters in the linear motion sensor. Di erentiation also changes the amplitude spectrum of the lter, making it more high pass. One consequence of this is that combining a low-pass or mildly band-pass blurring function, such as might be produced by retinal processing, with di erentiation produces a band-pass lter that becomes narrower with subsequent di erentiation operations.

Mark Georgeson has shown that the motion energy computation can be done using input lters that have these spatial and temporal proles and that the motion energy velocity computation, which is done by dividing the output of the motion energy stage by the output of a spatially matched static contrast energy stage, is identical to the velocity computation based directly on spatial and temporal derivatives (Bruce, Green, & Georgeson, 1996). This blurs the distinction between the multichannel gradient model and energy models of motion analyzers.

B. Experimental Data

The preceding section shows that, although the various approaches to analyzing the direction of motion have di erent starting points, most of them can be brought to a common end point by appropriate choice of lter characteristics and of subsequent processing. This overlap between the di erent models has the consequence that psychophysical experiments may not reveal which of the approaches described in the previous section provides the most appropriate description of the mechanisms that enable us to see motion.

In this section we shall see that psychophysical experiments allow us not only to demonstrate that some motion percepts cannot be derived from analysis of the correspondence between image features over time and so must result from some sort of ltering operation applied to the raw spatiotemporal luminance prole but also to infer the spatial and temporal characteristics of the lters involved. They may yet allow us to distinguish between the di erent nonlinear operations that follow the ltering stages in the models outlined in the previous section. We shall also see that physiological experiments on single neurones can determine which of the di erent classes of model best applies to the neurone in question.

1. Psychophysics

a. Filtering versus Correspondence

In order to demonstrate that a motion percept does not depend on an analysis of how features or objects in the image change their position with time, it is su cient

6 Seeing Motion

275

to show either that motion is perceived when features are not displaced or do not appear to be displaced, or that the perceived motion and the perceived feature displacement may have di erent speeds or go in di erent directions. I shall discuss three clear demonstrations of this kind of dissociation between perceived motion and the perceived displacement of features, indicating that some motion percepts must be extracted directly from the image. In most images, however, motion analysis by tracking of features and by ltering gives the same result, and special techniques are necessary to distinguish motion percepts based on ltering from those based on correspondence.

Perhaps the most straightforward demonstration of a dissociation between perceived motion and the perceived positions of objects arises in the motion aftere ect, or waterfall illusion, rst described in the modern literature in the nineteenth century (Addams, 1834). If, after looking at an object or pattern in continuous motion for some time, the motion is stopped or the gaze is shifted, motion in the opposite direction to that which had actually been occurring is seen in the same part of the visual eld. However, static objects in that part of the visual eld do not appear to change their positions with time.

The fact that the motion aftere ect does not involve any changes in perceived position is clear evidence that the perception of motion can be dissociated from any changes in the position (or the perceived position) of image features, and thus must result from activity in special-purpose mechanisms for the perception of motion. Barlow and Hill suggested that the motion aftere ect arises because the perception of motion in a given direction arises when there is an excess of activity in neurones selective for motion in that direction, relative to those selective for motion in the opposite direction. They illustrated the point by showing recordings from a direc- tion-selective neurone recorded in the retina of the rabbit. Prolonged stimulation with motion in the preferred direction was followed by a depression of its ring rate below the normal resting level (Barlow & Hill, 1963). Although it is unlikely that direction-selective mechanisms in the human visual system are exactly the same as those in the rabbit retina, the motion aftere ect does point to the existence of motion-sensors in the human visual system that do not depend on changes in position.

A second situation in which motion and position changes are dissociated occurs when subjects attempt to detect oscillating relative motion in random-dot elds (Nakayama & Tyler, 1981). The stimulus consists of a rectangular eld of random dots in which each row of dots oscillates to and fro along a horizontal path. The horizontal velocity of each dot is given by the product of a sinusoidal function of its vertical position and a sinusoidal function of time. If the motion of such a pattern were sensed by a mechanism that sensed changes in the horizontal positions of dots, one would expect that the motion would be detectable whenever the amplitude of the oscillating displacement exceeded the smallest detectable displacement.

In fact, as Figure 9 shows, for temporal frequencies up to about 2 Hz, the threshold displacement in an oscillating random-dot display declines almost exactly in

276 Andrew Derrington

FIGURE 9 Displacement thresholds for detecting oscillatory shearing motion in a pattern of random dots. From approximately 0.1 Hz to 2 Hz the threshold declines in proportion to the temporal frequency of oscillation which indicates that, expressed as a velocity, it is unchanging over this range. (Reprinted from Vision Research, 21, Nakayama, K., & Tyler, C. W. Psychophysical isolation of movement sensitivity by removal of familiar position cues, pp. 427–433. Copyright 1981, with permission of Elsevier Science and the author.)

proportion to the temporal frequency, indicating that threshold is reached at a constant velocity rather than at a constant displacement. Although it is not possible a priori to dene how the sensitivity of motion detectors of di erent types should vary with temporal frequency, it seems likely that in this case the limit is not set by a mechanism that tracks displacements since one would expect the performance of such a mechanism to be limited by displacement, and to decline rapidly at high temporal frequency, since one might expect that the encoding of each position displacement would take a more or less xed time.

A third situation in which motion judgments and displacement judgments are dissociated arises when human subjects are asked to discriminate the direction of motion of complex grating patterns that contain a high spatial-frequency (about 3 cycles/deg) moving sinusoid added to a low spatial-frequency (1 cycle/deg) sinusoid that does not move. At long durations the motion of such a pattern is seen correctly, but when it is presented for less than about 100 ms the pattern appears to move in the opposite direction to the actual motion of the 3 cycle per degree component, both when the motion is continuous and when it is part of a two-frame apparent motion sequence (Derrington & Henning, 1987b; Henning & Derrington, 1988). However, if subjects are asked to discriminate the direction of vernier o set between the two frames of the apparent motion sequence presented one

6 Seeing Motion

277

above the other, they perform correctly both at short durations and at long durations, as shown in Figure 10.

It is not clear what makes this pattern appear to move in the wrong direction when it is presented for a short duration; however, the fact that it reverses its direction of motion without reversing the o set in perceived position indicates that the motion signal is derived independently of any sense of spatial position. It does not depend on a correspondence-based mechanism.

If we make the assumption that the reversal of the motion percept at short durations is some intrinsic property of the motion lters, such as an interaction between lters tuned to di erent spatial frequencies, it follows that the most likely explanation for the recovery in performance at long durations is that the correspondencebased mechanism for sensing motion is able to provide a veridical signal at long durations that overcomes the erroneous signal derived from motion lters. From this it is tempting to infer that the stimulus duration at which veridical motion is rst seen, about 200 ms, represents a lower limit on the operation of the corre- spondence-based motion-sensing mechanism. It suggests that one way to isolate motion mechanisms based on spatiotemporal lters is to use stimuli shorter in duration than this.

FIGURE 10 Performance in judging direction of motion and direction of vernier o set in a pattern that consisted of the sum of a 3 cycle/degree grating that was displaced either between frames (in the motion task) or between the top of the frame and the bottom of the frame (in the vernier task) and a 1 cycle/degree grating that was not displaced. The motion discrimination is reliable but incorrect (i.e., observers see reversed motion) at short durations and correct at long durations. The vernier discrimination is correct at long and short durations. (Reprinted from Vision Research, 27, Derrington, A. M., & Henning, G. B. Errors in direction-of-motion discrimination with complex stimuli, 61–75. Copyright 1987, with permission of Elsevier Science.)

278 Andrew Derrington

Motion discriminations that depend on correspondence-based mechanisms can be identied by adding to the stimulus a mask that prevents a correspondence-based mechanism from extracting a motion signal but does not a ect the motion lter. Lu and Sperling (1995) have shown that adding a pedestal, a high-contrast static replica of itself, to a moving sinusoidal grating should have no e ect on an elaborated Reichardt detector, while making it impossible for a correspondence-based analysis to extract a motion signal.

The logic of the pedestal test is straightforward. First, the elaborated Reichardt detector is immune to the pedestal because its response to the sum of several di erent temporal frequencies is the sum of the responses to the individual temporal frequencies (Lu & Sperling, 1995). The pedestal simply adds an extra temporal frequency component—0 Hz, which generates no output from the Reichardt detector—to the moving stimulus. Accordingly, the pedestal should not a ect the response of the Reichardt detector. In fact, we can expect that when the contrast of the pedestal is high it will reduce sensitivity by activating gain-control mechanisms, but this should not happen until it is several times threshold contrast.

On the other hand, even if the pedestal is only slightly higher in contrast than the moving pattern, it will prevent features from moving consistently in any one direction. Instead, as Figure 11 illustrates, the features oscillate backwards and forwards over a range that depends on the relative contrasts of the moving pattern and the pedestal. In the presence of the pedestal any mechanism that depends solely on changes in the positions of features in the image to compute a motion signal will be prevented from extracting a consistent motion signal.

Psychophysical measurements of contrast thresholds show that pedestals of moderate contrast do not a ect thresholds for judgments of direction of motion of simple luminance patterns, but several more complex motion stimuli are a ected. Adding a pedestal to the moving stimulus raises the contrast required to discriminate direction of motion of patterns dened by variations in binocular disparity or direction of motion (Lu & Sperling, 1995).

These results raise the possibility that we may be able to divide motion stimuli into two classes according to whether or not they are susceptible to pedestals. How-

FIGURE 11 Space–time plots of a moving sinusoidal grating, a static pedestal of twice its contrast, and their sum. Adding the pedestal to the grating prevents the continuous displacement of features that occurs during movement. Instead the features oscillate and change their contrast over time.

6 Seeing Motion

279

ever, although such a classication is attractive, it is not necessarily as straightforward to interpret as Lu and Sperling (1995) suggest. In fact, there are three specic reasons that we should not leap to the conclusion that all motion stimuli that are immune to pedestals are analyzed by correspondence-based mechanisms (feature trackers) and those that are not vulnerable are analyzed by motion lters.

First, the Reichardt detector’s immunity to pedestals depends on an assumption that the detector’s response is integrated over time. The space–time plot in Figure 15 shows that the addition of a pedestal to a moving sinusoidal grating gives rise to a stimulus that moves backwards and forwards over time. When the motion is forwards (i.e., in the same direction as when there is no pedestal) the contrast is higher; however, the grating spends almost half its time moving in the reverse direction. Brief stimuli, or stimuli that are not integrated over time could well give rise to a motion signal in the opposite direction resulting in a deterioration in performance in the presence of the pedestal. Thus relatively minor variations in the detailed architecture of the Reichardt motion detector might make it vulnerable to pedestals. In addition, Lu and Sperling (1995) suggest that high-contrast pedestals are likely to impair the performance of a Reichardt motion analyzer simply by activating a contrast gain-control mechanism.

Second, the assertion that correspondence-based or feature-tracking motion analyzers are vulnerable to pedestals depends on an assumption that the contrast of a feature has no e ect on the ability to analyze its location or to track it. It might well be that when di erent features signal opposite directions of motion, or when the same feature signals opposite directions of motion at di erent times, the features with higher contrast are more likely to determine the perceived direction of motion. If this were to happen we would expect that feature-tracking motion analyzing mechanisms would be resistant to pedestals.

Third, even if all feature-tracking motion mechanisms are vulnerable to pedestals and all motion lters are resistant to them, we should acknowledge that in principle any feature can be tracked, whether or not its motion is normally analyzed by a motion lter or Reichardt detector. Thus many moving stimuli will be analyzed by both types of mechanisms and the e ect of interfering with one or other mechanism will depend on which is the more sensitive.

It follows that the discovery that under a particular set of circumstances our ability to analyze the motion of a particular stimulus is resistant to a pedestal does not mean that under normal circumstances feature tracking may not make an important contribution to the analysis of the motion of that particular stimulus. In my own lab we have found that the same sinusoidal grating moving at the same speed can become vulnerable to a pedestal simply by making its motion less smooth by causing it to move in jumps of period (Derrington & Ukkonen, unpublished observations).

This change can easily be explained by assuming that when the grating moves in smaller steps, spatiotemporal lters are more sensitive than the feature-tracking mechanism, and when it moves in large steps, the reverse is true.This kind of change in relative sensitivities seems quite reasonable in two respects. First, as the jump size

280 Andrew Derrington

increases and the average speed remains constant, tracking features should become easier because the features spend more time stationary in between jumps. Second, changing the jump size while keeping the speed constant a ects the responsiveness of motion detectors based on quadrature pairs of spatial lters because the temporal frequency spectrum of the stimulus becomes contaminated by sampling artifacts (Watson, 1990).

In sum, although the pedestal test represents a promising potential technique for distinguishing between di erent types of motion mechanisms, it is appropriate to be cautious both in interpreting the results and in extrapolating from them.

b. Characteristics of Motion Filters

The spatial and temporal frequency selectivity of the mechanisms subserving motion perception can be analyzed by the same techniques as have been used to study the mechanisms of spatial vision (Braddick, Campbell, & Atkinson, 1978). The most widely used techniques are adaptation (also known as habituation), in which the aftere ect of viewing a moving stimulus is a selective elevation of threshold for subsequently presented stimuli that are moving in the same direction, and masking, in which a high-contrast moving “mask” selectively elevates the threshold of concurrently presented stimuli that are similar to the mask. One of the most complete descriptions of the spatial and temporal frequency selectivity of mechanisms responsible for the detection of moving stimuli comes from a study in which observers adjusted the contrast of a moving grating until it was just visible (Burr, Ross, & Morrone, 1986). High-contrast masking gratings that ickered in temporal counterphase but did not move were added to the test and elevated its threshold.3 When plotted as a function of spatial frequency, the threshold elevation curves always peaked at the spatial frequency of the test grating. When plotted as functions of temporal frequency, however, the threshold elevation function peaked at a frequency close to that of the test when the test had high temporal frequency (8 Hz) regardless of the spatial frequency, and were low-pass with constant height from 10 Hz down to 0.3 Hz when the test grating had a low temporal frequency (0.3 Hz).

By measuring how threshold elevation varied with the contrast of the mask, Burr et al. (1986) were able to infer the threshold sensitivity of the mechanisms responsible for detecting the test. They made the assumption that detection was determined by a linear lter, which was followed by a compressive nonlinearity. Consequently, by using the way threshold elevation changes with mask contrast to factor out the nonlinearity, they were able to calculate the spatiotemporal frequency sensitivity of the lter; they were also able to calculate its prole in space–time. Figure 12 shows space–time proles of the lters responsible for detecting test stimuli of 0.1, 1, and 5 cycles/deg moving at 8 Hz, and 5 cycles/deg moving at 0.3 Hz. In the rst three cases the characteristics of the lter are well matched both to the spa-

3A counterphase ickering grating is the sum of two sinusoidal gratings of the same spatial frequency and contrast moving in opposite directions.

FIGURE 12

6 Seeing Motion

281

Space–time contour plots of the spatiotemporal impulse responses of motion-detec- tors in the human visual system inferred from masking experiments. Areas of opposite sign are indicated by dashed and continuous lines respectively. The straight line through the center of each plot represents the speed of the test pattern. Note that the horizontal scale changes by factors of 4 from b to c and again from c to d. (Data from Burr et al., 1986, Figure 6 with permission of the Royal Society of London and authors.)

tial structure of the stimulus and to the changes over time as it moves. In the last case, where the stimulus moves at 0.3 Hz, the spatial structure of the lter matches that of the grating, but its temporal structure shows no selectivity either for moving stimuli against static stimuli or for one direction of motion against the other.

Burr and his colleagues suggested that it is the combined spatial and temporal selectivity of the motion-selective lters that allows us to analyze the spatial structure of moving objects and to integrate the energy from a moving spot without smearing it; however, the relationship between the outputs of such lters and our perception of the motion of complex patterns is not straightforward. When a simple, briey presented, moving grating is added to a static grating of lower spatial frequency, the resulting stimulus appears to move in the opposite direction from

282 Andrew Derrington

that in which it actually moves. Subjects are perfectly reliable in discriminating between opposite directions of motion, but consistently wrong in their decision about which direction is which. This illusory reversed motion is at its strongest when the contrast of the moving pattern and that of the static pattern are roughly equal, and well above threshold (Derrington & Henning, 1987a). This suggests that there is some kind of nonlinear interaction between the lters tuned to di erent spatial frequencies that creates a motion metamer, that is, a stimulus that is moving in one direction but appears as if it is moving in the opposite direction.

2. Physiology

The fact, discussed in section III.A, that di erent schemes for generating direction selectivity can be rendered exactly equivalent to one another makes it di cult to conceive of psychophysical experiments that would reveal the principles of operation of direction-selective mechanisms. Part of the di culty here is that in a typical psychophysical experiment, the observer makes a single, usually binary, decision based on a large number of mechanisms. There is no access to the outputs of individual mechanisms. However, in physiological experiments on the mammalian visual cortex, it is possible to record the outputs of single cells that can be represented as direction-selective spatiotemporal frequency lters (Cooper & Robson, 1968; Movshon, Thompson, & Tolhurst, 1978a, 1978b).

Emerson et al. (1987) have shown that the responses of a complex cell to spatiotemporal sequences of bars ashed in di erent parts of the receptive eld can be used to distinguish between di erent nonlinear ltering operations that might give rise to direction selectivity. Figure 13 shows an example of the responses of a complex cell in cat striate cortex. The pattern of these responses is consistent with what would be produced by the nonopponent level of the motion energy lter, but not the Reichardt detector.

A plausible physiological implementation of the motion energy lter in the complex cell receptive eld uses two direction-selective subunits with receptive elds of opposite sign to represent each direction-selective lter in the quadrature pair (Emerson, 1997). Each subunit has a nonlinear output stage that half-wave recties and then squares the signal (Heeger, 1991), so that adding together the two complementary subunits gives the e ect of a linear lter followed by a squarer. A second pair of lters in quadrature spatial and temporal phase relationship to the rst pair complete the model and render the receptive eld model formally identical to the motion-energy lter (Adelson & Bergen, 1985; Emerson, 1997).

Simple cells respond to a moving grating with a modulated response whose temporal frequency matches the temporal frequency of the moving grating (Movshon et al., 1978b), so direction-selective simple cells could not possibly be based on the motion energy lter which gives an unmodulated response to a moving grating (Adelson & Bergen, 1985). However, a number of features of the responses of simple cell receptive eld suggest that its selectivity for direction of motion could be based, at least in part, on linear ltering like that underlying the motion energy lter.

FIGURE 13

6 Seeing Motion

283

Space–time contour plot of nonlinear motion-selective spatiotemporal interactions in the receptive eld of a cortical cell. The plot is produced by measuring the increase (continuous line) or decrease (dashed line) in the response to a conditioning ash produced by a test ash that is presented at a di erent time and position. The interaction is plotted as a function of the spatial and temporal separation between the test ash and the conditioning ash. The spatiotemporal pattern of facilitation and inhibition represent a nonlinear contribution to direction-selectivity. (Replotted from Emerson, Citron, Vaughn, & Klein, 1987, with permission from the American Psychological Society.)

The temporal phase of a simple cell’s response to a ickering sinusoidal grating varies with the spatial phase of the grating, and the temporal impulse response varies with the location of the stimulus in the receptive eld in a way that suggests that the linear receptive eld is oriented in space–time like the linear motion sensor (Albrecht & Geisler, 1991). Detailed nonlinear analyses of simple cell responses to ickering gratings and to ashing bars in di erent parts of the receptive eld suggest that in fact the simple-cell receptive eld may contain a single pair of linear motion lters that have a quadrature spatial and temporal phase relationship with one another (Emerson, 1997; Emerson & Huang, 1997). The resulting receptive elds are very similar to those of the single subunits in linear receptive-eld models, except that they give a more sustained response to a stimulus moving in the preferred direction.

The receptive eld model of the simple cell contains exactly half the components from the model of the complex receptive eld. Missing from the simple cell model are the complementary negative replicas of each motion-ltering subunit

Соседние файлы в папке Английские материалы