Ординатура / Офтальмология / Английские материалы / Seeing_De Valois_2000
.pdf
64 Robert Shapley
son (1966) paper: the existence of a class of nonlinearly summing retinal ganglion cells, the Y cells. Y cells have a classical Center–Surround organization but also receive mainly excitatory input from another retinal mechanism. This third mechanism is excitatory but nonlinear. That is, a Y cell receives a broadly spatially distributed excitatory input from spatial subregions within which there is linear spatial summation but between which signals are added together only if the signals exceed a threshold. Thus these many mechanisms can be thought of as nonlinear subunits within the Y cell’s receptive field (Hochstein & Shapley, 1976). The multiplicity of nonlinear subunits confers upon the Y cell the unusual property of spatial phase invariance at high spatial frequencies. This means that a Y cell’s response to a flashed (or contrast reversed) sine grating is invariant with the spatial phase, the position, of the grating with respect to the receptive field midpoint. A sketch of the array of nonlinear subunits, and a Y cell’s receptive field organization is shown in Figure 5. The nonlinear subunits make a Y cell very sensitive to temporal contrast variation and motion of textured backgrounds. Understanding the nonlinear subunit mechanism inY-type retinal ganglion cells is useful because similar sorts of nonlinear subunit mechanisms appear also in the visual cortex, in the receptive fields of “complex” cells.
C. Measuring Receptive Fields—Systems Analysis
The alert reader will have noticed that I have used a few equations to describe the spatial distributions of sensitivity in ganglion cells, but that I have not yet addressed the issue of the theoretical analysis of receptive fields. Analysis and measurement go hand in hand, and so one needs also to address the question, are there preferred stimuli with which to measure the receptive field properties of neurons? This is a controversial and somewhat emotional issue, for reasons that seem to have more to
FIGURE 5 Model of the receptive field of a cat Y-type retinal ganglion cell (Hochstein & Shapley, 1976). There is a conventional Center–Surround receptive field like the DOG model of Figure 2, but also there is an array of nonlinear subunits that excite the Y cell. Within each subunit there is linear spatial and temporal summation of light-evoked signals, but between subunits there is nonlinear summation.
2 Receptive Fields of Visual Neurons |
65 |
do with human history than with science. Perhaps we can lighten up a little. In Molière’s play, Le Bourgeois Gentilhomme, the title character is surprised when he learns that he has been speaking prose all his life! There is an analogy with neurophysiologists who use intuitively simple stimuli like flashing spots, or drifting bars, to study visual receptive fields. Perhaps unknown to themselves, they have been doing systems analysis all their lives! Usually, by counting spikes, or measuring response waveforms averaged with respect to the stimulus, they are implicitly performing a kind of linear systems analysis because, as I show below, they are measuring a first-order correlation with the stimulus. One cannot escape some kind of systems analysis technique if one is attempting to characterize a receptive field—it is like trying to speak without speaking in prose.
The major theoretical question in the area of analyzing receptive fields is how to characterize the way signals are combined by visual neurons. One major question is whether there is linearity of signal summation within receptive fields— whether the elegant equations of the DOG model have any predictive validity. The classical measurements of Hartline (1940), Rodieck and Stone (1965), and EnrothCugell and Robson (1966) were all designed to ask whether the ganglion cells summed signals linearly, but they were not stringent tests of the hypothesis of linearity.There have been several studies that have used more rigorous methods derived from Wiener analysis (reviewed in Victor, 1992). In general such studies are based on the use of “dense” stimuli like white noise. To measure the spatial as well as the temporal signal transfer properties, spatiotemporal white noise (like the “snow” on a television set that is receiving no broadcast signal), or some convenient variant, has been used as a stimulus. Cross-correlation of such a white-noise stimulus and the neural response has led to estimates of the spatial and temporal transfer characteristics of the neurons (see for example, Reid et al., 1997; Sakuranaga, Ando, & Naka, 1987; Sakai & Naka, 1995). This technique rests on the theory that in a linear system, the impulse response of the system can be recovered by cross-correlation with white noise. In fact, if we write that the white noise input is W(x,t) and the response of the system(neuron) is R(t), and the spatiotemporal impulse response that characterizes the system (neuron) is h(x,t), then the following equation is true:
h(x,t) W(x, t ) R(t t) , |
{D} |
where . . . is the average over t .
The governing equation {D} for characterizing receptive fields with white noise can be generalized to nonrandom, nonwhite stimuli. In general, the transfer properties of a neuron can be characterized by cross-correlating stimulus with response, even if the stimulus is a flashing spot, or a drifting bar, or a drifting or contrast modulated sine grating. (Technically, one has to take into account the autocorrelation of the stimulus.) This is the reason for writing that all measurements of receptive fields are a kind of systems analysis, whether intentional or not.
Spekreijse (1969) was among the first to use noise correlation techniques to study the visual system, and he used it to characterize the temporal response properties of
66 Robert Shapely
retinal ganglion cells in the goldfish. In general, such measurements indicate large first-order correlations between stimulus and response, consistent with a mainly linear transduction system. However, the responses of Y cells, and other nonlinear ganglion cell types, cannot be accounted for simply by this first-order analysis, and the full power of nonlinear system identification theory is required to say anything meaningful about the responses of Y cells or similarly nonlinear cells. The reason is that responses in these neurons are not simply the result of processing one stimulus at a time independently of all others, as in a linear system. Rather, nonlinear inter- action—for example, coincidence of stimuli, crosstalk between two or more stimuli, and distortion of stimulus waveforms—is characteristic of such neurons. The concept of receptive fields begins to break down here, because the receptive field for a given stimulus is not well defined without specifying the entire stimulus context (Victor & Shapley, 1979; Shapley & Victor, 1979; Victor, 1988, 1992). Volterra or Wiener functional expansions may be of some use in identifying the nature of the nonlinearity and in testing models of functional architecture of such nonlinear systems (Victor, 1992).
D. Lateral Geniculate Nucleus Cell Receptive Fields
The extension of these ideas to neurons in the LGN has shown to what a great extent the LGN neurons’ visual properties are inherited from their retinal excitors (Cleland, Dubin, & Levick, 1971; Derrington & Lennie, 1984; Kaplan & Shapley, 1984). What I would like to emphasize in the space available is how the same ideas that worked so well for retinal ganglion cells have been applied to neurons in the visual cortex. This leads naturally to a discussion of the validity of the use of the receptive field concept in the cerebral cortex.
III. VISUAL CORTEX
Modern neurophysiology of the visual cortex begins with Hubel and Wiesel’s study of visual receptive fields of neurons in cat primary visual cortex (Hubel & Wiesel, 1962). There are three main functional results of this study: (a) cortical neurons respond most vigorously to the motion of elongated contours or bars aligned with a particular orientation in space; (b) there are classes of cortical cells, simple and complex, with simple cells obeying at least qualitatively the rules of linear spatial summation, while complex cells are fundamentally nonlinear; (c) the receptive fields of simple cortical cells, mapped with small flashing spots as Hartline (1940) did in frog retinal ganglion cells, are elongated along the preferred orientation axis. Many of these original findings were extended to the monkey’s striate cortex, and are presumably relevant to the function also of human primary visual cortex, which seems a lot like the monkey’s (DeValois, Morgan, & Snodderly, 1974; De Valois, Albrecht, & Thorell, 1982; Hubel & Wiesel, 1968; Skottun et al., 1991).
2 Receptive Fields of Visual Neurons |
67 |
A. Simple and Complex Cells
The explanation of the functional importance of the simple–complex classification remains a di cult unsolved problem. Simple cells exhibit qualitative properties of linear summation (DeValois et al., 1982; Ma ei & Fiorentini, 1973; Movshon, Thompson, & Tolhurst, 1978; Skottun et al., 1991; Spitzer and Hochstein, 1985), responding, for example, mainly at the fundamental frequency of modulation of the contrast of a pattern. Complex cells are nonlinear in several di erent ways, mainly in responding to drifting patterns with an elevated mean spike rate, and in responding in a frequency-doubled (“on–o ”) manner to contrast modulation (DeValois et al., 1982; Movshon et al., 1978; Spitzer & Hochstein, 1985). These specific characteristics of complex cell responses have been accounted for by fairly simple network models that include a threshold nonlinearity after some spatial filtering. In fact, models of complex cells resemble qualitatively the model of cat retinal ganglion cells of the Y-type, discussed before, in that they all postulate summation of responses from nonlinear subunits of the receptive field. However, the complex cell models have to account also for orientation selectivity and spatial position e ects that are more complicated than those observed in Y cells. It certainly must be the case that the neural network that produces the complex cells is a cortical network, but it has some resemblance to the Y cell’s retinal network in its architecture. Much more could be written about what has been done to comprehend complex cells, but for simplicity’s sake, I will focus on what is known about simple cells.
Though I write “simple cells for simplicity’s sake,” in reality there is nothing simple at all about simple cells in the visual cortex. As has been discussed previously (Jones & Palmer, 1987;Tolhurst & Dean, 1990; Shapley, 1994), both in cat and monkey cortex the linear spatial summation that characterizes simple cells cannot be inherited simply from convergence of excitation from many LGN cells onto a single simple cell. The reason is that LGN cells’ responses to e ective spatial patterns of moderate contrast are distorted, clipped at zero, by the threshold nonlinearity of the spike-encoding mechanism. They cannot modulate as much below their mean rate as they modulate above it. Therefore, the excitatory input to simple cortical cells from the LGN is highly nonlinear. One possible way to deal with this problem is to postulate direct LGN R cortex inhibition that will linearize the simple cell’s response by a “push–pull” mechanism (Jones & Palmer, 1987; Tolhurst & Dean, 1990). Another is by disynaptic cortico-cortical lateral inhibition (Shapley, 1994). This latter mechanism has already been involved to explain in part the phenomenon of orientation selectivity in the cortex (Ben-Yishai, Bar-Or, & Sompolinsky, 1995; Bonds, 1989; Sillito, 1975; Somers, Nelson, & Sur, 1995, among others). Whatever the explanation, one should keep in mind the fact that the linearity, or quasi-linearity of simple cortical cells is the emulation of a linear system by a nonlinear system with a highly nonlinear input from the LGN. One important question for which there is yet no completely satisfactory answer is, why does the cortex go to such lengths to emulate a linear transducer? This is related to the sim-
68 Robert Shapley
FIGURE 6 Receptive field maps in cat area 17 neurons (after Hubel & Wiesel, 1962). Cortical cells in area 17 of the cat cortex were mapped with small flashing spots in the manner of Hartline (1940) and Ku er (1953). (A) Map of an LGN response as a function of position. (B, C) Maps of cortical receptive fields of simple cells. Points marked with a sign indicate regions where incrementing the spots excited the neuron; those marked with a sign indicate that decrements excited the neuron at that location.
ilarly unanswered question about the retina I raised previously, and may be answered by one of the speculations o ered there about color and motion computations.
B. Orientation Selectivity
Orientation selectivity and its relation to receptive field organization remains an outstandingly important issue in cortical neurophysiology. In simple cells one should expect a direct relationship between receptive field structure and orientation tuning. If simple cells were linear, one would be able to predict orientation tuning selectivity from measurements of the receptive field sensitivity distributions—just as Rodieck (1965) predicted the responses to drifting bars from the sensitivity profiles of retinal ganglion cells’ receptive fields. It was the qualitative agreement of receptive field maps with orientation tuning selectivity that inspired the Hubel– Wiesel feed-forward model (Hubel & Wiesel, 1962). As shown in Figure 6, these maps derived from responses elicited when flashing spots were elongated and aligned with the preferred axis of orientation selectivity of the neuron. But the question is, can linear summation of sensitivity explain not just the preferred axis
→
FIGURE 7 Reverse correlation maps of receptive fields of cat visual cortical neurons (Jones & Palmer, 1987). Reverse correlation maps of cortical receptive fields in cells of cat area 17. (a,b,c) Data from three di erent neurons. Data are shown in the left-hand column. The “fit” in the middle column is the best fitting 2-D Gabor function (see text). The column on the right-hand side of the figure is the di erence between the fit and the data. The data in this figure were measured with a reverse correlation procedure. Every 50 ms a small square of light was flashed in the visual field at random positions with respect to the midpoint of the neuron’s receptive field. Cross-correlation of the spike trains with the stimulus yielded the receptive field maps indicated in the left-hand column (as 3-D projection plots and beneath them, in the boxes, as contour plots).
70 Robert Shapley
but the observed orientation tuning? This question was asked directly by Jones and Palmer (1987) using a sophisticated quantitative technique.
Jones and Palmer (1987) performed a detailed quantitative test of linearity in cat simple cells by measuring 2-D receptive field properties and predicting the orientation and spatial frequency tuning. They cross-correlated the neuron’s response with a quasi-random input signal as in equation {D}, in order to calculate the neuron’s 2-D spatial impulse response. In these experiments the time variable was sup- pressed—they took the peak spatial response as characterizing the neuron’s response. In terms of equation {D}, they measured h(x, tmax), at the time tmax that gave maximum response. This experiment was done to test the hypothesis that the neuron was acting as a linear transducer of contrast signals, so the test came when Jones and Palmer attempted to account for the spatial frequency versus orientation tuning surfaces they measured independently on the same neurons with drifting grating patterns. The comparison was done by fitting h(x, tmax), the neural spatial sensitivity distribution, with what they called a 2-D Gabor function, which was an elliptical Gaussian function of space, that is,
exp{ [(x x0)/a]2 [(y y0)/b]2},
multiplied by a sinusoidal function of space. They also fit the spatial frequency versus orientation tuning surfaces with the Fourier transform of a 2-D Gabor function. Then they compared the parameters of the fitted functions to test the linearity hypothesis. Though in some cells there was good agreement between the fitted curves, in a majority of their cells there was a clearly visible discrepancy that indicated a significant nonlinearity of spatial summation (this is contrary to the authors’ conclusions from the same data, so I encourage readers to go back to the original paper). Jones and Palmer’s data were replicated by DeAngelis, Ohzawa, and Freeman (1993) for the preferred orientation only; they found what I believe to be a significant discrepancy in many cells between the shape of the predicted (from spatial sensitivity distribution) versus measured spatial frequency responses. There was a systematic narrowing of the spatial frequency response in the data measured with drifting gratings compared with the linear prediction from the spatial impulse response. Thus, although the cortex attempts to emulate a linear system with the neural network that drives a simple cell, it cannot hide the nonlinearity of the cortical network from these very precise experimental tests. This is a second indication that simple cells are not so simple. A third indication comes from the di erent time courses of cortical responses from di erent parts of the receptive field, and this is related to the cortical cell property of directional selectivity.
C. Direction Selectivity
The specificity of cortical cells for direction of motion is also an emergent property of the visual cortex that could be due to specific receptive field properties. It has been shown that visual neurons can give directionally selective responses if di erent spa-
2 Receptive Fields of Visual Neurons |
71 |
tial locations within the receptive field have di erent time courses of response. Technically, this means that the spatiotemporal impulse response h(x,t) is not factorable into two parts hx(x) and ht(t) but rather cannot be separated—hence it is called spatiotemporally inseparable. Spatiotemporal inseparability does cause direction selectivity in completely linear neural models (Watson & Ahumada, 1985; Adelson & Bergen, 1985; Burr, Ross, & Morrone, 1986; Reid, Soodak, & Shapley, 1987, 1991). Because simple cells in visual cortex are often direction selective, it is natural to ask whether spatiotemporal inseparability in the receptive field causes direction selectivity in these neurons. McLean, Raab, and Palmer (1994) answered this directly for cat cortical cells using a reverse correlation approach just like Jones & Palmer’s (1987), with the following improvements: they used a briefer stimulus presentation, and they measured the time evolution of h(x,t) instead of just mea-
suring h(x,tmax). Their results are displayed in Figure 8, which shows h(x,t) for two di erent cells: a spatiotemporally separable, nondirection-selective neuron, and a
spatiotemporally inseparable, directionally selective neuron. Separability is indicated in an x-t plot like Figure 8 as vertical symmetry of the main envelope of h(x,t). Inseparability is indicated by an oblique axis of symmetry in the x-t graph. In general, McLean et al. found direction selectivity and inseparability were associated in cat directional simple cells. This confirmed the previous work of Reid et al. (1987), who demonstrated spatiotemporal inseparability’s correlation with direction selectivity by means of experiments with contrast reversal sine gratings (rather than with the reverse correlation measurements of h(x,t) as was done by McLean et al., 1994).
The relation of the receptive field properties of visual neurons to their direction selectivity is analogous to the relation of orientation tuning to receptive field properties discussed above. There is a qualitative association between a receptive field property—in this case spatiotemporal inseparability—and a visual property, namely direction selectivity. But again, a true test of linearity would be the prediction of the quantitative characteristics of directional selectivity from first-order sensitivity distributions [in this case, the spatiotemporal sensitivity distribution h(x,t)]. When this is done, the preferred direction can be accounted for from measurements of h(x,t), but the measured directional selectivity is usually quite a bit more than that predicted from linear synthesis (McLean et al., 1994; Reid et al., 1987, 1991; Tolhurst & Heeger, 1997). Thus, some nonlinearity in cortical processing is needed to account for the discrepancy, and this is reminiscent of orientation and spatial frequency tuning. It has been proposed in the case of direction selectivity that known cortical nonlinearities (which are known but hardly understood) could account for the nonlinear stage that is needed (Tolhurst & Heeger, 1997). However, as argued below with respect to orientation tuning, it seems that an adequate understanding requires a distributed network model of intracortical interactions and cannot simply be explained by the sort of fairly simple nonlinear box models proposed previously.
In general, all explanations o ered so far for the nonlinearities of cortical visual processing include a major role of lateral interaction or feedback interaction. Non-
FIGURE 8 Space-time coupling in cat cortical cell receptive fields (McLean, Raab, & Palmer, 1994). Reverse correlation maps of cortical receptive fields in cells of cat area 17, as in Figure 7, but here briefer stimuli (usually 20 ms in duration) were used, so the time evolution of the sensitivity distribution was measured. The sensitivity distributions are represented as contour plots. In (a) and (b) the data are from a nondirectionally selective neuron, whereas (d) and (e) are data from a directionally selective cell. (a) and (d) are x-y plots of sensitivity as a function of position, at the peak time about 50 ms after stimulus onset. B and E are x-t plots of the sensitivity as a function of x position (averaged over the y- dimension) as a function of time after stimulus onset. The point of the figure is to show that the directionally selective cell [data from it are in (d), (e), and (f )] has a slanted x-t plot, indicating spatiotemporal inseparability in the directional neurons.
2 Receptive Fields of Visual Neurons |
73 |
FIGURE 8 (Continued)
linear feedback models cannot be accommodated easily within the usual concepts of receptive fields because a nonlinear feedback is usually adaptive in a complex manner. With such a feedback term added, the “receptive field” becomes less useful as an explanatory concept, because which receptive field is meant—the one under condition A or the one under condition B? So, to the degree that nonlinear
