Ординатура / Офтальмология / Английские материалы / The Neuropsychology of Vision_Fahle, Greenlee_2003
.pdfThis page intentionally left blank
Chapter 1
Vision, behaviour, and the single neuron
Gregor Rainer and Nikos K. Logothetis
Introduction
The single neuron is the fundamental computational element of the brain. By correlating the activity of single neurons with sensory and behavioural events, vast amounts of knowledge about different brain areas and their function in cognition have been gathered. It has proved possible to uncover neural representations of sensory stimuli such as oriented line segments, moving patterns, or complex objects. Neural correlates of cognitive operations such as short-term memory, attention, and planning have also been found. In this chapter we will begin by describing some of the historical developments that represent the origins of single-neuron recording and first attempts to relate the observed neural activity to behaviour. We will then outline more specifically the progress that has been made in our understanding of different regions of the monkey brain by recording the activity of single neurons. In many cases, single neurons signal task-relevant information at levels comparable to the behavioural performance of the monkey, and there exist remarkable parallels between neural and behavioural performance parameters. While much has been accomplished there remains a lot of work to be done, in particular in terms of providing mechanistic accounts of how the observed phenomena arise. The incorporation of knowledge from other areas of neuroscience such as human neuropsychology and brain imaging, in vitro physiology, or computational modelling promises further progress in unravelling how single neurons underlie cognitive functions and ultimately our mental life.
Historical origins of single-neuron recording
Neurons transmit information to each other by sending unitary events called action potentials down their axons. The action potential was discovered by Edgar Adrian in 1919, working on a preparation consisting of a muscle mechanoreceptor and the associated sensory neuron in the frog. He described properties of the action potential that are still central to our interpretation of neural activity today. Adrian found that stimulation of the receptor caused a series of action potentials to be transmitted down the axon of the associated nerve fibre. Stronger stimulation did not change the properties of
4 GREGOR RAINER AND NIKOS K. LOGOTHETIS
individual action potentials, but rather caused more of them to be transmitted. This is known as a rate code, because information about stimulus intensity is represented by how many action potentials are transmitted (the firing rate of the neuron). Although we have realized since then that the precise timing of action potentials can and does play a role in information transmission and encoding, the idea of the rate code is still a central concept of systems neuroscience today. Adrian also noticed that neurons sometimes emitted action potentials in the absence of any sensory stimulation. He called this background activity, and such activity also represents a general property of neurons in the mammalian brain.
When examining the activity of peripheral sensory neurons, we can be sure that their action potentials represent activations of the corresponding receptors. The question as to what sensory attributes neurons represent becomes more difficult to answer as we enter the central nervous system. In vision, the concept of the receptive field introduced by Halden Hartline (1938) represents a major advance. Recording from neurons in the optic nerve of the frog, Hartline defined the receptive field of a neuron as that area of visual space where stimulation leads to an increase in the neuron’s firing rate. The receptive field remains an important concept today, and much has been learned about the functions of different areas by comparing the receptive fields of their neurons. The receptive field can be thought of as a window through which a neuron has access to information in the visual field. However, it provides only a basic characterization of a neuron’s response properties. Neurons are also selective for features, in that their firing rate will vary as a function of which particular stimulus is presented in their receptive field.
In the 1950s, pioneering work on feature selectivity was carried out by Horace Barlow (1953) and Jerome Lettvin and colleagues (1959). Recording from frog retinal ganglion cells, they put forward the idea that neurons actually communicate information about behaviourally relevant features and not merely about local differences in illumination. For example, they described direction-selective ganglion cells that might be used by the frog to detect flies. This represents perhaps the first attempt to relate the response of single neurons to behaviour, an enterprise that has been a major focus of systems neuroscience since then. Unlike in the frog, retinal ganglion cells in mammals are not direction-selective and code only for local differences in illumination. In mammals, the vast majority of these neurons project to primary visual cortex via the thalamus. A breakthrough in our understanding of primary visual cortex came when, in 1959, David Hubel and Torsten Wiesel discovered that oriented line segments or bars appeared to be the features represented by the primary visual cortex. Working on anaesthetized cats, they found that a given neuron will respond vigorously to a bar oriented in its preferred orientation, but not to an orthogonally oriented bar. Different neurons are optimally activated by bars of different orientations so that across a population of neurons many different orientations are represented. Primary visual cortex can thus be viewed as a filtering device that extracts edge information from the retinal
VISION, BEHAVIOUR, AND THE SINGLE NEURON 5
input. This represents the first and best-understood feature extraction process of the cortical visual system.
Meanwhile, Barlow continued to study the retina and did some of his most influential work on the quantification of neural responses in the retina of the cat (Barlow et al. 1971), using a similar approach to that which Vernon Mountcastle had employed in the somatosensory system (Werner and Mountcastle 1963). Barlow and colleagues were interested in how well single retinal ganglion cells signalled small differences in illumination (Fig. 1.1). To do this, they employed receiver-operating-characteristic (ROC) analyses. Conceptually, ROC analyses provide an estimate in the form of a single number (called the ROC area) of how different the activity of a single neuron is between two conditions. One proceeds by collecting the firing rate of a given neuron for each of two conditions A and B. Because neural activity is characterized by large variability, firing rates will in general not be identical on different trials. Instead, one observes distributions of firing rates for each condition. The ROC area provides an assumption-free measure of how different these distributions are. A value of 0.5 means that they are completely overlapping, whereas a value of 1 means that the distributions are disjoint. The importance of this analysis lies in the fact that the ROC area can be interpreted as the performance of an ideal observer in making statements about the world given only the firing rate of a single neuron. A value of 0.5 indicates that even an ideal observer can only guess as to whether stimulus A or B was present (50% is chance performance for a choice between two possibilities A and B), whereas a value of 1 indicates that an ideal observer could make a correct choice on all trials.
Consider the experiment of Barlow and colleagues (1971). They assembled firing rate distributions for single ganglion cells under illuminations of: (1) zero quanta and
(2) five quanta of light as shown in Fig. 1.1(b). Using ROC analysis, they were able to quantify how well a single ganglion cell communicated information about this small difference in illumination. The ROC analysis provides an estimate of how likely it is that illumination was actually present on that trial, thus using the response of a single neuron to provide a probability estimate about some state of the external world. By varying the amount of illumination (always compared to the no-illumination condition), one can assemble a sensitivity function for each ganglion cell under study. Different ganglion cells have different sensitivity profiles but, for a given illumination, a particular ganglion cell will have a maximal probability of detecting the stimulus. This led Barlow to formulate the lower envelope principle:
sensory thresholds are set by the class of sensory neuron that has the lowest threshold for a particular stimulus, and they are little influenced by the presence or absence of responses in an enormous number of other neurons that are less sensitive to that stimulus.
The first part of this statement is quite intuitive, it simply states that discrimination performance is limited by how well we can perceive differences using our senses. It is the second part of the statement that is quite surprising and somewhat controversial, since it says that the performance of the organism is limited not by the average performance of
6 GREGOR RAINER AND NIKOS K. LOGOTHETIS
(a)
Impulses/s
Stim: 5 quanta
80
60
40
20
0
0.0 |
0.2 |
0.4 |
0.6 |
0.8 |
|
|
Time (s) |
|
|
(b) |
|
|
PNDs |
|
(c) |
10 |
|
Roc curve |
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
ms |
0.2 |
|
|
|
|
0.8 |
|
|
|
|
|
impulses/200N |
|
|
|
/(ScP+R) |
|
|
|
|
|
||
0.1 |
|
|
|
0.4 |
|
|
|
|
|
||
|
|
|
|
|
|
0.6 |
|
|
|
|
|
of |
0 |
|
|
|
|
|
|
|
|
|
|
Probability |
|
|
|
|
0.2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
5 |
10 |
15 |
|
0 |
0.2 |
0.4 |
0.6 |
0.8 |
1.0 |
|
|
|
N |
|
|
|
|
P (c /R) |
|
|
|
Fig. 1.1 |
ROC analysis of neurons in the cat retina. (a) Action potential (impulse) count |
|
|||||||||
histogram (peristimulus time histogram, PSTH) of a single neuron’s response to brief illumination of light (5 quanta, stimulus duration 10 milliseconds). (b) Firing rate distributions (pulse number distributions, PNDs) for conditions of no stimulus present (0 quanta, shaded distribution), and for a weak flash of light (5 quanta, non-shaded distribution). Each curve represents the frequency (on the abscissa) of having observed a given number of action potentials in each of the two cases (on the ordinate). (c) Receiver-operating-characteristic (ROC) curve generated from the firing-rate distribution shown in (b). The ROC curve plots the probability of a correct detection (on the abscissa) against the probability of a false alarm (on the ordinate). Each point on the curve represents a different criterion impulse count (c), and thus relates the probability that the random background activity exceeded a criterion (c) spike count P(c|c) to the probability that the activity elicited by the weak flash exceeded the same criterion (c). Each numbered point on the curve is generated by varying the criterion (c) from zero to the maximum number of spikes observed (17 in the present case). For example, neither distribution exceeds a criterion of 17, so this would generate a point on the ROC curve at the origin (0, 0). (Modified from Barlow et al. (1971).)
VISION, BEHAVIOUR, AND THE SINGLE NEURON 7
its relevant components but by the performance of the best components. According to this principle, badly performing neurons do not drag down the performance of the entire animal—it can always rely on the best neurons to support behaviour.
The work of Barlow in the retina was extended by David Tolhurst and co-workers (1983) in the primary visual cortex (V1) of the anaesthetized cat and monkey. Tolhurst and colleagues measured the contrast sensitivity of V1 neurons for sine-wave gratings of optimal spatial frequency, orientation, and drift-rate for the neuron under study. They generated ROC curves, and measured neural performance thresholds, which they then compared to measures of psychophysical performance of human and monkey observers. There was general agreement between the measures of behavioural and neural performance, although neural performance was somewhat worse than behavioural performance both in terms of threshold and slope. This prompted Tolhurst and colleagues to suggest that to produce behaviour, the combination of a small number of selective units might be required such that, through a process of probability summation, these units together could perform better than any individual one. The idea is that any individual neural response is noisy, so that it may or may not provide a reliable signal of stimulus presence on any given trial. However, the more neurons are considered together, the more likely it becomes that at least one of them has detected the stimulus resulting in correct behaviour for that particular trial. More recent results, however, suggest that the discrepancies between neural and behavioural performance may, in fact, have been due to the receptive field structure of V1 neurons (Hawken and Parker 1990). Using a more detailed model of receptive field structure, they found that there was, in fact, quite good agreement between neural and behavioural performance, suggesting that the lower envelope principle appears to hold in area V1. An important caveat in interpreting neural performance measures is that they are closely dependent on the length of time over which the firing rates of the neuron are computed. The longer this period is, the more accurate the rate estimates become as long as the process is stationary (i.e. does not change over time). Comparing neural performance between different conditions (i.e. different values of illumination or contrast) is unproblematic, but the absolute thresholds are somewhat arbitrary due to this dependence on integration time.
Soon after the discovery of orientation selectivity by Hubel and Wiesel (1959) in the anaesthetized preparation, they and others began to study neural activity in awake and behaving monkeys. The techniques for performing these recordings were pioneered by Edward Evarts (1966), who originally developed them to study motor cortex. A typical set-up is shown in Fig. 1.2, and consists of a device for fixing the head, a recording chamber for insertion of the electrodes, a scleral search coil for monitoring eye movements by means of electromagnetic induction, a screen or monitor for displaying visual stimuli, and a juice system for delivering a reward.
Many investigators began to study vision-related neural activity in various areas of the monkey brain and to relate it to different behaviours of interest. A detailed
8 GREGOR RAINER AND NIKOS K. LOGOTHETIS
Fig. 1.2 Single-unit recording in the behaving monkey. Monkeys (most often macaques) are seated in primate chairs and often also have buttons or levers at their disposal for making manual responses (not shown). Stainless steel head-holders and recording chambers
are implanted under general anaesthesia during sterile conditions with monkeys receiving postoperative antibiotics and analgesics. Eye position is monitored using a scleral search coil with metal contacts embedded in the head-holder. During recording sessions, the head is held in a steady position by a metal arm (shown above the monkey). Electrodes (usually made of platinum–iridium or tungsten) are inserted into the brain using manually or mechanically driven devices. Electrical signals are appropriately amplified and filtered (devices not shown). A reward is typically delivered in the form of apple juice or other varieties depending on the particular monkey’s preferences.
account of these goes beyond the scope of this chapter, but further details on the historical developments can be found elsewhere (Gross 1998; Hubel 1995; Schiller 1986). We continue here by providing an overview of the cortical areas of the monkey that process visual information and discussing some of the relevant literature for each area.
Visual processing areas in the macaque brain
The macaque monkey brain contains more than 30 distinct visual areas (Fig. 1.3). These areas can be divided into a dorsal and a ventral processing stream (Ungerleider and Mishkin 1982), each of which includes primary visual cortex (V1) and adjacent occipital visual areas (V2, V3). Dorsal stream areas located mostly in parietal cortex are thought to be important for the visual guidance of movements towards objects and the coordinate transformations required to perform these movements (Andersen et al.
|
|
VISION, BEHAVIOUR, AND THE SINGLE NEURON 9 |
||||||
|
|
|
|
|
MEDIAL PREFRONTAL |
|||
|
|
|
|
|
|
32 |
|
25 |
|
|
|
|
24 |
|
|
|
|
|
|
|
|
DORSAL |
|
14 |
||
|
PSd |
|
CINGULATE |
PREFRONTAL |
|
10 |
||
|
|
|
23 |
|
|
9 |
|
46 |
|
29 |
30 MDP |
|
MOTOR |
|
|
||
|
|
|
|
LATERAL |
||||
|
|
PO |
SONATO- 4 |
6 |
|
|
||
|
|
|
|
|
PREFRONTAL |
|||
|
V2d |
|
SENSORY |
|
|
|
||
|
|
|
|
|
45 |
|||
|
|
|
|
|
|
|
|
|
|
|
5 |
2 1 |
3b |
|
|
|
12 |
|
|
|
|
|
11 |
|||
|
V3 |
VIP |
|
|
|
Pall |
13 |
|
|
PIP |
LIP |
7b |
|
|
|
|
|
|
SII |
1d |
G |
Pro |
||||
|
|
|
|
|||||
|
|
7a |
|
|
||||
|
|
|
|
1d |
|
|
|
|
V3A DP |
|
|
|
|
ORBITO- |
|||
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
V41 |
|
AUDITORY |
PIR |
FRONTAL |
||
|
|
|
|
|
||||
|
V4d |
|
|
|
|
|
|
PAC |
|
|
CITd |
SIP |
|
|
|||
V1 |
|
|
|
|
||||
|
|
|
OLFACTORY |
|||||
|
V0 |
|
CITd |
|
||||
VP |
|
PITv |
CITv |
AITo |
|
ER |
|
|
V4v |
|
35 |
|
|
||||
|
|
|
|
|||||
|
AITv |
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
TF |
|
36 |
|
|
|
V2v |
|
|
|
|
|
|
|
|
|
TH |
|
ER |
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
SUBICULAR |
|
|
|
||
CA1CA3
PSv
1 cm
Fig. 1.3 Map of the cortical areas of the macaque monkey. Visual areas are shown shaded. Occipital visual areas include V1 (primary visual cortex), V2, V3, V3A, VP (ventral posterior), V4, V4t (transitional), and VOT (ventral occipitotemporal). Ventral stream areas include posterior (PIT), central (CIT), and anterior inferior temporal (AIT) cortex (both dorsal and ventral), posterior and anterior STP (superior temporal polysensory), TF, and TH. Dorsal stream areas include MT (middle temporal), dorsal and lateral MST (medial superior temporal), PO (parieto-occipital), 7a, DP (dorsal prelunate), MDP (medial dorsal prelunate), and several interparietal areas (lateral LIP, ventral VIP, and medial MIP). Finally the frontal eye fields (FEF) and area 46 of the frontal lobe are considered visual. (Modified from Felleman and Van Essen (1991).)
1997; Colby and Goldberg 1999; Milner and Goodale 1993; see also chapter 5, this volume). Areas of the ventral stream located in the temporal cortex are thought to be involved in object recognition and long-term memory storage (Gross 1992; Logothetis and Sheinberg 1996; Tanaka 1996; Miyashita 1993). The prefrontal cortex, which receives projections from both processing streams and contains many visually modulated
10 GREGOR RAINER AND NIKOS K. LOGOTHETIS
neurons, is thought to play a major role in short-term memory and guiding behaviour (Fuster 1997; Goldman-Rakic 1996; Miller 2000).
Ventral stream
The pioneering work carried out by Charles Gross and collaborators in the inferior temporal (IT) cortex represents some of the first single-neuron recordings in the ventral stream. Gross and colleagues discovered that IT neurons tended to have large bilateral receptive fields, typically responded best to objects presented at the centre of gaze, and responded well to complex objects such as faces or hands but not as vigorously to simple spots of light or oriented bars (Gross et al. 1972; Perrett et al. 1982; Desimone et al. 1984). Although many IT neurons were broadly tuned and responded to many different stimuli, some fired in a highly specific fashion to behaviourally relevant objects such as fruit or body parts. The discovery of face-selective neurons suggests that common principles may be at work in neural systems as widely different as the retina of the frog and the temporal lobe of the monkey, namely, that neurons are tuned for features of the world that are important to the animal and is of interest for understanding neuropsychological disorders such as prosopagnosia (see Chapter 7, this volume). Indeed, as Desimone and colleagues point out, the strong selectivity for faces may in fact reflect the behavioural requirements of monkeys to make accurate judgements about differences in the facial expressions of other monkeys. This work raises important questions, two of which will be discussed here.
1.How is selectivity for complex stimuli such as faces implemented; can it be described as a combination of responses to more simple components of a complex stimulus?
2.Is face selectivity a result of the extensive experience primates have with other faces, or are face cells privileged and mechanisms for face selectivity fundamentally different from the selectivity for other objects?
Keiji Tanaka and co-workers (1991) conducted careful experiments to investigate complex feature selectivity. They employed a stimulus reduction technique, in which they initially isolated a neuron that was highly selective for a complex object and then simplified the object’s features to find the minimal combination of features that still caused robust firing in that neuron. In a step-by-step procedure they were thus able to determine the minimal response features for many IT neurons. They concluded that, in general, moderately complex features such as filled circles with protrusions or combinations of hatched bars and ovals could elicit robust activity. Even neurons that appeared highly selective for real-world objects could be activated by much simpler features. These findings can be interpreted as evidence for a distributed code for complex objects, such that a given object would be represented by a unique firing pattern of a large ensemble of broadly tuned neurons, rather than very few highly selective neurons.
The question whether the highly specific face cells were fundamentally different from neurons selective for other objects was addressed by Nikos Logothetis and co-workers (1995). They reasoned that, if face cells were a result of extensive experience
VISION, BEHAVIOUR, AND THE SINGLE NEURON 11
with faces, it should be possible to train monkeys to recognize arbitrary objects and then find neurons highly selective for these objects in IT cortex. Logothetis and colleagues employed three-dimensional stimuli that resembled amoebae and paperclips. After extensive training, monkeys were able to recognize these objects from different views. Recording from IT cortex in these trained monkeys, they found neurons with firing characteristics like those shown in Fig. 1.4.
Single IT neurons responded in a highly specific fashion to views of the trained paperclip objects, but showed little or no response to distractor paperclips even though these appeared very similar to the trained examples. In addition, the vast majority of selective IT neurons—like the ones in Fig. 1.4—were tuned to particular views of objects (viewtuned cells), and only very few responded in a view-invariant fashion. These studies have demonstrated that, as a result of experience, IT neurons can become tuned to arbitrary objects such as paperclips. Further evidence that experience can modify neural tuning properties in IT came from Yasushi Miyashita and co-workers. They employed a pairassociation task, in which monkeys were trained to associate pairs of complex objects. They found that responses of IT neurons became correlated such that, after training, they tended to show similar responses to associated objects as compared to non-associated ones (Sakai and Miyashita 1991). Together, these and other studies suggest that IT cortex maintains representations of complex objects that are behaviourally important and that even the adult cortex shows plasticity—possibly accounting for improvement of performance after cortical damage (see Chapter 11, this volume). Further clues as to how this representation is formed and which processes may play a role in their formation come from examining activity in intermediate ventral stream areas such as area V4.
First described by Semir Zeki (1973) as the colour-processing area, V4 represents an intermediate processing stage between primary visual and inferior temporal cortex and is an area where neural activity is in fact modulated by both colour and form (Desimone and Schein 1987; Gallant et al. 1993; Pasupathy and Connor 1999). Landmark work concerning a possible function subserved by area V4 was performed by Jeffrey Moran and Robert Desimone (1985). They employed a paradigm in which two objects were simultaneously presented within the receptive field of a V4 neuron. Only one of these objects was relevant for the task; the other one was a distractor that could be ignored. They found that the relevant, attended object captured the response of the neuron and that the irrelevant distractor had little or no influence on the neural response. The importance of this work lies in the fact that identical visual stimulation can result in very different neural activity patterns depending on where the monkey directs his attention. Since then, effects of attention have been further studied in V4 (Connor et al. 1997; McAdams and Maunsell 1999) and also uncovered in other brain regions (Motter 1993). Different hypotheses have been proposed to account for attentional modulations. The strong effect that attention can have on neural responses highlights the fact that visual processing is not simply a passive ‘bottom–up’ process, but that sensory input is interpreted and modified in accordance with the internal state of the
