Ординатура / Офтальмология / Английские материалы / Computational Maps in the Visual Cortex_Miikkulainen_2005
.pdf
2
Biological Background
In later chapters, computational simulations are presented that describe how the human visual system develops and functions. In order to make such simulations a useful tool for understanding natural systems, they are based on detailed anatomical, neurophysiological, and psychological evidence. In this chapter, the organization of the visual system in humans and higher animals is reviewed, and biological evidence is discussed for structures and processes that are important for later chapters, such as lateral connections, externally and internally driven development, and temporal coding. Computational principles for modeling these phenomena are reviewed in the next chapter. Biological evidence for each specific phenomenon modeled will be reviewed in each chapter separately, and the general biological foundations of the model are evaluated in Chapter 16.
2.1 Visual System Organization
The adult visual system has been studied experimentally in a number of mammalian species, including human, monkey, cat, ferret, and tree shrew. For a variety of reasons, many of the important results have been measured in only one or a subset of these species, but they are generally expected to apply to the others as well. This book focuses on the human visual system, but also relies on data from these animals where human data are not available.
Figure 2.1 shows a diagram of the main feedforward pathways in the human visual system (see e.g. Daw 1995; Kandel et al. 2000; Wandell 1995 for reviews). Other mammalian species have a similar organization. During visual perception, light entering the eye is detected by the retina, an array of photoreceptors and related cells on the inside of the rear surface of the eye. The cells in the retina encode the light levels at a given location as patterns of electrical activity in neurons called ganglion cells. This activity is called visually evoked activity. Retinal ganglion cells are densest in a central region called the fovea, corresponding to the center of gaze; they are much less dense in the periphery. Output from the ganglion cells travels through neural connections to the lateral geniculate nucleus of the thalamus, or LGN, at the base
16 2 Biological Background
|
Right eye |
|
|
Right |
|
|
|
|
|
Right LGN |
|
|
Optic |
Primary |
|
Visual field |
visual |
||
chiasm |
cortex |
||
|
|||
|
|
(V1) |
|
|
|
Left LGN |
|
Left |
|
|
|
|
Left eye |
|
Fig. 2.1. Human visual pathways (top view). Visual information travels in separate pathways for each half of the visual field. For example, light entering the eye from the right hemifield reaches the left half of the retina, on the rear surface of each eye. The right hemifield inputs from each eye join at the optic chiasm, and travel to the LGN of the left thalamus, then to primary visual cortex, or area V1, of the left hemisphere. Signals from each eye are kept segregated into different neural layers in the LGN, and are combined in V1. There are also smaller pathways from the optic chiasm and LGN to other subcortical structures, such as the superior colliculus and pulvinar (not shown).
of each side of the brain. From the LGN, the signals continue to the primary visual cortex, or V1 (also called striate cortex and area 17) at the rear of the brain. V1 is the first cortical site of visual processing; the previous areas are termed subcortical. The output from V1 goes on to many different higher cortical areas, including areas that underlie object and face processing (see e.g. Merigan and Maunsell 1993; Van Essen, Anderson, and Felleman 1992 for reviews). Much smaller pathways also go from the optic nerve and LGN to subcortical structures such as the superior colliculus and pulvinar. In humans these subcortical pathways are involved primarily in eye movements and attention (LaBerge 1995; LaBerge and Buchsbaum 1990; Wallace, McHaffie, and Stein 1997). The LISSOM model focuses on V1 and the structures to which it connects, as reviewed below.
2.1.1 Early Visual Processing
At the photoreceptor level, the representation of the visual field is much like an image, but significant processing of this information occurs in the subsequent subcortical and early cortical stages (see e.g. Daw 1995; Kandel et al. 2000 for reviews).
First, retinal ganglion cells perform a type of edge detection on the input, responding most strongly to borders between bright and dark areas. Figure 2.2a,b illustrates the two main types of such neurons, ON-center and OFF-center. An ONcenter retinal ganglion cell responds most strongly to a spot of light surrounded by
|
|
|
|
2.1 Visual System Organization |
17 |
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(a) ON cell in |
|
(b) OFF cell in |
|
(c) 2-lobe V1 |
|
(d) 3-lobe V1 |
retina or LGN |
|
retina or LGN |
|
simple cell |
|
simple cell |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Time 0 |
Time 1 |
Time 2 |
Time 3 |
|
(e) Spatiotemporal RF of a V1 cell |
|
|
Fig. 2.2. Receptive field types in retina, LGN and V1. Each diagram shows a receptive field on the retina for one neuron. Areas of the retina where light spots excite this neuron are plotted in white (ON areas), areas where dark spots excite it are plotted in black (OFF areas), and areas with little effect are plotted in medium gray. The size of the RFs varies, but they all have the same basic shape and they are all spatially localized, i.e. their ON and OFF areas cover a small specific portion of the retina. (a) ON cells are found in the retina and LGN, and prefer light areas surrounded by dark. (b) OFF cells have the opposite preferences, responding most strongly to a dark area surrounded by light. RFs for both ON and OFF cells are isotropic, i.e. have no preferred orientation. Starting in V1, most cells in primates have orientation-selective RFs instead. The V1 RFs can be classified into a few basic spatial types, of which the two most common are shown above: (c) A two-lobe arrangement, favoring a 45◦ edge with dark in the upper left and light in the lower right, and (d) a three-lobe pattern, favoring a 135◦ white line against a dark background. Both types of RF are often represented with Gabor functions (Daugman 1980; Jones and Palmer 1987). RFs of all orientations are found in V1, but those representing the cardinal axes (horizontal and vertical) are more common. Many neurons are also sensitive for the direction of movement of these patterns, i.e. their RFs are spatiotemporal. For such a neuron, successive snapshots of the spatial RF at different times are shown in (e); together they form a spatiotemporal RF selective for a vertical light bar moving to the right. A model for the ON and OFF cells will be introduced in Chapter 4 and for the simple and spatiotemporal V1 cells in Chapter 5.
dark, located in a region of the retina called its receptive field, or RF. An OFF-center ganglion cell instead prefers a dark area surrounded by light. The size of the preferred spot determines the spatial frequency preference of the neuron; neurons preferring large spots have a low preferred spatial frequency, and vice versa.
Neurons in the LGN have properties similar to retinal ganglion cells, and are also arranged retinotopically, so that nearby LGN cells respond to nearby portions of the retina. The ON-center cells in the retina connect to the ON cells in the LGN,
18 2 Biological Background
and the OFF cells in the retina connect to the OFF cells in the LGN. Because of this independence, the ON and OFF cells are often described as separate processing channels: the ON channel and the OFF channel.
2.1.2 Primary Visual Cortex
Like LGN neurons, nearby neurons in V1 also respond to nearby portions of the retina and are selective for spatial frequency. Unlike LGN neurons, most V1 neurons are binocular, responding to some degree to stimuli from either eye, although they usually prefer one eye or the other. They are also selective for the orientation of the stimulus and its direction of movement. In addition, some V1 cells prefer particular color combinations (such as red/green or blue/yellow borders), and disparity (relative positions on the two retinas). V1 neurons respond most strongly to stimuli that match their feature preferences, although they respond to approximate matches as well (Hubel and Wiesel 1962, 1968; see Ringach 2004 for a review). Figure 2.2c–e shows examples of typical RFs of V1 neurons for static and moving stimuli. These neurons are simple cells, i.e. neurons whose ON and OFF regions are located at specific areas of the retinal field. Other neurons (complex cells) respond to the same configuration of light and dark over a range of positions. LISSOM models the simple cells only, which are thought to be the first in V1 to show orientation selectivity.
V1, like the other parts of the cortex, is composed of a two-dimensional, slightly folded sheet of neurons and other cells. If flattened, human V1 would cover an area of nearly four square inches. It contains at least 150 million neurons, each making hundreds or thousands of specific connections with other neurons in the cortex and in subcortical areas like the LGN (Wandell 1995). The neurons are arranged in six layers with different anatomical characteristics (using Brodmann’s scheme for numbering laminations in human V1, as described by Henry 1989; Figure 2.6). Input from the thalamus goes through afferent connections to V1, typically terminating in layer 4 (Casagrande and Norton 1989; Henry 1989). Neurons in the other layers form local connections within V1 or connect to higher visual processing areas. For instance, many neurons in layers 2 and 3 have long-range lateral connections to the surrounding neurons in V1 (Gilbert et al. 1990; Gilbert and Wiesel 1983; Hirsch and Gilbert 1991). There are also extensive feedback connections from higher areas (Van Essen et al. 1992). Lateral connections play a central role in the LISSOM model, and will be discussed in detail in Section 2.2.
At a given location on the cortical sheet, the neurons in a vertical section through the cortex respond most strongly to the same eye of origin, stimulus orientation, spatial frequency, and direction of movement. It is customary to refer to such a section as a column (Gilbert and Wiesel 1989). The LISSOM model will treat each column as a single unit, thus representing the cortex as a purely two-dimensional surface. This model is a useful approximation because it greatly simplifies the analysis while retaining the basic functional features of the cortex.
Nearby columns generally have similar, but not identical, preferences; slightly more distant columns have more dissimilar preferences. Preferences repeat at regular intervals (approximately 1–2 mm) in every direction, which ensures that each
2.1 Visual System Organization |
19 |
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fig. 2.3. Measuring cortical maps. Optical imaging techniques allow neuronal preferences to be measured for large numbers of neurons at once (Blasdel and Salama 1986). In such experiments, part of the skull of a laboratory animal is removed by surgery, exposing the surface of the visual cortex. Visual patterns are then presented to the eyes, and a video camera records either light absorbed by the cortex or light given off by voltage-sensitive fluorescent chemicals that have been applied to it. Depending on the neural activity, there will be small differences in the emitted or reflected light, and these differences can be amplified by repeated presentations and averaging. The results are an indirect measure of the average two-dimensional pattern of neural activity resulting from a particular stimulus. Measurements can then be compared between different stimulus conditions, e.g. different orientations, determining which stimulus is most effective at activating each small patch of neurons. Figure 2.4 and later figures in this chapter will show maps of orientation preference computed using these techniques. Adapted from Weliky et al. (1995).
type of preference is represented across the retina. This arrangement of preferences forms a smoothly varying map for each dimension. For example, stimulus orientation is represented across the cortex in an orientation map of the retinal input (Blasdel 1992a; Blasdel and Salama 1986; Grinvald, Lieke, Frostig, and Hildesheim 1994; Ts’o, Frostig, Lieke, and Grinvald 1990). Figure 2.3 shows how such maps can be measured experimentally in animals, and Figure 2.4 displays an example orientation map from monkey cortex. In an orientation map, each location on the retina is mapped to a region on the map, with each possible orientation at that retinal location represented by different but nearby orientation-selective cells. Other mammalian species have largely similar orientation maps, although they differ in details (Muller,¨ Stetter, Hubener, Sengpiel, Bonhoeffer, Godecke,¨ Chapman, Lowel,¨ and Obermayer 2000; Rao, Toth, and Sur 1997).
Other stimulus features are represented in a similar fashion as maps, including those for direction of motion and ocular dominance (left or right eye preference; Blasdel 1992a; Crowley and Katz 2000; Lowel¨ 1994; Obermayer and Blasdel 1993; Shatz and Stryker 1978; Shmuel and Grinvald 1996; Weliky, Bosking, and Fitzpatrick 1996). These maps are overlaid so that a hierarchical representation of the input features emerges (Figure 2.5). The primary organization in the hierarchy is retinotopy. Neurons that respond to the same location are divided into those that respond primarily to the left eye and those that respond primarily to the right eye. Each
20 2 Biological Background
(a) Orientation preference |
(b) Orientation selectivity |
Fig. 2.4. Orientation map in the macaque. (a) Orientation preference and (b) orientation selectivity maps in a 7.5 mm × 5.5 mm area of adult macaque monkey V1, measured by optical imaging techniques. Each neuron in (a) is colored according to the orientation it prefers, using the color key on top. Nearby neurons in the map generally prefer similar orientations, forming groups of the same color called iso-orientation patches. Other qualitative features are also found. Linear zones are straight lines along which the orientations change continuously, like a rainbow; a linear zone is marked with a long white rectangle. Pinwheels are points around which orientations change continuously. They often occur in matched pairs: such a pair is circled in white. At saddle points a long patch of one orientation is nearly bisected by another; one saddle point is marked with a bowtie. Fractures are sharp transitions from one orientation to a very different one; a fracture between red and blue (without purple in between) is marked with a white square. Orientation selectivity measures how closely the input must match the neuron’s preferred orientation for it to respond. As shown in (b), neurons at pinwheel centers and fractures tend to be less selective (dark areas) in the optical imaging response, whereas iso-orientation patches, linear zones and saddle points tend to be more selective (light areas). Reprinted with permission from Blasdel (1992b), copyright 1992 by the Society for Neuroscience; annotations added and brightness increased.
group is further divided into areas that respond to particular orientations. In turn, each orientation-selective patch is often further subdivided into two patches, each preferring opposite directions of motion (Shmuel and Grinvald 1996; Weliky et al. 1996). Other stimulus features (such as spatial frequency and color) are represented as well, but are not as well organized at the large scale (Issa, Trepel, and Stryker 2001; Landisman and Ts’o 2002b). Simulations with LISSOM will show how the hierarchical map-like organization arises automatically from input-driven self-organization, and how it constitutes an efficient way to represent visual information.
2.1.3 Face and Object Processing
Beyond V1 in primates are dozens of extrastriate visual areas that can be arranged into a rough hierarchy (Van Essen et al. 1992). The relative locations of the areas in this hierarchy are largely consistent across individuals of the same species. Nonprimate species have fewer higher areas, and in at least one mammal (the least shrew,
2.1 Visual System Organization |
21 |
|
|
|
|
(a) Orientation preference |
(b) Ocular dominance |
Fig. 2.5. Hierarchical organization of feature preferences in the macaque. The images illustrate orientation and ocular dominance patches in a 4 mm × 3 mm area of the cortical surface in the macaque monkey, measured through optical imaging. (a) The cells are colored according to their orientation preference as in Figure 2.4a. (b) The same cells are colored in gray scale from white to black according to how strongly they prefer input from the left vs. the right eye. Each neuron is sensitive to a combination of feature values, in this case a line of a particular orientation in the left or the right eye at a particular location on the visual field. These maps are shown superimposed in Figure 5.3, revealing more fine-grained interactions between the maps. Plot (a) reprinted with permission from Blasdel (1992b) and plot (b) from Blasdel (1992a), copyright 1992 by the Society for Neuroscience.
a tiny rodent-like creature) V1 is the only visual area (Catania, Lyon, Mock, and Kaas 1999). Although the higher levels have not been studied as thoroughly as V1, the basic circuitry within each region is thought to be largely similar to V1. Even so, the functional properties differ greatly, in part because their connections with other regions are different. For instance, neurons in higher areas tend to have larger retinal receptive fields, respond to stimuli at a greater range of positions, and process more complex visual features (Ghose and Ts’o 1997; Haxby, Horwitz, Ungerleider, Maisog, Pietrini, and Grady 1994; Kandel et al. 2000; Rolls 2000; Wang, Tanaka, and Tanifuji 1996). In particular, extrastriate cortical regions that respond preferentially to faces have been found both in adult monkeys (using single-neuron studies and optical imaging; Gross, Rocha-Miranda, and Bender 1972; Hasselmo, Rolls, and Baylis 1989; Rolls 1992; Rolls, Baylis, Hasselmo, and Nalwa 1989; Wang et al. 1996) and adult humans (using functional magnetic resonance imaging, or fMRI; Halgren, Dale, Sereno, Tootell, Marinkovic, and Rosen 1999; Kanwisher, McDermott, and Chun 1997; Puce, Allison, Gore, and McCarthy 1995). Any such cell or region that responds stronger to faces than to other similar stimuli is called face selective.
The face-selective areas receive visual input via V1. They are loosely segregated into different regions that process faces in different ways. For instance, some areas perform face detection, i.e. respond unspecifically to many facelike stimuli (de Gelder and Rouw 2000, 2001). Others selectively respond to facial expressions, gaze
22 2 Biological Background
directions, or prefer specific faces (i.e. perform face recognition; Perrett 1992; Rolls 1992; Sergent 1989; Treves 1997). Whether these regions are exclusively devoted to face processing, or also process other common objects, remains controversial (Hanson, Matsuka, and Haxby 2004; Haxby, Gobbini, Furey, Ishai, Schouten, and Pietrini 2001; Kanwisher 2000; Tarr and Gauthier 2000). LISSOM will model areas involved in face detection (and not face recognition or other types of face processing), although these areas do not have to process faces exclusively.
2.1.4 Input-Driven Self-Organization
The first hints of how these complicated yet orderly structures come about in the cortex were discovered in the 1960s. At that time, Hubel, Wiesel and their colleagues conducted a number of experiments where they showed that altering the visual environment drastically changes the organization of the visual cortex (Hubel and Wiesel 1962, 1974; Hubel et al. 1977). For example, if a kitten’s vision is impaired by suturing the eyes shut, the visual cortex becomes disorganized, lacking orientation selectivity and ocular dominance patches. Such an effect is most dramatic during the critical period, typically within a few weeks after birth: If the eyes are kept shut until after the critical period, the animal actually becomes blind. If the animal (e.g. a ferret) is reared in the dark instead of suturing the eyes shut, the visual system becomes similarly impaired, although to a lesser extent (White et al. 2001), suggesting that abnormal visual stimulation through the closed eyelids is more harmful than receiving none at all. These results show how important normal visual stimuli are during the critical period to ensure that the visual system develops normally.
Development has been shown to depend on input in several more specific experiments as well. For example, kittens can be raised in an environment with only vertical or horizontal features, and as a result, they are unable to respond well to other orientations (Blakemore and Cooper 1970; Blakemore and van Sluyters 1975; Hirsch and Spinelli 1970). Similar results have been reported for ocular dominance in ferrets: If one eye is sutured shut during the critical period, the animal loses the ability to respond to inputs from that eye as an adult (Issa, Trachtenberg, Chapman, Zahs, and Stryker 1999). Further, the auditory cortex has been shown to become sensitive to visual inputs when the projections from the retina are surgically connected to it (Sharma, Angelucci, and Sur 2000; Sur, Garraghty, and Roe 1988).
These experimental results convincingly demonstrate that the connections in the cortex are shaped by environmental input. Part II of the book focuses on understanding the mechanisms underlying this process, showing that input-driven selforganization is able to construct the observed structures even from an initially uniform, unordered starting point, based on suitable input. However, how much of the organization is indeed due to environmentally driven self-organization and how much is genetically determined is open to a considerable debate (as will be reviewed in Section 2.3). A solution to this question is proposed in Part III, showing how genetically specified self-organization followed by environmentally driven self-organization can account for many of the observed phenomena in visual development.
2.2 Lateral Connections |
23 |
2.2 Lateral Connections
As was discussed in Section 1.1, the modern understanding of the visual cortex as a continuously adapting dynamic system has caused us to reconsider the role of lateral connections in cortical development and function. Lateral interactions seem to play a much larger role than previously believed, a role that we are only now beginning to understand. Because complex recurrent systems are difficult to study experimentally, computational models are crucial in developing a detailed theory about lateral connections in the cortex.
LISSOM is the first computational theory specifically designed for this purpose. It allows self-organization and analysis of lateral connections to take place in a functioning visual cortex model. The biological foundations of the LISSOM approach are discussed below, followed by a review of current ideas about the role of lateral connections in the cortex (for more details, see e.g. Sirosh, Miikkulainen, and Choe 1996b).
2.2.1 Organization
Long-range lateral connections form a dense, highly patterned network within the cortex. Each connection extends over several millimeters and gives rise to clusters of axon endings at regular intervals (Figure 2.6; Fisken, Garey, and Powell 1975; Gilbert and Wiesel 1979; Schwark and Jones 1989). In the primary visual cortex these connections can be 6–8 mm long, i.e. cover a substantial percentage of the V1 area. They are reciprocal, i.e. if area A connects to B, then B connects back to A. Long-range connections are found in layers 2, 3, 5, and 6; they are longest in layers 2 and 3. The lateral connection patterns in the different layers are aligned, and the dendritic arbor of pyramidal cells in layer 3 matches the axonal clusters (Burkhalter and Bernardo 1989; Gilbert and Wiesel 1989; Katz and Callaway 1992; Livingstone and Hubel 1984b; Luhmann, Mart´ınez Millan,´ and Singer 1986; Lund, Yoshioka, and Levitt 1993; Rockland 1985; Rockland, Lund, and Humphrey 1982; see Douglas and Martin 2004 for a review).
About 80% of the long-range connections synapse on excitatory pyramidal cells, while the remaining 20% synapse on inhibitory interneurons (Gilbert et al. 1990; McGuire et al. 1991). Imaging studies and other measurements indicate a substantial amount of long-range inhibition in the cortex, more than predicted by the above 80–20 distribution; moreover, at high contrasts the net effect is strongly inhibitory (Section 16.1.4; Grinvald et al. 1994; Hata, Tsumoto, Sato, Hagihara, and Tamura 1993; Hirsch and Gilbert 1991; Weliky et al. 1995). One fundamental assumption of the LISSOM model is that the lateral excitatory and inhibitory connections serve different roles in the visual cortex; both kinds of connections are therefore included in the LISSOM models in this book.
Long-range lateral connections are clustered in patches whose distribution corresponds closely to the organization of receptive fields in the sensory map, especially orientation. The connections of a given neuron target neurons in other areas that have similar orientation preferences, aligned along the preferred orientation of the neuron
24 2 Biological Background
Fig. 2.6. Long-range lateral connections in the macaque. Lateral connections, also sometimes called horizontal or intrinsic connections, run parallel to the cortical surface. In the visual cortex they extend over several millimeters and sprout branches at intervals. The branches form a local cluster of connections to other cells in the region, as shown for this layer 3 pyramidal cell in the macaque visual cortex (injected with horseradish peroxidase: The dendrites are shown with thick lines and axon collaterals with thin lines, and the horizontal scale is approximately 2.3mm). Such clusters occur only in regions with similar functional properties as the parent cell. Reprinted with permission from Gilbert et al. (1990; adapted from McGuire et al. 1991), copyright 1990 by Cold Spring Harbor Laboratory Press.
(Figure 2.7; Bosking et al. 1997; Fitzpatrick, Schofield, and Strote 1994; Gilbert et al. 1990; Gilbert and Wiesel 1989; Malach, Amir, Harel, and Grinvald 1993; Schmidt, Kim, Singer, Bonhoeffer, and Lowel¨ 1997; Sincich and Blasdel 2001; Weliky et al. 1995). In the immediate vicinity of each neuron, the connection patterns are relatively unspecific, but over larger distances they closely follow the orientation preferences. To a lesser degree, the patterns are also shaped by other perceptual features such as ocular dominance and spatial frequency (Bauman and Bonds 1991; De Valois and Tootell 1983; Lowel¨ 1994; Lowel¨ and Singer 1992; Vidyasagar and Mueller 1994).
For computational efficiency, most prior models of self-organization represented the lateral connections as a simple isotropic function. Later chapters will demonstrate that specific connection patterns are important for several developmental and functional phenomena, including self-organization, efficient representations, certain visual illusions, and perceptual grouping. For this reason, LISSOM will specifically simulate the development of patchy lateral connectivity.
