Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Computational Maps in the Visual Cortex_Miikkulainen_2005

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
16.12 Mб
Скачать

41 Introduction

the approach to verifying them computationally is outlined, providing a roadmap for the rest of the book.

1.1 Input-Driven Self-Organization

Current computing systems lag far behind humans and animals at many important information-processing tasks. One potential reason is that brains have far greater complexity, e.g. 1011 neurons and 1014 synapses compared with 108 transistors (Burger and Goodman 1997; Kandel, Schwartz, and Jessell 2000; Shepherd 2003). Designing specific blueprints for systems with 1014 components is beyond human engineering for the foreseeable future. How does nature manage to do it? One clue is that the genome has fewer than 105 genes in total, which means that any encoding scheme for the connections must be extremely compact (Lander et al. 2001; Venter et al. 2001). The first main hypothesis to be tested in this book is that instead of being specified directly genetically, the structure in the visual cortex is constructed by input-driven self-organization. Let us review the motivation for this idea in more detail.

The structure of the mammalian early visual areas is now well understood. Nerve fibers from the retina project to an intermediate region called the lateral geniculate nucleus (LGN), from which the fibers project to the primary visual cortex (V1). The Nobel prize winning studies of Hubel and Wiesel (1959, 1965, 1974) showed that neurons in the primary visual cortex are responsive to particular features in the input, such as a line of a particular orientation at a particular location in the visual field. Together, the locations on the retina to which a neuron responds are called the receptive field of the neuron. Neurons in a vertical column in the cortex have similar receptive fields and feature preferences. Vertical groups of neurons with the same orientation preference are called orientation columns, and vertical groups with the same eye preference are called ocular dominance columns; such groups may also be selective for direction of movement, spatial frequency, and color. This organization is shown schematically in Figure 1.1. The feature preferences gradually vary across the surface of the cortex in characteristic spatial patterns called cortical feature maps.

Many researchers have argued that such maps develop through self-organization of input connections from the thalamus and are shaped by visual experience (Shatz 1992). A number of classic experiments by Hubel, Wiesel and other researchers showed that altering the visual environment can drastically change the organization of input connections, ocular dominance columns, and orientation columns (Hubel and Wiesel 1962, 1974; Hubel, Wiesel, and LeVay 1977). The animal is most susceptible during a critical period of early life, typically a few weeks. For example, if a kitten is raised with both eyes sutured shut, its cortex will be abnormally organized, without ocular dominance and orientation columns. If the eyes are opened only after a critical period of a few weeks, the animal will be blind for life, even though the eyes and the LGN are perfectly normal. Similarly, if kittens are raised in environments containing only vertical or horizontal contours, their ability to see other orientations suffers significantly. In the cortex, most cells develop preferences for these

1.1 Input-Driven Self-Organization

5

Fig. 1.1. Columnar organization of the primary visual cortex. This classic diagram illustrates an example patch of V1, responsive to one retinotopic location in the visual field. This patch includes an ocular dominance column for each eye, and a set of orientation columns within each ocular dominance column. Orientation preference changes along the length of the area shown, and ocular dominance along its width. Vertically, the receptive field properties are the same. Structures such as blobs, which analyze color, are scattered throughout the columns. Reprinted with permission from Kandel et al. (1991), copyright 1991 by McGraw-Hill.

particular orientations, and do not respond well to the other orientations (Blakemore and Cooper 1970; Blakemore and van Sluyters 1975; Hirsch and Spinelli 1970; Sengpiel, Stawinski, and Bonhoeffer 1999). Such experiments indicate that visual inputs are crucial for normal cortical organization, and suggest that the cortex tunes itself to the distribution of visual inputs.

How do such environmentally tuned feature preferences develop, and how do they become organized across the cortex? Since the 1970s, computational models have been used to demonstrate that both the preferences and their organization can result from a statistical learning algorithm that performs a nonlinear approximation of the distribution of visual inputs. The experiments in this book follow this tradition. An important, novel part of our theory is that lateral connections between columns self-organize to establish the competition and cooperation necessary for this process.

61 Introduction

The earlier theories of the visual cortex did not include a significant role for the lateral connections, which was in line with the original experimental results. Altering the visual environment of the young animal changes the organization of its afferents; lateral connections were assumed to be necessary only to provide a stable environment for the afferent adaptation, and they were assumed to be isotropic, as they are in the retina. In the adult, the visual cortex was thought to be a collection of filters for visual input, and the properties of the filters (such as orientation preference) were thought to be defined by the patterns of afferent synapses. Possible lateral interactions between cells across the cortex were generally not taken into account, partly for simplicity, and partly because there did not exist sufficient neurobiological data to form well-defined theories about these interactions.

Over the last decade, however, a number of exciting results about lateral intracortical connectivity and dynamic processes in the visual cortex have emerged: (1) Lateral connections primarily connect areas with similar properties, such as neurons with the same orientation preference (Gilbert, Hirsch, and Wiesel 1990; Gilbert and Wiesel 1989; Lowel¨ and Singer 1992; Weliky, Kandler, Fitzpatrick, and Katz 1995).

(2) The lateral connections are initially uniform, but they become patchy during early development as a result of neural activity (Callaway and Katz 1990, 1991; Lowel¨ and Singer 1992; Ruthazer and Stryker 1996). (3) Lateral connections develop at approximately the same time as orientation columns and ocular dominance columns form (Burkhalter, Bernardo, and Charles 1993; Katz and Callaway 1992). (4) By integrating information over large portions of the cortex, these connections appear to assist in the grouping of simple features such as edges into perceptual objects (Singer, Gray, Engel, Konig,¨ Artola, and Brocher¨ 1990; von der Malsburg and Singer 1988). (5) The visual cortex is not static after maturation, but can adapt rapidly (in minutes) to retinal lesions and similar changes in the visual input. Several researchers have hypothesized that lateral connections play an important role in this adaptability (Gilbert and Wiesel 1992; Kapadia, Gilbert, and Westheimer 1994; Pettet and Gilbert 1992).

The new understanding of cortical development and function thus differs drastically from the old. It now appears that the adult visual cortex is a continuously adapting recurrent structure in a dynamic equilibrium, capable of rapid changes in response to altered visual environments. The lateral connections develop cooperatively and simultaneously with the thalamocortical afferents, and visual experience dynamically changes the lateral interactions throughout life.

In this book, a unified, dynamic computational model of such mechanisms in the visual cortex is developed. A single self-organizing process determines how both afferent and lateral connections develop in early life. This same process also continuously adapts the adult cortical structure during visual processing and may play an important role in perception. The model therefore provides strong computational support for the idea that cortical structure develops based on input-driven self-organization.

1.2 Constructing Visual Function

7

Wave 1

0.0 s

1.0 s

2.0 s

3.0 s

4.0 s

Wave 2

0.0 s

0.5 s

1.0 s

1.5 s

2.0 s

Fig. 1.2. Spontaneous activity in the retina. Each of the frames shows calcium concentration imaging of approximately 1 mm2 of newborn ferret retina; the plots are a measure of how active the retinal cells are. Light gray indicates areas of increased activity. This activity is spontaneous (internally generated), because the photoreceptors have not yet developed at this time. From left to right, the frames on the top row form a 4-second sequence showing the start and expansion of a wave of activity. The bottom row shows a similar wave 30 seconds later. Later chapters will show that this type of correlated activity can explain how orientation selectivity develops before eye opening. Reprinted with permission from Feller et al. (1996), copyright 1996 by the American Association for the Advancement of Science; gray scale reversed.

1.2 Constructing Visual Function

The experiments with LISSOM will show that the self-organizing algorithm is powerful enough to construct structure from visual inputs starting from an initially uniform, unorganized state. However, there are two problems with this result: (1) Selforganization takes time, and the animal would not be able to act on visual input until the process is almost complete. (2) The self-organized structure depends critically on the specific input patterns available: if the visual environment is variable, the organism may not develop predictably, and what the learning algorithm discovers may not be the information most relevant to the organism.

In contrast, visual development in nature is highly stable, and the visual cortex of most animals is partially organized already at birth (or eye-opening). Such robustness could be achieved with a specific, fixed genetic blueprint, but (as was discussed above) there is not enough information available in the genome to represent it.

Recent experimental findings in neuroscience suggest that nature may have found a clever way to utilize self-organization to achieve the same result. Developing sensory systems are now known to be spontaneously active even before birth, i.e. before they could be learning from the environment (see O’Donovan 1999; Wong 1999 for reviews; Figure 1.2). This spontaneous, internal activity may actually guide the process of cortical development, acting as genetically specified training patterns for a learning algorithm (Constantine-Paton, Cline, and Debski 1990; Hirsch 1985; Jouvet 1998; Katz and Shatz 1996; Marks, Shaffery, Oksenberg, Speciale, and Rof-

81 Introduction

fwarg 1995; Roffwarg, Muzio, and Dement 1966; Shatz 1990, 1996; Sur and Leamey 2001). For a biological species, being able to control the training patterns can guarantee that each organism has a rudimentary level of performance from the start. Such training would also ensure that initial development does not depend on the details of the external environment. Thus, internally generated patterns can preserve the benefits of a blueprint, within a learning system capable of much higher complexity and performance.

The second main hypothesis tested in this book is that the input-driven selforganization is based on internally generated patterns as well as external visual inputs. Internal patterns drive the initial development, and the external environment completes the process. The result is a compact specification of a complex highperformance product.

This idea will be implemented in LISSOM, and illustrated on two visual capabilities where both genetic and environmental influences play a strong role: orientation processing and face detection. At birth, newborns can already discriminate between two orientations (Slater and Johnson 1998; Slater, Morison, and Somers 1988), and animals have neurons and brain regions selective for particular orientations even before their eyes open (Chapman and Stryker 1993; Crair, Gillespie, and Stryker 1998; Godecke,¨ Kim, Bonhoeffer, and Singer 1997). Yet, as reviewed above, orientation processing circuitry in these same areas can also be strongly affected by visual experience (Blakemore and van Sluyters 1975; Sengpiel et al. 1999). Internally generated patterns make it easier to build an effective orientation map from later environmental input, and they are crucial for explaining the experimental data. Similarly, newborns already prefer facelike patterns soon after birth, but face-processing ability takes months or years of experience to develop fully (Goren, Sarty, and Wu 1975; Johnson and Morton 1991; see de Haan 2001 for a review). Pattern generators can be used to specify such species-specific structure: If the visual system model is trained with simple three-dot patterns before birth, the newborn system prefers facelike schematics the same way human infants do, and gradually learns to recognize real faces through similar developmental phases.

These results suggest that self-organization driven by both internal and external inputs can be used to build complex, plastic, robust structures that would be too complex to determine directly genetically, and too fragile to learn from external inputs. Pattern generation is ubiquitous in nature, and could also be utilized in engineering of complex artificial systems in general.

1.3 Perceptual Grouping

In addition to understanding how the observed structures in the visual cortex emerge, it is important to understand what role they play in visual processing. Because LISSOM is a functional computational model, it can be tested in simulated neurobiological and psychophysical experiments. It is therefore ideal for testing hypotheses about the functional phenomena that arise from the self-organized structures.

1.3 Perceptual Grouping

9

 

 

 

 

(a) Proximity

(b) Good continuation

(c) World knowledge

Fig. 1.3. Perceptual grouping tasks. Perceptual grouping is the process of identifying constituents in the visual scene that together form a coherent object. Perceptual grouping can take place at many different levels, from the very low level (a), to the very high level (c). (a) Grouping by proximity. The two black disks that are close to each other appear to form a unit. Thus, two groups are perceived: one on the left and another on the right. (b) Grouping by good continuation. In the random background of oriented edges (or contour elements), it is easy to notice the long, continuous sequence of edges that runs horizontally from the top-left of the circular area toward the right and slightly down. The task of detecting such contours is known as contour integration. (c) Grouping requiring world knowledge. In this seemingly unintelligible image lurks a Dalmatian dog sniffing on the pavement (a photograph by R. C. James; the dog is in the top right of the image, facing left). Without world knowledge, e.g. experience with dogs, leaves, etc., it would be impossible to group together the dots that form the Dalmatian.

Perhaps the most significant such function is perceptual grouping, or the process of identifying the constituents in the visual scene that together form a coherent object (Grossberg, Mingolla, and Ross 1997; Watt and Phillips 2000; Zucker 1995). The complexity of such tasks varies widely, and they can take place at various levels of the visual processing hierarchy (Figure 1.3). Different grouping principles are utilized at the different levels, including those based on spatial, temporal, and chromatic relationships (Geisler and Super 2000). At the level of orientation maps, perceptual grouping is manifested in contour integration, and a large body of neurobiological and psychophysical data is available to constrain, validate, and test the models. In this book, the LISSOM model will be used to test the hypothesis that contour integration is an automatic function of the orientation map in the visual cortex, based on synchronized neuronal activity mediated by self-organized lateral connections.

A typical visual input for the contour integration task is shown in Figure 1.3b. The input consists of a series of short oriented edge segments (contour elements) aligned along a continuous path, embedded in a background of randomly oriented contour elements. The task is to identify the longest continuous contour in this scene. Contour integration is an appropriate problem for computational analysis because the relationships between constituents of the image are neither too simple to be interesting (as in Figure 1.3a where the distance between the centers of the disks is the only grouping criteria), nor too complex to be represented (as in Figure 1.3c where complex world knowledge is required).

10 1 Introduction

Most importantly, contour integration is believed to occur relatively early in the visual system. The response properties and connection patterns found in the primary visual cortex have exactly the right properties for explaining contour integration performance in terms of neural mechanisms (Field, Hayes, and Hess 1993; Geisler, Perry, Super, and Gallogly 2001; Li 1998; McIlhagga and Mullen 1996; Pettet, McKee, and Grzywacz 1998; Stettler, Das, Bennett, and Gilbert 2002; Yen and Finkel 1997, 1998). The lateral connections run along collinear or cocircular paths, and these areas are often activated together (Bosking, Zhang, Schofield, and Fitzpatrick 1997; Dalva and Katz 1994; Gilbert 1992; Katz and Callaway 1992; Lowel¨ and Singer 1992; McGuire, Gilbert, Rivlin, and Wiesel 1991; Weliky et al. 1995). As discussed above, there is strong evidence that these structures are self-organized, driven by neural input (Blakemore and Cooper 1970; Blakemore and van Sluyters 1975; Hirsch and Spinelli 1970; Hubel and Wiesel 1962, 1974; Hubel et al. 1977; Ruthazer and Stryker 1996; White, Coppola, and Fitzpatrick 2001). Such specific patterns of connectivity are well suited for forming a consistent, coherent activation in response of a continuous contour.

One major question is how coherent percepts are represented in the cortex. The task consists of two parts: binding, i.e. grouping together separate constituent representations in the visual scene into a coherent object, and segmentation, i.e. segregating such coherently bound representations into different objects. With static activity, it is hard to represent binding and segmentation in a constantly changing sensory environment (von der Malsburg 1981, 1986a). Several researchers have proposed that temporal coding through synchronization, spike timing, phase differences, or other temporal information, could solve the problem (Eckhorn, Reitboeck, Arndt, and Dicke 1990; Horn and Opher 1998; Kammen, Holmes, and Koch 1989; Reitboeck, Stoecker, and Hahn 1993; Terman and Wang 1995; von der Malsburg 1986b; Wang 1995). Indeed, experiments with cats have shown that presentation of coherent objects gives rise to synchronized firing of neurons in the visual cortex, and presenting separate objects causes no synchronization (Eckhorn, Bauer, Jordan, Kruse, Munk, and Reitboeck 1988; Gray, Konig, Engel, and Singer 1989; Gray and Singer 1987; Singer 1993). Such coherent firing of neurons may be a possible representation for grouping.

In this book, the mechanisms of self-organized lateral connections and synchronization between groups of spiking neurons are brought together into an integrated developmental and functional model of the visual cortex. The results support the hypothesis that much of contour integration is performed in V1, based on these mechanisms. The work also suggests that similar mechanisms could be in use at higher levels, providing insights into perceptual grouping in general.

1.4 Approach

The above three hypotheses will be tested in a computational framework called LISSOM, or laterally interconnected synergetically self-organizing map. LISSOM is a computational map model of the visual cortex developed in our laboratory over the

1.4 Approach

11

V1

LGN

ON

OFF

Retina

Fig. 1.4. Basic LISSOM model of the primary visual cortex. The core of the model consists of a two-dimensional array of computational units representing columns in V1. These units receive input from the retinal receptors through the ON/OFF channels of the LGN, and from other columns in V1 through lateral connections. The solid circles and lines delineate the receptive fields of two sample units in the LGN and one in V1, and the dashed circle in V1 outlines the lateral connections of the V1 unit. The LGN and V1 activation in response to a sample input on the retina is displayed in gray-scale coding from white to black (low to high). The V1 responses are patchy because each neuron is selective for a particular combination of image features (Figure 1.1), and only certain combinations exist in the image. This basic LISSOM model will be used in Part II to understand input-driven self-organization, cortical plasticity, and functional effects of adapting lateral connections. In Part III, the model is further extended with subcortical and higher level areas to study prenatal and postnatal development, and in Part IV, with binding and segmentation circuitry in V1 to model perceptual grouping.

past 10 years, building on about 30 years of map modeling research in the literature. LISSOM’s core is a two-dimensional array of computational units corresponding to columns in V1, which receive inputs from the retina through the ON/OFF channels of the LGN and from other columns in V1 through lateral connections (Figure 1.4). The units learn through Hebbian adaptation, and compete with other units in a selforganizing map structure (Hebb 1949; Kohonen 2001; von der Malsburg 1973). The hypotheses are tested by analyzing the behavior of this model through simulated neurobiological and psychophysical experiments.

The input-driven self-organization hypothesis is tested in four ways: (1) In a number of specific experiments where each individual feature of visual inputs to the cortex, such as topographic order, eye dominance, orientation, and direction of

12 1 Introduction

movement, is learned and represented in the cortex; (2) in a combined simulation where a large cortical model self-organizes to represent all these features simultaneously; (3) in an adult-plasticity experiment where the cortex repairs itself after retinal or cortical damage; and (4) in a functional experiment where visual aftereffects are shown to arise from these same mechanisms in the normal adult system.

The pattern generation hypothesis will be evaluated by building and testing HLISSOM, a hierarchical model that includes both subcortical and higher visual areas. The goal is to understand how internal and external inputs affect the organization and function of the visual cortex. Because the orientation processing circuitry has been mapped out in detail in animals, it will be used as a verifiable test case for the pattern generation approach. The same techniques will then be applied to face processing, where they will be used as a basis for a unified theory for the phenomenon. The goal is to demonstrate how internal activity can account for the newborn structure in each system, and how postnatal experience can complete this developmental process. In each case, the model is first validated by comparing it with existing experimental results, and then used to derive predictions for future experiments.

The contour integration hypothesis will be studied in the PGLISSOM model, where LISSOM is extended to perform perceptual grouping through spiking neurons and long-range excitatory lateral connections. Grouping is measured as the degree of synchrony among neural populations, and such synchrony is established through the lateral connections. This model shows how the statistical structure in the visual environment determines the structure of the visual cortex, which in turn determines its grouping performance. The model therefore provides a computational account of the possible neural mechanisms of contour integration.

In addition to providing computational support for the above three hypotheses, the LISSOM framework constitutes a general computational theory of representation and learning in the visual cortex. The learning mechanisms extract correlations in the input that allow representing visual information efficiently in a sparse, redundancyreduced code. Such representations are separable and generalizable, and serve as an effective foundation for later stages of visual processing, such as pattern recognition. These computational principles are abstractions of what the cortex is doing, but they are also general principles that could be used in constructing artificial systems.

The LISSOM approach is intended to serve as a starting point for future explorations in computational understanding of the visual cortex. The models described in this book are freely available on the Internet under the Topographica project (http://topographica.org). In this project, a general simulator for computational modeling of cortical maps is being developed, intended to support further research in this general area. We believe that the current confluence of experimental data on cortical maps and such newly available computational tools will lead to major progress in understanding how the brain processes visual information.

1.5 Guide for the Reader

13

1.5 Guide for the Reader

The book is divided into five parts. First, the biological background is reviewed for the core constituents of LISSOM, i.e. for self-organization, lateral connections, genetic vs. environmentally driven development, and temporal coding. The computational foundations of LISSOM, such as the neuron models, synchronization, learning, and self-organizing maps, are also discussed. However, the specific biological and psychophysical evidence and prior modeling work for each individual experiment is reviewed in the individual chapters throughout the book.

Part II focuses on mechanisms of input-driven self-organization. The basic architecture of the LISSOM computational map model of V1 is presented, and demonstrated to develop a map organization and patchy lateral connections based on regularities in the visual input. The same self-organization processes are shown to account for plasticity of the adult cortex, and give rise to psychophysical phenomena such as the tilt aftereffect.

Part III demonstrates how genetic and environmental influences can be combined in input-driven self-organization. The LISSOM model of V1 is first expanded outward into a multi-level model containing subcortical areas and higher visual maps, capable of processing both natural images and internally generated input. This model demonstrates a synergy of nature and nurture in developing orientation preferences, and allows gaining insight into high-level phenomena such as infant face processing.

Perceptual grouping is studied in Part IV. To gain insight into this process, the LISSOM model is extended inward to include spiking units and separate excitatory and inhibitory components in cortical columns. The resulting temporal coding and self-organization processes are demonstrated in detail, and shown to work together. The model is shown to account for low-level perceptual grouping phenomena such as contour integration performance under varying conditions, integration of illusory contours, and differences in grouping performance across the different areas of the cortex.

In Part V, laterally connected self-organizing maps are shown to result in efficient visual representations well suited for higher level processing and for practical applications. Techniques are developed for scaling the approach to very large maps, including possibly the entire visual cortex. The assumptions and predictions of LISSOM are reviewed and evaluated in terms of biological research results and opportunities. Connections are made to related and complementary work in cortical modeling and cognitive science, and future directions are outlined.