Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Computational Maps in the Visual Cortex_Miikkulainen_2005

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
16.12 Mб
Скачать

17.2 Modeling New Phenomena with LISSOM

387

PGLISSOM can be trained with artificial input patterns where these dimensions are systematically varied. Alternatively, such training could be made more realistic by generating inputs based on known natural image statistics (as suggested by W. Geisler). Such inputs would need to be appropriately filtered based on known biological processes in the retina and LGN, because PGLISSOM is a model of V1 and assumes that such processes have already taken place. The model could then be used to predict human perceptual performance in more detail. It could also be used to identify the input statistics that are important for each stimulus dimension, by analyzing the lateral connection patterns and receptive field properties.

The next step would be to train the model with natural inputs such as moving natural images, as was done in Section 5.6.4. Currently such inputs cannot be used in PGLISSOM because the spiking network is computationally much more expensive to simulate than the firing-rate version. However, with the scaling-up techniques proposed in Section 17.2.9, such training should be possible. Natural images vary in all the stimulus dimensions described above, and should make PGLISSOM grouping performance sensitive to these dimensions.

Most interestingly, direction-selective cells that emerge from such training would allow grouping to take place based on motion, which is an important process in biological vision. Such a model would make it possible to compare the model directly with experimental observations on synchronization, which are usually based on moving inputs (Eckhorn et al. 1988; Gray and Singer 1987).

17.2.9 Scaling up to Large Networks

Developing techniques to simulate large networks accurately is a major issue in computational neuroscience research. Scaled-down versions of perceptual experiments had to be devised in this book in order to study them with the available model retina and cortex: For example, the tilt aftereffect inputs consisted of only a single line, instead of gratings used in human experiments; contour integration was performed over three to six elements in a background of zero to six elements, instead of 20 in 200. While the principles are the same and we believe the results are valid, such mismatch in scale makes it difficult to compare the results directly with human experiments. Also, large maps are necessary for many other important visual phenomena, such as visual attention, saccades between stimulus features, the interaction between the foveal and peripheral representations of the visual field, and the self-organization based on large-scale patterns of optic flow due to head movement. In order to understand them computationally, large parts of the visual cortex have to be simulated.

The parameter scaling equations and the GLISSOM map growing method introduced in Chapter 15 are the most promising avenue to date to scale up to large networks. As was discussed in that chapter, these techniques already allow simulating the entire V1 at the column level with desktop workstations. The techniques can be combined with parallel implementations of the LISSOM algorithm in order to simulate multiple visual areas or more-detailed column models (Chang and Chang 2002).

388 17 Future Work: Computational Directions

It may be possible to reduce the memory requirements of large simulations by modeling biological networks more directly. The lateral connection weights could be initialized based on known biological connectivity or connectivity derived from image statistics. Since such connectivity is usually sparse, a larger map can be constructed. Although the initial development of the map cannot be modeled in such networks, large-scale simulations can be run to test specific functional effects of biological connectivity patterns, and to test the components of the network in various realistic psychophysical tasks.

In order to deal with inputs as large as the entire visual scene, techniques could be developed for scanning the visual space sequentially with a small LISSOM network. If a model has only afferent connections, the input space can be partitioned into discrete grids the size of the LISSOM retina, and the responses for each grid location combined to form the global output. With lateral connections, the combination becomes more complicated because the lateral interactions between the different areas must be taken into account. As a first approximation, it may be sufficient to represent only the lateral connections of the LISSOM map to its eight neighboring locations in the grid. These same connections can then be used at all grid locations. In this manner, it may be possible to self-organize and run an arbitrarily large LISSOM map that is constructed on the fly from local components.

With the scale-up techniques, it should be possible to apply the LISSOM approach to many new visual phenomena in a realistic scale. Before its performance can be directly compared with that of humans, it will also be necessary to extend LISSOM to model foveated input, as will be described next.

17.2.10 Foveated Input and Eye Movements

Current LISSOM simulations are based on a uniform representation of the visual field, which is appropriate when modeling a small patch of the retina and the corresponding parts of the visual cortex. However, a number of visual phenomena depend on differences between central and peripheral visual processing (such as contour integration; Section 13.1.1). In the periphery, retinal ganglion cells are spaced much farther apart and have much larger receptive fields than in the fovea, and thus the mapping of visual space differs significantly between central and peripheral vision. As a result, object perception performance varies across the visual field (Levy, Hasson, Avidan, Hendler, and Malach 2001; Makel¨a,¨ Nas¨anen,¨ Rovamo, and Melmoth 2001; Strasburger and Rentschler 1996). For instance, faces in the periphery need to be both larger and have higher contrast to be recognized, compared with those in central vision.

To understand these experimental results, a large-scale version of LISSOM can be implemented that includes both central and peripheral processing. The architecture would be mostly the same as in the current model, perhaps scaled up with the techniques discussed in Section 17.2.9. In addition, a module would be included before the photoreceptors to perform a log-polar transformation of the visual image before computing the activation of the photoreceptors. This transformation could be adjusted over time to take into account that the fovea develops later than the rest of

17.2 Modeling New Phenomena with LISSOM

389

the retina (Abramov et al. 1982; Kiorpes and Kiper 1996). Such a transformation process would simulate the nonlinear distribution of retinal ganglion cells, without requiring changes to the LGN, V1, or FSA models.

Although modeling foveated input requires only these small changes to the LISSOM model of V1, to understand fully the effects of the fovea on self-organization and visual function, it may be necessary to include mechanisms for moving the direction of gaze.

In normal vision, the fovea is directed at several different visual targets each second, changing between targets with a quick saccade. The saccades are controlled by subcortical areas such as the superior colliculus and by high-level areas such as the frontal eye fields (see Bisley and Goldberg 2003 for a review). Including regions that generate eye movement would greatly complicate the model, but would also complete a loop between eye movement, retinal image, and subsequent eye movements. In the long run, such a model will be crucial for understanding how the visual system utilizes representations in the fovea and the periphery to make sense of the visual environment.

17.2.11 Scaling up to Cortical Hierarchy

In addition to scaling to larger maps, the visual cortex model can be scaled up vertically by including maps beyond V1 in the visual hierarchy, such as V2, V4, and MT. Such large-scale models will rely on detailed data now becoming available about the connectivity and functional properties of higher visual areas (e.g. Heeger, Boynton, Demb, Seidemann, and Newsome 1999; Kotter¨ 2004; McCormick, Choe, Koh, Abbott, Keyser, Melek, Doddapaneni, and Mayerich 2004a; McCormick, Mayerich, Abbott, Gutierrez-Osuna, Keyser, Choe, Koh, and Busse 2004b; McGraw, Walsh, and Barrett 2004; Pinsk, Doniger, and Kastner 2004; Van Essen 2003; Wong and Koslow 2001). The ultimate goal would be to self-organize structures as complex and powerful as the primate visual system, with dozens of interacting visual areas allowing recognition of highly complex patterns. First steps toward this goal were presented in Section 10.2, where the high-level FSA was organized based on inputs from the V1 model.

Self-organization of hierarchical structures is a difficult unsupervised learning task in general (Becker 1992). The idea is to apply self-organization in multiple stages and to discover increasingly complex structures in the input. However, linear projections and nonlinear topographic mappings (such as the SOM) do not usually suffice: Each level will represent essentially the same information even if it is scaled or organized differently. In contrast, LISSOM includes several extensions that make it possible to discover high-level representations: (1) The afferent receptive fields are local, and higher levels receive information from broader areas than lower levels; (2) the activation function includes a threshold, ensuring that only the bestmatching neurons respond; and (3) the lateral interactions decorrelate the representations, forming a sparse, redundancy-reduced code that makes recognition and classification at higher levels easier (as was shown in Section 14.3). The higher levels

390 17 Future Work: Computational Directions

can then develop complex feature preferences like those found in higher levels of the visual cortex.

With a hierarchical model, it should be possible to show how high-level perceptual properties can develop. One important such property is translation-invariant and viewpoint-invariant responses. These invariances are crucial for large-scale object and face recognition, because large objects are usually not encountered at precisely the same retinal position and orientation each time. Lateral connections are believed to play a crucial role in this process (Edelman 1996; Marshall and Alley 1996; Wiskott and von der Malsburg 1996); however, existing models do not yet develop ordered maps for orientation or other low-level visual features, and do not explain how the lateral connectivity can develop along with the map (Bartlett and Sejnowski 1998; Foldi¨ak´ 1991a; Olshausen, Anderson, and Van Essen 1995, 1996; Stringer and Rolls 2002; Wallis and Rolls 1997). Instead, the models are based on hierarchically arranged sheets of neurons that respond to faces or specific objects over a wide range of positions or three-dimensional viewpoints.

Despite this difference in focus, the overall architectures of most transformation invariance models are similar to a hierarchical LISSOM model, and their mechanisms could be implemented in LISSOM. For instance, the VisNet family of models (Rolls and Milward 2000; Stringer and Rolls 2002; Wallis 1994; Wallis and Rolls 1997) achieves transformation invariance using the trace learning rule (Section 17.1.4; Foldi¨ak´ 1991a). Because a moving object will assume a number of different spatial positions and configurations over time, the trace learning rule ensures that responses to each of these views will become associated. At each subsequent hierarchical level, neurons will process larger areas of the visual field, leading to translation and viewpoint invariance at the highest levels (comparable to monkey inferotemporal cortex).

A hierarchical LISSOM model would provide a first unified account of how topographic maps can develop in a hierarchy of visual areas, how their function depends on self-organized lateral connections, and how high-level properties such as transformation invariance emerge in this process. In the following four subsections, opportunities for understanding several high-level perceptual phenomena with hierarchical LISSOM are reviewed.

17.2.12 Line-End-Induced Illusory Contours and Occluded Objects

PGLISSOM was used in Section 13.3.4 to show how edge-induced illusory contours could arise based on the same mechanisms as normal contour integration. An important future experiment with hierarchical LISSOM is to include end-stopped cells in the model, and demonstrate how line-end-induced illusory contours, and possibly occluded objects, could be detected in V2.

As was discussed in Section 17.1.4, the behavior of cortical columns in the current LISSOM models is based on simple cells, which are the first in V1 to show orientation selectivity. Such cells in the cortex usually respond more strongly when the input stimulus, such as an oriented line element, gets longer. In contrast, end-stopped cells also found in V1 respond best to an input with a particular length (Gilbert

17.2 Modeling New Phenomena with LISSOM

391

and Wiesel 1979), and they have been proposed to be responsible for both line-end- induced illusory contour completion and occluded object recognition (Finkel and Edelman 1989; Kellman, Yin, and Shapley 1998; Rensink and Enns 1998; Sajda and Finkel 1992; Weitzel, Kopecz, Spengler, Eckhorn, and Reitboeck 1997).

End-stopped cells have been found in layer 4 in cats. They are thought to arise when a layer-6 cell, which typically has a wider receptive field, inhibits (through an inhibitory interneuron) a layer-4 neuron with a smaller receptive field: Inputs that are exactly as long as the smaller RF excite such a layer-4 neuron the most (Bolz and Gilbert 1986; Gilbert 1994). The end-stopped cells further project to V2, and may form a basis for orientation columns for illusory contours that have been observed in V2 (Sheth et al. 1996).

In order to understand line-end-induced illusory contours and occluded object detection with PGLISSOM, it will first have to be extended with a set of neurons and intracolumnar circuitry in layers 4 and 6 that give rise to end-stopped receptive fields: These neurons in layer 6 have wider receptive fields and inhibit the neurons in layer 4, which have narrower receptive fields. During self-organization, these layer- 4 neurons will develop into end-stopped cells and self-organize with the rest of the orientation map. A V2 map, receiving input from the end-stopped cells, will then self-organize based on the oriented end-stop activity, and form an orientation map of illusory contours in a process similar to how the V1 forms an orientation map of ordinary contours.

The model can be tested in illusory contour detection and in occluded object detection tasks. If successful, it would demonstrate how such behavior can arise from input-driven self-organization in V1 and V2. Further, since the V2 in such an extended model will behave as the V1 in the current LISSOM, tilt aftereffects should occur between illusory contours in much the same way as they do in ordinary contours in LISSOM (Chapter 7). Such aftereffects have indeed been found in psychophysical experiments (Berkley, Debruyn, and Orban 1993; Paradiso, Shimojo, and Nakayama 1989; van der Zwan and Wenderoth 1994, 1995); the model predicts that the same underlying mechanism is responsible for both of them.

In this manner, a hierarchical LISSOM model can be used to understand selforganization and function of V2 and higher levels of visual processing. An important further extension of the hierarchy is to include feedback from higher levels, as will be discussed next.

17.2.13 Feedback from Higher Levels

Current hierarchical LISSOM models, such as the HLISSOM network of Chapter 8 and those proposed so far in this chapter, are feedforward only: Activation propagates from the eye to LGN, to V1, and to higher levels, but not in the reverse direction. Including feedback connections is an important direction of future work that will allow us to model several new phenomena.

In the cortex, a large proportion of connections propagate in the reverse direction, connecting from higher levels to V1 and the LGN (see Gandhi, Heeger, and Boynton 1999; Lamme, Super, and Spekreijse 1998; Van Essen et al. 1992; White 1989 for

392 17 Future Work: Computational Directions

reviews). The role of these feedback connections is not yet clear, but they may be involved in top-down pattern completion, attention, visual imagery, and large-scale object grouping. In many cases, they may achieve these effects by enhancing the existing lateral interactions (Freeman, Driver, Sagi, and Zhaoping 2003).

During self-organization, the feedback connections may also encourage different areas to develop synergetically, mediating competition and cooperation between multiple areas (Rolls 1990). Thus, over a large spatial scale, feedback connections may act like lateral connections within each area. The arrangement of maps into a hierarchy may even be primarily a means for making such large-scale connections more efficiently than could be achieved in a single large, laterally connected map (Kaas 2000).

As a first approach, feedback connections can be included in LISSOM just like afferent and lateral connections, as additional terms in each neuron’s activation function. The principle is particularly clear if the activation function is written by indexing over receptive fields (as is done in Appendix A and implemented in Topographica):

ηij (t) = σ γρ Xkl (t − 1)wkl,ij , (17.1) ρ kl RFρ

where the index ρ indicates afferent, lateral, and feedback receptive fields (RF), Xkl (t − 1) is the activation of neuron (k, l) in that receptive field, and wkl,ij is the weight from that neuron to neuron (i, j). The sign of scaling factor γρ is positive for afferent and lateral excitatory connections, and negative for lateral inhibitory connections. Although individual feedback connections are usually excitatory in the cortex (White 1989), they may have inhibitory effects for strongly activated neurons (like lateral connections do; Weliky et al. 1995). Both approaches can be tested in LISSOM by changing the sign of γρ for the feedback connections. In future work, this approach can be further generalized by adding different delays for afferent, lateral, and feedback connections, thus replacing “1” in the equation with a specific dρ for each connection type. For instance, lateral connections have a higher latency, on average, than feedback connections (Nowak and Bullier 1997), and would thus have longer delays.

Using a hierarchical version of LISSOM with feedback, it should be possible to explain visual phenomena like pattern completion, where an object-selective neuron in a higher level can bias those neurons in a lower level that generally cause it to fire, thus completing missing or weak low-level features. Such feedback could also be used to account for a wider range of illusory contours, and for multi-modal integration, as will be discussed in the next two sections.

Feedback most likely plays a large role in perceptual systems, a role that has only recently began to be understood (Carpenter 2001; Dayan, Hinton, Neal, and Zemel 1995; Knoblauch and Palm 2003; Kosslyn and Sussman 1995; Murray, Schrater, and Kersten 2004; Pollen 1999; Schyns, Goldstone, and Thibaut 1998). Although much of the insight comes from computational experiments, it has been difficult to build large-scale computational models with self-organized hierarchy and feedback. The extensions to LISSOM proposed above, as well as their practical implementation

17.2 Modeling New Phenomena with LISSOM

393

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a)

(b)

(c)

(d)

Fig. 17.1. High-level influence on illusory contour perception. Low-level mechanisms such as contour completion can account for some illusory contour perception phenomena. However, surrounding context can affect how salient the illusory contours are, suggesting that higher levels influence the low-level mechanisms. The patterns in (a) and (b) have the same oval outline in the middle, but only in (b) is the oval seen as a floating illusory object. Similarly, (c) and (d) have the same square outline in the middle, but the illusory square is prominent only in (d). In (a) and (c), the boundary elements are perceived as individual objects at the high level, suppressing the illusory effect. Adapted from Hoffman (1998).

provided by the Topographica simulator (Section 17.4), should make such models possible, making it possible to understand perceptual systems at a new level of detail.

17.2.14 High-Level Influence on Perceptual Grouping

Although synchronization in V1 can explain many perceptual grouping phenomena, feedback from higher levels of visual processing also has an effect in many cases. An interesting question is whether synchronized activities exist in high-level visual and cognitive areas of the brain, and how they might influence low-level perception and behavior.

In fact, correlated spiking has been found in the frontal cortex of awake monkeys. Such spiking forms synfire chains, where a population of neurons firing synchronously activates another population in a successive, feed-forward manner (Abeles 1991; Abeles, Bergman, Gat, Meilijson, Seidemann, Tishby, and Vaadia 1995; Abeles, Bergman, Margalit, and Vaadia 1993; Vaadia, Haalman, Abeles, Bergman, Prut, Slovin, and Aertsen 1995). Further, these chains correlate with the behavioral states of the animal: In a delayed response task, different chains were observed leading to different responses (Prut, Vaadia, Bergman, Haalman, Slovin, and Abeles 1998). Through backprojections, synfire chains could also affect synchrony in low-level areas. For example, Sillito, Jones, Gerstein, and West (1994) showed that to achieve synchrony, the LGN needs feedback from V1. Such observations suggest that synchronized firing indeed exists in higher cortical areas, and it can influence processing in lower levels.

Further evidence for this idea comes from certain complex illusory contour phenomena. For example, even though the central regions in Figure 17.1a and b are identical, the region is salient only in b (the same is true of c and d). This phenomenon suggests that all illusory contours are not based purely on bottom-up activation of V1 or V2 neurons; feedback from higher levels influences them as well. In Figure 17.1a and c, the parallel contours cause the boundary elements to be recognized as objects

394 17 Future Work: Computational Directions

at a higher level, and feedback from their representations suppresses the illusory contour effect.

At this point it is not clear whether the high-level representations are local, such as columns on a map, or distributed, i.e. patterns of activation spread across an area of cortex. If they are local, they can influence lower levels simply by projecting a broad, diffuse set of connections back to the lower level; the neurons receiving these connections will then synchronize with the high-level representation. On the other hand, if the high-level representations are distributed, each neuron may project back just a small, focused set of connections; when the high-level representation synchronizes, the back projections will synchronize the corresponding low-level representation (von der Malsburg 1999).

Computational experiments with PGLISSOM can help distinguish between these alternatives. The model will first have to be extended with high-level object representations, such as localist and distributed representations for parallel lines. With each alternative, the contextual cues can be varied and the synchrony emerging at the lower level observed. Such a model can lead to a computational account on how high-level objects are represented in the cortex, and also how grouping is affected by feedback from higher levels.

Similar computational experiments with PGLISSOM can also be used to understand other phenomena of high-level influence on perceptual grouping. For example, which afterimages are observed after viewing patterns like those in Figure 17.1 (i.e. the boundary elements or the central illusory object) depends on the high-level context (Shimojo, Kamitani, and Nishida 2001); this phenomenon could be modeled with PGLISSOM that combines high-level feedback with short-term adaptation (such as that responsible for the tilt aftereffect; Chapter 7). Similarly, integration of inputs from different sensory modalities involves feedback from higher areas, and can be modeled with PGLISSOM as discussed in the next section.

17.2.15 Multi-Modal Integration

The different sensory modalities are known to interact in the brain: For example, auditory processing and visual perception influence each other (Churchland, Ramachandran, and Sejnowski 1994; McGurk and MacDonald 1976; Repp and Penel 2002; Stein, Meredith, Huneycutt, and McDade 1989), and so do touch and vision (Bach y Rita 1972, 2004; Zhou and Fuster 2000). Once LISSOM is extended to include a hierarchy of sensory areas and feedback from higher levels, it can be used to test hypotheses about how such interactions take place.

One well-documented multi-sensory phenomenon is the coupled development of auditory and sensory areas in the barn owl (Haessly, Sirosh, and Miikkulainen 1995; Knudsen and Knudsen 1985; Rosen, Rumelhart, and Knudsen 1995). The auditory spatial map in the inferior colliculus depends partially on visual input the animal receives during development. A hierarchical LISSOM network can be extended to model this phenomenon by including auditory and visual channels converging on a higher level map. The low-level maps learn to represent the auditory and visual

17.2 Modeling New Phenomena with LISSOM

395

space, and the high level learns associations between the two modalities. With backprojections, the high-level map then correlates the low-level maps as well, resulting in coupled development of the two modalities.

Another important issue is how the sensory information in different modalities is integrated during performance. One possibility is that a higher level area performs the integration (Section 17.2.14; de Sa 1994; de Sa and Ballard 1997). Another possibility is synchronization: When two representations in different modalities are synchronized, they are perceived as part of a single experience. Although coherent oscillations have been found in sensory areas other than vision (such as the olfactory bulb and the auditory system; Eeckman and Freeman 1990; Friedrich and Laurent 2001; Joliot et al. 1994), it is not yet known whether the different modalities are bound together through synchronization. Computational models such as PGLISSOM can be instrumental in testing this hypothesis.

Integration studies require simulating multi-modal brain areas, i.e. regions that receive strong input from multiple sensory modalities. One such candidate is the posterior part of the intraparietal area (PIP), where integration of tactile and visual information has been observed in fMRI studies (Saito, Okada, Morita, Yonekura, and Sadato 2003). A PGLISSOM model of this system would include sensory areas, a higher level area representing the PIP, and feedback mechanisms similar to those discussed in Section 17.2.14. The model would be first validated by matching its spike activity with the fMRI data. Different inputs would then be systematically presented and the resulting synchronization in the PIP and the low-level maps observed. In this way, it would be possible to predict what kinds of representations are activated and synchronized during multi-modal integration tasks. These predictions could then be verified in biological experiments, using e.g. the techniques proposed in Section 16.4.7.

In addition to high-level cortical areas like the PIP, subcortical areas such as the thalamus can contribute to multi-modal integration (Crabtree, Collingridge, and Isaac 1998; Crabtree and Isaac 2002; Hadjikhani and Roland 1998; see Calvert 2001; Sherman and Guillery 2001 for reviews). For example, Choe (2002, 2003a,b, 2004) recently showed how the recurrent activation between the thalamus and different cortical areas can establish analogical mappings between cortical representations (see Jani and Levine 2000; Kanerva 1998 for general neural mechanisms of analogy). Such an approach can be extended to establish mappings between sensory modalities, such as those between orthographic and phonetic representations of words and sharp visual edges and high-pitch tones. Corticocortical connections between different sensory areas initiate such mappings, and the thalamus selects the most appropriate ones among the resulting activity. If the PGLISSOM architecture were to be extended to include the thalamocortical loop, synchronization between different sensory modalities could be used to represent the analogy. Such a model would lead to concrete predictions about how information in different modalities is represented and associated.

A number of other psychophysical phenomena involving multi-modal integration have been described as well (Meredith and Stein 1986; Stein and Meredith 1993). Computational models such as PGLISSOM can serve an instrumental role in formu-

396 17 Future Work: Computational Directions

lating hypotheses about their neurobiological foundations. Such models constitute a significant step toward understanding the multi-modal, multi-level, and recurrent nature of the perceptual system.

17.3 New Research Directions

The LISSOM framework was developed to understand the visual system in computational terms. However, experience with it can serve to motivate research in other areas, and even suggest entirely new research directions. A number of such ideas are reviewed in this section, including theoretical analysis of visual computations, training realistic natural and artificial vision systems, and constructing innate capabilities and complex systems through interactions between evolution and learning.

17.3.1 Theoretical Analysis of Visual Computations

Computational experiments with LISSOM demonstrate how the visual cortex can develop and function based on a number of biologically motivated computational principles. The next step in this direction is to analyze the model theoretically, determining what its computational goals are and which of its mechanisms are necessary to achieve them. The two main directions are defining an objective function for LISSOM self-organization, and characterizing the capabilities of temporal coding.

The ultimate goal in theoretical neuroscience is to understand why the cortical structures exist, i.e. what is their purpose and role in information processing (Arbib,

´

Erdi, and Szentagothai´ 1997; Barlow 1994; Churchland and Sejnowski 1992; Dayan and Abbott 2001; Field 1994; Hecht-Nielsen 2002; Marr 1982; Rao, Olshausen, and Lewicki 2002). With LISSOM, a crucial question is: What is the goal of the selforganizing process? In Chapter 14, the self-organized LISSOM network was shown to form sparse representations by reducing redundancy in the input, and these representations were found to be effective in further stages of information processing. This observation suggests that the goal of self-organization is to form structures that allow representing the visual input in an optimal manner, given the biological constraints. What exactly is the objective function, and what are the constraints?

One possibility is that the process optimizes the ability to reconstruct the input, i.e. maximizes the information retained in visual cortex representations. Alternatively, these representations could be optimized for the needs of further stages in visual processing, such as pattern recognition. The process could be constrained by physical resources in the cortex, such as wiring length, i.e. the extent and total strength of the lateral connections, the total amount of activation in each cortical representation, or the smoothness and continuity of cortical activation patterns (Bell and Sejnowski 1997; Chklovskii, Schikorski, and Stevens 2002; Hochreiter and Schmidhuber 1999; Koulakov and Chklovskii 2001; Olshausen and Field 1997).

Such hypotheses can be verified by constructing a mathematical model with the proposed objective function and constraints, and showing that optimizing it results in processes and structures similar to those in the computational model (Wiskott