Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Computational Maps in the Visual Cortex_Miikkulainen_2005

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
16.12 Mб
Скачать

366 16 Discussion: Biological Assumptions and Predictions

model also leads to several predictions verifiable in traditional plasticity experiments. These predictions are briefly reviewed below; the details are discussed Chapter 6.

A retinal scotoma changes the dynamic equilibrium of the cortex, and the cortical neurons shift their receptive fields and adjust their lateral connections as a result. The model predicts that if the scotoma is large enough so that the neurons responding to its center no longer receive afferent input, their receptive fields should remain unchanged; otherwise, the receptive fields should shift outward and the blind spot should disappear. Second, according to the model, the unstimulated neurons expand their receptive fields because they receive less inhibition from the surrounding neurons. If inhibition before and after the scotoma was blocked (e.g. by bicuculline), such an expansion should not occur. Third, the orientation map reorganizes to represent the boundaries of the scotoma, and these changes are integrated into a smoothly varying map structure.

Similar reorganization occurs after a cortical lesion. Because the neurons in the lesioned area no longer inhibit the neurons right outside, these surrounding neurons begin to respond to inputs to the lesioned area, and the loss of function is smaller than would be expected. The inhibitory connections between these surrounding neurons gradually increase, and their responses to such inputs become weaker, temporarily increasing the loss of function. Over a longer time however, the receptive fields of these neurons shift inward, regaining some of the lost representations again. The model predicts that neurons with orientation preferences perpendicular to the lesion boundary (and therefore most lateral connections from the lesioned area) shift the most, whereas those with parallel preferences are less affected. This prediction could be checked with standard recording and imaging techniques.

The model further predicts that the extent of surviving lateral excitation determines how much of the function is regained. If the neurons on the opposite sides of the damaged area are linked with excitatory connections, the RFs can reorganize as neighbors and take over the lost function completely. Therefore, in early development, when the lateral excitatory radius is large and the connections are less patchy (Callaway and Katz 1990; Dalva and Katz 1994), it should be possible to compensate for larger lesions. The prediction can be checked by inducing lesions at various stages of development of the animal and comparing the extent of reorganization. The model also suggests two mechanisms for speeding up recovery from such lesions. First, by blocking inhibition in the perilesion area after the lesion should eliminate the regressive phase, and allow the receptive fields to shift inward faster. Second, it should be possible to hasten the shift by stimulating the area extensively with inputs originally represented by the damaged area.

If successful, such experiments would provide convincing evidence that the cortex is a continuously adapting system in a dynamic equilibrium with internal and external input, as suggested by the LISSOM model.

16.4.5 Internal Pattern Generation

The main principle of the HLISSOM model is that the shape and distribution of internally generated activity patterns determine how the maps and connections in

16.4 Predictions

367

the visual cortex develop in early life. Further, the model predicts that differences in these patterns serve a purpose in information processing. If the internally generated patterns differ across species, they should have different cognitive and sensoryprocessing abilities as a result.

Internally generated activity has not yet been characterized in sufficient detail to constrain modeling efforts. However, such constraints could possibly be derived indirectly, based on computational grounds. For example, the results in Section 9.2 show that different types of internally generated patterns will result in different types of V1 receptive fields. Uniformly random noise tends to lead to four-lobed RFs, whereas disks alone result in two-lobed RFs. Although orientation maps have been measured in young animals, very little data are available about what receptive fields these neurons might have. In future experiments, both of these structures could be measured at the same time, so that it will be clear what types of RFs lead to orientation selectivity at different stages of map development. Such data should allow narrowing the range of possible models of orientation map development, e.g. by rejecting either those based on uniformly random noise or those based on retinal waves.

In humans, one of the most intriguing predictions is that the internally generated activity should have a specific spatial structure. In Section 10.2.6, a number of training patterns that matched the size of a face and several of its components were shown to result in a preference for faces, as seen in newborn humans. On the other hand, many other patterns did not, including single dots and pairs of dots. The model therefore predicts that in humans, at least some of the internal patterns belong in the first category. As was mentioned in Section 16.2.2, recent advances in imaging equipment may make such measurements possible in the near future (Rector et al. 1997).

Perhaps the most convincing test of the pattern generation idea would be to modify the internally generated activity in animals, measuring whether their brains develop differently in a systematic way. As was discussed in Section 2.3.3, some tests of this type have already been performed, measuring effects on the retina and the LGN. For instance, pharmacological agents can be used to make the retinal wave patterns larger, faster, and more frequent (Stellwagen and Shatz 2002). When the agent is applied to one eye, the LGN layer corresponding to that eye becomes larger. The effect is not simply due to increased metabolic activity or other nonspecific mechanisms, because no changes are seen when the agent is applied to both eyes at once. That is, the pattern of activity determines how the LGN develops, not simply the total amount of activity.

The effects of these manipulations have not yet been measured at the V1 level. HLISSOM predicts that faster waves would make neurons more direction selective, perhaps even resulting in a full direction map at eye opening (unlike in normal darkreared animals; White and Fitzpatrick 2003). If the waves were made smaller than the RFs, the V1 neurons should become less selective for orientation, because they would not have seen large oriented edges during development. Once PGO wave patterns have been measured, similar modifications could be performed on them, leading to similar results. Each change in the pattern can be simulated in HLISSOM, generating a specific, testable prediction about how the change will affect V1.

368 16 Discussion: Biological Assumptions and Predictions

In the long term, detailed analysis of the pattern-generating mechanisms in the brainstem could provide clues to how the pattern generators are specified in the genome. These studies would also suggest ways for representing the generator effectively in a computational evolutionary algorithm, which is currently an open question. Studies of genetics and phylogeny, perhaps paired with computational simulations, could then possibly determine whether pattern generation was crucial in producing complex organisms that could adapt to the environment postnatally. Such studies could be instrumental in understanding the evolutionary origins of complex abilities and adaptation in higher animals.

16.4.6 Face Processing

In Chapter 10, the pattern generation hypothesis was extended to explain how face preferences in human infants could arise, and how they could change during early life. The model leads to several concrete predictions that can be tested in psychological experiments with human infants.

For instance, HLISSOM suggests that the precise spacing of a face outline is not crucial for newborn face preferences. This prediction can be tested in human infants by presenting facelike patterns with a variety of outline shapes; the model predicts that the response will be similar regardless of the shape of the outline. Similarly, HLISSOM predicts that newborns will prefer facelike patterns to ones where the eyes have been shifted over to one side. This prediction contrasts the alternative “topheavy” explanation for face preferences, and can be tested in similar experiments as the effect of outline.

A second set of predictions concerns the preferences that develop postnatally, with exposure to real faces. Pascalis et al. (1995) showed that covering the hair outline was sufficient to suppress infant’s preference for his or her mother. HLISSOM predicts that covering the facial features will have a similar effect, i.e. that newborns learn faces holistically. This prediction can be tested using methods like those of Bartrip et al. (2001) and Pascalis et al. (1995), using human newborns. HLISSOM also predicts that older infants will continue to prefer realistic face stimuli, such as photographs or video of faces, even though they stop preferring schematic faces in the periphery by 2 months of age.

Nearly all of the proposed experiments can be run using the same techniques already used in previous experiments with human infants. They will help determine how specific the newborn face preferences are, and clarify what types of learning are involved in postnatal development of these preferences.

16.4.7 Synchronization

The PGLISSOM model shows that neural synchrony can serve as a foundation for perceptual representations in the visual cortex. It may be possible to verify this idea in future biological experiments, and also demonstrate how such representations are constructed and maintained.

16.4 Predictions

369

First, the contour integration experiments in Chapter 13 suggest that how strongly and reliably the contour is perceived depends on how strongly synchronized the neural representations of the contour elements are. If synchronization were to be disrupted artificially, the perception of the contour should disappear. Such an experiment might be possible in the future using transcranial magnetic stimulation (TMS; Barker, Jalinous, and Freeston 1985; Hallett 2000; Lancaster, Narayana, Wenzel, Luckemeyer, Roby, and Fox 2004; Walsh and Cowey 2000). While the usual effect of TMS is to excite or inhibit an area of neurons temporarily, it might be possible to affect only their synchronization in three ways: (1) TMS could be applied to a related visual area, which could indirectly disrupt synchronization in the area of study; (2) with very low intensity, TMS could have a subthreshold excitatory effect, causing the target group of neurons to fire more rapidly and out of synchrony with other neurons; and (3) high-frequency repetitive TMS (1–60 Hz rTMS; George, Wassermann, Williams, Steppel, Pascual-Leone, Basser, Hallett, and Post 1996) at low intensity could be applied at the same or different phases at two separate sites, driving the firing of the neurons and inducing artificial synchrony or desynchrony. Applying such techniques or some combination of them while the subject is performing a perceptual grouping task should degrade his or her performance in the contour integration task without affecting the perception of the elements themselves.

Such methods could perhaps be even more illuminating and reliable if paired with simultaneous electro-encephalogram (EEG) recordings that measure the changes in brain activity resulting from the TMS (Ilmoniemi, Virtanen, Ruohonen, Karhu, Aronen, Na¨at¨anen,¨ and Katila 1997). If TMS is found to result in a particular perceptual change and at the same time a characteristic change in the EEG, it might even be possible to develop a way to measure synchrony directly from the EEG. Such techniques would be highly useful in verifying whether synchronization indeed is the underlying representation for perceptual grouping.

Second, the results in Section 12.2.1 show that adapting the PSP decay rate is a possible way to modulate synchronization behavior in neurons. The effects of decay are similar to those of adjusting delay of signal propagation between neurons, which has been proposed before as a possible synchronization mechanism (Eurich et al. 2000; Gerstner 1998a; Horn and Opher 1998; Nischwitz and Glunder¨ 1995; Tversky and Miikkulainen 2002). At this point there is no conclusive biological evidence to support either process. However, although axonal morphology can change over time and affect the delay, it would be difficult to make such changes fast and accurately enough (Eurich et al. 1999; Stevens et al. 1998). On the other hand, PSP decay may be easier to adjust, for example by controlling the properties of ion channels in the dendrites. It is therefore a good candidate for modulating synchrony.

Experimental techniques exist for measuring the various sources of temporal lag in the neuron (Nowak and Bullier 1997). Using similar techniques, it may be possible to verify whether the dendritic membrane potential decays at different rates at different locations in neurons, and also whether the rates depend on the synapse type (such as glutaminergic vs. GABAergic). If two neurons far apart are found to synchronize despite a long delay, the model predicts that their decay rates could be compensating for the delay. Further, using voltage or current clamping, it might be possible to con-

370 16 Discussion: Biological Assumptions and Predictions

trol the decay of the PSP, and measure the effect on synchronization. In this way, it might be possible to verify that PSP decay rate could have a modulating effect.

Third, PGLISSOM simulations showed how a longer absolute refractory period could help synchronize neural activity even under noisy conditions (Section 12.4.3). Such an effect has not been demonstrated biologically, but it is nevertheless interesting to speculate how it could be utilized in biological systems. Given that homeostatic mechanisms have been found for regulating various other aspects of neuronal behavior (Turrigiano 1999), perhaps their refractory periods adapt as well. Synchronization would then be possible over different ranges of noise and variability. This hypothesis could be tested by changing the noise environment of a synchronized group of neurons and observing whether the refractory periods change to compensate.

Even if the refractory periods did not adapt dynamically, they could have been adapted through evolution. Assuming that synchronized activity will be found in different parts of the brain and in different species, the hypothesis could be tested by measuring the refractory periods of neurons located in different noise environments. Neurons operating under significant noise should have longer refractory periods than those where the signal is relatively free of noise. Further, it might be possible to strengthen synchronization artificially in such neuron populations. By artificially lowering the membrane potential immediately following a spike, the refractory period could be increased, which should result in more highly synchronized neuron activities in noisy conditions.

Fourth, fast adaptation of lateral excitatory connections in PGLISSOM promotes synchrony among the connected regions. Such adaptation has been proposed to be crucial for dynamic feature binding in general (Crick 1984; Sporns et al. 1991; von der Malsburg 1981, 2003; Wang 1996), and similar short-term plasticity has recently been observed in rat prefrontal cortex (Hempel, Hartman, Wang, Turrigiano, and Nelson 2000). However, it is not clear whether such plasticity depends on coinciding presynaptic and postsynaptic activity, and whether it indeed facilitates synchrony. To answer these questions, it will be necessary to measure plasticity and synchronization in the same experiment, perhaps by combining the techniques used by Hempel et al. (2000), Eckhorn et al. (1988), and Gray et al. (1989). Since short-term plasticity is believed to depend on Ca2+ channels (Fisher, Fisher, and Carew 1997; Hempel et al. 2000; Regehr, Delaney, and Tank 1994), it might also be possible to test the hypothesis directly by disabling short-term plasticity with Ca2+ antagonists such as nifedipine, or with Ca2+ channel blockers such as diltiazern (Jensen and Mody 2001). If synchronization becomes less robust, it suggests that fast adaptation indeed plays a role in establishing synchrony.

Such investigations of cellular mechanisms can lead to a deeper understanding of perceptual representations, including how neural synchrony occurs and how fine tuning of temporal behavior is possible.

16.4.8 Perceptual Grouping

The PGLISSOM model shows how input-driven self-organization can result in different perceptual grouping performance across the visual field (Section 13.4). This

16.4 Predictions

371

result leads to interesting predictions about the statistics of visual input and the anatomical foundations of perceptual grouping, as well as the role of internal pattern generation in the initial construction of the system. The assumption that grouping and self-organization are based on the lateral connections of neurons in different layers of the cortex may also be tested experimentally.

First, the PGLISSOM model can explain the observed performance differences assuming that the visual input statistics differ in specific ways across the visual field. This assumption should be verified in direct image measurements. Using an eyetracking method similar to that of Reinagel and Zador (1999), statistics about edge frequency, curvature, and occlusion can be collected at different visual regions. To validate the model, edges and occlusions should be more frequent and curvature higher in the fovea vs. periphery and in the lower vs. upper hemifield. Further, the same method could be used to measure other input properties, such as color, contrast, and spatial frequency. The results should help characterize the structure of visual scenes and understand how visual attention biases the sensory statistics, forming a foundation for further computational modeling and psychophysical experiments of grouping performance.

Second, PGLISSOM predicts that if different areas of the visual cortex receive different inputs during development, they will develop different patterns of lateral connections. These patterns in turn cause perceptual performance to differ in corresponding areas of the visual field. Such predictions can be tested experimentally by manipulating the training inputs: Lateral connectivity can be measured in upper vs. lower hemifield and fovea vs. periphery to see whether there are any differences between these areas as the model suggests (Section 13.4.2); or, the cause of the differences can be experimentally tested by rearing animals in controlled visual environments where the input distributions in these areas are systematically altered.

For example, animals can be fitted with eye glasses that flip the input to the upper and lower hemifield. After the critical period, the connectivity patterns and contour integration in the lower and upper hemifield can be measured and compared with normally reared animals. If connectivity is genetically determined, there should be no noticeable difference to the control animals. In contrast, PGLISSOM predicts that the more cocircular connectivity and better contour integration would develop in the upper hemifield instead of the lower hemifield. Such an experiment would verify both that the contour detection depends on the connectivity and that the connectivity can be explained as an effect of input-driven self-organization. Finding such evidence would be an important step toward understanding how functional differences arise in the visual cortex.

Third, PGLISSOM can be brought together with internal pattern generation to suggest how prenatal and postnatal self-organization each contribute to constructing an effective contour integration circuitry. As was discussed in Section 13.2.2, essential for contour integration is that the lateral connections have a cocircular profile, matching the distribution of edges in natural images. Interestingly, PGLISSOM does not have to be trained with natural images to obtain such connectivity: It also develops with oriented Gaussian inputs. Such inputs might indeed be available during the prenatal development of the visual cortex: After the retinal waves are filtered by

372 16 Discussion: Biological Assumptions and Predictions

the LGN, they would appear as collections of smooth, oriented, and locally straight Gaussians to the developing V1 (Section 16.2.1). Although such Gaussians would be relatively short, they should be enough to develop a rudimentary contour integration ability prenatally. The circuitry would then be further refined postnatally with visual inputs, resulting in adult performance.

While it would be difficult to measure contour integration performance in newborns and in animals, it might be possible to measure synchrony in animals using multi-cellular recording techniques (Eckhorn et al. 1988; Gray et al. 1989). The prediction could then be verified by presenting contours to animals of different ages and comparing the extent of synchronized groups that form. Such an experiment would provide further evidence for prenatal self-organization, and lead to a precise understanding of how the ability to integrate contours develops.

Fourth, the model suggests that due to the opposite requirements for grouping and self-organization, functional differences might exist in the layered architecture of the visual cortex (Section 16.3.3). This prediction is difficult to verify with current experimental techniques, but it might become feasible in the future. For example, if deep or shallow layers could be selectively disabled, or the intracolumnar connections between the layers disrupted, their functions would become decoupled. Progress of development and the ability to form synchronized groups could then be measured separately in the deep layers and compared with that in the shallow layers. The PGLISSOM model predicts that the shallow layers would not properly selforganize and the deeper layers would not synchronize. Such an experiment would lead to significant insights into how cortical structures implement function.

16.4.9 Sparse Coding

One of the important predictions of the LISSOM model is that the inhibitory selforganized lateral connections decorrelate, reducing redundant activity and producing a sparse coding of the visual input (Sections 14.2 and 17.3.1). Such representations are efficient in energy and space, allowing more inputs to be represented with fixed resources, and also form an effective foundation for later stages of visual information processing.

In LISSOM, sparse representations emerge as a necessary component of the selforganizing process: The responses are focused into local neighborhoods so that distinct representations for the different parts of the input space can be formed. Indirect evidence for such focusing already exists from direct recordings of cortical activity. Pei, Vidyasagar, Volgushev, and Creutzfeldt (1994) found that the orientation selectivity determined from excitatory PSPs (and not action potentials) became increasingly sharper with time. Consistent with the LISSOM model, they independently attributed this phenomenon to intracortical excitation and inhibition.

Sparse coding implies that the information about a given visual image is represented by a small number of neurons, and conversely that each neuron conveys a significant amount of information about the input. The LISSOM network indeed focuses the initial activation in a few iterations to the best-responding neurons, and it is possible to identify what type of stimulus would have caused it to be active. Such

16.5 Conclusion

373

a process is consistent with the response properties of neurons. The neurons divide up the task of representing the input space, and respond only to particular sharply defined regions of it. For a typical neuron in the primary visual cortex, a response of 10 action potentials following one brief stimulus presentation was found to be sufficient to classify the stimulus into a relatively small region in stimulus space, with a high degree of confidence (Geisler and Albrecht 1995).

With optical imaging techniques, it should also be possible to verify the sparse coding prediction directly. The time resolution of current imaging techniques (e.g. Senseman 1996) should be sufficient to observe how the sparseness of cortical activity changes in response to visual input, presumably due to the recurrent lateral interactions in the cortex. Sparseness can be quantitatively estimated by calculating the kurtosis of the activity (Barlow 1972; Field 1994), as was done with the LISSOM experiments of Chapter 14. For any given input, the kurtosis should increase with time as the activity becomes more focused and stable.

Any manipulation that affects the lateral interactions should affect the sparse code. Suppressing lateral inhibition, for example, would be expected to result in less sparse responses. It should be possible to apply an inhibitory blocker, such as bicuculline, and measure the kurtosis of cortical activity through optical imaging. If lateral inhibition plays a significant role in producing the sparse code, the less inhibited responses should be less sparse. Another experiment could focus on selectively preventing the long-range lateral connections from self-organizing during development. In such a case, the kurtosis of cortical activity should be much lower than in the normal case. As a consequence, neurons should also be far less selective for orientation and spatial frequency. The experiment would show that the activity coding in the cortex closely depends on the organization of lateral connections. The last step would be to verify that the sparse representations indeed constitute an advantage in information processing. Such experiments are difficult to design and conduct conclusively; however, if it is possible to prevent proper self-organization of lateral connections during development as outlined above, such an animal should have an impaired ability for visual pattern recognition due to a less efficient coding of visual inputs.

While sparse coding is currently a computational hypothesis consistent with biological data, the above experiments in the near future could establish it as a fundamental information processing principle in the visual cortex.

16.5 Conclusion

The LISSOM framework is based on a number of assumptions about the connectivity, adaptation, and temporal coding in the cortex. Although there is not yet sufficient evidence to verify these assumptions completely, they are plausible, and lead to testable predictions. The next chapter will discuss how LISSOM can be extended to explain new phenomena, and how it can serve as a starting point for new research directions.

17

Future Work: Computational Directions

The research discussed in this book can serve as a starting point for a wide variety of future computational investigations. Several such studies continue the research described in the individual chapters, and have already been discussed in those chapters. Others are new directions, and will be described in this chapter. First, the LISSOM architecture can be extended in several ways to match biological systems in more detail. Second, LISSOM models can be built to understand several new visual phenomena, both in V1 and in other visual areas. Third, LISSOM can be used as a starting point for new research directions, including modeling other cortical areas and building artificial vision and other information processing systems. The Topographica simulator, a publicly available computational tool for modeling cortical maps, is designed to make these future extensions practical to simulate and understand, as described in the end of this chapter.

17.1 Extensions to the LISSOM Mechanisms

The LISSOM equations and architecture are designed to account for a large fraction of the experimental results on the development and function of the primary visual cortex, while being as clear and parsimonious as possible. To more closely match specific experimental results, more complex mechanisms can be included in the model, as described in this section. These extensions should not change the fundamental principles and results of the model, only make it biologically less abstract.

17.1.1 Threshold Adaptation

One of the most important components of any detection or recognition system is the activation threshold. If set properly, the threshold allows the system to respond to appropriate inputs while ignoring low-level noise and other nonspecific activation. For instance, the threshold can allow an orientation-selective neuron to respond to stimuli near its preferred orientation, and not to other orientations. In terms of signal detection theory, the threshold is a balance between the false alarm rate and the false

376 17 Future Work: Computational Directions

positive rate; in general this tradeoff is unavoidable (Egan 1975). Thus, it is important to set the threshold appropriately.

In LISSOM, the activation thresholds (θl and θu in Equation 4.4) are set by hand through trial and error. This procedure can be seen to correspond to evolutionary adaptation, which results in animals that are born with appropriate threshold values. However, biological systems make use of threshold adaptation as well. Homeostatic regulation mechanisms in general have recently been discovered in a variety of biological systems (see Turrigiano 1999 for a review). Some of these regulatory mechanisms are similar to the weight normalization already used in LISSOM (Equation 4.8). Others more directly adjust how excitable a neuron is, so that its responses will cover a useful range (i.e. neither always on nor always off). In particular, Azouz and Gray (2000) found the neuron’s firing threshold to be proportional to the rate of depolarization in the PSP: When the PSP rises rapidly, neurons automatically raise their thresholds.

An automatic method of setting thresholds would also be computationally desirable for several reasons: (1) It is time consuming for the modeler to find an appropriate threshold value; (2) the threshold setting process is subjective, which often prevents rigorous comparison between different experimental conditions (e.g. to determine what features of the model are required for a certain behavior); (3) different threshold settings are needed even for different types of input patterns, depending on how strongly each activates the network (e.g. high-contrast schematic patterns vs. low-contrast real images); and (4) the optimal threshold value changes over the course of self-organization (because weights become concentrated into configurations that match typical inputs, increasing the likelihood of a response).

A first step in this direction is the mechanism used in the PGLISSOM simulations of Chapter 11, where the spiking threshold θb was set automatically based on input activity. A fixed percentage, usually 50 to 65%, of the maximum input activity over the map was set as the threshold at the beginning of each settling iteration (Appendix D). Another approach is to increase the threshold if the neuron has been highly active in the past, and decrease it if not (Burger and Lang 2001; Gorchetchnikov 2000; Horn and Usher 1989). With such mechanisms, the map responds to all inputs with roughly the same level of total activation. However, this level still needs to be adapted by hand during self-organization to make sure that the responses became gradually sparser. How to do it automatically, perhaps utilizing regulation mechanisms similar to those observed in biology, is currently an open question.

Adding automatic mechanisms for setting the thresholds would make the model significantly more complex mathematically. Even so, it would make the model biologically more realistic, and also much simpler to use in practice, particularly with real images.

17.1.2 Push–Pull Afferent Connections

Neurons in adult V1 retain their selectivity over wide variations in contrast; how strongly they respond depends primarily on how well the input matches their preferred features such as orientation, ocularity, and direction (Sclar and Freeman 1982).