Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Computational Maps in the Visual Cortex_Miikkulainen_2005

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
16.12 Mб
Скачать

2.4 Temporal Coding

35

Receptive field

Moving light bar

(a) Strong

(b) No

(c) Weak

synchronization

synchronization

synchronization

Fig. 2.10. Synchronization of one and two input objects in the cat. Moving bars of light were presented on two locations in the cat visual field where the receptive fields had no overlap, and the level of synchronization in the corresponding areas in the visual cortex were measured. (a) A single light bar moving across two receptive fields results in strong synchronization between the two neuronal populations. (b) Two separate bars moving in opposite directions result in no synchronization. (c) Two separate bars moving in the same direction result in weak synchronization. These results suggest that synchronization may indeed represent how likely the inputs are to belong to one and the same object. Adapted from Gray et al. 1989.

1993). These results suggest that synchronized firing of distant populations of neurons may represent the percept of a single coherent object, and desynchronized firing that of separate objects.

Another piece of evidence for synchronization-based grouping was obtained by manipulating the temporal properties in the visual input. Usher and Donnelly (1998) presented inputs where the object to be detected and the background were either flashed in synchrony (both object and background blink at the same time) or flashed asynchronously (object and background blink at different phases) over a period of time. The time-scale of the flashing was shorter than the integration time of the visual system so that such flashing could not be consciously perceived. Given such input, the subjects were asked to identify where the object appeared among one of four areas in the background. The percentage of correct responses turned out to be consistently higher when the object and background were flashing asynchronously. The percentage of correct responses increased as the phase difference between the flashing of object and background was increased. The explanation was that the timing of the inputs caused the temporal properties of neuronal firing to change and in turn caused the detection performance to differ. Flashing the object and background at different times would cause a slight phase shift between the neurons representing them, and such a shift helped distinguish the object from the background. Similar results have been reported by Fahle (1993), Lee and Blake (1999, 2001), Leonards and Singer (1998), Leonards, Singer, and Fahle (1996), Meyerson and Palmer (2004), Palmer (1999), and Wehrhahn and Westheimer (1993).

These results suggest that synchronization may indeed signal coherence in neural representations. The next issue is, how this synchronization is represented, i.e. what exactly is synchronized in the representation?

36 2 Biological Background

2.4.3 Modes of Synchronization

There are two ways in which synchrony can occur: (1) Individual neurons can be firing at the same time, and (2) population activity, i.e. number of neurons in the population firing per unit time, can oscillate in synchrony. Population oscillations are more general and include synchronized firing as a special case. They are also biologically the more likely candidate for several reasons.

Due to the stochastic nature of neuronal firing, it seems unlikely that individual neurons could synchronize their actual firing events. However, they could fire within a short time window so that the spikes are approximately aligned, and the whole group could exhibit synchrony (Lisman 1998; Menon 1990). Theoretical analysis also suggests that the oscillations found in the cortex result from a collective behavior of neurons. Such population oscillations are more robust and tolerant of random fluctuations (Menon 1990; Wilson and Cowan 1972).

In direct multi-electrode measurements, Eckhorn et al. (1988) discovered that synchrony in individual neurons is hard to find even when the number units firing and local field potential shows coherent oscillation, suggesting that population oscillation is the major mode of operation for binding of percepts. There is also indirect experimental evidence to support this hypothesis. When two almost simultaneous clicks are presented to a subject, they are initially heard as a single click, but as the interval between the clicks increases, the subject starts hearing two clicks instead of one. Interestingly, this transition from one click to two clicks occurs exactly at the frequency of population oscillations (Joliot, Ribary, and Llinas´ 1994). Apparently, the neuronal firing events within a single oscillation cycle are bound together even though the exact timing does not match, whereas the firings that occur in different cycles are perceived as separate.

For these reasons, most of the synchronizing models, including the model in this book, adopt the definition of synchrony in terms of population oscillations rather than that of individual neurons. Synchronization will be used in LISSOM to explain binding and segmentation phenomena, especially contour integration performance in humans. The detailed psychophysical evidence for these phenomena will be reviewed in Chapter 13.

2.5 Conclusion

Although the structure of the visual cortex has been well understood for several decades, the dynamic processes that develop this structure, maintain it, and represent visual information are still not well known. Lateral connections are believed to play a large role in all these processes, they may involve a synergy of nature and nurture through self-organization based on internal pattern generators, and achieve binding and segmentation through synchronization of activity.

Whereas such hypotheses are difficult to verify directly on biological systems, they can be implemented in computational models. Computational tests can lead to concrete predictions, and through further experiments, to a thorough understanding

2.5 Conclusion

37

of the mechanisms underlying visual perception. Such an understanding is the main goal of this book. The computational principles and approaches on which it is based will be outlined in the next chapter.

3

Computational Foundations

As seen in the previous chapter, the visual system is a highly complex dynamical system, and it is difficult to integrate the scattered experimental results into a specific, coherent understanding of how the system is constructed and how it functions. A computational model provides a crucial tool for such integration: It constitutes a concrete implementation of the theory. Because all of its components must be implemented for the model to work, unstated assumptions must be made explicit. The model then shows what types of structure and behavior follow from those assumptions. The model can be tested just like animals or humans can, either to validate the theory or to provide predictions for future experimental tests.

This book introduces a comprehensive computational model of the visual cortex, built on findings from the past 30 years of research in computational neuroscience. These computational foundations are reviewed in this chapter, including the general models of neural computation, temporal coding, adaptation, and self-organizing maps.

3.1 Computational Units

A crucial issue in any computational model is what the appropriate level of abstraction is. Although in theory we could dissect and model each neuron at the smallest level of detail allowed by current technology (i.e. at the molecular level), in practice only a few measurements of the system parameters would be available at that level, and the resulting model would be largely underconstrained. Superfluous detail can also make it difficult to understand the model and to generate predictions based on it. Fortunately, such detailed simulations are often unnecessary for understanding high-level behavior, and more efficient abstractions can be used.

In this section, the models of computation in neurons and neuronal groups are reviewed at various levels of abstraction, evaluating which of their properties will be useful for understanding how the visual cortex develops and functions. Detailed models of the neuron are reviewed first, followed by gradually higher level abstractions (shown in Figure 3.1). As was discussed in Section 2.1.2, a cortical column is

40 3 Computational Foundations

an appropriate computational unit for visual cortex models; the conclusion from this section is that integrate-and-fire and firing-rate models of the cortical column most efficiently capture the properties needed to understand their collective behavior.

3.1.1 Compartmental Models

Because neurons are generally believed to communicate through action potentials, or spikes, the most detailed models of the neuron focus on how spikes are generated and transmitted. The electrical currents that lead to spike generation are controlled by various ion channels in the fatty (lipid) membrane that encloses the cell body. The voltage across the membrane (i.e. the membrane potential) changes as these ions come in and go out of the neuron through the ion channel, and it is this voltage that determines whether the neuron generates a spike. Such dynamic change in state over time gives the neuron rich temporal dynamics that can be used to encode information.

Understanding the behavior of a neuron begins by modeling a small patch of the neuron membrane. Computational models of such patches are usually based on the Hodgkin–Huxley model of excitable membranes (Hodgkin and Huxley 1952). It consists of coupled differential equations for the membrane potential V and the fraction ci of ion channels open for each channel type i (Gerstner and Kistler 2002; Rinzel and Ermentrout 1998):

C

dV

= − Ii(ci, V ) + I (t),

 

dt

 

 

 

 

i

(3.1)

 

dci

 

 

ci − ci,∞(V )

 

=

,

 

 

 

dt

 

τi(V )

 

where C is the membrane capacity, Ii(ci, V ) is the current through ion channel i, and I (t) is the externally applied input current. For a fixed membrane potential V , ci approaches the steady-state level ci,∞(V ) with the time constant τi(V ).

The Hodgkin–Huxley equation only describes an isolated patch of membrane, i.e. a single compartment. To model an entire neuron, it is first divided into major morphological sections corresponding to the axons, dendrites, and the cell body (Figure 3.1b). Each section is treated as an electrical conductor, usually represented as a cylinder or a cable (Rall 1962, 1977; Rall and Agmon-Snir 1998). Equations for the electrical behavior of a long cable can be solved analytically in simple cases, but for a realistic neuron model they need to be solved numerically. To do so, the cylinders are decomposed into smaller discrete compartments, each described using a membrane equation similar to Equation 3.1. The model for an entire neuron thus consists of a set of compartments, each with specific membrane properties and membrane voltage, all connected using electrical circuit theory. These models are simulated using software like NEURON (Hines and Carnevale 1997) or GENESIS (Bower and Beeman 1998) that are specifically designed to determine the appropriate compartments and solve the equations governing their electrical behavior (see e.g. Bower and Beeman 1998; Dayan and Abbott 2001; Lytton 2002 for reviews).

3.1 Computational Units

41

 

 

 

 

 

 

(a) Neuronal network

(b) Compartmental

(c) Coupled oscillator

(d) Integrate-and-fire /

neuron model

neuron model

firing-rate model

(e) Integrate-and-fire / firing-rate model of the network

Fig. 3.1. Computational abstractions of neurons and networks. Biological neurons can be modeled at different levels of abstraction depending on the scale of the phenomena studied. (a) A microscopic image of pyramidal cells in a 1.4 mm × 0.7 mm area of layer III in macaque temporo-occipital (TEO) area, injected individually with Lucifer Yellow (reprinted with permission from Elston and Rosa 1998, copyright 1998 by Oxford University Press; circle added). Although this technique shows only a fraction of the neurons in a single horizontal cross-section, it demonstrates the complex structure of individual neurons and their connectivity. (b) A detailed compartmental model of the top left neuron (circled). Each compartment represents a small segment of the dendrite, and connections are established on the small dendritic spines, shown as line segments. (c) A coupled oscillator model of the neuron, consisting of an excitatory and an inhibitory unit with recurrent coupling, and weighted connections with other neurons in the network. (d) A model where a single variable describes the activation of the neuron, corresponding to either the membrane potential (in the integrate-and-fire model), or the average number of spikes per unit time (in the firing-rate model). (e) A high-level model of a neuronal network. With the more abstract neurons, it is possible to simulate a number of neurons and connections, allowing us to study phenomena at the level of networks and maps.

42 3 Computational Foundations

Compartmental models allow neurons to be represented in arbitrarily fine detail. They can be used effectively when experimental data are available to provide the parameters for the model, such as the size and shape of compartments and the distributions of ion channels (Doya, Selverston, and Rowat 1995). In such cases, they can be used to generate a very close fit to experimental data. For example, there are detailed models of cortical pyramidal neurons (Mainen and Sejnowski 1998) and cerebellar Purkinje cells (De Schutter and Bower 1994a,b).

However, large-scale cortical structures such as orientation maps are composed of millions of neurons, each making thousands of connections (Wandell 1995). Detailed data are available for only a very small sample of these neurons, and billions of parameters would have to be chosen arbitrarily for a compartmental model of such a map. The large number of components would make it difficult to understand its behavior, e.g. to determine which components are responsible for particular computations. Also, currently it is possible to simulate only a few neurons in such detail, due to limitations on computer memory and processing time.

The structures and phenomena studied in this book, i.e. cortical maps and perceptual behavior, depend crucially on having large numbers of neurons; on the other hand, they are not assumed to be sensitive to detailed membrane processes of individual neurons. Models of such phenomena must (and can) therefore use higher level abstractions of computational units. There are three major classes of such abstract models: coupled oscillators, integrate-and-fire neurons, and firing-rate neurons. These abstractions allow simulating large numbers of neurons and their connections, so that they theories about large-scale phenomena in the cortex can be tested. Each model will be described below in turn.

3.1.2 Coupled Oscillators

Coupled oscillator models focus on the temporal dynamics of pairs of neurons or neuron groups. The dynamics of each oscillator are determined by two variables x and y, representing the states of two coupled units, one of which is inhibitory, the other excitatory (Figure 3.1c; Horn and Opher 1998; Sabatini, Solari, and Secchi 2004; Terman and Wang 1995; von der Malsburg 1987; von der Malsburg and Buhmann 1992; Wang 1995, 1996; Wilson and Cowan 1972; some, like Chakravarthy and Ghosh 1996, use a single complex variable instead). The units are connected into a recursive loop where the excitatory unit activates the inhibitory unit, which in turn inhibits the excitatory unit. The activities of the units can be described with coupled differential equations that can be written in several different forms. In an example due to Terman and Wang (1995) and Wang (1999),

dxdt = f (x) − y + z,

(3.2)

dydt = [g(x) − y],

z is the input, is the coupling strength between the two units, and the functions f (x) and g(x) are chosen so that robust oscillation results. For example, with the

3.1 Computational Units

43

cubical hyperbola f (x) = 3x − x3 + 2, the height a and the slope b of sigmoid g(x) = a[1 + tanh(x/b)] can be tuned to obtain a robust limit cycle. When x rises in

this system (initially due to external input z), f (x), g(x) and y increase. Once f (x) starts to decrease, inhibition from y effectively turns x off. As a result, y also turns off, and the cycle repeats.

It is possible to interpret such an oscillator as a single neuron where the excitatory unit represents the membrane potential, and the inhibitory unit the change in potential resulting from ionic channel activation and deactivation (Wang 1999). However, more commonly, each of the units in the oscillator is interpreted as a pooled activity level of a population of neurons of the same cell type (pyramidal for the excitatory and stellate for the inhibitory unit), residing in the same cortical column (Menon 1991; Wang 1996). The oscillators can also be connected into a network, and based on the sign of the connection, their phases can become synchronized or desynchronized. Such coupled oscillator networks have been used in segmentation and binding tasks. For example, images such as aerial photographs or brain scans can be segmented into homogeneous regions (Liu and Wang 1999; von der Malsburg and Buhmann 1992), and speech can be segmented from background noise (Wang and Brown 1999). In each of these applications, desynchronization across oscillators represents segmentation and synchronization represents binding, establishing a temporal code.

One important advantage of coupled oscillator models is that they include only two variables, which makes them easier to analyze than compartmental models (FitzHugh 1961; Nagumo, Arimato, and Yoshizawa 1962). Unit activities can be represented in two-dimensional phase portraits, and behaviors such as limit-cycle oscillations identiÞed. Even large-scale phenomena may sometimes be described theoretically (see Rinzel and Ermentrout 1998; Wang 1999 for reviews).

In summary, the coupled oscillator offers a description of the neuron at a higher level than the compartmental model does, allowing it to be analyzed more easily and used in applications. However, a further more efÞcient abstraction is still possible without losing the ability to perform temporal coding. Such a model is based on a single variable describing the membrane potential, as will be described in the next section.

3.1.3 Integrate-and-Fire Neurons

In the integrate-and-Þre approach, a single variable corresponding to the membrane potential of a neuron is used to describe the state (Figure 3.1d). Such neurons accumulate the membrane potential from incoming signals, generate a spike when it exceeds a threshold, and reset the potential after each spike. A typical formulation of the general idea is

 

dV

V

 

 

C

 

= I(t)

 

,

(3.3)

dt

R

where V is the membrane potential, C its capacitance, R its resistance, and I(t) is the input current (Lapicque 1907; see Gabbiani and Koch 1998 for a review). The effect of the incoming activity I(t) is to build up the membrane potential over time. The −V /R, the leak term, retards the rise of the potential, and without further input,

44 3 Computational Foundations

eventually returns it to the baseline level. Consequently, this model is also called the leaky integrate-and-Þre neuron (Campbell, Wang, and Jayaprakash 1999; Nischwitz and Glunder¬ 1995). When the membrane potential rises to the threshold level, the neuron spikes, and the potential is reset to the baseline. Such dynamics capture the aggregate behavior of the compartmental model well, and can be implemented efÞciently computationally.

Several variations of the basic integrate-and-Þre model have been proposed, and there are also formulations that unify many of them in a single framework (Gerstner 1998b; Hoppensteadt and Izhikevich 1997; Izhikevich 2003). A particularly efÞcient variation is the dynamic threshold model (Eckhorn et al. 1990; Reitboeck et al. 1993). The threshold is increased acutely after the neuron Þres, and then decayed over time, simulating the refractory period of the neuron. Both the leaky synapse and the dynamic threshold are formulated using the same leaky-integration mechanism, implemented through convolution ( ):

x(t) = X(t) K(t),

(3.4)

where x(t) is the membrane potential or the threshold at time t and X(t) is the impulse input representing a received or generated spike. The convolution kernel K(t) is deÞned as

K(t) =

e−λt if t ≥ 0,

(3.5)

 

0 otherwise,

 

where λ is the decay rate. A spike generates a single exponentially decaying potential over time, and multiple spikes generate a superposition of multiple decaying potentials.

The convolution can be calculated using the digital Þlter equation (Eckhorn et al. 1990) as

x(t) = X(t) + x(t

1) e−λ,

(3.6)

 

 

 

where t increases in discrete time steps. Any input from X(t) causes a jump in x(t), which then decays over time by the factor e−λ. With this simple recursive equation, complicated neuron dynamics can be calculated efÞciently. The temporal structure of the events is abstracted into a single variable, without storage or repeated calculations, which is ideal for large-scale simulations.

The integrate-and-Þre model is efÞcient and theoretically well understood. Closedform analytical solutions exist for simple cases, and even large networks can be analyzed theoretically (Gabbiani and Koch 1998; Gerstner and Kistler 2002; Meunier and Segev 2002). It has been used in several applications, including image segmentation of both static and moving objects, auditory analysis, motor control and reaching, range-image segmentation, sequence memory, and temporal pattern recognition (Campbell et al. 1999; Eckhorn et al. 1990; Glover, Hamilton, and Smith 2002; Hugh, Laubach, Nicolelis, and Henriquez 2002; Kuhlmann, Burkitt, Paolini, and Clark 2002; Rehn and Lansner 2004; Reitboeck et al. 1993; Sohn, Zhang, and Kaang 1999). It will also be used in Part IV of this book to understand how perceptual grouping occurs in the primary visual cortex.

3.1 Computational Units

45

3.1.4 Firing-Rate Neurons

The unit models reviewed so far can be used to understand the behavior of single neurons and the temporal coding that could take place in binding and segmentation. However, much of high-level behavior in the visual cortex (and elsewhere) does not require such detailed representations: The temporal behavior of the neurons is often not as important as their overall activity. The individual firing events can be abstracted into a general level of activation, or firing rate, and the activities of small groups of neurons can be aggregated into single computational units. For example, the force applied to a muscle and the firing rate of the muscle spindle are strongly correlated (Adrian 1926). Similarly, the firing rate of visual cortex neurons codes orientation and position of visual inputs (Hubel and Wiesel 1962, 1968). Focusing on firing rates alone leads to a much simpler and computationally tractable model.

The firing-rate model is again loosely based on the membrane potential of the neuron. This potential s is calculated as a sum of activities ηk of all neurons k that send their output to the neuron, multiplied by the connection weights wk :

s = ηk wk .

(3.7)

k

 

Most models further abstract the membrane potential into a firing rate η, using a logistic (sigmoid) activation function σ:

η = σ(s) = 1/(1 + e−s).

(3.8)

In this way, the activation (or firing rate) of the neuron is limited between 0 (i.e. minimum firing rate) and 1 (maximum rate), roughly modeling the activation function of real neurons. A piecewise linear approximation of σ can also be used in many cases, including the models in this book; it is faster to compute and results in qualitatively similar behavior.

Even though they are formulated at the single-neuron level, Equations 3.7 and 3.8 constitute a reasonable model for the response of small groups of neurons as well, such as cortical columns. In this interpretation, the amount of input stimulation (s) to the group is measured, and the total activation (or response) of the group is a logistic function of the input, limiting it between a minimum and a maximum value. Cortical column activation turns out to be a powerful abstraction for understanding the two-dimensional structure of the visual cortex, and will be used extensively in this book.

Firing-rate units can be used to simulate very large networks, and thereby even high-level behavior. Most neural network models in cognitive science and engineering, especially in natural language processing, reasoning, memory, speech recognition, and visual pattern recognition, are based on firing-rate units. They will be used in Parts II and III in this book to understand phenomena such as large-scale organization of the visual cortex, plasticity, visual illusions, and face detection.