Ординатура / Офтальмология / Английские материалы / Binocular Rivalry_Alais, Blake_2005
.pdf(Leopold and Logothetis, 1996; Logothetis, Leopold, and Sheinberg, 1996; Logothetis and Schall, 1989), and forms the basis for our model. Dayan’s model used competition between high-level neurons in the visual pathway that feed back to lower layers, thus recurrently affecting the responses in higher levels. His model comprised four layers, each representing an area further up the visual pathway. He termed these layers the retinal input layer, V1, early extrastriate cortex, and late extrastriate cortex. The cells in the higher levels of the hierarchy represent increasing levels of complexity, with the final layer containing a mere two cells acting as “categorical” cells (recognizing horizontal versus vertical elements). In order for the images presented to rival, Dayan also introduced a fatigue factor in V1 where excited cells would gradually reduce their activity over time.
Several features observed in experimental binocular rivalry are reproduced by Dayan’s model, notably the lack of rivalry between low-strength inputs (Blake, 1977; Liu, Tyler, and Schor, 1992) and the dependence of suppression periods on the relative contrast of the stimuli (see also chapter 17 in this volume). Dayan noted that his model did not accurately describe the experimentally observed stability of monocular neurons during rivalry (Leopold and Logothetis, 1996). His model is also deterministic, ignoring the stochastic nature of suppression and dominance periods (which follow a gamma distribution). Also, there are no cells that respond when the images are presented binocularly and also fire when suppressed during rivaling. Neither are there cells that do not fire during binocular presentation but do fire during rivalry.
AIMS
We wished to investigate the contrast dependence of a much more complex rivalry model. The number of input states and the number of categorical cells in the Dayan model were so restricted that statistical statements about the general behavior of the model under, say, contrast variation conditions would be hard to draw.
Furthermore, it was proposed to set up a model to meet the principles as set out by Levelt (1966):
•The percentage of total dominance of an image will increase with its strength.
•The average duration of dominance for a stimulus is independent of its strength.
340 |
D. P. Crewther and colleagues |
•The speed with which dominance switches between stimuli increases as the strength of one stimulus increases.
•The speed with which dominance switches between stimuli increases as the strength of both stimuli increases. Increasing the contrast of one stimulus, rather than increasing the time of dominance of that eye, reduces the time of suppression of the eye receiving enhanced contrast (Blake, 1977; Fox and Rasche, 1969; Levelt, 1965).
OUR MODEL
A software package (NeuroSolutions V.3.022, The Neural Network Simulation Package) was chosen to build the neural net. The model was loosely based on Dayan’s, but included a much more extensive input layer and a far more sophisticated recognition layer.
Two “eyes” for input were constructed with a visual pathway comprising four layers eventually combining to form a binocular network (figure 18.1).
The first layer models the LGN via an matrix of inputs whose excitation varied over an 8-bit range (256 gray levels). The “receptive fields” of these units had no orientation tuning, somewhat like the pixels of a video camera sensor (i.e., without surround). Layer 2 modeled the striate cortex and was represented by a cell matrix, the last dimension representing possible orientations of 0, , 90, and 135. The third layer modeled early extrastriate cortex with an oriented matrix of cells (again limited to four orientations), combining inputs from the striate cortical units. The outputs from the separate pathways were merged as they entered the fourth layer, using a Kohonen “winner-take-all” neural network classifier that assigned its inputs just one of a group of classifications (which we will refer to as categorical cells—much like the idea of “grandmother cells”).
We chose to implement ten such categorical cells. It should be clear that the receptive field properties of these categorical cells are not usually available from such a Kohonen network, and this feature provided a challenging limitation to rectify. Kohonen maps are artificial neural network components that assign each of the inputs to one of a group of classifications based on the features the network itself extracts from the input data. In addition, a Kohonen map classifies inputs topographically, with the algorithm forcing nearby classifier cells to respond to images that have similar features. In this sense, the Kohonen classifiers demonstrate features similar to neurons in primate inferotemporal cortex that possess highly complex receptive fields organized in columns with much greater
341 |
A Neural Network Model of Top-Down Rivalry |
Figure 18.1 Breadboard layout of the neural network. The layers and their connections are shown as two independent processing networks that undergo modification at levels labeled LGN (lateral geniculate nucleus), Striate Cortex, Early Extrastriate Cortex, and Late Extrastriate Cortex. It is between these last two axons that binocular interaction first occurs. Graphs above and below the eye inputs represent the simple array at the LGN level, a array at the Striate Cortex, and an image at the level of Early Extrastriate. The final level is a winner-take- all Kohonen classifier that selects which of ten categorical cells is perceived.
similarity in receptive field within columns than between columns (Tanaka, 1993, 2003; Wang, Fujita, and Murayama, 2000).
TRAINING
Because we were not concerned with the process of creating orientation specificity for the cells of level 2 (V1), a developmentally plausible training algorithm was not required. Thus, backpropagation, the most efficient training algorithm, was used. The input data consisted of binary matrices, used to train and to test the network (see top section of figure 18.2). Dur-
342 |
D. P. Crewther and colleagues |
Figure 18.2 An array of 24 input stimuli organized into columns according to the categorical cell with which they associated after training. The set of images at the bottom of the figure represents “collations” used as a means of representing each categorical cell.
ing training, the network used a data set of 24 images (of pixels), with eyes viewing a pair of matching inputs at contrasts 1, 0.9, 0.7, and 0.5 (i.e., bioptically). These input images were transformed by synaptic weights as they passed up to the striate layer. The output of this transformation was then compared with a previously prepared orientation-selective representation of the input image.
The process was repeated for all images and constituted one training run or epoch. Since NeuroSolutions provided graphical output probes to monitor the progress of the network, training was ceased when no visually recognizable improvement in the 8-bit gray scale representation could be seen (i.e., the output of the orientation-selective striate and extrastriate cells matched the input pattern). During the training phase for orientation specificity, the Kohonen map was not active.
After training for orientation specificity, the Kohonen network used these paired matching inputs to classify each representation into one of ten different classes (the categorical cells). The Kohonen network is often
343 |
A Neural Network Model of Top-Down Rivalry |
used in recognition software and is characterized by a “winner-take-all” response, with a single solution from the network being provided by the cell that gains maximum activation. Since the network did not allow recurrent connections as an inbuilt option of NeuroSolutions, a separately programmed module implemented as a dynamic link library (DLL) was coded. This served two functions. During training, the DLL calculated the average of the leftand right-eye firing strengths for each cell. This representation was added to any existing representation of the same categorical cell output. This creates a “collation,” a superimposed image comprising all representations for each of the ten categorical cell outputs (see bottom section of figure 18.2). The training set needed to be executed only once to create such a collation.
HYPOTHESIS FOR PREDICTING RIVALRY
Given the 24 input states (stimuli) and the ten categorical cells in the model, it is clear that more than one input state will associate with the same categorical cell (see figure 18.2). Indeed, for the input set of stimuli used, six input stimuli associated with categorical cell 1 but no input stimuli formed an association with categorical cell 2. Rivalry in this model is seen as a competition at the level of the categorical cells between inputs trying to claim association. Thus, our expectation was that input stimuli which associate with the same categorical cell should not rival, while stimuli that associate with different categorical cells during training will rival.
Testing
When the training was completed and testing began, pairs of images were presented to the eyes (with possibilities). Each pair was shown to the “eyes” for 100 presentations, corresponding to 100 units of time. The Kohonen map was forced to make a classification based on the mismatching data. The second function of the DLL was to determine which of the images was “seen” by the Kohonen map and to modify the strengths of the neural connections being fed back to the late extrastriate layer based on this perception. The DLL compared each cell of the left eye and the right eye against the collated image of the current Kohonen selection. Calculating the absolute value of the difference between these two numbers for each cell and accumulating a total difference value finds a measure of similarity between the current Kohonen selection and the leftand
344 |
D. P. Crewther and colleagues |
right-eye representations. The eye with greater similarity to the Kohonen selection was chosen as the winner or dominant eye.
Two variables, termed left modifier and right modifier, altered the strength of the inputs at the early extrastriate layer. Initially, both these modifiers were set to 1. The dominant-eye modifier decreased the strength of the winning representation at the early extrastriate layer by 0.1, and the suppressed-eye modifier increased the representation by 0.1 for each iteration. The suppressed eye came into dominance when the total difference value indicated that the suppressed eye was now more similar to the current Kohonen output than the previously dominant eye. When this occurred, both the left and the right modifier were multiplied by a value derived from the ratio between the total difference of the left eye and that of the right eye. Thus the strength for the newly dominant eye was raised by this value and that for the suppressed eye was decreased by the same amount. Though artificial, these factors were intended to represent the processes loosely termed “habituation” and “selective attention/ suppression.”
RESULTS
The output files generated by the network were saved in spreadsheet form for later analysis. Overall averages for mean left-eye and right-eye dominance length, mean left and right dominance percentage, switching frequency, total percentage of stimuli that rivaled, and consistency between expected and actual behaviors were investigated.
Summary of Cases
Given the 24 input states, the whole set of cases of potential rivalry or its lack could not be reasonably presented. However, several interesting cases have been included as an appendix in which the predictions of rivalry or lack of rivalry are described in detail. These cases are presented diagrammatically in figure 18.3.
It is clear that the model incorporates a nonrealistic regular fluctuation, as might be expected from the gradual fatigue built into the model. The figures show the strength modifiers for each eye’s input, the similarity measure between each input and the current Kohonen “winning” classifier, and the Kohonen selection (that which is closest to percept). Thus rivalry is regular, nonexistent, slow, rapid, or irregular, depending on interactions between the network weightings associated with stimulus input.
345 |
A Neural Network Model of Top-Down Rivalry |
Rivalry Prediction Performance
Given the relatively large set of input stimuli (24), the total number of possible pair combinations ( cases) was sufficiently large to test statistical relations. All 576 cases were tested. The summary statistics for rivalry were as follows:
Nonrivalry (cases where the two input stimuli were associated during training with the same categorical cell) was correctly predicted in 97.2% of 85 cases.
Rivalry occurred in 81.4% of the 491 cases in which it was predicted (cases where the two input stimuli were associated during training with different categorical cells).
Rivalry Parameters as a Function of Contrast
As indicated earlier, the network was trained under two conditions of varying contrast. The first variation involved altering the contrast presented to the right eye (0.5, 0.7, 0.9, 1.0), while the contrast of the left eye was fixed (1.0). This condition was termed “the interocular contrast difference condition.”
The second variation involved presenting stimuli with the same contrast to the two eyes but altering the contrast across blocks of trials (0.5, 0.7, 0.9, 1.0)—this was termed “equal-eye contrast variation.”
Under conditions of interocular contrast variation, a distinct behavior was observed in mean dominance periods as a function of contrast. Where unequal contrast was presented to the two eyes, rather than the mean dominance of the stronger eye stimulus becoming greater, the mean dominance period of the weaker eye was reduced (see figure 18.4).
Figure 18.3 Six cases of heterogeneous input to the two eyes and the variety of outcomes that ensue. The input stimuli are shown at the top of each figure, and three graphs are presented for each rivalry case. The top graph shows the behavior of the left and right modifiers over 100 iterations or presentations. The middle graph shows the behavior of the similarity measures for left and right stimuli. The bottom graph shows the behavior of the Kohonen classifier (solid line) as well as the “winner” (triangular markers) across the 100 presentations. (a) Stim 1 vs. stim 14—rivalry expected (regular rivalry occurred). (b) Stim 1 vs. stim 18—rivalry expected (rival alternation between nonexpected categorical cells occurred). (c) Stim 5 vs. stim 0—rivalry expected (rapid, irregular rivalry observed). (d) Stim 1 vs. stim 6—rivalry expected (very irregular switching observed). (e) Stim 0 vs. stim 20—rivalry expected (after initial switch, no rivalry observed). ( f ) Stim 19 vs. stim 11—rivalry expected (very rapid alternation observed).
347 |
A Neural Network Model of Top-Down Rivalry |
|
40 |
|
|
|
|
|
|
|
L_Dominance |
|
38 |
|
|
R_Dominance |
|
|
|
|
|
Dominance |
36 |
|
|
|
34 |
|
|
|
|
32 |
|
|
|
|
Mean |
|
|
|
|
30 |
|
|
|
|
|
28 |
|
|
|
|
26 |
|
|
|
|
1.0L 1.0R |
1.0L 0.9R |
1.0L 0.7R |
1.0L 0.5R |
Figure 18.4 Mean dominance period for the leftand right-eye stimuli under varied interocular contrast. The contrast for the left eye was maintained at 1.0, while that for the right eye was varied from 1.0 to 0.5. All cases where rivalry was expected have been included. While left-eye dominance remained approximately constant, a distinct trend to shorter mean dominance is observed as the contrast of the weaker stimulus is reduced.
Analysis of variance (ANOVA) was used to assess the results of contrast variation, using a factorial design. In addition, a one-way () design was used to investigate the dynamics of switching as a function of contrast. While for the interocular contrast difference condition a systematic drop in the mean percentage of left-eye dominance with increasing right-eye contrast was observed, the main effect of interocular contrast on total percent dominance was not significant (). The interaction between interocular contrast and leftand right-eye percent total dominance was also not significant ().
The frequency of dominance switching between eyes showed an interesting trend, with an increase in mean switching frequency as the right-eye contrast was increased from 0.5 to 1.0. However, significance was again lacking ().
When both eyes were presented with the same contrast, the frequency of switching increased with contrast in a significant fashion: (figure 18.5).
DISCUSSION
The model of binocular rivalry, as created, clearly provides a theoretical arena in which to investigate the properties of perception and suppression. The model was large enough in terms of input complexity and output sophistication to investigate properties of rivalry across a range of
348 |
D. P. Crewther and colleagues |
Mean Switching Frequency
22
20
18
16
|
|
0.5 L 0.5R |
|
0.7 L 0.7R |
|
0.9 L 0.9R |
|
1.0 L 1.0R |
|
|
|
|
|
|
|
Figure 18.5 Mean switching frequency (number of switches per 100 presentations) shown as a function of stimulus contrast for equal-contrast stimuli presented to the two eyes across the cases where rivalry was expected. Asignificant increase in switching frequency was observed.
stimuli presented to the two eyes, such that the relations between lowand high-order processing of the stimuli could be related to the dynamics of their rivalry.
The first aim, that the model would exhibit rivalry (i.e., dominant and suppressive phases when presented with dissimilar stimuli) was clearly supported by the results. Indeed, under normal, high-contrast conditions, the prediction of rivalry was confirmed at a rate of over 80%, and the prediction of nonrivalry was confirmed on a case-by-case basis at a rate of 97%. It is also apparent, from the series of cases demonstrated in figure 18.3, that the dynamics of rivalry was not fixed, with varying rates of alternation and irregular switching of dominant and suppressive phases. This was a requirement for biological plausibility (Levelt, 1966). However, it is also obvious that the distribution of alternations for any particular pair of our rival stimuli is far more regular than the gamma distribution experimentally observed (Fox and Herrmann, 1967; Levelt, 1966).
This regularity is largely due to the deterministic manner in which strength modifiers increase and decrease in value. It is plausible that if there were a noise contribution to the strength modifier, then a more natural distribution would result.
It is important to acknowledge the degree to which the network reflects the theoretical assumptions hardwired into its design. We expected to achieve rivalry in the gross sense because of two input features. First, the use of a winner-take-all network to provide an answer to the question of
349 |
A Neural Network Model of Top-Down Rivalry |
