Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
8.75 Mб
Скачать

204 Clifton Schor

FIGURE 16 Panum’s limiting case. Two objects on a visual line of one eye are fused with a single point imaged on the fovea of the other eye. The resulting stereo-percept has a disparity gradient of 2, which exceeds the disparity gradient limit described in Figure 11. (From BINOCULAR VISION AND STEREOPSIS by Ian P. Howard and Brian J. Rogers, Copyright © 1995 by Oxford University Press, Inc. Used by permission of Oxford University Press, Inc.

see depth variations in a gravel bed than it is to see the form of the leafy foliage on a tree. If the matching process assumes that surfaces and boundary contours in nature are generally smooth, then matches will be biased to result in similar disparities between adjacent features.This rule is enforced by a disparity-gradient limit of stereopsis and fusion. Figure 11 is a stereogram that illustrates how two points lying at di erent depths are both fused if they are widely separated, and only one can be fused at a time if they are crowded together. This e ect is summarized by the rate of change of disparity as a function of the two target’s separation (disparity gradient). When the di erence in disparity between the two targets is less than their separation (disparity gradient less than one), both targets can be fused. When the change in disparity is equal to or greater than their separation (disparity gradient equal to or greater than one), the targets can no longer be fused (Burt & Julesz, 1980). Thus the bias helps to obtain the correct match in smooth surfaces but interferes with obtaining matches in irregular surfaces. Edge continuity might be considered as a corollary of the smoothness constraint. Matching solutions that result in continuous edges or surface boundaries are not likely to result from chance and are strong indicators that the correct binocular match has been made. Furthermore, subsequent matches along the same boundary contour will be biased to converge on the same solution.

Finally, the number of potential matches can be reduced dramatically by restricting matches to retinal meridians that lie in epipolar planes. The retinal locus of epipolar images lie in a plane that contains the target in space and the two nodal points of the eyes. Assuming that the nodal point lies very near the center of curvature of the retina, this plane intersects the two retinae and forms great circles whose radius of curvature equals the radius of the eye-globe. When presented with multiple images, matching would be greatly simplied if searches were made along these epipolar lines. Because the eyes undergo cyclovergence when we converge or

5 Binocular Vision

205

elevate our direction of gaze, the epipolar planes will intersect di erent coplanar retinal meridians depending on gaze. Thus utilization of the epipolar constraint requires that eye position and torsion information be used to determine which retinal meridians lie in the earth-referenced epipolar plane. If matching is unconstrained, such as is the case with long oblique lines, it is completely ambiguous (aperture problem). Under these conditions matching is restricted to a small vertical disparity range (10 arc min) about epipolar lines (Van Ee & Schor, 1999). The matching solution in this case equals the vector average of all possible matches. The vertical extent of the operating range for binocular matching can be extended when matches are constrained by image primitives such as end or crossing points. For example, Stevenson and Schor (1997) have shown that a wide range of binocular matches can be made between random-dot targets containing combinations of large horizontal and vertical disparities ( 1 ). The epipolar constraint would, however, be ideal for articial vision systems in which the orientation of two cameras could be used to determine which meridians in the two screen planes were coplanar and epipolar.

C. Computational Algorithms

In addition to the constraints listed above, several computational algorithms have been developed to solve the correspondence problem. Some of these algorithms exhibit global cooperativety in that the disparity processed in one region of the visual eld inuences the disparity solution in another region. Matches for di erent tokens are not made independently of one another. These algorithms are illustrated with a Keplarian grid, which represents an array of disparity detectors that sense depth over a range of distances and eccentricities in the visual plane from the point of convergence. Biological analogs to the nodes in this array are the binocularly innervated cortical cells in the primary visual cortex that exhibit sensitivity to disparity of similar features imaged in their receptive elds (Hubel & Wiesel, 1970; Poggio, Gonzalez, & Krause, 1988). Notice that the many possible matches of the points in Figure 17 fall along various nodes in the Keplarian grid. Cooperative models enforce the smoothness and disparity gradient constraints by facilitating activity of nodes stimulated simultaneously in the fronto-parallel plane (dashed lines) and inhibiting activity of nodes stimulated simultaneously in the orthogonal depth planes (e.g., midsagittal) (solid lines). The consequence is di erent disparity detectors inhibit one another and like disparity detectors facilitate one another (Dev, 1975; Nelson, 1975).This general principle has been elaborated upon in other models that extend the range of facilitation to regions falling outside the intersection of the visual axes and areas of inhibition to areas of the Keplarian grid that lie between the two visual axes (Marr & Poggio, 1976). For a review of other cooperative models see Blake and Wilson (1991).

Several serial processing models have also been proposed to optimize solutions to the matching problem. These serial models utilize the spatial lters that operate at the early stages of visual processing. Center-surround receptive elds and simple

206 Clifton Schor

FIGURE 17 The Keplarian grid shown above represents some of the possible matches of points imaged on various retinal loci described along the horizontal position along the left retina and horizontal position along the right retina. Midsagittal targets are imaged along the long axis at 45 and frontoparallel objects are imaged along the dashed short axis at 135 . Several of the cooperative stereo-algo- rithms that have been proposed include just one set of inhibitory connections between detectors of di erent disparities (along the long 45 axis) at the same retinal position.

and complex cells in the visual cortex have optimal sensitivity to limited ranges of luminance periodicity that can be described in terms of spatial frequency. The tuning or sensitivity proles of these cells have been modeled with various mathematical functions (di erence of Gaussian, Gabor patches, Kauche functions, etc.) all of which have band-pass characteristics. They are sensitive to a limited range of spatial frequencies referred to as a channel, and there is some overlap in the sensitivity range of adjacent channels. These channels are also sensitive or tuned to limited ranges of di erent orientations. Thus they encode both the size and orientation of contours in space. There is both psychophysical and neurophysiological evidence that disparities or binocular matches are formed within spatial channels.These lters serve to decompose a complex image into discrete ranges of its spatial frequency components. In the Pollard Mayhew and Frisby (PMF) model (Frisby & Mayhew, 1980) three channels tuned to di erent spatial scales lter di erent spatial scale image components. Horizontal disparities are calculated between the contours having similar orientation and matched contrast polarities within each spatial scale. Matches are biased to obtain edge continuity. In addition, matches for contours of the same orientation and disparity are biased that agree across all three spatial scales. Most of these models employ both mutual facilitation and inhibition; however, sev-

5 Binocular Vision

207

eral stereo-phenomenon suggest a lesser role for inhibition. For example, we are able to see depth in transparent planes, such as views of a stream bed through a textured water surface, or views of a distant scene through a spotted window or intervening plant foliage. In addition, depth of transparent surfaces can be averaged, seen as lled in or as two separate surfaces as their separation increases (Stevenson, Cormack, & Schor, 1989). Inhibition between dissimilar disparity detectors would make these precepts impossible.

Another serial model reduces the number of potential solutions with a coarse to ne strategy (Marr & Poggio, 1976). Problems arise in matching the highand lowfrequency components in a complex image when the disparity of the target exceeds one-half period of the highest frequency component. A veridical match can be made for any spatial frequency for any disparity that is less than a half the period (180 phase shift between binocular image components). However, there are unlimited matches that could be made once disparity exceeds the half period of a highfrequency component. False matches in the high-frequency range sensed by a smallscale channel could be reduced by rst matching low-frequency components that have a larger range within their 180 phase limit for unambiguous matches in a large-scale channel. The large-scale solution constrains the small-scale (high-fre- quency component) solution. Indeed, Wilson et al. (1991) have shown that a lowfrequency background will bias the match of a high-frequency pattern in the foreground to a solution that is greater than its half period unambiguous match. A phase limit of 90 rather than 180 for the upper disparity limit for binocular matches has been found empirically using band-pass targets (Schor et al., 1984b). The 90 phase limit results from the two-octave bandwidth of the stimulus as well as the spatial channels that process spatial information. Phase is referenced to the midfrequency of the channel rather than its upper frequency range, and the 180 phase limit still applies to the upper range of these channels.

The matching task could be facilitated by vergence responses to the large-scale component of the image. This would reduce overall disparity and bring the smallscale components within the unambiguous phase range of a small-scale channel (Marr & Poggio, 1979). Through iterations, a common disparity solution will eventually be found for all spatial frequency components. A similar result could be obtained in a parallel process with a single broadly tuned spatial channel that could be constrained to nd a common match for all spatial frequency components of the image, as long as they had a wide enough range of spatial frequencies. When the frequency range is too narrow, then the nearest neighbor match will prevail, regardless of the real disparity of the target. This is seen in the wallpaper illusion in which a repetitive patter of vertical lines can be fused at any horizontal vergence angle. The depth of the grating is always the solution that lies nearest to the point of convergence, irrespective of the real distance of the wallpaper pattern.

The serial models or any spatial model of binocular matching that rely on the 180 phase limit for unambiguous disparity are not supported by empirical observations of fusion limits and stereopsis (Schor et al., 1984a,b). The theories predict

208 Clifton Schor

that stereo threshold and binocular fusion ranges should become progressively lower as spatial frequency is increased. This prediction is born out as spatial frequency increases up to 2.5 cpd; however, both stereo acuity and horizontal fusion ranges are constant at frequencies above 2.5 cpd. As a result, disparities are processed in the high spatial scales that greatly exceed the 180 phase limit. This is a clear violation of theories of disparity sensitivity based upon phase sensitivity within spatial channels. In addition to phase sensitivity, there may be other limits of disparity resolution, such as a disparity position limit. Given the presence of both a phase and a position limit, the threshold would be set by whichever of these two limits was least sensitive. For large disparities, the constant position limit would be smaller than the phase limit, and the converse would be true for high spatial frequencies.

D. Interocular Correlation

Computational algorithms and matching constraints described above rely upon a preattentive mechanism that is capable of making comparisons of the two retinal images for the purpose of quantifying the strength of the many possible binocular matches. Ideally, all potential matches could be quantied with a cross-correlation or interocular correlation (IOC) function. This is represented mathematically as the convolution integral of the two retinal images.

IOC(d) f(x)h(x d )dx,

where f(x) and h(x) represent the intensity proles (or some derivative of them) along the horizontal meridian of the right and left eye’s retinae. The IOC can be thought of as the degree to which the two retinal images match one another.

This nonlinear operation represents the strength of various matches with products between the two eyes images as a function of retinal disparities along epipolar lines. A random-dot stereogram (RDS) is an ideal target to test the binocular matching process in human vision because it is devoid of clear monocular forms that are only revealed after stereoscopic depth is perceived (Julesz, 1964). IOC of a RDS equals the proportion of dots in one eye’s image that match dots in the other eye’s image with the same contrast at the same relative location. The middle RDS shown in Figure 18 has 50% density and is composed of only black and white dots. Those dots that do not match correspond to images of opposite contrast. If all dots match, the correlation is 1. If no dots match, that is they all have paired opposite contrasts, the correlation is 1. If half the dots match, the correlation is zero.

R

FIGURE 18 Random-dot stereograms of varying degrees of correlation. (a) The two images are correlated 100%. They fuse to create a at plane. (b) The interocular correlation is 50%. A at plane is still perceived but with some dots out of the plane. (c) The correlation is zero and a at plane is not perceived. Subjects had to detect transitions between di erent states of correlation. (Reprinted from Vision Research, 31, Cormack, L. K., Stevenson, S. B., & Schor, C. M. Interocular correlation, luminance contrast and cyclopean processing, pp. 2195–2207, Copyright © 1991, with kind permission from Elsevier Science Ltd.,The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

5 Binocular Vision

209

210 Clifton Schor

The image correspondence of the RDS can be quantied by a cross-correlation analysis of the luminance or contrast prole of the random-dot pattern. Figure 18 illustrates three patterns of random-element stereograms whose correlation varies from 1 at the top where all dots match, to zero at the bottom, where left and right image contrasts are randomly related. Note the variation in atness of the fused images. Figure 19 illustrates the cross-correlation function of the previous autostereogram images, where the peak of each function represents the stimulus correlation at the disparity of the match, and the average amplitude of the surrounding noise represent a zero correlation composed of 50% matched dots. The noise uctuations result from spurious matches that vary at di erent disparities.The negative side lobes about the peak result from the use of edge contrast rather than luminance in computing the cross-correlation function. With only two dot contrasts, the IOC equals the percent matched dots in the stimulus minus 50% divided by percent.

IOC 2Pd 1,

where Pd is the proportion of matching dots. The IOC is analogous to contrast in the luminance domain. At threshold, the IOC is analogous to a Weber fraction. The IOC represents stimulus disparity (signal) in the presence of a mean background correlation of zero with 50% matches (noise). At threshold, the cross-correlation provides a means of quantifying the visibility of the disparity much like contrast threshold quanties the visibility of a luminance contour in the presence of a background luminance.

FIGURE 19 A family of cross-correlation functions for a random-dot stereogram with 80% interocular correlation but decreasing contrast (indicated by numbers on the right). The curves are vertically separated on the y axis. The signal (peak in the correlation function) and the extrinsic noise (lesser peaks due to spurious matches in the display) both vary with the square of contrast. The functions are displaced vertically. (Reprinted from Vision Research, 31, Cormack, L. K., Stevenson, S. B., & Schor, C. M. Interocular correlation, luminance contrast and cyclopean processing, pp. 2195–2207, Copyright © 1991, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

5 Binocular Vision

211

E. Off-Horopter Interocular Correlation Sensitivity

The RDS has been used to measure the sensitivity of the binocular system to correlation at di erent standing disparities with respect to the horopter. This is analogous to extrahoropteral studies of stereopsis by Blakemore (1970), Badcock and Schor (1985), and Siderov and Harwerth (1993), which measured stereopsis as a function of distance in front or behind the xation plane, only here we are measuring correlation detection as opposed to di erential disparity detection. The task with the RDS is for the subject to identify which of two intervals presents a partially correlated surface as opposed to one having zero correlation, where 50% of the dots match. Surprisingly, we are extremely good at this task, and under optimal conditions, some subjects can detect as little as 5% increment in correlation, whereas others can only detect 10%. Thus for a 10% correlation threshold, the latter subject is able to discriminate between 50 and 55% matching dots.

Figure 20 illustrates the o -horopter correlation thresholds for three subjects measured as a function of the disparity subtended by the correlated dots. Sensitivity falls o abruptly away from the horopter until 100% correlation is needed to discriminate between a zero correlated eld at a one degree disparity on either side of the horopter. Beyond that distance, all correlations appear the same as zero. The

FIGURE 20 Baseline correlation thresholds for three subjects as a function of disparity pedestal amplitude. The log of interocular correlation at threshold is plotted against horizontal disparity of the test surface relative to xation. Negative values of disparity indicate near (crossed) disparity. A value of 0.0 on the vertical axis represents a correlation of 1.0, the maximum possible. The intersection of each subject’s curve with the line at 0.0 indicates the upper disparity limits for correlation detection in our task. Correlation thresholds were measured for each subject out to / 35 arc min disparity, with each point representing the mean of ve runs. Upper disparity limits were measured in a separate experiment. The general trend can be characterized as being symmetric about 0 disparity, with an exponential relationship between correlation threshold and disparity. Each subject showed idiosyncratic departures from this exponential trend, such as the bumps at 0 disparity for subject SBS, at 5 arc min for CMS and at10 for subject LKC. (Reprinted from Vision Research, 32, Stevenson, S. B., Cormack, L. K., Schor, C. M., & Tyler, C. W. Disparity tuning mechanisms of human stereopsis, pp. 1685–1694, Copyright © 1992 with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

212 Clifton Schor

range can be extended to two degrees by increasing the number of visible dots by increasing eld size or reducing dot size. There is improvement as the number of dots is increased up to 10,000 dots (Cormack, Stevenson, & Schor, 1994) and then performance remains static, demonstrating the limited e ciency of the visual system. The function shown in Figure 20 illustrates that the horopter is to binocular vision as the fovea is to spatial resolution. It is the locus along the depth axis where correlation sensitivity is highest.

F. Extrinsic and Intrinsic Noise and Interocular Correlation

The matching problem is basically one of detecting a signal in the presence of noise. The peak of the cross-correlation function could represent the signal that is to be detected in the presence of various sources of noise. One extrinsic noise source results from the spurious matches in the stimulus at nonoptimal disparities. These are seen as the ripples in the anks surrounding the peak of the IOC distribution. There are also intrinsic sources of noise that could result from the variable responses of physiological disparity detectors. The inuence of these two noise sources can be revealed by measuring the correlation threshold of a RDS as a function of contrast. The IOC threshold is most sensitive at high contrasts and remains constant as contrast is reduced to 10 times the contrast threshold (approximately 16% contrast) for detecting the RDS. Figure 21 illustrates that at lower contrasts, the threshold

FIGURE 21 Correlation thresholds as a function of luminance contrast. Threshold for the detection of interocular correlation as a function of luminance contrast, expressed in threshold multiples, for the three subjects. Both axes are logarithmic. The data asymptote to a log-log slope of 0 at high contrasts and 2 at low contrasts. A line of slope 2, indicating a trading relation between interocular correlation and the square of contrast, is plotted for reference. The leftand right-hand vertical lines at the top of the gure represent typical SD for low and high contrast judgments respectively. (Reprinted from Vision Research, 31, Cormack, L. K., Stevenson, S. B., & Schor, C. M. Interocular correlation, luminance contrast and cyclopean processing, pp. 2195–2207, Copyright © 1991, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

5 Binocular Vision

213

FIGURE 22 E ect of reduced contrast on signal to noise ratio for a stimulus noise source. As the contrast of an image pair with some xed interocular correlation is reduced, both the signal amplitude and the noise level are decreased by the square of contrast. Shown in this gure is a family of cross-cor- relation functions, the members of which di er in the contrast of the input image pair. The particular contrast of each input image pair is displayed to the right of the corresponding function. The interocular correlation of the input image pair was 80% in all cases, and the functions are displaced vertically for clarity. (Reprinted from Vision Research, 31, Cormack, L. K., Stevenson, S. B., & Schor, C. M. Interocular correlation, luminance contrast and cyclopean processing, pp. 3195–2207, Copyright © 1991, with kind permission from Elsevier Science Ltd.,The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

increased proportionally with the square of the reduction (i.e., slope of 2 on a log-log scale). For example, if contrast is lowered by a factor of two, the correlation threshold for perception of a plane increases by a factor of four. Figures 22 and 23 illustrate the variation of signal-to-noise ratio that account for the at and root 2 regions of the contrast function. In Figure 22, assume the noise results from spurious matches in the stimulus. Accordingly, as the contrast of an image pair with some xed interocular correlation is reduced, both the signal amplitude and the noise level are decreased by the square of contrast. The square relationship reects the product between the left and right images during the cross-correlation. The covariation of signal and noise with contrast results in a constant signal-to-noise ratio. Figure 23 assumes the noise results from an intrinsic source that is independent of the stimulus and that this intrinsic noise is greater than the extrinsic noise of the stimulus when the image contrast has been reduced below 10 times detection threshold. As contrast is reduced below a value of 10 times detection threshold, the signal is still reduced with the square of contrast; however, the intrinsic noise remains constant. Accordingly, signal-to-noise ratio decreased abruptly with the square of contrast causing a rapid rise in IOC threshold. These results illustrate the presence of intrinsic and extrinsic noise sources as well as a nonlinearity, which can be described as a binocular cross-correlation or product of the two monocular images.

Соседние файлы в папке Английские материалы