Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
8.75 Mб
Скачать

214 Clifton Schor

FIGURE 23 E ect of reduced contrast on signal-to-noise ratio for an intrinsic noise source. As the interocular correlation of an image pair with some xed contrast is reduced, the signal amplitude is decreased proportionally while the noise level remains constant. Shown in this gure is a family of crosscorrelation functions, the members of which di er in the interocular correlation of the input image pair. The particular interocular correlation of each input image pair is displayed to the right of the corresponding function. The contrast of the input image pair was 20% in all cases, and the functions are displaced vertically for clarity. (Reprinted from Vision Research, 31, Cormack, L. K., Stevenson, S. B., & Schor, C. M. Interocular correlation, luminance contrast and cyclopean processing, pp. 2195–2207, Copyright © 1991, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

G. Estimating Disparity Magnitude

Once features have been matched in the two eyes, the resulting disparities must be quantied and scaled with viewing distance in order to obtain a veridical quantitative sense of depth. Lehky, Pouget, and Sejnowski (1990) have summarized three classes of general mechanisms that have been proposed to quantify disparity. These include an array of narrowly tuned units with nonoverlapping sensitivity, which have sensitivities distributed over the range of disparities that stimulate stereopsis. The spatial layout of binocular receptive elds for these multiple local channels forms the nodes described in a Keplarian grid (Marr & Poggio, 1976; Mayhew & Frisby, 1980). The value of a disparity determines which channel is stimulated. A second process uses rate encoding to specify disparity with a single channel. Firing rate increases as disparity increases as suggested by models by Julesz (1971) and Marr and Poggio (1979). The third and most physiologically plausible means of disparity quantication utilizes a distribution of multiple channels with partially overlapping sensitivity to disparity. Because there is overlapping sensitivity, the activity of a single channel is ambiguous; however, disparity amplitude can be computed from the activity of all channels by such methods as averaging (Stevenson, Cormack, Schor, & Tyler, 1992) or by spectrum representation (Lehky & Sejnowski, 1990).

5 Binocular Vision

215

In all cases, if the disparity analysis is localized to a small region of space, there is little problem with confounding the analysis with multiple disparity stimuli. This only occurs in dense random-depth scenes, such as close-up views of tree foliage. In these circumstances, averaging mechanisms would have di culty in resolving the separation of closely spaced depth planes, whereas spectral representations would have unique distributions for multiple disparities that could be analyzed with templates (Lehky et al., 1990). Our ability to solve the correspondence problem suggests, however, that there is some interaction between adjacent stimuli as described above in cooperative-global models. This is thought to involve inhibitory interactions between disparity detectors that have been described as inhibitory side-lobes of disparity tuned channels (Cormack, Stevenson, & Schor, 1993; Lehky et al., 1990; Poggio et al., 1988; Stevenson et al., 1992). This produces a band-pass sensitivity to periodic spatial variations of disparity, such as is seen in a depth-corrugated surface like a curtain (Tyler, 1975).

Our ability to perceive depth transparency is often cited as a phenomenon that is inconsistent with many models of stereopsis. However, if single values of depth are analyzed along discrete directions of space, some patches of space will be coded as near and some are far. These regions could be interpolated at a higher level to perceive that the near and far surfaces are actually continuous.The alternative is that the visual system processes the multiple matches in every visual direction, resulting in true transparency.

Finally, there is the question of the metric used to quantify disparity. Disparity has been described either as a positional o set of the two retinal images from corresponding points (measured in angular units) or as a phase o set, where phase describes the disparity as a proportion of the luminance spatial period to which the disparity is optimally tuned (Schor et al., 1984a). Thus a -degree disparity would be a phase disparity of 180 for a unit tuned to 2 cycles/deg. Positional disparity could be encoded by the relative misalignment of two receptive elds that have the same distribution of excitatory and inhibitory zones in the two eyes (Barlow, Blakemore, & Pettigrew, 1967). Phase disparity could be encoded by receptive elds that are arranged in quadrature. These cells are not o set in the two eyes. They have relative phase o sets between excitatory and inhibitory zones within the monocular receptive elds, such that one cell could have a peak sensitivity centered in cosine phase and the other a displaced peak in sine phase (Figure 24) (DeAngelis, Ohzawa, & Freeman, 1991; Freeman & Ohzawa, 1990).

These models and physiological measures suggest several questions about stereopsis. These include (a) what is the minimum number of disparity channels that could account for stereo acuity? (b) what are the crowding limits on stereoresolution for adjacent stimuli? (c) are multiple depths averaged or biased toward one another? (d) is there evidence supporting the existence of inhibitory interactions in stereo-processing mechanisms? and (e) is disparity coded by phase or position?

216 Clifton Schor

FIGURE 24 Position and phase encoding of disparity. Two possible schemes of disparity encoding by binocular simple cells. (a) The traditional model. Solid curves show Gabor functions that represent receptive eld (RF) proles of a simple cell for the left and right eyes. The abscissa (x) for each curve gives retinal position, and the ordinate represents sensitivity to a luminance increment, such that downward deections of the curve correspond to dark-excitatory (or OFF) subregions and upward deections correspond to bright-excitatory (or ON) subregions. Dashed curves show the Gaussian envelopes of the idealized RFs. In this conventional scheme, the two RF proles are identical in shape, but their centers (the centers of the Gaussian envelopes) are located at noncorresponding points (i.e., they are binocularly disparate). (b) Disparity-tuning prediction is based on the conventional scheme shown in

(a). The abscissa is binocular disparity, and the ordinate represents response strength. The cell is tuned to a particular (crossed) disparity by virtue of spatial o set of the RFs from the point of retinal correspondence. (c) An alternative scheme for disparity encoding, in which the left and right RF proled may di er in shape (or phase), but are centered at corresponding retinal location (i.e., the centers of the Gaussian envelopes are at zero disparity). (d) Disparity tuning predicted by the phase-encoding scheme shown in (c). The optimal disparity for the cell is determined by the di erent in phase between the two RF proles, and by the size (or spatial frequency) of the elds. Note that the two schemes shown here produce di erently shaped disparity-tuning curves [(b) and (d)], but the optimal disparities are the same. (From DeAngelis, G. C., Ohzawa, I., & Freeman, R. D. [1995]. Neuronal mechanisms underlying stereopsis: How do simple cells in the visual cortex encode binocular disparity? Perception, 24, 3–31. Pion, London. Reprinted with permission of the publisher and authors.)

H. Disparity Pools or Channels

As will be discussed below, stereopsis has both a transient component that senses depth of very large (up to 12 degrees), briey presented ( 200 ms) disparities (Ogle, 1952; Westheimer & Tanzman, 1956) and a sustained component that senses small disparities ( 1 degree) (Ogle, 1952) near the plane of xation. Behavioral studies

5 Binocular Vision

217

suggest that di erent channel structures underlie these two forms of stereopsis. Many individuals are unable to perceive transient depth to any disparity magnitude presented in one depth direction, either far or near, from the xation plane while they are able to perceive it in response to a wide range of disparities presented in the opposite depth direction (Richards & Regan, 1973). This condition is referred to as stereo-anomalous (Richards, 1971). Stereo-anomalous subjects have normal stereo-acuity when measured with static (sustained) disparities in both the crossed and uncrossed directions.The stereo-anomalous observations have been used as evidence for three classes of disparity pools (crossed, uncrossed and zero) that sense transient disparities. Depth aliasing by the transient-stereo system also supports the three channel model (Edwards & Schor, 1999). Jones (1977) also observed a transient disparity-vergence decit in stereo-anomalous subjects that was consistent with the three-pool model.

Simulations with the three-pool model indicate that the model has insu cient resolution to account for the static (sustained) stereo-acuity of 5- to 10-arc sec (Lehky et al., 1990). Correlation detection studies employing depth adaptation (Stevenson et al., 1992) and subthreshold summation (Cormack et al., 1993) revealed multiple sustained-disparity tuned mechanisms with peak sensitivities along approximately a 2 disparity continuum that had opponent center-surround organization. The width of these disparity-tuned functions varied from 5-arc min at the horopter to 20-arc min at a distance of 20-arc min from the xation plane. Models of the sustained-stereo system that are based upon these data, and measures of stereo-depth sensitivity obtained in front and behind the xation plane (Badcock & Schor, 1985), indicate that a minimum of approximately 20 disparity-tuned channels is necessary to account for the sensitivity of the sustained-stereo system (Lehky et al., 1990; Stevenson et al., 1992).

V. STEREOSCOPIC DEPTH PERCEPTION

A. Depth Ordering and Scaling

A variety of cues are used to interpret a 3-D space from the 2-D retinal images. Static monocular cues rely upon some familiarity with the absolute size and shape of targets in order to make quantitative estimates of their relative distance and surface curvature. Monocular cues such as overlap do not require any familiarity with objects.They only give qualitative information about depth ordering; however, they do not provide depth-magnitude information. Stereopsis and dynamic motion parallax cues do yield a quantitative sense of relative depth and 3-D shape, and they do not depend upon familiarity with size and shape of objects. Depth from stereo and motion parallax can be calculated from geometrical relationships (triangulation) between two separate views of the same scene, taken simultaneously in stereopsis or sequentially in motion parallax.

Three independent variables involved in the calculation stereo-depth are retinal image disparity, viewing distance, and the separation in space of the two viewpoints

218 Clifton Schor

(i.e., the baseline). In stereopsis, the relationship between the linear depth interval between two objects and the retinal image disparity that they subtend is approximated by the following expression:

d * d 22a,

where is retinal image disparity in radians, d is viewing distance, 2a is the interpupillary distance, and d is the linear depth interval. 2a, d, and d are all expressed in the same units (e.g., meters). The formula implies that in order to perceive depths in units of absolute distance (e.g., meters), the visual system utilizes information about the interpupillary distance and the viewing distance. Viewing distance could be sensed from the angle of convergence (Foley, 1980) or from other retinal cues, such as oblique or vertical disparities. These disparities occur naturally with targets in tertiary directions from the point of xation (Garding, Porrill, Mayhew, & Frisby, 1995; Mayhew & Longuet-Higgins, 1982; Gillam & Lawergren, 1983; Liu et al., 1994a; Rogers & Bradshaw, 1993; Westheimer & Pettet, 1992).

The equation illustrates that for a xed retinal image disparity, the corresponding linear depth interval increases with the square of viewing distance and that viewing distance is used to scale the horizontal disparity into a linear depth interval. When objects are viewed through base-out prisms that stimulate additional convergence, perceived depth should be reduced by underestimates of viewing distance. Furthermore, the pattern of zero retinal image disparities described by the curvature of the longitudinal horopter varies with viewing distance. It can be concave at near distances and convex at far distances in the same observer (Figure 3) (Ogle, 1964). Thus, without distance information, the pattern of retinal image disparities across the visual eld is insu cient to sense either depth ordering (surface curvature) or depth magnitude (Garding et al., 1995). Similarly, the same pattern of horizontal disparity can correspond to di erent slants about a vertical axis presented at various horizontal gaze eccentricities (Ogle, 1964). Convergence distance and direction of gaze are important sources of information used to interpret slant from disparity elds associated with slanting surfaces (Banks & Backus, 1998). Clearly, stereo-depth perception is much more than a disparity map of the visual eld.

There are lower and upper limits of retinal image disparity that can be coded by the nervous system and used to interpret relative depth. The region of useful disparities is illustrated in Figure 25 (Tyler, 1983), which plots perceived depth as a function of binocular disparity. The lower left-hand corner of the function represents the lower disparity limit or stereo-acuity. The lower right-hand corner represents the upper disparity limit for stereopsis, beyond which no stereo depth is perceived. The upper disparity limit for stereopsis (approximately 1000 arc min) is much greater than for singleness or Panum’s fusional area, indicated by the vertical line at 6 arc min (Ogle, 1952). When evaluated with narrow band sustained stimuli, however, the upper disparity limit is only slightly greater than the fusion limit (Figure 10). The largest upper disparity limit for stereopsis occurs at low spatial fre-

5 Binocular Vision

219

FIGURE 25 Sensory limits imposed by disparity. Schematic of some stereoscopic limits of perceived depth and fusion as a function of binocular disparity [From Tyler, C. W. (1983). Sensory processing of binocular disparity. In C. M. Schor & K. Ciu reda (Eds.), Vergence eye movements: Clinical and basic aspects. Boston: Butterworth.]

quencies. It corresponds to the large upper stereo limit shown at the right side of the horizontal axis in Figure 25. Below Panum’s limit, targets are seen singly and in depth, whereas above Panum’s limit they are seen as double and in depth for a limited range. Between the upper and lower limits there is a region where there is a veridical match between perceived depth and actual depth.The maximum perceived depth occurs just beyond the fusion limit. Then perceived depth actually diminishes as disparity is increased to the upper disparity limit. The rising limb of this function describes quantitative stereopsis, in which perceived depth increases monotonically with retinal image disparity. The falling limb describes stereopsis with nonveridical depth equal to that perceived with smaller disparities.

B. Hyperacuity, Superresolution, and Gap Resolution

There are three classes of visual-direction acuity described by Westheimer (1979, 1987). They are referred to as Hyperacuity, Super-Resolution Acuity, and Gap Resolution Acuity (Figure 26). Hyperacuity tasks involve detection of a small relative displacement of targets in space or time. The term hyperacuity refers to extremely low thresholds that are less than the width of a single photoreceptor in the retina. A classic example is Vernier acuity, in which misalignment is detected between adjacent targets that are separated along a meridian that is orthogonal to the axis of their misalignment. Superresolution involves size discrimination between sequentially

FIGURE 26

220 Clifton Schor

Hyperacuity, superresolution, and gap resolution tasks. Schematic diagram of targets and corresponding spread functions used to measure (a) hyperacuity, (b) superresolution, and (c) gap resolution. The left and right panels show alternatives given in a forced-choice procedure. A comparison of (a) single unbroken target and broken target measures o set hyperacuity; (b) single target and two parallel targets measures width or thickness superresolution acuity, (c) two parallel, separated targets and four targets of equal overall separation (approximating a lled area) measure gap resolution acuity. For luminance domain acuity measures, the targets could be thin lines; in these stereoscopic acuity measures, the targets were ran- dom-dot planes, with all o sets and separations occurring along the depth, or z-axis. The spread functions under each target represent hypothetical distributions produced in the nervous system on viewing the targets: presumably, the information on which the subject bases his or her choice in the task.The axes at lower left refer to these spread functions: for stereoacuity estimates, x is retinal disparity and Y could be some measure of interocular correlation or matching probability. (Reprinted from Vision Research, 29, Stevenson, S. B., Cormack, L. K., & Schor, C. M. Hyperacuity, superresolution and gap resolution in human stereopsis, pp. 1597–1605, Copyright © 1989, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

viewed targets. Width discrimination is a superresolution task. Gap-resolution represents our ability to resolve space between two separate targets that produces a dip in the combined target luminance prole. It is a judgment based upon something

FIGURE 27

5 Binocular Vision

221

like a Raleigh criterion. Measures of visual acuity with a Snellen E or Landolt C are examples of gap-resolution.

There are forms of stereo-acuity that are analogous to the three forms of monocular acuity (Stevenson et al., 1989). As shown in Figure 27, stereo-hyperacuity tasks involve discrimination of depth between adjacent targets. Stereo-superresolution involves discriminating between the depth-axis thickness of surfaces (pykno-stere- opsis) (Tyler, 1983). Stereo-gap-resolution tasks require discrimination between a single thick surface and two overlaying separate surfaces. Stereo-gap perception of

Schematic illustration of the suprathreshold appearance of stereo tasks of hyperacuity, superresolution, and gap resolution. The panel depicts a perspective view of the targets in Figure 26. (a) Hyperacuity stimulus appeared as a fronto-parallel plane of dynamic random dots split across the middle, with the bottom half closer in depth than the top. (b) Superresolution stimulus appeared as a thick ran- dom-dot slab, yield “pykno-stereopsis.” (c) Gap resolution stimulus appeared as two distinct, overlapped random-dot surfaces with empty space between them, yielding diastereopsis. (Reprinted from Vision Research, 29, Stevenson, S. B., Cormack, L. K., & Schor, C. M. Hyperacuity, superresolution and gap resolution in human stereopsis, pp. 1597–1605, Copyright © 1989, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

FIGURE 28

222 Clifton Schor

two overlapping depth surfaces is referred to as dia-stereopsis (Tyler, 1983). The thresholds for these three stereo tasks are similar to their monocular counterparts. Thresholds for stereo-hyperacuity range from 3 to 6 arc sec. Threshold for stereosuper resolution ranges from 15 to 30 arc sec and threshold for stereo-gap resolution is approximately 200 arc sec. These distinct thresholds demonstrate that stereopsis can subserve di erent types of acuity tasks, and performance on these tasks follows performance on analogous visual direction acuity tasks. The thresholds are modeled from the spread functions depicted in Figure 28.The spread functions represent the combined noise produced by optical ltering, oculomotor vergence noise, and neural ltering. Acuity limits for stereo and visual direction tasks could be attributed to a 3-D ellipsoid of positional uncertainty formed by the spread functions on each spatial dimension.

Neural activity associated with three types of stereo-acuity. Hypothetical distributions of neural activity corresponding to zero-disparity and the threshold stimulus for each task are on the right. Reprinted from Vision Research, 29, Stevenson, S. B., Cormack, L. K., & Schor, C. M. Hyperacuity, superresolution and gap resolution in human stereopsis, pp. 1597–1605, Copyright © 1989, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

FIGURE 29

5 Binocular Vision

223

C. Stereo-Acuity

Under optimal conditions we are able to resolve depth di erences as small as 3 to 6 sec of arc. This performance level can be achieved with foveal xation of targets located at the plane of xation (on the horopter). Stereo-threshold is measured with two spatial o sets in each eye, such as is illustrated in Figure 29. It is remarkable that the spatial o set in each eye at the stereo-threshold is smaller than the minimum spatial o set that can be detected by one eye alone (Vernier threshold) (Berry, 1948; McKee & Levi, 1987; Schor & Badcock, 1985; Westheimer & McKee, 1979). This observation poses problems for ideal detector models of hyperacuity that attempt to explain limitations of vernier-acuity with retinal factors (Banks, Sekuler, & Anderson, 1991).

Stereo acuity and width discrimination. Diagram of displays used to compare a stereoincrement task with a monocular width-increment task. In the stereo task subjects xated a line and detected a change in depth of a second line placed above it, about each of several depth pedestals. In the monocular task subjects xated a line and detected a change in the distance between two other lines about each of several initial separations. (Reprinted from Vision Research, 30, McKee, S. P., Levi, D. M., & Bowne, S. F. The imprecision of stereopsis, pp. 1763–1779, Copyright © 1990, with kind permission from Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, OX5 1GB, United Kingdom, and the authors.)

Соседние файлы в папке Английские материалы