Добавил:

chrysler_a57_mltbnk Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский технический университет связи и информатики

Предмет:

Основы телевидения

Файл:

литература / Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf

Скачиваний:

Добавлен:

13.05.2026

Размер:

38.02 Mб

Скачать

☆

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 1112 / 14012 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Luma is comparable to lightness; it is often carelessly and incorrectly called luminance by video engineers. See Appendix A, YUV and luminance considered harmful, on page 567.

response of vision at a certain state of adaptation. A few greyscale imaging systems have pixel values proportional to L*.

Value refers to measures of lightness roughly equivalent to CIE L* (but typically scaled 0 to 10, not 100). In image science, value is rarely – if ever – used in any sense consistent with accurate colour. (Several different value scales are graphed in Figure 20.2 on page 208.)

Regrettably, many practitioners of digital image processing and computer graphics have a cavalier attitude toward the terms described here. In the HSB, HSI, HSL, and HSV systems, B allegedly stands for brightness, I for intensity, L for lightness, and V for value; however, none of these systems is associated with brightness, intensity, luminance, or value according to any definition that is recognized in colour science!

Colour images are sensed and reproduced based upon tristimulus values (tristimuli), whose amplitudes are proportional to intensity but whose spectral compositions are carefully chosen according to the principles of colour science. Relative luminance can be regarded as a distinguished tristimulus value useful on its own; apart from that, tristimulus values come in sets of three.

The image sensor of a digital camera produces values, proportional to radiance, that approximate red, green, and blue (RGB) tristimulus values. I call these values linear-light. However, in most imaging systems, RGB tristimulus values are coded using a nonlinear transfer function – gamma correction – that mimics the perceptual response of human vision. Most image coding systems use R’G’B’ values that are not proportional to intensity; the primes in the notation denote imposition of a perceptually motivated nonlinearity.

Luma (Y’) is formed as a suitably weighted sum of R’G’B’; it is the basis of luma/colour difference coding in video, MPEG, JPEG, and similar image coding systems. In video, the nonlinearity implicit in gamma correction that forms R’G’B’ components is subsequently incorporated into the luma and chroma (Y’CBCR) components.

Contrast

The term contrast, as used in imaging, is heavily overloaded. “Contrast” can relate to physical properties of light in a scene, physical properties of light in an image,

28	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

The histogram computed and displayed by a D-SLR camera typically has luma as the independent axis, not luminance or log luminance.

The term key used by photographers is unrelated to key used in video to refer to matting.

Simultaneous contrast ratio is sometimes shortened to simultaneous contrast, which unfortunately has a second (unrelated) meaning. See

Surround effect, on page 116. Owing to this ambiguity, I avoid the term simultaneous contrast ratio and use the term intra-image contrast ratio instead. Contrast ratio without qualification should be taken as intra-image.

properties of an imaging system’s mapping between the two, physical properties of a display system (independent of any image), or various attributes of perception.

Photographers speak of scenes having “high contrast” or “low contrast.” A scene’s contrast can be characterized by the coefficient of variation of its luminance – that is, by the standard deviation of its luminance divided by its mean luminance. Scene contrast is closely related to the width of the scene’s luminance histogram with respect to its “centre of gravity.” A dark scene and a light scene may have identical scene contrast. Scene contrast is unaffected by noise.

Photographers also speak of “high key” and “low key” scenes. Loosely speaking, these are predominantly light or dark (respectively). From a cumulative histogram of log luminance, key can be quantified as the fraction of the interval along the x-axis between the 10th and 90th percentile where the 50th percentile falls.

Contrast ratio

A basic property of a display is its contrast ratio, the ratio between specified light and dark luminances – typically the luminance associated with reference (or peak) white and the luminance associated with reference black. Inter-image (or on-off, or sequential) contrast ratio is measured between fully-white and fully-black images presented individually. Intra-image contrast ratio is measured from white and black regions of a single test image such as that specified by ANSI IT7.228-1997 (now withdrawn, for projectors) or ITU-R BT.815. Visual performance of a display system is best characterized by intra-image contrast ratio.

BT.815 specifies a test image, having 16:9 aspect ratio, useful for measuring contrast ratio ;see Figure 3.1. Contrast ratio is the luminance of the white square divided by the average luminance of the black squares.

In practical imaging systems, many factors conspire to increase the luminance of black, thereby lessening contrast ratio and impairing picture quality. On an electronic display or in a projected image, intra-image contrast ratio is typically less than 800:1 owing to flare in the display system and/or spill light (stray light) in the ambient environment. Typical contrast ratios are shown in Table 3.1. Contrast ratio is a major determi-

CHAPTER 3

LINEAR-LIGHT AND PERCEPTUAL UNIFORMITY

Figure 3.1 The ITU-R BT.815 pattern is used to characterize intra-image contrast ratio. The grey background is at 8-bit interface code 126. A small square of reference white lies at the centre of the image; it occupies 2/15 of the image height. Four identical small squares of reference black are disposed around the centre, offset by 1/3 of the image height, at 12 o’clock, 3 o’clock,

6 o’clock, and 9 o’clock.

	Max. (Ref.)	Typ. PDR		Typical	Typical BT.815
	luminance	luminance	inter-image		intra-image
Application	[cd·m–2]	[cd·m–2]	contrast ratio		contrast ratio	Minimum L*
Cinema (projection)	480 (48)	24	1	000:1	100:1	9
HD, studio mastering	120 (100)	90	10	000:1	1000:1	0.9
HD, living room (typ.)	200)	200	1	000:1	400:1	2.3
Office (sRGB, typ.)	320)	320		100:1	100:1	16

Table 3.1 Typical contrast ratios are summarized. PDR refers to perfect diffuse reflector.

nant of subjective image quality, so much so that an image reproduced with a high simultaneous contrast ratio may be judged sharper than another image that has higher measured spatial frequency content.

Because contrast ratio involves measurement of just two quantities – reference white and reference black – it is independent of transfer function. Contrast ratio (and its first cousin, dynamic range) say nothing about bit depth: A bilevel image contains both reference white and reference black, and suffices to produce the optical stimulus necessary to measure contrast ratio.

Perceptual uniformity

I introduced perceptual uniformity on page 8. Coding of a perceptual quantity is perceptually uniform if a small perturbation to the coded value is approximately equally perceptible across the range of that value. Vision cannot distinguish two luminance levels if the ratio between them is less than about 1.01 – in other words, the visual threshold for luminance difference is about 1%. This contrast sensitivity threshold is established by experiments using the test pattern such as the

30	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

L L+∆L

Figure 3.2 A contrast sensitivity test pattern reveals that a difference in luminance will be observed in certain conditions when ∆L exceeds about 1% of L. This threshold is called a justnoticeable difference (JND).

255

201200 ∆= 0.5%

2.55:1

101		∆= 1%

100

2120 ∆= 5%

Figure 3.3 The “code 100” problem with linear-light coding is that for code levels below 100 the steps between code values have ratios larger than the visual threshold: With just 256 steps, some steps are liable to be visible as banding.

one sketched in Figure 3.2 in the margin; details will be presented in Contrast sensitivity, on page 249. Ideally, pixel values are placed at these just noticeable difference (JND) increments along the scale from reference black to reference white.

The “code 100” problem and nonlinear image coding

Consider 8-bit pixel values proportional to luminance, where code zero represents black, and the maximum code value of 255 represents white, as in Figure 3.3 below. Code 100 lies at the point on the scale where the ratio between adjacent luminance values is 1%: Owing to the approximate 1% contrast threshold of vision, the boundary between a region of code-100 samples and a region of code-101 samples is liable to be visible.

As pixel value decreases below 100, the difference in luminance between adjacent codes becomes increasingly perceptible: At code 20, the ratio between adjacent luminance values is 5%. In a large area of smoothly

4095

40014000 ∆= 0.025%

40.95:1

101		∆= 1%

100

Figure 3.4 The “code 100” problem is mitigated by using more than 8 bits to represent luminance. Here, 12 bits are used, placing the top end of the scale at 4095. However, the majority of these 4096 codes cannot be distinguished visually.

CHAPTER 3

LINEAR-LIGHT AND PERCEPTUAL UNIFORMITY

varying shades of grey, these luminance differences are likely to be visible or even objectionable. Visible jumps in luminance produce contouring or banding artifacts.

Linear-light codes above 100 suffer no banding artifacts. However, as code value increases toward white, the codes have decreasing perceptual utility: At code 200, the luminance ratio between adjacent codes is just 0.5%, below the threshold of visibility. Codes 200 and 201 are visually indistinguishable; code 201 could be discarded without its absence being noticed.

High-quality image display requires a contrast ratio of at least 30 to 1 between the luminance of white and the luminance of black. In 8-bit linear-light coding, the ratio between the brightest luminance (code 255) and the darkest luminance that can be reproduced without banding (code 100) is only 2.55:1. Linear-light coding in 8 bits is therefore unsuitable for high-quality images.

One way to mitigate the “code 100” problem is to place the top end of the scale at a code value much higher than 100, as sketched in Figure 3.4. If luminance is represented in 12 bits, with white at code 4095, the luminance ratio between code 100 and white reaches 40.95:1. However, the vast majority of those 4096 code values cannot be distinguished visually; for example, codes 4000 through 4039 are visually indistinguishable. Rather than coding luminance linearly with a large number of bits, we can use many fewer code values, and assign them on a perceptual scale that is nonlinear with respect to light power. Such nonlinear-light – or perceptually uniform – coding is pervasive in digital imaging; it works so beautifully that many practitioners don’t even know it’s happening.

In video (including motion-JPEG, MPEG, and H.264/AVC), and in digital photography (including JPEG/JFIF/Exif), R’G’B’ components are coded in a perceptually uniform manner. A video display has a nonlinear transfer function from code value to luminance. That function is comparable to the function graphed in Figure 1.7 on page 8, and approximates the inverse of the lightness sensitivity of human vision. Perceptual uniformity is effected by applying a nonlinear transfer function – gamma correction – to each tristimulus estimate sensed from the scene. Gamma correction parameters are chosen – by default, manually, or

32	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

An example of 8-bit quantization as commonly used in computing is shown in the right-hand sketch of Figure 4.1, on page 37.

log 100 ≈ 462; 1.01462 ≈ 100 log 1.01

Conversely, display R’G’B’ values are proportional to displayed luminance raised to approximately the 0.42-power.-See Gamma, on

page 315.

255	2.4

255
		≈ 1.01
254	2.4


255

automatically – such that the intended image appearance is obtained on the reference display in the reference viewing environment. Gamma correction thereby incorporates both considerations of perceptual uniformity and considerations of picture rendering.

A truecolour image in computing is usually represented in R’G’B’ components of 8 bits each, as I will explain further on page 36. Each component ranges from 0 (black) through 255 (white). Greyscale and truecolour data in computing is usually coded so as to exhibit approximate perceptual uniformity: The steps are not proportional to intensity, but are instead uniformly spaced perceptually. The number of steps required depends upon properties of the display system, of the viewing environment, and of perception.

Reference display and viewing parameters for studio video have historically been either poorly standardized or not standardized at all. In Reference display and viewing conditions, on page 427, I summarize a set of realistic parameters.

Video standards such as BT.709 and SMPTE ST 274 have historically established standard opto-electronic conversion functions (OECFs), as if video had scenereferred image state and as if cameras had no adjustments! In fact, video is effectively output (display) referred, and what matters most is not the OECF but the electro-optical conversion function, EOCF! Nonlinear coding is the central topic of Chapter 27, Gamma, on page 315; that chapter discusses OECF and EOCF.

If the threshold of vision behaved strictly according to the 1% relationship across the whole tone scale, then luminance could be coded logarithmically. For

a contrast ratio of 100:1, about 462 code values would be required, corresponding to about 9 bits.

For reasons to be explained in Luminance and lightness, on page 255, instead of modelling the lightness sensitivity of vision as a logarithmic function, in most digital imaging systems we model it as a power function. The luminance of the red, green, or blue primary light produced by a display is made proportional to code value raised to approximately the 2.4-power. The equation in the margin computes the ratio of linearlight value between the highest 8-bit code value (255) and the next-lowest value (254): The 1.01-requirement

CHAPTER 3

LINEAR-LIGHT AND PERCEPTUAL UNIFORMITY

A related claim is that 8-bit imaging has an optical density range of about 2.4, where 2.4 is the base-10 log of 1/255. This claim similarly rests upon linear-light coding.

	1		2.4
			≈ 0.0000016

	255

See Bit depth requirements, on page 326.

of human vision is met. As code values decrease, the ratio gets larger; however, as luminance decreases the luminance ratio required to maintain the “JND” of vision gets larger. The 2.4-power function turns out to be a very good match to the perceptual requirement.

Eight-bit imaging systems are often claimed to have “dynamic range” of 255:1 or 256:1. Such claims arise from the assumption that image data codes are linearly related to light. However, most 8-bit image data is coded perceptually, like sRGB, assuming a 2.2- or 2.4- power function at the display: For an ideal display, the dynamic range associated with code 1 would be close to a million to one. In practice, physical parameters of the display and its environment limit the dynamic range.

The cathode ray tube (CRT) was, for many decades, the dominant display device for television receivers and for desktop computers. Amazingly, a CRT exhibits an EOCF that is very nearly the inverse of vision’s lightness sensitivity! The nonlinear lightness response of vision and the power function intrinsic to a CRT combine to cause the display code value (historically, voltage) to exhibit perceptual uniformity, as demonstrated in Figures 3.5 and 3.6 (opposite). The CRT’s characteristics were the basis for perceptual uniformity for the first half century of electronic imaging. Most modern display devices – such as LCDs, PDPs, and DLPs – do not have a native, physical power function like that of a CRT; however, signal processing circuits impose whatever transfer function is necessary to make the device behave as if it had a 2.2- or 2.4-power function from signal value to displayed tristimulus value.

In video, this perceptually uniform relationship is exploited by gamma correction circuitry incorporated into every video camera. The R’G’B’ values that result from gamma correction – the values that are processed, recorded, and distributed in video – are roughly proportional to the 0.42-power of the intended display luminance: R’G’B’ values are nearly perceptually uniform. Perceptual uniformity allows as few as 8 bits to be used for each R’G’B’ component. Without perceptual uniformity, each component would need 11 bits or more. Image coding standards for digital still cameras – for example, JPEG/Exif – adopt a similar approach.

34	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Pixel value, 8-bit scale

100

150

200

250

Figure 3.5 A greyscale ramp on a CRT display is generated by writing successive integer values 0 through 255 into the columns of a framebuffer. When processed by a digital-to-analog converter (DAC), and presented to a CRT display, a perceptually uniform sweep of lightness results. A naive experimenter might conclude – mistakenly! – that code values are proportional to intensity.

Pixel value, 8-bit scale	0	50	100	150	200	250
Luminance, Y, relative
Luminance, Y, relative	0 0.02	0.05	0.1	0.2	0.4 0.6	0.8 1
CIE Lightness, L*

	0 10 20		40	60	80	100

Figure 3.6 A greyscale ramp, augmented with CIE relative luminance (Y, proportional to intensity, on the middle scale), and CIE lightness (L*, on the bottom scale). The point midway across the screen has lightness value midway between black and white. There is a near-linear relationship between code value and lightness. However, luminance at the midway point is only about 18% of white! Luminance produced by a CRT is approximately proportional to the 2.4-power of code value. Lightness is approximately the 0.42-power of luminance. Amazingly, these relationships are near inverses. Their near-perfect cancellation has led many workers in video, computer graphics, and digital image processing to misinterpret the term intensity, and to underestimate – or remain ignorant of – the importance of nonlinear transfer functions.

CHAPTER 3

LINEAR-LIGHT AND PERCEPTUAL UNIFORMITY

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 1112 / 14012 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Соседние файлы в папке литература

#
13.05.202619.44 Mб0Color_Appearance_Models_SE_RUS.pdf
#
13.05.202638.02 Mб0Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf
#
13.05.202610.94 Mб0hubel глаз мозг зрение.djvu
#
13.05.202623.26 Mб0Tsifrovye_videoinformatsionnye_sistemy_teoria_i.pdf
#
13.05.202612.31 Mб0tv_4.djvu
#
13.05.20265.32 Mб0videocode.djvu
#
13.05.20261.98 Mб0Zelenin_IA_PTSVS.pdf