Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
литература / Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf
Скачиваний:
0
Добавлен:
13.05.2026
Размер:
38.02 Mб
Скачать

The sharpness control in consumer receivers effects horizontal “enhancement” on the luma signal.

In the rare case of an even-order median filter, the output is the average of the central two samples after sorting.

effects. Enhancement in this case, also known as aperture correction, is accomplished by some degree of highpass filtering, either in the horizontal direction, the vertical direction, or both. Compensation of loss of detail (MTF) should be done in the linear-light domain; however, it is sometimes done in the gamma-corrected domain. Historically, vertical aperture correction in interlaced tube cameras (vidicons and plumbicons) was done in the interlaced domain.

More generally, enhancement is liable to involve nonlinear processes that are based on some assumptions about the properties of the image data. Unless signal flow is extremely well controlled, there is a huge danger in using such operations: Upon receiving image data that has not been subject to the expected process, “enhancement” is liable to degrade the image, rather than improve it. For this reason, I am generally very strongly opposed to “enhancement.”

Median filtering

A median filter is a nonlinear filter in which each output sample is computed as the median value of the input samples under the window – that is, the result is the middle value after the input values have been sorted. Ordinarily, an odd number of taps is used. Median filtering often involves a horizontal window with 3 taps; occasionally, 5 or even 7 taps are used. Sometimes spatial median filters are used (for example, 3× 3).

Any isolated extreme value, such as a large-valued sample due to impulse noise, will never appear in the output sequence of a median filter: Median filtering can be useful to reduce noise. However, a legitimate extreme value will not be included! I urge you to use great caution in imposing median filtering: If your filter is presented with image data whose statistics are not what you expect, you are very likely to degrade the image instead of improving it.

Coring

Coring is a technique widely presumed to reduce noise. The assumption (often incorrect) is made that any highfrequency signal components having low magnitude are noise. The input signal is separated into lowand highfrequency components using complementary filters. The

CHAPTER 31

VIDEO SIGNAL PROCESSING

385

LUMA or R’G’B’

Figure 31.4 A coring circuit includes complementary filters that separate lowand high-frequency components. The high-frequency components are processed by the nonlinear transfer function in the sketch.

 

LOWPASS FILTER

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

NONLINEAR

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TRANSFER FUNCTION

 

 

 

 

 

 

HIGHPASS FILTER

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

+1

 

LUMA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

+1

 

or R’G’B’

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

low-frequency component is passed to the output. The magnitude of the high-frequency component is estimated, and the magnitude is subject to a thresholding operation. If the magnitude is below threshold, then the high-frequency component is discarded; otherwise, it is passed to the output through summation with the low-frequency component. Coring can be implemented by the block diagram shown in Figure 31.4 above.

Like median filtering, coring depends upon the statistical properties of the image data. If the image is a flatshaded cartoon having large areas of uniform colour with rapid transitions between them, then coring will eliminate noise below a certain magnitude. However, if the input is not a cartoon, you run the risk that coring will cause it to look like one! In a close-up of a face, skin texture produces a low-magnitude, high-frequency component that is not noise. If coring eliminates this component, the face will take on the texture of plastic.

Coring is liable to introduce spatial artifacts into an image. Consider an image containing a Persian carpet that recedes into the distance. The carpet’s pattern will produce a fairly low spatial frequency in the foreground (at the bottom of the image); as the pattern recedes into the background, the spatial frequency of the pattern becomes higher and its magnitude becomes lower. If this image is subject to coring, beyond a certain distance, coring will cause the pattern to vanish. The viewer will perceive a sudden transition from the pattern of the carpet to no pattern at all. The viewer may conclude that beyond a certain distance there is a different carpet, or no carpet at all.

386

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Porter, Thomas, and Tom Duff

(1984), “Compositing digital images,” in Computer Graphics,

18 (3): 253–259 (July, Proc. siggraph). The terms composite and compositing are overused in video!

SMPTE RP 157, Key Signals.

Eq 31.1

R = α FG + (1− α) BG

Chroma transition improvement (CTI)

Colour-under VCRs exhibit very poor colour difference bandwidth (evidenced as poor chroma resolution in the horizontal direction). A localized change in luma may be faithfully reproduced, but the accompanying change in colour difference components will be spread horizontally. If you assume that coloured areas tend to be uniformly coloured, one way of improving image quality is to detect localized changes in luma, and use that information to effect repositioning of colour difference information. Techniques to accomplish this are collectively known as chroma transition improvement (CTI).

If you use CTI, you run the risk of introducing excessive emphasis on edges. Also, CTI operates only on the horizontal dimension: Excessive CTI is liable to become visible owing to perceptible (or even objectionable) differences between the horizontal and vertical characteristics of the image. CTI works well on cartoons, and on certain other types of images. However, it should be used cautiously.

Mixing and keying

Mixing video signals together to create a transition, or a layered effect – for example, to mix or wipe – is called compositing. In America, a piece of equipment (with

a control surface) that performs such effects is a production switcher. In Europe, the equipment – or the person that operates it! – is called a vision mixer.

Accomplishing mix, wipe, or key effects in hardware requires synchronous video signals – that is, signals whose timing matches perfectly in the vertical and horizontal domains.

Keying (or compositing, or blending) refers to superimposing a foreground (FG, or fill video) image over

a background (BG) image. Keying is normally controlled by a key (or matte) signal, represented like luma, that indicates the opacity of the accompanying foreground image data, coded between black (0, fully transparent) and white (1, fully opaque). In computer graphics, the key signal (data) is called alpha (α), and the operation is called compositing.

The keying (or compositing) operation is performed as in Equation 31.1. Foreground image data that has been premultiplied by the key is called shaped in video,

CHAPTER 31

VIDEO SIGNAL PROCESSING

387

The most difficult part of keying is extracting (“pulling”) the matte. For review from a computer graphics perspective, see Smith, Alvy Ray, and James F. Blinn

(1996), “Blue screen matting,” in

Computer Graphics (Proc. siggraph), 259–268.

Figure 31.5 This matte image example shows the typical matte polarity. See Chuang, Yung-Yu et al. (2002), “Video matting of complex scenes,” in ACM Transactions on Graphics 21 (3) (Proc. siggraph), 243–248 (July).

or associated, integral, or premultiplied in computer graphics. Foreground image data that has not been premultiplied by the key is called unshaped in video, or unassociated or nonpremultiplied in computer graphics.

The key signal is sometimes called linear key: The modifier linear does not refer to linear light, but to a key signal representing opacity with more than just the two levels fully transparent and fully opaque. In keying or compositing, the compositing operation of Equation 31.1 is applied directly without any transfer function applied to the key signal.

Historically, keying was accomplished in the gamma domain. However, proper simulation of the physics of blending requires keying in the linear-light domain; such an approach is now widely practiced in software systems (sometimes by setting an option denoted something like “blend in gamma=1.0 space”); an increasing number of hardware-based video switchers are now capable of linear-light blending.

The multiplication of foreground and background data in keying is equivalent to modulation: This can produce signal components above half the sampling rate, thereby producing alias components. Aliasing can be avoided by upsampling the foreground, background, and key signals; performing the keying operation at twice the video sampling rate; then suitably filtering and downsampling the result. Most keyers operate directly at the video sampling rate without upsampling or downsampling, and consequently exhibit some aliasing.

Figure 31.5 is a matte image representative of work by Yung-Yu Chuang and his colleagues at the University of Washington.

In order for a compositing operation to mimic the mixing of light in an actual scene, keying should be performed on foreground and background in the linearlight domain. However, keying in video has historically been performed in the gamma-corrected domain.

388

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Соседние файлы в папке литература