Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
литература / Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf
Скачиваний:
0
Добавлен:
13.05.2026
Размер:
38.02 Mб
Скачать

Although MPEG-2 itself permits 23.967 Hz or 24 Hz native frame rate, DVD doesn’t. Film material on 480i DVDs must be coded as 59.94 Hz interlaced. Flags identify film frames to the decoder, which then either inserts 2-3 pulldown for 480i29.97 output or – if configured to do so – outputs 480p23.976. See

Frame rate and 2-3 pulldown in MPEG, on page 518.

the only way to detect 2-3 pulldown is to compare data in successive fields: If two successive first fields contain luma and chroma that are identical, within a certain noise tolerance, and the repeat pattern follows the characteristic 2-3 sequence, then the material can be assumed to have originated from film. Once the original film frames have been identified, conversion to the ultimate display rate can be accomplished with minimal introduction of motion artifacts. Identifying film frames using this method is feasible for dedicated hardware, but it is difficult for today’s desktop computers.

Native 24 Hz coding

Traditionally, 2-3 pulldown is imposed in the studio, at the point of transfer from film to video. The repeated fields are redundant, and consume media capacity to no good effect except compatibility with native 60 field- per-second equipment: When movies were recorded on media such as VHS tape, fully 20% of the media capacity was wasted. In the studio, it is difficult to recover original film frames from a 2-3 sequence. Information about the repeated fields is not directly available; decoding equipment typically attempts to reconstruct the sequence by comparing pixel values in successive fields.

MPEG-2 coding can handle coding of progressive material at 24 frames per second; it is inefficient to encode a sequence with 2-3 pulldown. Some MPEG-2 encoders are equipped to detect, and remove, 2-3 pulldown, so that 24 progressive frames per second are encoded. However, this process – called inverse telecine – is complex and trouble-prone.

Ordinarily, repeated fields in 2-3 pulldown are omitted from an MPEG-2 bitstream. However, the bitstream can include flags that indicate to the decoder that the omitted fields should be repeated at the display; with these flags, a decoder can reconstruct the 2-3 sequence for display at 59.94 Hz. Alternatively, when configured for progressive output, the decoder can reconstruct 23.976 Hz frames and present them directly to suitable display equipment, or frame-triple

to 72 Hz (actually, 72/1.001 or about 71.928 Hz) at the output. All this complexity is avoided in Blu-ray, which

can directly code 23.976 Hz progressive frames.

CHAPTER 34

2-3 PULLDOWN

411

Conversion to other rates

In 2-3 pulldown from 24 Hz to 60 Hz, information from successive film frames is replicated in the fixed sequence {2, 3, 2, 3, 2, 3, …}. The frequency of the repeat pattern – the beat frequency – is fairly high: The {2, 3} pattern repeats 12 times per second, so the beat frequency is 12 Hz. The ratio of frame rates in this case is 52 – the small integers in this fraction dictate the high beat frequency.

When PCs are to display 29.97 Hz video that originated on film – from sources such as DVD or digital satellite – the situation is more complicated, and motion impairments are more likely to be introduced.

In conversion to a rate that is related to 24 Hz by

a ratio of larger integers, the frequency of the repeated pattern falls. For example, converting to 75 Hz involves the fraction 258; this creates the sequence {3, 3, 3, 3, 3, 3, 3, 4}, which repeats three times per second.

Converting to 76 Hz involves the fraction 176; this creates the sequence {2, 3, 3, 3, 3, 3}, which repeats four times per second. The susceptibility of the human visual system to motion artifacts peaks between 4 and 6 beats per second; the temporal artifacts introduced upon conversion to 75 Hz or 76 Hz are likely to be quite visible. Motion estimation and motion-compen- sated interpolation could potentially be used to reduce the severity of these conversion artifacts, but these techniques are highly complex, and will remain out of reach for desktop computing for several years.

When 24 Hz material is to be displayed in a computing environment with wide viewing angle and good ambient conditions, the best approach to minimize motion artifacts is to choose a display rate that is an integer multiple of 24 Hz. Displaying at 60 Hz reproduces the situation with video display, and we know well that the motion portrayal is quite acceptable.

412

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Deinterlacing 35

Figure 35.1 Test scene

FIRST

FIELD

SECOND

FIELD

Figure 35.2 Interlaced capture samples the position of a football about 60 times per second; frames are produced at half that rate. (A soccer ball takes

50 positions per second.)

In Interlaced format, on page 88, I explained that when a scene element in motion relative to the camera is captured by an interlaced camera, the scene element appears at different positions in the two fields. Reconstruction of progressive frames is necessary for certain image-processing operations, such as upconversion, downconversion, or standards conversion. Also, computer imagery is typically represented in progressive format: Integrating video imagery into computer displays also requires deinterlacing. This chapter outlines deinterlacing techniques.

I will introduce deinterlacing in the spatial domain. Then I will describe the vertical-temporal domain, and outline practical deinterlacing algorithms.

I will discuss the problem of deinterlacing, and the algorithms, in reference to the test scene sketched in Figure 35.1. The test scene comprises a black background, partially occluded by a white disk that is in motion with respect to the camera.

Spatial domain

Video captures 50 or 60 unique fields per second. If a scene contains an object in motion with respect to the camera, each field will carry half the spatial infor-

mation of the object, but the information in the second field will be displaced according to the object’s motion. The situation is illustrated in Figure 35.2, which shows the first and second fields, respectively. The example is typical of capture by a CCD camera set for a short exposure time; the example neglects capture blur due to nonzero exposure time at the camera. (For details of

413

Poynton, Charles (1996), “Motion portrayal, eye tracking, and emerging display technology,” in Proc. 30th SMPTE Advanced Motion Imaging Conference: 192-202.

Figure 35.3 The weave technique stitches two fields into a frame. When applied to moving objects, it produces the “field tearing,” “mouse’s teeth,” or “zipper” artifact.

Figure 35.4 Line replication

Figure 35.5 Interfield averaging

temporal characteristics of image acquisition and display, see my SMPTE paper.)

You can think of an interlaced video signal as having its lines in permuted order, compared to a progressive signal. An obvious way to accomplish deinterlacing is to write into two fields of video storage – the first field, then the second – in video order, then read out the assembled frame progressively (in spatial order). This method is sometimes given the sophisticated name field replication, or weave. This method is quite suitable for a stationary scene, or a scene containing only slowmoving elements. However, image data of the second field is delayed by half the frame time (typically 160 s or 150 s) with respect to image data of the first field. If the scene contains an element in fairly rapid motion (such as the disk in our test scene), and image data is interpreted as belonging to the same time instant, then field tearing will be introduced: The scene element will be reproduced with jagged edges, either when viewed as a still frame or when viewed in motion. The effect is sketched in Figure 35.3.

Field tearing can be avoided by intrafield processing, using only information from a single field of video. The simplest intrafield technique is to replicate each line upon progressive readout. A disadvantage is that this method will reproduce a stationary element with at most half of its potential vertical resolution. Also, line replication introduces a blockiness into the picture, and an apparent downward shift of one image row. The effect is sketched in Figure 35.4.

The blockiness of the line replication approach can be avoided by synthesizing information that is apparently located spatially in the opposite field, but located temporally coincident with the same field. This can be accomplished by averaging vertically adjacent samples in one field, to create a synthetic intermediate line, as depicted in Figure 35.5. (In the computer industry, this is called “bob.”) The averaging can be done prior to writing into the video memory, or upon reading, depending on which is more efficient for the memory system. Averaging alleviates the disadvantage of blockiness, but does not compensate the loss of vertical resolution. Nonetheless, the method performs well for VHSgrade images, which lack resolution in any case. Rather

414

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Соседние файлы в папке литература