Добавил:

chrysler_a57_mltbnk Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский технический университет связи и информатики

Предмет:

Основы телевидения

Файл:

литература / Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf

Скачиваний:

Добавлен:

13.05.2026

Размер:

38.02 Mб

Скачать

☆

<<< < Предыдущая 112 113 114 115 116 117 118 119 120 121 122 123124 / 140124 125 126 127 128 129 130 131 132 133 134 135 136 > Следующая >>>

Inverse quantization is sometimes denoted IQ, not to be confused with IQ colour difference components.

When a decoder reconstructs a P-picture, it is displayed; additionally, the picture is written into a reference frame so as to be available for subsequent predictions.

Each B-picture contains elements that are bipredicted from one or both reference frames. The encoder computes, compresses, and transmits residuals. The decoder reconstructs a B-picture, displays it, then discards it: No B-picture is used for prediction.

Each reference picture is associated with a full frame of storage. When a decoder reconstructs a reference field (an I-field or a P-field), half the lines of the reference framestore are written; the other half retains the contents of the previous reference field. After the first field of a field pair has been reconstructed, it is available as a predictor for the second field. (The first field of the previous reference frame is no longer available.)

Prediction

In Figure 16.1, on page 152, I sketched a naïve interpicture coding scheme. For any scene element that moves more than few pixels from one video frame to the next, the naïve scheme is liable to produce large interpicture difference values. Motion can be more effectively coded by having the encoder form motioncompensated predictions. The encoder also produces motion vectors; these are used to displace a region of a reference picture to improve the prediction of the current picture relative to an undisplaced prediction. The residuals are then compressed using DCT, quantized, and VLE-encoded.

At a decoder, predictions are formed from the reference picture(s), based upon the transmitted motion vectors and prediction modes. Residuals are recovered from the bitstream by VLE decoding, inverse quantization, and inverse DCT. Finally, the decoded residual is added to the prediction to form the reconstructed picture. If the decoder is reconstructing an I-picture or a P-picture, the reconstructed picture is written to the appropriate portion (or the entirety) of a reference frame.

The obvious way for an encoder to form forward interpicture differences is to subtract the current source picture from the reference picture. (The reference

CHAPTER 47

MPEG-2 VIDEO COMPRESSION

521

A prediction region in a reference frame is rarely aligned to

a 16-luma-sample macroblock grid; it is not properly referred to as a macroblock. Some authors fail to make the distinction between macroblocks and prediction regions; other authors use the term prediction macroblocks for prediction regions.

In a closed GoP, no B-picture is permitted to use forward prediction to the I-picture that starts the next GoP. See the caption to Figure 16.5, on page 155.

picture would have been subject to motion-compen- sated interpolation, according to the encoder’s motion estimate.) Starting from an intra coded picture, the decoder would then accumulate interpicture differences. However, MPEG involves lossy compression: Both the I-picture starting point of a GoP and each set of decoded interpicture differences are subject to reconstruction errors. With the naïve scheme of computing interpicture differences, reconstruction errors would accumulate at the decoder. To alleviate this potential source of decoder error, the encoder incorporates a decoder. The interpicture difference is formed by subtracting the current source picture from the previous reference picture as a decoder will reconstruct it. Reconstruction errors are thereby brought “inside the loop,” and are prevented from accumulating.

The prediction model used by MPEG-2 is blockwise translation of 16× 16 blocks of luma samples (along with the associated chroma samples): A macroblock of the current picture is predicted from a like-sized region of a reconstructed reference picture. The choice of

16× 16 region size was a compromise between the desire for a large region (to effectively exploit spatial coherence, and to amortize motion vector overhead across a fairly large number of samples), and a small region (to efficiently code small scene elements in motion).

Macroblocks in a P-picture are typically forwardpredicted. However, an encoder can decide that a particular macroblock is best intracoded (that is, not predicted at all). Macroblocks in a B-picture are typically predicted as averages of motion-compensated past and future reference pictures – that is, they are ordinarily bidirectionally predicted. However, an encoder can decide that a particular macroblock in a B-picture is best intracoded, or unidirectionally predicted using either forward or backward prediction. Table 47.7 at the top of the facing page indicates the four macroblock types. The macroblock types allowed in any picture are restricted by the declared picture type, as indicated in Table 47.8.

Each nonintra macroblock in an interlaced sequence can be predicted either by frame prediction (typically chosen by the encoder when there is little motion

522	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Table 47.7 MPEG macroblock			Typ. quantizer
types		Prediction	matrix

Intra		None – the macroblock is self-contained	Perceptual

	Backward	Predicts from the future reference picture	Flat
	predictive-coded

Inter	Forward	Predicts from the past reference picture	Flat
(Nonintra)	predictive-coded

	Bipredictive-coded	Averages predictions from past and future	Flat

reference pictures

Table 47.8 MPEG	Binary	Reference
picture coding types	code	picture? Permitted macroblock types

I-picture	001	Yes	Intra

P-picture	010	Yes	Intra
			Forward predictive-coded

B-picture	011	No	Intra
			Forward predictive-coded
			Backward predictive-coded
			Bipredictive-coded


Table 47.9 MPEG-2				Max. MVs
prediction modes	For		Description	back.	fwd.
Frame prediction	(P, B)-pictures		Predictions are made for the frame, using	1	1
			data from one or two previously
			reconstructed frames.
Field prediction	(P, B)-pictures,		Predictions are made independently for	1	1
	(P, B)-fields		each field, using data from one or two
			previously reconstructed fields.
16× 8 motion	(P, B)-fields		The upper 16× 8 and lower 16× 8 regions	2	2
compensation			of the macroblock are predicted separately.
(16× 8 MC)			(This is completely unrelated to top and
			bottom fields.)
Dual prime	P-fields with		Two motion vectors are derived from the	1	1
	no intervening		transmitted vector and a small differential
	B-pictures		motion vector (DMV, -1, 0, or +1); these
			are used to form predictions from two
			reference fields (one top, one bottom),
			which are averaged to form the predictor.
Dual prime	P-pictures with		As in dual prime for P-fields (above), but	1	1
	no intervening		repeated for 2 fields; 4 predictions are
	B-pictures		made and averaged.

CHAPTER 47

MPEG-2 VIDEO COMPRESSION

523

between the fields), or by field prediction (typically chosen by the encoder when there is significant interfield motion). This is comparable to field/frame coding in DV, which I described on page 507. Predictors for a field picture must be field predictors. However, predictors for a frame picture may be chosen on

a macroblock-by-macroblock basis to be either field predictors or frame predictors. MPEG-2 defines several additional prediction modes, which can be selected on a macroblock-by-macroblock basis. MPEG-2’s prediction modes are summarized in Table 47.9.

Motion vectors (MVs)

A motion vector identifies a region of 16× 16 luma samples in a reference picture that are to be used for prediction. A motion vector refers to a prediction region that is potentially quite distant (spatially) from the region being coded – that is, the motion vector range can be quite large. Even in field pictures, motion vectors are specified in units of frame luma samples. A motion vector can specify integer pixel coordinates, in which case forming the 16× 16 prediction is accomplished by merely copying pixels. However, in MPEG, a motion vector can be specified to half-sample precision: If the fractional bit of a motion vector is set, then the prediction is formed by averaging sample values at the neighboring integer coordinates – that is, by linear interpolation. Transmitted motion vector values are halved for use with subsampled chroma. All defined profiles require that no motion vector refers to any sample outside the bounds of the reference frame.

Each macroblock’s header contains a count of motion vectors. Motion vectors are themselves predicted! An initial MV is established at the start of a slice (see page 534); the motion vector for each successive nonintra macroblock is differentially coded with respect to the previous macroblock in raster-scan order.

Motion vectors are variable-length encoded, so that short vectors – the most likely ones in large areas of translational motion or no motion – are coded compactly. Zero-valued motion vectors are quite likely, so provision is made for compact coding of them.

Intra macroblocks are not predicted, so motion vectors are not necessary for them. However, in certain

524	DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

<<< < Предыдущая 112 113 114 115 116 117 118 119 120 121 122 123124 / 140124 125 126 127 128 129 130 131 132 133 134 135 136 > Следующая >>>

Соседние файлы в папке литература

#
13.05.202619.44 Mб0Color_Appearance_Models_SE_RUS.pdf
#
13.05.202638.02 Mб0Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf
#
13.05.202610.94 Mб0hubel глаз мозг зрение.djvu
#
13.05.202623.26 Mб0Tsifrovye_videoinformatsionnye_sistemy_teoria_i.pdf
#
13.05.202612.31 Mб0tv_4.djvu
#
13.05.20265.32 Mб0videocode.djvu
#
13.05.20261.98 Mб0Zelenin_IA_PTSVS.pdf