Добавил:

Andrey Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Санкт-Петербургский государственный электротехнический университет "ЛЭТИ"

Предмет:

Электротехника

Файл:

Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf

Скачиваний:

Добавлен:

23.08.2013

Размер:

4.27 Mб

Скачать

☆

<<< < Предыдущая 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 4849 / 5549 50 51 52 53 54 55 > Следующая >>>

•	DESIGN AND PERFORMANCE
256

parameters for MB1 affects the coding performance of MB2 since, for example, the coding modes of MB2 (e.g. MV, intra prediction mode, etc.) may be differentially encoded from the coding modes of MB1.

Achieving near-optimum rate–distortion performance can be a very complex problem indeed, many times more complex than the video coding process itself. In a practical CODEC, the choice of optimisation strategy depends on the available processing power and acceptable coding latency. So-called ‘two-pass’ encoding is widely used in ofﬂine encoding, in which each frame is processed once to generate sequence statistics which then inﬂuence the coding strategy in the second coding pass (often together with a rate control algorithm to achieve a target bit rate or ﬁle size).

Many alternative rate–distortion optimisation strategies have been proposed (such as those based on Lagrangian optimisation) and a useful review can be found in [6]. Rate– distortion optimisation should not be considered in isolation from computational performance. In fact, video CODEC optimisation is (a least) a three-variable problem since rate, distortion and computational complexity are all inter-related. For example, rate–distortion optimised mode decisions are achieved at the expense of increased complexity, ‘fast’ motion estimation algorithms often achieve low complexity at the expense of motion estimation (and hence coding) performance, and so on. Coding performance and computational performance can be traded against each other. For example, a real-time coding application for a hand-held device may be designed with minimal processing load at the expense of poor rate–distortion performance, whilst an application for ofﬂine encoding of broadcast video data may be designed to give good rate–distortion performance, since processing time is not an important issue but encoded quality is critical.

7.5 RATE CONTROL

The MPEG-4 Visual and H.264 standards require each video frame or object to be processed in units of a macroblock. If the control parameters of a video encoder are kept constant (e.g. motion estimation search area, quantisation step size, etc.), then the number of coded bits produced for each macroblock will change depending on the content of the video frame, causing the bit rate of the encoder output (measured in bits per coded frame or bits per second of video) to vary. Typically, an encoder with constant parameters will produce more bits when there is high motion and/or detail in the input sequence and fewer bits when there is low motion and/or detail. Figure 7.35 shows an example of the variation in output bitrate produced by coding the Ofﬁce sequence (25 frames per second) using an MPEG-4 Simple Proﬁle encoder, with a ﬁxed quantiser step size of 12. The ﬁrst frame is coded as an I-VOP (and produces a large number of bits because there is no temporal prediction) and successive frames are coded as P-VOPs. The number of bits per coded P-VOP varies between 1300 and 9000 (equivalent to a bitrate of 32–225 kbits per second).

This variation in bitrate can be a problem for many practical delivery and storage mechanisms. For example, a constant bitrate channel (such as a circuit-switched channel) cannot transport a variable-bitrate data stream. A packet-switched network can support varying throughput rates but the mean throughput at any point in time is limited by factors such as link rates and congestion. In these cases it is necessary to adapt or control the bitrate produced by a video encoder to match the available bitrate of the transmission mechanism. CD-ROM

RATE CONTROL	•
	257

Office sequence, 25 fps, MP4 Simple Profile, QP = 12

	10000
	9000
	8000
	7000
frame	6000
frame	5000
Bitsper	5000
Bitsper	4000
	3000
	2000
	1000
	0	20	40	60	80	100	120	140	160	180	200
	0	20	40	60	80	100	120	140	160	180	200
						Frames

Figure 7.35 Bit rate variation (MPEG-4 Simple Proﬁle)

constant	variable	constant	constant	variable	constant
frame rate	bitrate	bitrate	bitrate	bitrate	frame rate

video

encoder

decoder

video

channel

frames

Figure 7.36 Encoder output and decoder input buffers

and DVD media have a ﬁxed storage capacity and it is necessary to control the rate of an encoded video sequence (for example, a movie stored on DVD-Video) to ﬁt the capacity of the medium.

The variable data rate produced by an encoder can be ‘smoothed’ by buffering the encoded data prior to transmission. Figure 7.36 shows a typical arrangement, in which the variable bitrate output of the encoder is passed to a ‘First In/First Out’ (FIFO) buffer. This buffer is emptied at a constant bitrate that is matched to the channel capacity. Another FIFO is placed at the input to the decoder and is ﬁlled at the channel bitrate and emptied by the decoder at a variable bitrate (since the decoder extracts P bits to decode each frame and P varies).

Example

The ‘Ofﬁce’ sequence is coded using MPEG-4 Simple Proﬁle with a ﬁxed Q P = 12 to produce the variable bitrate plotted in Figure 7.35. The encoder output is buffered prior to transmission over a 100 kbps constant bitrate channel. The video frame rate is 25 fps and so the channel transmits

•	DESIGN AND PERFORMANCE
258

x 104 Encoder buffer contents (channel bitrate 100kbps)

	10
	9
	8
(bits)	7
(bits)	6
contents	6
	5

Buffer	4
Buffer	3
	3
	2
	1
	0	1	2	3	4	5	6	7	8
	0	1	2	3	4	5	6	7	8
					Seconds

Figure 7.37 Buffer example (encoder; channel bitrate 100 kbps)

4 kbits (and hence removes 4 kbits from the buffer) in every frame period. Figure 7.37 plots the contents of the encoder buffer (y-axis) against elapsed time (x-axis). The ﬁrst I-VOP generates over 50 kbits and subsequent P-VOPs in the early part of the sequence produce relatively few bits and so the buffer contents drop for the ﬁrst 2 seconds as the channel bitrate exceeds the encoded bitrate. At around 3 seconds the encoded bitrate starts to exceed the channel bitrate and the buffer ﬁlls up.

Figure 7.38 shows the state of the decoder buffer, ﬁlled at a rate of 100 kbps (4 kbits per frame) and emptied as the decoder extracts each frame. It takes half a second before the ﬁrst complete coded frame (54 kbits) is received. From this point onwards, the decoder is able to extract and decode frames at the correct rate (25 frames per second) until around 4 seconds have elapsed. At this point, the decoder buffer is emptied and the decoder ‘stalls’ (i.e. it has to slow down or pause decoding until enough data are available in the buffer). Decoding picks up again after around 5.5 seconds.

If the decoder stalls in this way it is a problem for video playback because the video clip ‘freezes’ until enough data available to continue. The problem can be partially solved by adding a deliberate delay at the decoder. For example, Figure 7.39 shows the results if the decoder waits for 1 second before it starts decoding. Delaying decoding of the ﬁrst frame allows the buffer contents to reach a higher level before decoding starts and in this case the contents never drop to zero and so playback can proceed smoothly2 .

2 Varying throughput rates from the channel can also be handled using a decoder buffer. For example, a widely-used technique for video streaming over IP networks is for the decoder to buffer a few seconds of coded data before commencing decoding. If data throughput drops temporarily (for example due to network congestion) then decoding can continue as long as data remain in the buffer.

RATE CONTROL	•
	259

		4	Decoder buffer contents (channel bitrate 100kbps)
	x 10		Decoder buffer contents (channel bitrate 100kbps)
	7
		1st frame decoded
	6
	5
(bits)	4
contents	4
contents	3
Buffer	3
Buffer
	2
					Decoder stalls
	1
	0	1	2	3	4	5	6	7	8	9
	0	1	2	3	4	5	6	7	8	9
					Seconds

Figure 7.38 Buffer example (decoder; channel bitrate 100 kbps)

x 104 Decoder buffer contents (channel bitrate 100kbps)

	12
		1st frame decoded
	10
(bits)	8
(bits)
Buffercontents	6
Buffercontents	4
	2
	0	1	2	3	4	5	6	7	8	9
	0	1	2	3	4	5	6	7	8	9
					Seconds

Figure 7.39 Buffer example (decoder; channel bitrate 100 kbps)

•	DESIGN AND PERFORMANCE
260

These examples show that a variable coded bitrate can be adapted to a constant bitrate delivery medium using encoder and decoder buffers. However, this adaptation comes at a cost of buffer storage space and delay and (as the examples demonstrate) the wider the bitrate variation, the larger the buffer size and decoding delay. Furthermore, it is not possible to cope with an arbitrary variation in bitrate using this method, unless the buffer sizes and decoding delay are set at impractically high levels. It is usually necessary to implement a feedback mechanism to control the encoder output bitrate in order to prevent the buffers from overor under-ﬂowing.

Rate control involves modifying the encoding parameters in order to maintain a target output bitrate. The most obvious parameter to vary is the quantiser parameter or step size (QP) since increasing QP reduces coded bitrate (at the expense of lower decoded quality) and vice versa. A common approach to rate control is to modify QP during encoding in order to (a) maintain a target bitrate (or mean bitrate) and (b) minimise distortion in the decoded sequence. Optimising the tradeoff between bitrate and quality is a challenging task and many different approaches and algorithms have been proposed and implemented. The choice of rate control algorithm depends on the nature of the video application, for example:

(a)Ofﬂine encoding of stored video for storage on a DVD. Processing time is not a particular constraint and so a complex algorithm can be employed. The goal is to ‘ﬁt’ a compressed video sequence into the available storage capacity whilst maximising image quality and ensuring that the decoder buffer of a DVD player does not overﬂow or underﬂow during decoding. Two-pass encoding (in which the encoder collects statistics about the video sequence in a ﬁrst pass and then carries out encoding in a second pass) is a good option in this case.

(b)Encoding of live video for broadcast. A broadcast programme has one encoder and multiple decoders; decoder processing and buffering is limited whereas encoding may be carried out in expensive, fast hardware. A delay of a few seconds is usually acceptable and so there is scope for a medium-complexity rate control algorithm, perhaps incorporating two-pass encoding of each frame.

(c)Encoding for two-way videoconferencing. Each terminal has to carry out both encoding and decoding and processing power may be limited. Delay must be kept to a minimum (ideally less than around 0.5 seconds from frame capture at the encoder to display at the decoder). In this scenario a low-complexity rate control algorithm is appropriate. Encoder and decoder buffering should be minimised (in order to keep the delay small) and so the encoder must tightly control output rate. This in turn may cause decoded video quality to vary signiﬁcantly, for example it may drop signiﬁcantly when there is an increase in movement or detail in the video scene.

Recommendation H.264 does not (at present) specify or suggest a rate control algorithm (however, a proposal for H.264 rate control is described in [39]). MPEG-4 Visual describes a possible rate control algorithm in an Informative Annex [40] (i.e. use of the algorithm is not mandatory). This algorithm, known as the Scalable Rate Control (SRC) scheme, is appropriate for a single video object (a rectangular V.O. that covers the entire frame) and a range of bit rates and spatial/temporal resolutions. The SRC attempts to achieve a target bit rate over a certain number of frames (a ‘segment’ of frames, usually starting with an I-VOP) and assumes the following model for the encoder rate R:

R =	X1 S	+	X2 S	(7.10)
	Q		Q2

RATE CONTROL	•
	261

where Q is the quantiser step size, S is the mean absolute difference of the residual frame after motion compensation (a measure of frame complexity) and X1, X2 are model parameters. Rate control consists of the following steps which are carried out after motion compensation and before encoding of each frame i:

1.Calculate a target bit rate Ri , based on the number of frames in the segment, the number of bits that are available for the remainder of the segment, the maximum acceptable buffer contents and the estimated complexity of frame i. (The maximum buffer size affects the latency from encoder input to decoder output. If the previous frame was complex, it is assumed that the next frame will be complex and should therefore be allocated a suitable number of bits: the algorithm attempts to balance this requirement against the limit on the total number of bits for the segment.)

2.Compute the quantiser step size Qi (to be applied to the whole frame). Calculate S for the complete residual frame and solve equation (7.10) to ﬁnd Q.

3.Encode the frame.

4.Update the model parameters X1, X2 based on the actual number of bits generated for frame i.

The SRC algorithm aims to achieve a target bit rate across a segment of frames (rather than a sequence of arbitrary length) and does not modulate the quantiser step size within a coded frame, giving a uniform visual appearance within each frame but making it difﬁcult to maintain a small buffer size and hence a low delay. An extension to the SRC supports modulation of the quantiser step size at the macroblock level and is suitable for low-delay applications that require ‘tight’ rate control. The macroblock-level algorithm is based on a model for the number of bits Bi required to encode macroblock i , equation (7.11):

=		σ 2
		Qi2 +
Bi	A K	i	C	(7.11)

where A is the number of pixels in a macroblock, σi is the standard deviation of luminance and chrominance in the residual macroblock (i.e. a measure of variation within the macroblock), Qi is the quantisation step size and K, C are constant model parameters. The following steps are carried out for each macroblock i :

1.Measure σi .

2.Calculate Qi based on B, K , C, σi and a macroblock weight αi .

3.Encode the macroblock.

4.Update the model parameters K and C based on the actual number of coded bits produced for the macroblock.

The weight αi controls the ‘importance’ of macroblock i to the subjective appearance of the image and a low value of αi means that the current macroblock is likely to be highly quantised. These weights may be selected to minimise changes in Qi at lower bit rates since each change involves sending a modiﬁed quantisation parameter DQUANT which means encoding an extra ﬁve bits per macroblock. It is important to minimise the number of changes to Qi during encoding of a frame at low bit rates because the extra ﬁve bits in a macroblock may become signiﬁcant; at higher bit rates, this DQUANT overhead is less important and Q may change more frequently without signiﬁcant penalty. This rate control method is effective

<<< < Предыдущая 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 4849 / 5549 50 51 52 53 54 55 > Следующая >>>

Соседние файлы в предмете Электротехника

#
23.08.20131.4 Mб15Revised report on the algorithmic language Algol-68.pdf
#
23.08.2013111.05 Кб12Rich H.H.J reference card.V6.01.2006.pdf
#
23.08.20131.79 Mб20Rich H.J for C programmers.2006.pdf
#
23.08.2013798.85 Кб20Richards M.The BCPL Cintcode and Cintpos user guide.2005.pdf
#
23.08.201341.83 Кб21Richards M.The BCPL reference manual.1967.pdf
#
23.08.20134.27 Mб38Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf
#
23.08.2013718.38 Кб108Ridley R.Потери в обмотках вследствие эффекта близости.pdf
#
23.08.201364.93 Кб28Ritchie D.M.The development of the C language.1993.pdf
#
23.08.2013379.35 Кб16Rivard F.Smalltalk.A reflective language.pdf
#
23.08.201323.5 Mб15Rivero L.Encyclopedia of database technologies and applications.2006.pdf
#
23.08.2013672.52 Кб14Robertson G.D.A practical introduction to APL-1 & APL-2.2004.PDF