Добавил:

Andrey Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Санкт-Петербургский государственный электротехнический университет "ЛЭТИ"

Предмет:

Электротехника

Файл:

Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf

Скачиваний:

Добавлен:

23.08.2013

Размер:

4.27 Mб

Скачать

☆

<<< < Предыдущая 31 32 33 34 35 36 37 38 39 40 41 42 43 44 4546 / 5546 47 48 49 50 51 52 53 54 55 > Следующая >>>

INPUT AND OUTPUT	•
	241

Table 7.2 Variable length decoding example: MPEG-4 Visual TCOEF

State	Input	Next state	VLC	Output (last, run, level)

0	0	1	0..	–
	1	2	1..	–
1	0	..later state	00..
	1	..later state	01..
2	0	0	10s	(0,0,s1)
	1	3	11..
3	0	0	110s	(0,1,s1)
	1	4	111..	–
4	0	0	1110s	(0,2,s1)
	1	0	1111s	(0,0,s2)
etc.	. . .	. . .	. . .	. . .

decoded syntax element (or error indication) is returned and the decoder restarts from the initial state. Table 7.2 shows the ﬁrst part of the decoding sequence for the MPEG-4 Visual TCOEF (transform coefﬁcient) context, starting with state 0. If the input bit is 0, the next state is state 1 and if the input is 1, the next state is 2. From state 2, if the input is 0, the decoder has ‘found’ a valid VLC, 10. In this context it is necessary to decode 1 more bit at the end of each VLC (the sign bit, s, indicating whether the level is positive or negative), after which it outputs the relevant syntax element (0, 1 or +/−1 in this case) and returns to state 0. Note that when a syntax element containing ‘last = 1’ is decoded, we have reached the end of the block of coefﬁcients and it is necessary to reset or change the context.

In this example, the decoder can process one input bit at each stage (e.g. one bit per clock cycle in a hardware implementation). This may be too slow for some applications in which case a more sophisticated architecture that can examine multiple bits (or entire VLCs) in one operation may be required. Examples of architectures for variable-length coding and decoding include [26–29].

7.2.6.3 Arithmetic Coding

An arithmetic encoder (see Chapter 3) encodes each syntax element through successive reﬁnement of a fractional number. Arithmetic coding has the potential for greater compression efﬁciency than any variable-length coding algorithm (due to its ability to represent fractional probability distributions accurately). In practice, it is usually necessary to represent the fractional numbers produced by an arithmetic encoder using ﬁxed-point values within a limited dynamic range. Some implementation issues for the context-based arithmetic coder adopted for H.264 Main Proﬁle are discussed in Chapter 6 and a detailed overview of the CABAC scheme is given in [30].

7.3 INPUT AND OUTPUT

7.3.1 Interfacing

Figure 7.17 shows a system in which video frames are encoded, transmitted or stored and decoded. At the input to the encoder (A) and the output of the decoder (D), data are in the

242

DESIGN AND PERFORMANCE

• A

video

encode

network

network or

network

decode

video

frames

adaptation

storage

adaptation

frames

Figure 7.17 Video CODEC interfaces

format of uncompressed video frames, each represented by a set of samples, typically in the YCbCr colour space using one of the sampling structures described in Chapter 2 (4:4:4, 4:2:2 or 4:2:0). There are a number of different methods of combining the three components of each frame, including interleaved (samples of Y, Cb and Cr are interleaved together in raster scan order), concatenated (the complete Y component for a frame is followed by the Cb and then Cr components) and using separate buffers or memory areas to store each of the three components. The choice of method may depend on the application. For example, using separate buffers for the Y, Cb and Cr components may be suitable for a software CODEC; a hardware CODEC with limited memory and/or a requirement for low delay may use an interleaved format.

At the output of the encoder (B) and the input to the decoder (C) the data consist of a sequence of bits representing the video sequence in coded form. The H.264 and MPEG-4 Visual standards use ﬁxed length codes, variable-length codes and/or arithmetic coding to represent the syntax elements of the compressed sequence. The coded bitstream consists of continuous sequences of bits, interspersed with ﬁxed-length ‘marker’ codes. Methods of mapping this bitstream to a transport or storage mechanism (‘delivery mechanism’) include the following.

Bit-oriented: If the delivery mechanism is capable of dealing with an arbitrary number of bits, the bitstream may be transmitted directly (optionally multiplexed with associated data such as coded audio and ‘side’ information).

Byte-oriented: Many delivery mechanisms (e.g. ﬁle storage or network packets) require data to be mapped to an integral number of bytes or words. It may be necessary to pad the coded data at the end of a unit (e.g. slice, picture, VOP or sequence) to make an integral number of bytes or words.

Packet-oriented: Both MPEG-4 Visual and H.264 support the concept of placing a complete coded unit in a network packet. A video packet or NAL unit packet contains coded data that corresponds to a discrete coded unit such as a slice (a complete frame or VOP or a portion of a frame or VOP) (see Section 6.7).

7.3.2 Pre-processing

The compression efﬁciency of a video CODEC can be signiﬁcantly improved by preprocessing video frames prior to encoding. Problems with the source material and/or video capture system may degrade the coding performance of a video encoder. Camera noise (introduced by the camera and/or the digitisation process) is illustrated in Figure 7.18. The top

INPUT AND OUTPUT	•
	243

Figure 7.18 Image showing camera noise (lower half)

half of this image is relatively noise-free and this is typical of the type of image captured by a high-quality digital camera. Images captured from low-quality sources are more likely to contain noise (shown in the lower half of this ﬁgure). Camera noise may appear in higher spatial frequencies and change from frame to frame. An encoder will ‘see’ this noise as a high-frequency component that is present in the motion-compensated residual and is encoded together with the desired residual data, causing an increase in the coded bitrate. Camera noise can therefore signiﬁcantly reduce the compression efﬁciency of an encoder. By ﬁltering the input video sequence prior to encoding it may be possible to reduce camera noise (and hence improve compression efﬁciency). The ﬁlter parameters should be chosen with care, to avoid ﬁltering out useful features of the video sequence.

Another phenomenon that can reduce compression efﬁciency is camera shake, small movements of the camera between successive frames, characteristic of a hand-held or poorly stabilised camera. These are ‘seen’ by the encoder as global motion between frames. Motion compensation may partly correct the motion but block-based motion estimation algorithms are not usually capable of correcting fully for camera shake and the result is an increase in residual energy and a drop in compression performance. Many consumer and professional camcorders incorporate image stabilisation systems that attempt to compensate automatically for small camera movements using mechanical and/or image processing methods. As well as improving the appearance of the captured video sequence, this has the effect of improving compression performance if the material is coded using motion compensation.

7.3.3 Post-processing

Video compression algorithms that incorporate quantisation (such as the core algorithms of MPEG-4 Visual and H.264) are inherently lossy, i.e. the decoded video frames are not identical to the original. The goal of any practical CODEC is to minimise distortion and maximise compression efﬁciency. It is often possible to reduce the actual or apparent distortion in the decoded video sequence by processing (ﬁltering) the decoded frames. If the ﬁltered decoded frames are then used for compensation, the ﬁltering process can have the added beneﬁt of improving motion-compensated prediction and hence compression efﬁciency.

•	DESIGN AND PERFORMANCE
244

Figure 7.19 Distortion introduced by MPEG-4 Visual encoding (lower half)

Figure 7.20 Distortion introduced by H.264 encoding (lower half)

Block transform-based CODECs introduce characteristic types of distortion into the decoded video data. The lower half of Figure 7.19 shows typical distortion in a frame encoded and decoded using MPEG-4 Simple Proﬁle (the upper half is the original, uncompressed frame). This example shows ‘blocking’ distortion (caused by mismatches at the boundaries of reconstructed 8 × 8 blocks) and ‘ringing’ distortion (faint patterns along the edges of objects, caused by the ‘break through’ of DCT basis patterns). Blocking is probably the most visually obvious (and therefore the most important) type of distortion introduced by video compression. Figure 7.20 (lower half) shows the result of encoding and decoding using H.264 without loop ﬁltering. The smaller transform size in H.264 (4 × 4 rather than 8 × 8 samples) means that the blocking artefacts are correspondingly smaller, but are still obvious1.

1 The compressed halves of each of these ﬁgures were encoded at different bitrates.

INPUT AND OUTPUT	•
	245

ENCODER

DECODER

motion-

transform,

decoding,

motion-

input

compensated

quantise,

rescaling,

compensated

decoded

filter

inverse

frame

prediction

etc.

reconstruction

frame

transform

reconstructed

frame

Figure 7.21 Post-ﬁlter implementation

ENCODER

DECODER

motion-

transform,

decoding,

motion-

input

compensated

quantize,

rescaling,

compensated

filter

decoded

frame

prediction

etc.

inverse

reconstruction

frame

transform

reconstructed

filter

reconstructed

frame

Figure 7.22 Loop ﬁlter implementation

Filters to reduce blocking (de-blocking) and/or ringing effects (de-ringing) are widely used in practical CODECs. Many ﬁlter designs have been proposed and implemented, ranging from relatively simple algorithms to iterative algorithms that are many times more complex than the encoding and decoding algorithms themselves [31–34]. The goal of a de-blocking or de-ringing ﬁlter is to minimise the effect of blocking or ringing distortion whilst preserving important features of the image. MPEG-4 Visual describes a deblocking ﬁlter and a deringing ﬁlter: these are ‘informative’ parts of the standard and are therefore optional. Both ﬁlters are designed to be placed at the output of the decoder (Figure 7.21). With this type of post-ﬁlter, unﬁltered decoded frames are used as the reference for motion-compensated reconstruction of further frames. This means that the ﬁlters improve visual quality at the decoder but have no effect on the encoding and decoding processes.

It may be advantageous to place the ﬁlter inside the encoding and decoding ‘loops’ (Figure 7.22). At the decoder, the ﬁltered decoded frame is stored for further motioncompensated reconstruction. In order to ensure that the encoder uses an identical reference frame, the same ﬁlter is applied to reconstructed frames in the encoder and the encoder uses the ﬁltered frame as a reference for further motion estimation and compensation. If the quality of the ﬁltered frame is better than that of an unﬁltered decoded frame, then it will provide a better match for further encoded frames, resulting in a smaller residual after motion compensation and hence improved compression efﬁciency. H.264 makes use of this type of loop ﬁlter (see Chapter 6 for details of the ﬁlter algorithm). One disadvantage of incorporating the ﬁlter into the loop is that it must be speciﬁed in the standard (so that any decoder can successfully repeat the ﬁltering process) and there is therefore limited scope for innovative ﬁlter designs.

<<< < Предыдущая 31 32 33 34 35 36 37 38 39 40 41 42 43 44 4546 / 5546 47 48 49 50 51 52 53 54 55 > Следующая >>>

Соседние файлы в предмете Электротехника

#
23.08.20131.4 Mб14Revised report on the algorithmic language Algol-68.pdf
#
23.08.2013111.05 Кб11Rich H.H.J reference card.V6.01.2006.pdf
#
23.08.20131.79 Mб19Rich H.J for C programmers.2006.pdf
#
23.08.2013798.85 Кб17Richards M.The BCPL Cintcode and Cintpos user guide.2005.pdf
#
23.08.201341.83 Кб20Richards M.The BCPL reference manual.1967.pdf
#
23.08.20134.27 Mб34Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf
#
23.08.2013718.38 Кб104Ridley R.Потери в обмотках вследствие эффекта близости.pdf
#
23.08.201364.93 Кб27Ritchie D.M.The development of the C language.1993.pdf
#
23.08.2013379.35 Кб14Rivard F.Smalltalk.A reflective language.pdf
#
23.08.201323.5 Mб14Rivero L.Encyclopedia of database technologies and applications.2006.pdf
#
23.08.2013672.52 Кб13Robertson G.D.A practical introduction to APL-1 & APL-2.2004.PDF