Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
литература / Digital_Video_and_HD_Second_Edition_Algorithms_and_Interfaces.pdf
Скачиваний:
0
Добавлен:
13.05.2026
Размер:
38.02 Mб
Скачать

ITU-T H.264, Advanced Video Coding for Generic Audiovisual Services – Coding of Moving Video, also published as ISO/IEC 14496-10 (MPEG-4 Part 10),

Advanced Video Coding.

H.264 is usually pronounced

H-dot-TWO-SIX-FOUR.

Compounding 1.06 twelve times yields a factor of two:

1.0612 ≈ 2

H.264 video compression

48

H.264 denotes a codec standardized by ITU-T (under the designation H.264) and by ISO/IEC (under the designation MPEG-4 Part 10). The Simple Studio Profile (SStP) of MPEG-4 Part 2 is used in hdcam. That aspect of Part 2, and all of Part 10, are applicable to broadcastquality video; other than those cases, MPEG-4 is generally not applicable to broadcast-quality video. H.264 was developed by the Joint Video Team (JVT), where it was referred to as Advanced Video Coding (AVC); its ITU-T nomenclature during development was H.26L. All of these terms were once used to denote what is now, after adoption of the standard, best called H.264.

H.264 is broadly similar to MPEG-2, but the “low fruit” had been taken. Compression improvements in H.264 are obtained by a dozen or so techniques, each having perhaps 6% improvement in coding efficiency – but a dozen of those cascaded yields twice the efficiency of MPEG-2. (Practitioners claim efficiency as low as 1.5 and as high as 3 times that of MPEG-2.) H.264 spans a wide range of applications, from surveillance video, to video conferencing, to mobile devices, to internet video streaming, to HDTV broadcasting.

H.264 is complicated. The standard (in its 2010-03 edition) comprises 669 pages of very dense description. Implementing an encoder or decoder takes many manyears. Software, firmware, and hardware implementations are commercially available. Even hardware implementations require embedded firmware: H.264 VLSI solutions typically involve one or more embedded RISC processors and quite a bit of associated firmware.

537

MPEG LA, L.L.C. is not affiliated with MPEG (the standards group). LA apparently stands for Licensing Administration. The organization is based in Denver, not Los Angeles.

The H.264 features that extend MPEG-2 are described in the remaining sections of this chapter.

I assume that you are familiar with Introduction to video compression, on page 147, and with JPEG, M-JPEG, DV, and MPEG-2, described in the preceding three chapters.

Like MPEG-2, H.264 specifies exactly what constitutes a conformant bitstream: A conformant (“legal”) encoder generates only conformant bitstreams; a legal decoder correctly decodes any conformant bitstream. H.264 effectively standardizes the behaviour of

a decoder, but does not standardize the encoder!

The goal of compression is to reduce data rate while minimizing the visibility of artifacts. The best way – most experts say, the only way – to establish the performance of an encoder is to visually assess the result of compressing and decompressing video streams.

H.264 is covered by hundreds of patents. Implementors, manufacturers, users, and/or others may or may not be required to take out a licence to the “patent pool” administered by MPEG LA.

Not all features of H.264 are expected to be implemented in every decoder; for example, B-slices (comparable to MPEG-2 B-pictures) are prohibited in the baseline profile. Applications have various bit rates, and decoders can have various levels of resources (e.g., memory); like MPEG-2, a system of profiles and levels determines the minimum requirements.

Algorithmic features, profiles, and levels

Table 48.1 opposite summarizes the algorithmic features of H.264 beyond MPEG-2. The features in the top section are available in all profiles; features in the sections below are profile-dependent.

The features available in the baseline and extended profiles concern robust handling of data conveyed across unreliable channels. These features (and profiles) are generally not of interest for professional video, and they are not permitted in the main and high profiles.

The features of the extended, main, and high profiles offer improved coding efficiency. CABAC improves the performance of variable-length entropy coding.

Fidelity range extensions (FRExt) refers to several algorithmic features incorporated into the high profiles – HiP, Hi10P, Hi422P, and Hi444P – to enable

538

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

 

Profile

Baseline

Extended

Main

High

 

Algorithmic feature (“tool”)

(BP)

(XP)

(MP)

(HiP)

 

 

 

 

 

 

 

 

 

 

 

profiles

Multiple reference pictures

Flexible motion compensation

I-slices and P-slices

1/ -pel motion-comp. interpolation

all

4

 

 

 

 

16-bit exact-match integer transform

in

Unified variable-length coding

Features

(UVLC/Exp-Golomb)

CAVLC

Deblocking filter in-the-loop

 

 

 

 

 

 

 

 

 

 

 

1

Flexible macroblock ordering (FMO)

 

 

Set

Arbitrary slice order (ASO)

 

 

Redundant slices (RS)

 

 

 

 

 

 

 

 

 

 

 

2

Data partitioning

 

 

 

Set

SI & SP slices

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

B-slices

 

Set

Interlaced coding (PicAFF, MBAFF)

 

Weighted and offset MC prediction

 

 

 

 

 

 

 

 

 

4

CABAC entropy coding

 

 

Set

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8× 8 luma intra prediction

 

 

 

 

Increased sample depth

 

 

 

FRExt

4:4:4 and 4:2:2 chroma subsampling

 

 

 

Inter-picture lossless coding

 

 

 

8× 8/4× 4 transform adaptivity

 

 

 

Quantization scaling matrices

 

 

 

 

 

 

 

 

Separate CB and CR QP control

 

 

 

 

Monochrome (4:0:0)

 

 

 

 

 

 

 

 

 

Table 48.1 H.264 features are arranged in rows; the columns indicate presence of features in the commercially important profiles.

higher quality video. Hi10P allows 10-bit video; Hi422P permits 4:2:2 chroma subsampling, and Hi444P permits 4:4:4, 12-bit video, and several other features.

Four of H.264’s profiles are commercially important: baseline, extended, main, and high. The main and high profiles are relevant to professional video. H.264 has fifteen levels, accommodating images ranging from 176× 144 (coded at rates as low as 64 kb/s) to 4 K× 2 K (coded at rates as high as 240 Mb/s). Profile and level combinations important to professional video are summarized in Table 48.2 overleaf.

CHAPTER 48

H.264 VIDEO COMPRESSION

539

Level

Typ. image format

Typ. frame

Max. bit

rate [Hz]

rate [b/s]

 

 

 

 

 

L1

176

× 144

15

64 k

 

 

 

 

 

L1b

176

× 144

15

128 k

 

 

 

 

L1.1

352× 288 or 176× 144

7.5 or 30

192 k

 

 

 

 

 

L1.2

352

× 288

15

384 k

 

 

 

 

 

L1.3

352

× 288

30

768 k

 

 

 

 

 

L2

352

× 288

30

2 M

 

 

 

 

L2.1

352× 480 or 352× 576

30 or 25

4 M

 

 

 

 

 

L2.2

 

SD

15

4 M

 

 

 

 

 

L3.0

 

SD

30 or 50

10 M

 

 

 

 

 

L3.1

1280

× 720

30

14 M

 

 

 

 

 

L3.2

1280

× 720

60

20 M

 

 

 

 

L4.0

1920× 1080

30

20 M

 

 

 

 

L4.1

1920× 1080

30

50 M

 

 

 

 

L4.2

1920× 1080

60

50 M

 

 

 

 

L5

2048× 1024

72 or 30

135 M

 

 

 

 

L5.1

4096× 2048

30

240 M

 

 

 

 

 

Table 48.2 H.264 levels are summarized.

Baseline and extended profiles

You might imagine a baseline profile to be decodable by every decoder. That is not the case in H.264. The baseline profile is intended to address low bit-rate applications that suffer from poor quality transmission. The flexible macroblock ordering (FMO), arbitrary slice order (ASO), and redundant slices (RS) features all contribute to robustness. Other features – in particular, B-slices – are excluded from the baseline profile, so as to achieve low computational complexity. The baseline profile is rarely used (if used at all) in professional video.

You might imagine an extended profile to have features beyond those of the main profile. That is not the case in H.264. The extended profile extends the robustness features of the baseline profile by including two additional features, data partitioning and SI and SP slices. Two additional features improve coding efficiency: B-slices, and interlaced coding (PicAFF, MBAFF). The extended profile is rarely used in professional video.

540

DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES

Соседние файлы в папке литература