
- •Copyright
- •Contents
- •About the Author
- •Foreword
- •Preface
- •Glossary
- •1 Introduction
- •1.1 THE SCENE
- •1.2 VIDEO COMPRESSION
- •1.4 THIS BOOK
- •1.5 REFERENCES
- •2 Video Formats and Quality
- •2.1 INTRODUCTION
- •2.2 NATURAL VIDEO SCENES
- •2.3 CAPTURE
- •2.3.1 Spatial Sampling
- •2.3.2 Temporal Sampling
- •2.3.3 Frames and Fields
- •2.4 COLOUR SPACES
- •2.4.2 YCbCr
- •2.4.3 YCbCr Sampling Formats
- •2.5 VIDEO FORMATS
- •2.6 QUALITY
- •2.6.1 Subjective Quality Measurement
- •2.6.2 Objective Quality Measurement
- •2.7 CONCLUSIONS
- •2.8 REFERENCES
- •3 Video Coding Concepts
- •3.1 INTRODUCTION
- •3.2 VIDEO CODEC
- •3.3 TEMPORAL MODEL
- •3.3.1 Prediction from the Previous Video Frame
- •3.3.2 Changes due to Motion
- •3.3.4 Motion Compensated Prediction of a Macroblock
- •3.3.5 Motion Compensation Block Size
- •3.4 IMAGE MODEL
- •3.4.1 Predictive Image Coding
- •3.4.2 Transform Coding
- •3.4.3 Quantisation
- •3.4.4 Reordering and Zero Encoding
- •3.5 ENTROPY CODER
- •3.5.1 Predictive Coding
- •3.5.3 Arithmetic Coding
- •3.7 CONCLUSIONS
- •3.8 REFERENCES
- •4 The MPEG-4 and H.264 Standards
- •4.1 INTRODUCTION
- •4.2 DEVELOPING THE STANDARDS
- •4.2.1 ISO MPEG
- •4.2.4 Development History
- •4.2.5 Deciding the Content of the Standards
- •4.3 USING THE STANDARDS
- •4.3.1 What the Standards Cover
- •4.3.2 Decoding the Standards
- •4.3.3 Conforming to the Standards
- •4.7 RELATED STANDARDS
- •4.7.1 JPEG and JPEG2000
- •4.8 CONCLUSIONS
- •4.9 REFERENCES
- •5 MPEG-4 Visual
- •5.1 INTRODUCTION
- •5.2.1 Features
- •5.2.3 Video Objects
- •5.3 CODING RECTANGULAR FRAMES
- •5.3.1 Input and output video format
- •5.5 SCALABLE VIDEO CODING
- •5.5.1 Spatial Scalability
- •5.5.2 Temporal Scalability
- •5.5.3 Fine Granular Scalability
- •5.6 TEXTURE CODING
- •5.8 CODING SYNTHETIC VISUAL SCENES
- •5.8.1 Animated 2D and 3D Mesh Coding
- •5.8.2 Face and Body Animation
- •5.9 CONCLUSIONS
- •5.10 REFERENCES
- •6.1 INTRODUCTION
- •6.1.1 Terminology
- •6.3.2 Video Format
- •6.3.3 Coded Data Format
- •6.3.4 Reference Pictures
- •6.3.5 Slices
- •6.3.6 Macroblocks
- •6.4 THE BASELINE PROFILE
- •6.4.1 Overview
- •6.4.2 Reference Picture Management
- •6.4.3 Slices
- •6.4.4 Macroblock Prediction
- •6.4.5 Inter Prediction
- •6.4.6 Intra Prediction
- •6.4.7 Deblocking Filter
- •6.4.8 Transform and Quantisation
- •6.4.11 The Complete Transform, Quantisation, Rescaling and Inverse Transform Process
- •6.4.12 Reordering
- •6.4.13 Entropy Coding
- •6.5 THE MAIN PROFILE
- •6.5.1 B slices
- •6.5.2 Weighted Prediction
- •6.5.3 Interlaced Video
- •6.6 THE EXTENDED PROFILE
- •6.6.1 SP and SI slices
- •6.6.2 Data Partitioned Slices
- •6.8 CONCLUSIONS
- •6.9 REFERENCES
- •7 Design and Performance
- •7.1 INTRODUCTION
- •7.2 FUNCTIONAL DESIGN
- •7.2.1 Segmentation
- •7.2.2 Motion Estimation
- •7.2.4 Wavelet Transform
- •7.2.6 Entropy Coding
- •7.3 INPUT AND OUTPUT
- •7.3.1 Interfacing
- •7.4 PERFORMANCE
- •7.4.1 Criteria
- •7.4.2 Subjective Performance
- •7.4.4 Computational Performance
- •7.4.5 Performance Optimisation
- •7.5 RATE CONTROL
- •7.6 TRANSPORT AND STORAGE
- •7.6.1 Transport Mechanisms
- •7.6.2 File Formats
- •7.6.3 Coding and Transport Issues
- •7.7 CONCLUSIONS
- •7.8 REFERENCES
- •8 Applications and Directions
- •8.1 INTRODUCTION
- •8.2 APPLICATIONS
- •8.3 PLATFORMS
- •8.4 CHOOSING A CODEC
- •8.5 COMMERCIAL ISSUES
- •8.5.1 Open Standards?
- •8.5.3 Capturing the Market
- •8.6 FUTURE DIRECTIONS
- •8.7 CONCLUSIONS
- •8.8 REFERENCES
- •Bibliography
- •Index
• |
|
|
|
APPLICATIONS AND DIRECTIONS |
||
272 |
|
|
|
|||
|
Table 8.3 MPEG-4 Visual CODECs (information not guaranteed to be correct) |
|||||
|
|
|
|
|
|
|
|
Company |
HW/SW |
Profiles |
Performance |
Comments |
|
|
|
|
|
|
|
|
|
Amphion |
HW |
SP |
|
L0-L3 |
SoC modules, plus |
|
www.amphion.com |
|
|
|
Up to 2048 × |
HW accelerators |
|
Dicas |
SW |
SP, |
ASP, |
Implements binary |
|
|
www.dicas.de |
|
Core |
|
2048/60 fps? |
shape |
|
|
|
|
|
|
functionalities |
|
DivX |
SW |
SP, ASP |
|
All levels |
Now compatible with |
|
www.divx.com |
|
|
|
|
MPEG-4 File |
|
|
|
|
|
|
Format |
|
Emblaze |
HW |
SP |
|
QCIF/up to 15 fps |
Based on ARM920T |
|
www.emblaze.com |
|
|
|
encode, 30 fps |
core, suitable for |
|
|
|
|
|
decode |
mobile applications |
|
EnQuad |
SW? |
Core? |
|
30 fps? |
No product details |
|
www.enquad.com |
|
|
|
|
available |
|
Envivio |
HW/SW |
SP and ASP |
L0-L5 |
SW and HW versions |
|
|
www.envivio.com |
|
|
|
|
|
|
Equator |
SW |
SP |
|
? |
Decoder (running on |
|
www.equator.com |
|
|
|
|
Equator’s BSP-15 |
|
|
|
|
|
|
processor) |
|
Hantro www.hantro.com |
HW/SW |
SP |
|
L0-L3 |
SW and HW versions |
|
iVast www.ivast.com |
SW |
SP/ASP |
|
L0-L3 |
|
|
Prodys www.prodys.com |
SW |
SP/ASP/Core |
L0-L4 (ASP) |
Implemented on |
|
|
|
|
|
|
|
Texas Instruments |
|
|
|
|
|
|
TMS320c64× |
|
|
|
|
|
|
processor. Does not |
|
|
|
|
|
|
implement binary |
|
|
|
|
|
|
shape coding. |
|
Sciworx |
HW/SW |
SP |
|
QCIF/15 fps |
Embedded processor |
|
www.sci-worx.com |
|
|
|
(encoder) |
solution (partitioned |
|
|
|
|
|
|
between hardware |
|
|
|
|
|
|
and software) |
|
Toshiba |
HW |
SP |
|
QCIF/15 fps en- |
Single chip including |
|
www.toshiba.com/taec/ |
|
|
|
code + decode |
audio and multiplex |
|
IndigoVision |
HW |
SP |
|
L1-L3 |
SoC modules |
|
www. indigovision.com |
|
|
|
|
|
|
3ivx |
SW |
SP/ASP |
|
? |
Embedded version of |
|
www.3ivx.com |
|
|
|
|
decoder available |
|
UBVideo |
SW |
SP |
|
Up to L3 |
PC and DSP |
|
www.ubvideo.com |
|
|
|
|
implementations |
|
|
|
|
|
|
|
for H.264 will be broadcast-quality streamed or stored video, replacing existing technology in ‘higher-end’ applications such as broadcast television or video storage.
8.5 COMMERCIAL ISSUES
For a developer of a video communication product, there are a number of important commercial issues that must be taken into account in addition to the technical features and performance issues discussed in Chapters 5, 6 and 7.
COMMERCIAL ISSUES |
|
|
|
• |
|
|
|
|
273 |
|
|
Table 8.4 H.264 CODECs (information not guaranteed to be correct) |
|||||
|
|
|
|
|
|
Company |
HW/SW |
Supports |
Performance |
Comments |
|
|
|
|
|
|
|
VideoLocus |
SW/HW |
Main profile |
30 fps/4CIF, up |
HW/SW encoder; SW |
|
www.videolocus.com |
|
|
to level 3 |
decoder; sub-8 × 8 |
|
|
|
|
|
motion |
|
|
|
|
|
compensation not yet |
|
|
|
|
|
supported |
|
UBVideo |
SW |
Main profile |
30fps/4CIF |
DSP implementation |
|
www.ubvideo.com |
|
|
|
(Texas Instruments |
|
|
|
|
|
TMS320C64×) |
|
Vanguard Software Solu- |
SW |
? |
? |
Downloadable |
|
tions www.vsofts.com |
|
|
|
Windows CODEC |
|
Sand Video |
HW |
Main profile |
Supports high |
Decoder |
|
www.sandvideo.com |
|
|
definition |
|
|
|
|
|
(1920 × 1080) |
|
|
HHI www.hhi.de |
SW |
Main profile |
? |
Not real-time yet? |
|
Envivio |
SW/HW |
Main profile |
D1 resolution |
Due to be released later |
|
www.envivio.com |
|
|
encoding and |
in 2003 |
|
|
|
|
decoding in |
|
|
|
|
|
real time (HW) |
|
|
Equator |
SW |
? |
? |
Implementation for |
|
www.equator.com |
|
|
|
BSP-15 processor; |
|
|
|
|
|
no details available |
|
DemoGraFX |
SW/HW |
? |
? |
Advance information: |
|
www.demografx.com/ |
|
|
|
encoder and decoder |
|
|
|
|
|
will include optional |
|
|
|
|
|
proprietary |
|
|
|
|
|
‘extensions’ to H.264 |
|
Polycom |
HW |
? |
? |
Details not yet available |
|
www.polycom.com |
? |
|
|
|
|
STMicroelectronics |
HW |
? |
? |
Advance information: |
|
us.st.com |
|
|
|
encoder and decoder |
|
|
|
|
|
running on Nomadik |
|
|
|
|
|
media processor |
|
|
|
|
|
platform |
|
MainConcept |
SW |
? |
? |
Advance information, |
|
www.mainconcept.com |
|
|
|
few details available |
|
Impact Labs Inc., |
SW |
? |
? |
Advance information, |
|
www.impactlabs.com |
|
|
|
few details available |
|
|
|
|
|
|
|
8.5.1 Open Standards?
MPEG-4 and H.264 are ‘open’ international standards, i.e. any individual or organisation can purchase the standards documents from the ISO/IEC or ITU-T. This means that the standards have the potential to stimulate the development of innovative, competitive solutions conforming to open specifications. The documents specify exactly what is required for conformance, making it possible for anyone to develop a conforming encoder or decoder. At the same time, there is scope for a developer to optimise the encoder and/or the decoder, for example to provide enhanced visual quality or to take best advantage of a particular implementation platform.

• |
APPLICATIONS AND DIRECTIONS |
274 |
There are however some factors that work against this apparent openness. Companies or organisations within the Experts Groups have the potential to influence the standardisation process and (in the case of MPEG) have privileged access to documents (such as draft versions of new standards) that may give them a significant market lead. The standards are not easily approachable by non-experts and this makes for a steep learning curve for newcomers to the field. Finally, there are tens of thousands of patents related to image and video coding and it is not considered possible to implement one of the more recent standards without potentially infringing patent rights. In the case of MPEG-4 Visual (and probably the Main and Extended Profiles of H.264), this means that any commercial implementation of the standard is subject to license fee payments (see below).
8.5.2 Licensing MPEG-4 Visual and H.264
Any implementation of MPEG-4 Visual will fall into the scope of a number of ‘essential’ patents. Licensing the rights to the main patents is coordinated by MPEG LA [1], a body that represents the interests of the major patent holders and is not part of MPEG or the ISO/IEC. Commercial implementation or usage of MPEG-4 Visual is subject to royalty payments (through MPEG LA) to 20 organisations that hold these patents. Royalty payments are charged depending on the nature of use and according to the number of encoders, decoders, subscribers and/or playbacks of coded video. Critics of the licensing scheme claim that the cost may inhibit the take-up of MPEG-4 by industry but supporters claim that it helpfully clarifies the complex intellectual property situation and ensures that there are no ‘hidden costs’ to implementers.
H.264/MPEG-4 Part 10 is also subject to a number of essential patents. However, in order to make the new standard as accessible as possible, the JVT has attempted to make the Baseline Profile (see Chapter 6) ‘royalty free’. During the standardisation process, holders of key patents were encouraged to notify JVT of their patent claims and to state whether they would permit a royalty free license to the patent(s). These patent statements have been taken into account during the development of the Profiles with the aim of keeping the Baseline free of royalty payments. As this process is voluntary and relies on the correct identification of all relevant patents prior to standardisation, it is not yet clear whether the goal of a royalty-free Profile will be realised but initial indications are positive1. Implementation or use of the Main and Extended Profiles (see Chapter 6) is likely to be subject to royalty payments to patent holders.
8.5.3 Capturing the Market
Defining a technology in an International Standard does not guarantee that it will be a commercial success in the marketplace. The original target application of MPEG-1, the video CD, was not a success, although the standard is still widely used for storage of coded video in PC-based applications and on web sites. MPEG-2 is clearly a worldwide success in its applications to digital television broadcasting and DVD-Video storage. The first version of MPEG-4 Visual was published in 1999 but it is still not clear whether it will become a market leading technology for video coding. The slow process of agreeing licensing terms (not
1 In March 2003, 31 companies involved in the H.264 development process and/or holding essential patents confirmed their support for a royalty-free Baseline Profile.