- •Contents
- •Figures
- •Tables
- •Preface
- •Acknowledgments
- •1. Raster images
- •Aspect ratio
- •Geometry
- •Image capture
- •Digitization
- •Perceptual uniformity
- •Colour
- •Luma and colour difference components
- •Digital image representation
- •Square sampling
- •Comparison of aspect ratios
- •Aspect ratio
- •Frame rates
- •Image state
- •EOCF standards
- •Entertainment programming
- •Acquisition
- •Consumer origination
- •Consumer electronics (CE) display
- •Contrast
- •Contrast ratio
- •Perceptual uniformity
- •The “code 100” problem and nonlinear image coding
- •Linear and nonlinear
- •4. Quantization
- •Linearity
- •Decibels
- •Noise, signal, sensitivity
- •Quantization error
- •Full-swing
- •Studio-swing (footroom and headroom)
- •Interface offset
- •Processing coding
- •Two’s complement wrap-around
- •Perceptual attributes
- •History of display signal processing
- •Digital driving levels
- •Relationship between signal and lightness
- •Algorithm
- •Black level setting
- •Effect of contrast and brightness on contrast and brightness
- •An alternate interpretation
- •Brightness and contrast controls in LCDs
- •Brightness and contrast controls in PDPs
- •Brightness and contrast controls in desktop graphics
- •Symbolic image description
- •Raster images
- •Conversion among types
- •Image files
- •“Resolution” in computer graphics
- •7. Image structure
- •Image reconstruction
- •Sampling aperture
- •Spot profile
- •Box distribution
- •Gaussian distribution
- •8. Raster scanning
- •Flicker, refresh rate, and frame rate
- •Introduction to scanning
- •Scanning parameters
- •Interlaced format
- •Interlace and progressive
- •Scanning notation
- •Motion portrayal
- •Segmented-frame (24PsF)
- •Video system taxonomy
- •Conversion among systems
- •9. Resolution
- •Magnitude frequency response and bandwidth
- •Visual acuity
- •Viewing distance and angle
- •Kell effect
- •Resolution
- •Resolution in video
- •Viewing distance
- •Interlace revisited
- •10. Constant luminance
- •The principle of constant luminance
- •Compensating for the CRT
- •Departure from constant luminance
- •Luma
- •“Leakage” of luminance into chroma
- •11. Picture rendering
- •Surround effect
- •Tone scale alteration
- •Incorporation of rendering
- •Rendering in desktop computing
- •Luma
- •Sloppy use of the term luminance
- •Colour difference coding (chroma)
- •Chroma subsampling
- •Chroma subsampling notation
- •Chroma subsampling filters
- •Chroma in composite NTSC and PAL
- •Scanning standards
- •Widescreen (16:9) SD
- •Square and nonsquare sampling
- •Resampling
- •NTSC and PAL encoding
- •NTSC and PAL decoding
- •S-video interface
- •Frequency interleaving
- •Composite analog SD
- •15. Introduction to HD
- •HD scanning
- •Colour coding for BT.709 HD
- •Data compression
- •Image compression
- •Lossy compression
- •JPEG
- •Motion-JPEG
- •JPEG 2000
- •Mezzanine compression
- •MPEG
- •Picture coding types (I, P, B)
- •Reordering
- •MPEG-1
- •MPEG-2
- •Other MPEGs
- •MPEG IMX
- •MPEG-4
- •AVC-Intra
- •WM9, WM10, VC-1 codecs
- •Compression for CE acquisition
- •AVCHD
- •Compression for IP transport to consumers
- •VP8 (“WebM”) codec
- •Dirac (basic)
- •17. Streams and files
- •Historical overview
- •Physical layer
- •Stream interfaces
- •IEEE 1394 (FireWire, i.LINK)
- •HTTP live streaming (HLS)
- •18. Metadata
- •Metadata Example 1: CD-DA
- •Metadata Example 2: .yuv files
- •Metadata Example 3: RFF
- •Metadata Example 4: JPEG/JFIF
- •Metadata Example 5: Sequence display extension
- •Conclusions
- •19. Stereoscopic (“3-D”) video
- •Acquisition
- •S3D display
- •Anaglyph
- •Temporal multiplexing
- •Polarization
- •Wavelength multiplexing (Infitec/Dolby)
- •Autostereoscopic displays
- •Parallax barrier display
- •Lenticular display
- •Recording and compression
- •Consumer interface and display
- •Ghosting
- •Vergence and accommodation
- •20. Filtering and sampling
- •Sampling theorem
- •Sampling at exactly 0.5fS
- •Magnitude frequency response
- •Magnitude frequency response of a boxcar
- •The sinc weighting function
- •Frequency response of point sampling
- •Fourier transform pairs
- •Analog filters
- •Digital filters
- •Impulse response
- •Finite impulse response (FIR) filters
- •Physical realizability of a filter
- •Phase response (group delay)
- •Infinite impulse response (IIR) filters
- •Lowpass filter
- •Digital filter design
- •Reconstruction
- •Reconstruction close to 0.5fS
- •“(sin x)/x” correction
- •Further reading
- •2:1 downsampling
- •Oversampling
- •Interpolation
- •Lagrange interpolation
- •Lagrange interpolation as filtering
- •Polyphase interpolators
- •Polyphase taps and phases
- •Implementing polyphase interpolators
- •Decimation
- •Lowpass filtering in decimation
- •Spatial frequency domain
- •Comb filtering
- •Spatial filtering
- •Image presampling filters
- •Image reconstruction filters
- •Spatial (2-D) oversampling
- •Retina
- •Adaptation
- •Contrast sensitivity
- •Contrast sensitivity function (CSF)
- •24. Luminance and lightness
- •Radiance, intensity
- •Luminance
- •Relative luminance
- •Luminance from red, green, and blue
- •Lightness (CIE L*)
- •Fundamentals of vision
- •Definitions
- •Spectral power distribution (SPD) and tristimulus
- •Spectral constraints
- •CIE XYZ tristimulus
- •CIE [x, y] chromaticity
- •Blackbody radiation
- •Colour temperature
- •White
- •Chromatic adaptation
- •Perceptually uniform colour spaces
- •CIE L*a*b* (CIELAB)
- •CIE L*u*v* and CIE L*a*b* summary
- •Colour specification and colour image coding
- •Further reading
- •Additive reproduction (RGB)
- •Characterization of RGB primaries
- •BT.709 primaries
- •Leggacy SD primaries
- •sRGB system
- •SMPTE Free Scale (FS) primaries
- •AMPAS ACES primaries
- •SMPTE/DCI P3 primaries
- •CMFs and SPDs
- •Normalization and scaling
- •Luminance coefficients
- •Transformations between RGB and CIE XYZ
- •Noise due to matrixing
- •Transforms among RGB systems
- •Camera white reference
- •Display white reference
- •Gamut
- •Wide-gamut reproduction
- •Free Scale Gamut, Free Scale Log (FS-Gamut, FS-Log)
- •Further reading
- •27. Gamma
- •Gamma in CRT physics
- •The amazing coincidence!
- •Gamma in video
- •Opto-electronic conversion functions (OECFs)
- •BT.709 OECF
- •SMPTE 240M OECF
- •sRGB transfer function
- •Transfer functions in SD
- •Bit depth requirements
- •Gamma in modern display devices
- •Estimating gamma
- •Gamma in video, CGI, and Macintosh
- •Gamma in computer graphics
- •Gamma in pseudocolour
- •Limitations of 8-bit linear coding
- •Linear and nonlinear coding in CGI
- •Colour acuity
- •RGB and R’G’B’ colour cubes
- •Conventional luma/colour difference coding
- •Luminance and luma notation
- •Nonlinear red, green, blue (R’G’B’)
- •BT.601 luma
- •BT.709 luma
- •Chroma subsampling, revisited
- •Luma/colour difference summary
- •SD and HD luma chaos
- •Luma/colour difference component sets
- •B’-Y’, R’-Y’ components for SD
- •PBPR components for SD
- •CBCR components for SD
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •“Full-swing” Y’CBCR
- •Y’UV, Y’IQ confusion
- •B’-Y’, R’-Y’ components for BT.709 HD
- •PBPR components for BT.709 HD
- •CBCR components for BT.709 HD
- •CBCR components for xvYCC
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •Conversions between HD and SD
- •Colour coding standards
- •31. Video signal processing
- •Edge treatment
- •Transition samples
- •Picture lines
- •Choice of SAL and SPW parameters
- •Video levels
- •Setup (pedestal)
- •BT.601 to computing
- •Enhancement
- •Median filtering
- •Coring
- •Chroma transition improvement (CTI)
- •Mixing and keying
- •Field rate
- •Line rate
- •Sound subcarrier
- •Addition of composite colour
- •NTSC colour subcarrier
- •576i PAL colour subcarrier
- •4fSC sampling
- •Common sampling rate
- •Numerology of HD scanning
- •Audio rates
- •33. Timecode
- •Introduction
- •Dropframe timecode
- •Editing
- •Linear timecode (LTC)
- •Vertical interval timecode (VITC)
- •Timecode structure
- •Further reading
- •34. 2-3 pulldown
- •2-3-3-2 pulldown
- •Conversion of film to different frame rates
- •Native 24 Hz coding
- •Conversion to other rates
- •Spatial domain
- •Vertical-temporal domain
- •Motion adaptivity
- •Further reading
- •36. Colourbars
- •SD colourbars
- •SD colourbar notation
- •Pluge element
- •Composite decoder adjustment using colourbars
- •-I, +Q, and Pluge elements in SD colourbars
- •HD colourbars
- •References
- •38. SDI and HD-SDI interfaces
- •Component digital SD interface (BT.601)
- •Serial digital interface (SDI)
- •Component digital HD-SDI
- •SDI and HD-SDI sync, TRS, and ancillary data
- •Analog sync and digital/analog timing relationships
- •Ancillary data
- •SDI coding
- •HD-SDI coding
- •Interfaces for compressed video
- •SDTI
- •Switching and mixing
- •Timing in digital facilities
- •Summary of digital interfaces
- •39. 480i component video
- •Frame rate
- •Interlace
- •Line sync
- •Field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Halfline blanking
- •Component digital 4:2:2 interface
- •Component analog R’G’B’ interface
- •Component analog Y’PBPR interface, EBU N10
- •Component analog Y’PBPR interface, industry standard
- •40. 576i component video
- •Frame rate
- •Interlace
- •Line sync
- •Analog field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Component digital 4:2:2 interface
- •Component analog 576i interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •43. HD videotape
- •HDCAM (D-11)
- •DVCPRO HD (D-12)
- •HDCAM SR (D-16)
- •JPEG blocks and MCUs
- •JPEG block diagram
- •Level shifting
- •Discrete cosine transform (DCT)
- •JPEG encoding example
- •JPEG decoding
- •Compression ratio control
- •JPEG/JFIF
- •Motion-JPEG (M-JPEG)
- •Further reading
- •46. DV compression
- •DV chroma subsampling
- •DV frame/field modes
- •Picture-in-shuttle in DV
- •DV overflow scheme
- •DV quantization
- •DV digital interface (DIF)
- •Consumer DV recording
- •Professional DV variants
- •47. MPEG-2 video compression
- •MPEG-2 profiles and levels
- •Picture structure
- •Frame rate and 2-3 pulldown in MPEG
- •Luma and chroma sampling structures
- •Macroblocks
- •Picture coding types – I, P, B
- •Prediction
- •Motion vectors (MVs)
- •Coding of a block
- •Frame and field DCT types
- •Zigzag and VLE
- •Refresh
- •Motion estimation
- •Rate control and buffer management
- •Bitstream syntax
- •Transport
- •Further reading
- •48. H.264 video compression
- •Algorithmic features, profiles, and levels
- •Baseline and extended profiles
- •High profiles
- •Hierarchy
- •Multiple reference pictures
- •Slices
- •Spatial intra prediction
- •Flexible motion compensation
- •Quarter-pel motion-compensated interpolation
- •Weighting and offsetting of MC prediction
- •16-bit integer transform
- •Quantizer
- •Variable-length coding
- •Context adaptivity
- •CABAC
- •Deblocking filter
- •Buffer control
- •Scalable video coding (SVC)
- •Multiview video coding (MVC)
- •AVC-Intra
- •Further reading
- •49. VP8 compression
- •Algorithmic features
- •Further reading
- •Elementary stream (ES)
- •Packetized elementary stream (PES)
- •MPEG-2 program stream
- •MPEG-2 transport stream
- •System clock
- •Further reading
- •Japan
- •United States
- •ATSC modulation
- •Europe
- •Further reading
- •Appendices
- •Cement vs. concrete
- •True CIE luminance
- •The misinterpretation of luminance
- •The enshrining of luma
- •Colour difference scale factors
- •Conclusion: A plea
- •Radiometry
- •Photometry
- •Light level examples
- •Image science
- •Units
- •Further reading
- •Glossary
- •Index
- •About the author
MOD is reported to stand for
MPEG on disk.
Elementary stream (ES)
A coder – audio or video – produces a stream of bytes known as an elementary stream. The previous chapter outlines the information that is encoded into a video elementary stream. (Audio encoding is outside the scope of this book.)
An elementary stream can contain private streams.
Packetized elementary stream (PES)
An elementary stream is packetized into packets of 188 bytes, the first byte being MPEG’s sync byte valued 47h. Some systems construct 204 byte packets, expecting the channel coder to overwrite the final 16 bytes of each packet; in this case the sync byte will be B8h.
MPEG-2 program stream
An MPEG-2 program stream (PS) a relatively simple mechanism to multiplex video and audio of a single program for storage or transmission on relatively errorfree media such as computer disks or digital optical media. PS packets are variable-length; packets of
1 KByte or 2 KBytes are typical, though a packet can be as long as 64 KBytes. MPEG-2 program streams are used in applications such as these:
•DVD media uses a strict subset of MPEG-2 program stream encoding; the associated file extension is vob.
•The MOD consumer video format is essentially an MPEG-2 MP@ML SD program stream according to DVD conventions. On a computer, such files typically have extensions mpg or mpeg.
MPEG-2 transport stream
An MPEG-2 transport stream (TS) is a part of the MPEG-2 suite of standards that specifies a relatively complex mechanism of multiplexing video and audio for one or more programs into a data stream, typically having short packets, suitable for transmission through error-prone media where relatively powerful forward error-correction (FEC) is required. A transport stream is suitable for applications where a player connects to
a transmission in progress (like television), as opposed to reading a file from its beginning. For terrestrial over- the-air (OTA) or cable television, TS packets are expected to be suitably protected; however specifica-
556 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
ATM: Asynchronous transfer mode, a protocol for high performance networking.
ATSC Standard A/65, Program and
System Information Protocol.
TOD is reported to stand for transport stream on disk.
I write 29.97 Hz; expressed exactly, it’s 30/1.001.
tion of the FEC and channel coding lies outside the MPEG standards and ordinarily lies within the realm of digital television standards (for example, ATSC standards in North America, and DVB standards in Europe).
A transport stream packet (TSP) comprises 188 bytes – a 4-byte header (whose first byte has the value 47h), including a 13-bit packet identifier (PID), and 184 bytes of payload. Packet size was designed with ATM in mind: One TS packet fits into four ATM cells (48 bytes each). Owing to a lack of external interfaces for program streams, a single program transport stream
(SPTS) may be used to carry one program. For some applications, a multiple program transport stream
(MPTS) is used.
Transport stream packets with PID 0 contain the program association table (PAT), repeated a few times per second. The PAT lists one or more PIDs of subsequent packets containing program map tables (PMTs). A PMT lists PIDs of video and audio elementary streams associated with a single program.
An ATSC DTV transport stream contains a set of packets implementing the program and system information protocol (PSIP). PSIP identifies channels and programs, and conveys time-of-day and station callsign information. PSIP enables a receiver to provide an electronic program guide (EPG).
On a computer, 188-byte transport stream packets typically have a 4-byte timecode appended (resulting in 192-byte packets); a file comprising a sequence of such packets typically has the extension m2t, m2ts, or just ts. MPEG-2 transport streams are used in applications
such as these:
•The TOD consumer video format (essentially an MPEG-2 MP@HL HD transport stream)
•The BDAV container of Blu-ray
•H.264 compressed video
•AVCHD compressed video (in computing, the file extension mts is usual)
System clock
Synchronization in MPEG is achieved through a system clock reference (SCR). The lowest common multiple of 25 Hz and 29.97 Hz is 30 kHz; In MPEG-2, 90 kHz was
CHAPTER 50 |
MPEG-2 STORAGE AND TRANSPORT |
557 |
27 MHz divided by 90 kHz is 300.
Frame |
PCR counts |
rate [Hz] |
per frame |
30 |
3000 |
29.97 |
3003 |
25 |
3600 |
24 |
3750 |
|
|
Table 50.1 MPEG-2
PCR counts per frame
chosen as the basis for the program clock reference (PCR). A program clock value is represented in 33 bits, sufficient to provide unique PCR values over 24 hours.
MPEG system timing is based upon a 27 MHz reference clock, expressed by augmenting the PCR by
a nine-bit field taking a value from 0 through 299. Table 50.1 in the margin enumerates the number of PCR counts per frame at various frame rates.
Each program stream has a single reference clock. Different programs in an MPTS can have different program clocks, so provision is made for a transport stream to carry multiple independent PCRs.
Further reading
Chen, Xuemin (2002), Digital Video Transport System (Springer).
Whitaker, Jerry C. (2003), “DTV Service Multiplex and Transport Systems,” Chapter 13.2 in Standard Handbook of Video and Television Engineering, Fourth Edition (McGraw-Hill).
Whitaker, Jerry C. (2003), “DTV Program and System Information Protocol,” Chapter 13.4 in Standard Handbook of Video and Television Engineering, Fourth Edition (McGraw-Hill).
558 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
Digital television
broadcasting |
51 |
This chapter briefly summarizes digital television broadcasting. Most digital broadcast systems that have been standardized are based upon MPEG-2 compression, described in MPEG-2 video compression on page 513. Some cable and satellite systems use H.264.
HDTV transmission systems were conceived to deliver images of about twice the vertical and twice the horizontal resolution of SDTV – that is, about
2 megapixels – in a 6 MHz analog channel. MPEG-2 can compress 2 megapixel images at 30 frames per second to about 20 Mb/s. Modern digital modulation schemes suitable for terrestrial RF transmission have a payload of about 3.5 bits per hertz of channel bandwidth.
Combining these numbers, you can see that one HDTV digital signal can be transmitted in the spectrum formerly occupied by one analog NTSC 6 MHz channel.
The basic RF parameters of the 525-line, 60-field- per-second interlaced transmission scheme are basically unchanged since the introduction of black-and-white television in 1941! The modulation scheme requires that potential channels at many locations remain unused, owing to potential interference into other channels. The unused channels were called taboo. Digital television transmission takes advantage of half a century of technological improvements in modulation systems. The modulation system chosen allows very low power. This low power has two major consequences: It minimizes interference from digital transmitters into NTSC or PAL, and it allows use, for digital television transmission, of the channels that were formerly taboo. Digital television service is thus overlaid on top of
559
