- •Contents
- •Figures
- •Tables
- •Preface
- •Acknowledgments
- •1. Raster images
- •Aspect ratio
- •Geometry
- •Image capture
- •Digitization
- •Perceptual uniformity
- •Colour
- •Luma and colour difference components
- •Digital image representation
- •Square sampling
- •Comparison of aspect ratios
- •Aspect ratio
- •Frame rates
- •Image state
- •EOCF standards
- •Entertainment programming
- •Acquisition
- •Consumer origination
- •Consumer electronics (CE) display
- •Contrast
- •Contrast ratio
- •Perceptual uniformity
- •The “code 100” problem and nonlinear image coding
- •Linear and nonlinear
- •4. Quantization
- •Linearity
- •Decibels
- •Noise, signal, sensitivity
- •Quantization error
- •Full-swing
- •Studio-swing (footroom and headroom)
- •Interface offset
- •Processing coding
- •Two’s complement wrap-around
- •Perceptual attributes
- •History of display signal processing
- •Digital driving levels
- •Relationship between signal and lightness
- •Algorithm
- •Black level setting
- •Effect of contrast and brightness on contrast and brightness
- •An alternate interpretation
- •Brightness and contrast controls in LCDs
- •Brightness and contrast controls in PDPs
- •Brightness and contrast controls in desktop graphics
- •Symbolic image description
- •Raster images
- •Conversion among types
- •Image files
- •“Resolution” in computer graphics
- •7. Image structure
- •Image reconstruction
- •Sampling aperture
- •Spot profile
- •Box distribution
- •Gaussian distribution
- •8. Raster scanning
- •Flicker, refresh rate, and frame rate
- •Introduction to scanning
- •Scanning parameters
- •Interlaced format
- •Interlace and progressive
- •Scanning notation
- •Motion portrayal
- •Segmented-frame (24PsF)
- •Video system taxonomy
- •Conversion among systems
- •9. Resolution
- •Magnitude frequency response and bandwidth
- •Visual acuity
- •Viewing distance and angle
- •Kell effect
- •Resolution
- •Resolution in video
- •Viewing distance
- •Interlace revisited
- •10. Constant luminance
- •The principle of constant luminance
- •Compensating for the CRT
- •Departure from constant luminance
- •Luma
- •“Leakage” of luminance into chroma
- •11. Picture rendering
- •Surround effect
- •Tone scale alteration
- •Incorporation of rendering
- •Rendering in desktop computing
- •Luma
- •Sloppy use of the term luminance
- •Colour difference coding (chroma)
- •Chroma subsampling
- •Chroma subsampling notation
- •Chroma subsampling filters
- •Chroma in composite NTSC and PAL
- •Scanning standards
- •Widescreen (16:9) SD
- •Square and nonsquare sampling
- •Resampling
- •NTSC and PAL encoding
- •NTSC and PAL decoding
- •S-video interface
- •Frequency interleaving
- •Composite analog SD
- •15. Introduction to HD
- •HD scanning
- •Colour coding for BT.709 HD
- •Data compression
- •Image compression
- •Lossy compression
- •JPEG
- •Motion-JPEG
- •JPEG 2000
- •Mezzanine compression
- •MPEG
- •Picture coding types (I, P, B)
- •Reordering
- •MPEG-1
- •MPEG-2
- •Other MPEGs
- •MPEG IMX
- •MPEG-4
- •AVC-Intra
- •WM9, WM10, VC-1 codecs
- •Compression for CE acquisition
- •AVCHD
- •Compression for IP transport to consumers
- •VP8 (“WebM”) codec
- •Dirac (basic)
- •17. Streams and files
- •Historical overview
- •Physical layer
- •Stream interfaces
- •IEEE 1394 (FireWire, i.LINK)
- •HTTP live streaming (HLS)
- •18. Metadata
- •Metadata Example 1: CD-DA
- •Metadata Example 2: .yuv files
- •Metadata Example 3: RFF
- •Metadata Example 4: JPEG/JFIF
- •Metadata Example 5: Sequence display extension
- •Conclusions
- •19. Stereoscopic (“3-D”) video
- •Acquisition
- •S3D display
- •Anaglyph
- •Temporal multiplexing
- •Polarization
- •Wavelength multiplexing (Infitec/Dolby)
- •Autostereoscopic displays
- •Parallax barrier display
- •Lenticular display
- •Recording and compression
- •Consumer interface and display
- •Ghosting
- •Vergence and accommodation
- •20. Filtering and sampling
- •Sampling theorem
- •Sampling at exactly 0.5fS
- •Magnitude frequency response
- •Magnitude frequency response of a boxcar
- •The sinc weighting function
- •Frequency response of point sampling
- •Fourier transform pairs
- •Analog filters
- •Digital filters
- •Impulse response
- •Finite impulse response (FIR) filters
- •Physical realizability of a filter
- •Phase response (group delay)
- •Infinite impulse response (IIR) filters
- •Lowpass filter
- •Digital filter design
- •Reconstruction
- •Reconstruction close to 0.5fS
- •“(sin x)/x” correction
- •Further reading
- •2:1 downsampling
- •Oversampling
- •Interpolation
- •Lagrange interpolation
- •Lagrange interpolation as filtering
- •Polyphase interpolators
- •Polyphase taps and phases
- •Implementing polyphase interpolators
- •Decimation
- •Lowpass filtering in decimation
- •Spatial frequency domain
- •Comb filtering
- •Spatial filtering
- •Image presampling filters
- •Image reconstruction filters
- •Spatial (2-D) oversampling
- •Retina
- •Adaptation
- •Contrast sensitivity
- •Contrast sensitivity function (CSF)
- •24. Luminance and lightness
- •Radiance, intensity
- •Luminance
- •Relative luminance
- •Luminance from red, green, and blue
- •Lightness (CIE L*)
- •Fundamentals of vision
- •Definitions
- •Spectral power distribution (SPD) and tristimulus
- •Spectral constraints
- •CIE XYZ tristimulus
- •CIE [x, y] chromaticity
- •Blackbody radiation
- •Colour temperature
- •White
- •Chromatic adaptation
- •Perceptually uniform colour spaces
- •CIE L*a*b* (CIELAB)
- •CIE L*u*v* and CIE L*a*b* summary
- •Colour specification and colour image coding
- •Further reading
- •Additive reproduction (RGB)
- •Characterization of RGB primaries
- •BT.709 primaries
- •Leggacy SD primaries
- •sRGB system
- •SMPTE Free Scale (FS) primaries
- •AMPAS ACES primaries
- •SMPTE/DCI P3 primaries
- •CMFs and SPDs
- •Normalization and scaling
- •Luminance coefficients
- •Transformations between RGB and CIE XYZ
- •Noise due to matrixing
- •Transforms among RGB systems
- •Camera white reference
- •Display white reference
- •Gamut
- •Wide-gamut reproduction
- •Free Scale Gamut, Free Scale Log (FS-Gamut, FS-Log)
- •Further reading
- •27. Gamma
- •Gamma in CRT physics
- •The amazing coincidence!
- •Gamma in video
- •Opto-electronic conversion functions (OECFs)
- •BT.709 OECF
- •SMPTE 240M OECF
- •sRGB transfer function
- •Transfer functions in SD
- •Bit depth requirements
- •Gamma in modern display devices
- •Estimating gamma
- •Gamma in video, CGI, and Macintosh
- •Gamma in computer graphics
- •Gamma in pseudocolour
- •Limitations of 8-bit linear coding
- •Linear and nonlinear coding in CGI
- •Colour acuity
- •RGB and R’G’B’ colour cubes
- •Conventional luma/colour difference coding
- •Luminance and luma notation
- •Nonlinear red, green, blue (R’G’B’)
- •BT.601 luma
- •BT.709 luma
- •Chroma subsampling, revisited
- •Luma/colour difference summary
- •SD and HD luma chaos
- •Luma/colour difference component sets
- •B’-Y’, R’-Y’ components for SD
- •PBPR components for SD
- •CBCR components for SD
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •“Full-swing” Y’CBCR
- •Y’UV, Y’IQ confusion
- •B’-Y’, R’-Y’ components for BT.709 HD
- •PBPR components for BT.709 HD
- •CBCR components for BT.709 HD
- •CBCR components for xvYCC
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •Conversions between HD and SD
- •Colour coding standards
- •31. Video signal processing
- •Edge treatment
- •Transition samples
- •Picture lines
- •Choice of SAL and SPW parameters
- •Video levels
- •Setup (pedestal)
- •BT.601 to computing
- •Enhancement
- •Median filtering
- •Coring
- •Chroma transition improvement (CTI)
- •Mixing and keying
- •Field rate
- •Line rate
- •Sound subcarrier
- •Addition of composite colour
- •NTSC colour subcarrier
- •576i PAL colour subcarrier
- •4fSC sampling
- •Common sampling rate
- •Numerology of HD scanning
- •Audio rates
- •33. Timecode
- •Introduction
- •Dropframe timecode
- •Editing
- •Linear timecode (LTC)
- •Vertical interval timecode (VITC)
- •Timecode structure
- •Further reading
- •34. 2-3 pulldown
- •2-3-3-2 pulldown
- •Conversion of film to different frame rates
- •Native 24 Hz coding
- •Conversion to other rates
- •Spatial domain
- •Vertical-temporal domain
- •Motion adaptivity
- •Further reading
- •36. Colourbars
- •SD colourbars
- •SD colourbar notation
- •Pluge element
- •Composite decoder adjustment using colourbars
- •-I, +Q, and Pluge elements in SD colourbars
- •HD colourbars
- •References
- •38. SDI and HD-SDI interfaces
- •Component digital SD interface (BT.601)
- •Serial digital interface (SDI)
- •Component digital HD-SDI
- •SDI and HD-SDI sync, TRS, and ancillary data
- •Analog sync and digital/analog timing relationships
- •Ancillary data
- •SDI coding
- •HD-SDI coding
- •Interfaces for compressed video
- •SDTI
- •Switching and mixing
- •Timing in digital facilities
- •Summary of digital interfaces
- •39. 480i component video
- •Frame rate
- •Interlace
- •Line sync
- •Field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Halfline blanking
- •Component digital 4:2:2 interface
- •Component analog R’G’B’ interface
- •Component analog Y’PBPR interface, EBU N10
- •Component analog Y’PBPR interface, industry standard
- •40. 576i component video
- •Frame rate
- •Interlace
- •Line sync
- •Analog field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Component digital 4:2:2 interface
- •Component analog 576i interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •43. HD videotape
- •HDCAM (D-11)
- •DVCPRO HD (D-12)
- •HDCAM SR (D-16)
- •JPEG blocks and MCUs
- •JPEG block diagram
- •Level shifting
- •Discrete cosine transform (DCT)
- •JPEG encoding example
- •JPEG decoding
- •Compression ratio control
- •JPEG/JFIF
- •Motion-JPEG (M-JPEG)
- •Further reading
- •46. DV compression
- •DV chroma subsampling
- •DV frame/field modes
- •Picture-in-shuttle in DV
- •DV overflow scheme
- •DV quantization
- •DV digital interface (DIF)
- •Consumer DV recording
- •Professional DV variants
- •47. MPEG-2 video compression
- •MPEG-2 profiles and levels
- •Picture structure
- •Frame rate and 2-3 pulldown in MPEG
- •Luma and chroma sampling structures
- •Macroblocks
- •Picture coding types – I, P, B
- •Prediction
- •Motion vectors (MVs)
- •Coding of a block
- •Frame and field DCT types
- •Zigzag and VLE
- •Refresh
- •Motion estimation
- •Rate control and buffer management
- •Bitstream syntax
- •Transport
- •Further reading
- •48. H.264 video compression
- •Algorithmic features, profiles, and levels
- •Baseline and extended profiles
- •High profiles
- •Hierarchy
- •Multiple reference pictures
- •Slices
- •Spatial intra prediction
- •Flexible motion compensation
- •Quarter-pel motion-compensated interpolation
- •Weighting and offsetting of MC prediction
- •16-bit integer transform
- •Quantizer
- •Variable-length coding
- •Context adaptivity
- •CABAC
- •Deblocking filter
- •Buffer control
- •Scalable video coding (SVC)
- •Multiview video coding (MVC)
- •AVC-Intra
- •Further reading
- •49. VP8 compression
- •Algorithmic features
- •Further reading
- •Elementary stream (ES)
- •Packetized elementary stream (PES)
- •MPEG-2 program stream
- •MPEG-2 transport stream
- •System clock
- •Further reading
- •Japan
- •United States
- •ATSC modulation
- •Europe
- •Further reading
- •Appendices
- •Cement vs. concrete
- •True CIE luminance
- •The misinterpretation of luminance
- •The enshrining of luma
- •Colour difference scale factors
- •Conclusion: A plea
- •Radiometry
- •Photometry
- •Light level examples
- •Image science
- •Units
- •Further reading
- •Glossary
- •Index
- •About the author
In video, codeword (or codepoint) refers to a combination of three integer values such as [R’, G’, B’] or [Y’, CB, CR].
RGB and R’G’B’ colour cubes
Red, green, and blue tristimulus (linear light) primary components, as detailed in Colour science for video, on page 287, can be considered to be the coordinates of a three-dimensional colour space. Coordinate values between zero and unity define the unit cube of this space, as sketched at the top of Figure 28.1 opposite. Linear-light coding is used in CGI, where physical light is simulated. However, as I explained in the previous chapter, Gamma in video, 8-bit linear-light coding exhibits poor perceptual performance: 12 or 14 bits per component are necessary to achieve excellent quality. The best perceptual use is made of a limited number of bits by using nonlinear coding that mimics the nonlinear lightness response of human vision. As introduced on page 27, and detailed in Chapter 27 Gamma, on page 315, in video, JPEG, MPEG, computing, digital still photography, and in many other domains a nonlinear transfer function is applied to RGB tristimulus signals to give nonlinearly coded (gamma-corrected) components, denoted with prime symbols: R’G’B’. Excellent image quality is obtained with 10-bit nonlinear coding with a transfer function similar to that of BT.709 or sRGB.
In PC graphics, 8-bit nonlinear coding is common: Each of R’, G’, and B’ ranges from 0 through 255, inclusive, following the quantizer transfer function sketched in Figure 4.1, on page 37. The resulting R’G’B’ cube is sketched at the bottom of Figure 28.1 opposite. A total of 224 colours – that is, 16,777,216 colours – are representable. Not all of them can be distinguished visually; not all are perceptually useful; but they are all colours. Studio video uses headroom and footroom, as explained in Studio-swing (footroom and headroom), on page 42: 8-bit R’G’B’ has 219 codes between black and white, for a total of 2203 or 10,648,000 codewords.
The drawback of conveying R’G’B’ components of an image is that each component requires relatively high spatial resolution: Transmission or storage of a colour image using R’G’B’ components requires a capacity three times that of a greyscale image. Human vision has considerably less spatial acuity for colour information than for lightness. Owing to the poor colour acuity of vision, a colour image can be coded into a wideband
CHAPTER 28 |
LUMA AND COLOUR DIFFERENCES |
337 |
Here the term colour difference refers to a signal formed as the difference of two gamma-corrected colour components. In other contexts, the term can refer to
a numerical measure of the perceptual distance between two colours.
I introduced interface offsets on page 44.
monochrome component representing lightness, and two narrowband components carrying colour information, each having substantially less spatial resolution than lightness. In analog video, each colour channel has bandwidth typically one-third that of the monochrome channel. In digital video, each colour channel has half the data rate (or data capacity) of the monochrome channel, or less. There is strong evidence that the human visual system forms an achromatic channel and two chromatic colour-difference channels at the retina.
Green dominates luminance: Between 60% and 70% of luminance comprises green information. Signal-to- noise ratio is maximized if the colour signals on the other two components are chosen to be blue and red. The simplest way to “remove” lightness from blue and red is to subtract it, to form a pair of colour difference (or loosely, chroma) components.
The monochrome component in colour video could have been based upon the luminance of colour science (a weighted sum of R, G, and B). Instead, as I explained in Constant luminance, on page 107, luma is formed as a weighted sum of R’, G’, and B’, using coefficients similar or identical to those that would be used to compute luminance. Expressed in abstract terms, luma ranges 0 to 1. Colour difference components B’-Y’ and R’-Y’ are bipolar; each ranges nearly ±1.
In component analog video, B’-Y’ and R’-Y’ are scaled to form PB and PR components. In abstract terms, these range ±0.5. Figure 28.2 shows the unit R’G’B’ cube transformed into luma [Y’, PB, PR]. (Various interface standards are in use; see page 359.) In component digital video, B’-Y’ and R’-Y’ are scaled to form CB and CR components. In 8-bit Y’CBCR prior to the application of the interface offset, the luma axis of
Figure 28.2 would be scaled by 219, and the chroma axes by 112.
Once colour difference signals have been formed, they can be subsampled to reduce bandwidth or data capacity, without the observer’s noticing, as I will explain in Chroma subsampling, revisited, on page 347.
338 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
Figure 28.2 A Y’PBPR cube is |
|
|
|
|
formed when R’, G’, and B’ |
|
|
|
1 |
are subject to a particular 3× 3 |
Yl |
|
|
|
matrix transform. The valid |
|
|
|
|
R’G’B’ unit cube occupies |
|
|
|
|
about one-quarter of the |
|
|
Cy |
|
volume of the Y’PBPR unit |
|
|
||
cube. (The volume of the |
|
|
|
|
Y’PBPR unit cube, the outer |
|
|
|
|
boundary of this sketch, is the |
G |
|
|
|
same as the volume of the |
|
|
|
|
R’G’B’ cube in Figure 28.1 on |
|
|
|
Y’ AXIS |
page 336; however, the useful |
|
|
|
|
codes occupy only the central |
|
|
|
|
parallelpiped here.) Luma and |
|
|
|
|
colour difference coding |
|
|
|
|
incurs a penalty in signal-to- |
|
|
|
|
noise ratio, but this disadvan- |
|
|
|
0 |
tage is compensated by the |
|
|
|
|
|
|
|
|
|
opportunity to subsample. |
-0.5 |
|
|
|
|
- |
0 |
. |
5 |
|
|
|||
|
|
|
||
|
|
|
|
|
REFERENCE WHITE
Mg
R |
|
|
AXIS |
|
|
|
PR |
|
. |
5 |
B |
|
|
||
|
+0 |
|
|
0+0.5
PB AXIS
REFERENCE BLACK
Izraelevitz, David, and Joshua L. Koslov (1982), “Code utilization for component-coded digital video,” in Tomorrow’s Television
(Proc. 16th Annual SMPTE Television Conference): 22–30.
1 |
·220·2252 |
2784375 |
|
4 |
|||
|
|
= |
|
2203 |
|
||
10648000 |
|||
|
|
≈ 0.261 |
|
It is evident from Figure 28.2 that when R’G’B’ signals are transformed into the Y’PBPR space of analog video, the unit R’G’B’ cube occupies only part of the volume of the unit Y’PBPR cube: Only 1⁄4 of the Y’PBPR volume corresponds to R’G’B’ values all between 0 and 1. Consequently, Y’PBPR exhibits a loss of signal-to- noise ratio compared to R’G’B’. However, this disadvantage is offset by the opportunity to subsample.
In a legal signal, no component exceeds its reference excursion. Signal combinations that are R’G’B’-legal are termed valid. Signals within the Y’PBPR unit cube are Y’PBPR-legal. However, about 3⁄4 of these combinations correspond to R’G’B’ combinations outside the R’G’B’ unit cube: Although legal, these Y’PBPR combinations are invalid – that is, they are R’G’B’-illegal.
In digital video, we refer to codewords instead of combinations. There are about 2.75 million valid codewords in 8-bit Y’CBCR, compared to 10.6 million in 8-bit studio R’G’B’. If R’G’B’ is transcoded to 8-bit Y’CBCR , then transcoded back to R’G’B’, the resulting R’G’B’ cannot have any more than 2.75 million colours.
CHAPTER 28 |
LUMA AND COLOUR DIFFERENCES |
339 |
|
|
|
|
|
|
|
|
|
R’-Y’ |
|
|
|
|
|
|
|
|
|
+1 |
+0.5 |
|
R |
|
Mg |
|
R |
|
|
+0.701 |
|
|
|
|
Mg |
|
|
|||
|
|
|
|
|
|
|
|
|
|
R |
|
|
|
|
|
|
|
|
|
P |
|
|
|
|
|
|
|
|
|
|
Yl |
|
|
|
|
Yl |
|
|
|
0 |
|
|
Bk, Wt |
B |
Wt |
|
|
Bk |
0 |
|
|
|
|
|
|
|
B |
|
|
|
|
|
|
|
|
|
|
|
|
0.5- |
G |
|
|
|
|
G |
|
|
-0.701 |
|
|
|
|
|
|
|
|
|
|
|
|
|
Cy |
|
|
Cy |
|
|
|
|
|
|
|
|
|
|
|
|
-1 |
|
-0.5 |
0 |
PB |
+0.5 |
+1 |
Y’ |
|
|
0 |
+1 |
|
Wt |
|
|
Figure 28.3 Y’, B’–Y’, R’–Y’ orthographic |
||||
|
|
|
|
|
|||||
|
|
|
|
|
|
||||
|
Yl |
|
|
|
|
views. The unit R’G’B’ cube transformed into |
|||
|
|
|
|
|
Y’PBPR coordinates reveals, at the upper left, |
||||
|
|
|
Cy |
|
|
the hexagonal form familiar to video engi- |
|||
|
|
|
|
|
neers from vectorscope displays. The side |
||||
|
|
|
|
|
|
||||
|
G |
|
|
|
|
view [Y’, R’-Y’] to the right of the hexagon, |
|||
Y’ |
|
|
|
|
and [Y’, B’-Y’] below, are related to the |
||||
|
|
|
|
|
lightning displays used in component video |
||||
|
|
|
|
Mg |
|
||||
|
|
|
|
|
display equipment. PB and PR axis values are |
||||
|
|
|
|
|
|
indicated; these components are scaled from |
|||
|
|
R |
|
|
|
B’-Y’ and R’-Y’ as described in Y’PBPR, on |
|||
|
|
|
|
|
|
page 123. It is apparent from this diagram |
|||
|
|
|
|
B |
|
that the R’G’B’ prism occupies a small frac- |
|||
|
|
Bk |
|
|
|
tion – it turns out to be 1/ |
4 |
– of the volume |
|
0 |
|
|
|
|
of the Y’PBPR cube. This diagram is derived |
||||
|
|
|
|
|
|||||
-1 -0.886 |
0 |
B’-Y’ |
+0.886 +1 |
|
from SD luma coefficients; sadly, HD differs. |
||||
CBCR components are comparable to PBPR components, but have codeword values ranging ±112 on the 8-bit scale instead of abstract values ranging ±0.5.
In Figure 28.2, the Y’PBPR cube is portrayed off-axis. Figure 28.3 shows three orthographic views of the R’G’B’ prism in Y’PBPR-space. The luma axis, denoted Y’, ranges 0 to 1. The chroma axes are annotated with both [B’-Y’, R’-Y’] scaling (where the components range ±0.886 and ±0.701, respectively), and PBPR scaling (where the components both range ±0.5). The extent of the volume of Y’PBPR space that lies outside the R’G’B’ prism is apparent. The emergent xvYCC system, to be described, uses Y’CBCR codewords outside the unit R’G’B’ prism – that is, formerly “invalid” codewords – to convey wide-gamut colour.
340 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
