- •Contents
- •Figures
- •Tables
- •Preface
- •Acknowledgments
- •1. Raster images
- •Aspect ratio
- •Geometry
- •Image capture
- •Digitization
- •Perceptual uniformity
- •Colour
- •Luma and colour difference components
- •Digital image representation
- •Square sampling
- •Comparison of aspect ratios
- •Aspect ratio
- •Frame rates
- •Image state
- •EOCF standards
- •Entertainment programming
- •Acquisition
- •Consumer origination
- •Consumer electronics (CE) display
- •Contrast
- •Contrast ratio
- •Perceptual uniformity
- •The “code 100” problem and nonlinear image coding
- •Linear and nonlinear
- •4. Quantization
- •Linearity
- •Decibels
- •Noise, signal, sensitivity
- •Quantization error
- •Full-swing
- •Studio-swing (footroom and headroom)
- •Interface offset
- •Processing coding
- •Two’s complement wrap-around
- •Perceptual attributes
- •History of display signal processing
- •Digital driving levels
- •Relationship between signal and lightness
- •Algorithm
- •Black level setting
- •Effect of contrast and brightness on contrast and brightness
- •An alternate interpretation
- •Brightness and contrast controls in LCDs
- •Brightness and contrast controls in PDPs
- •Brightness and contrast controls in desktop graphics
- •Symbolic image description
- •Raster images
- •Conversion among types
- •Image files
- •“Resolution” in computer graphics
- •7. Image structure
- •Image reconstruction
- •Sampling aperture
- •Spot profile
- •Box distribution
- •Gaussian distribution
- •8. Raster scanning
- •Flicker, refresh rate, and frame rate
- •Introduction to scanning
- •Scanning parameters
- •Interlaced format
- •Interlace and progressive
- •Scanning notation
- •Motion portrayal
- •Segmented-frame (24PsF)
- •Video system taxonomy
- •Conversion among systems
- •9. Resolution
- •Magnitude frequency response and bandwidth
- •Visual acuity
- •Viewing distance and angle
- •Kell effect
- •Resolution
- •Resolution in video
- •Viewing distance
- •Interlace revisited
- •10. Constant luminance
- •The principle of constant luminance
- •Compensating for the CRT
- •Departure from constant luminance
- •Luma
- •“Leakage” of luminance into chroma
- •11. Picture rendering
- •Surround effect
- •Tone scale alteration
- •Incorporation of rendering
- •Rendering in desktop computing
- •Luma
- •Sloppy use of the term luminance
- •Colour difference coding (chroma)
- •Chroma subsampling
- •Chroma subsampling notation
- •Chroma subsampling filters
- •Chroma in composite NTSC and PAL
- •Scanning standards
- •Widescreen (16:9) SD
- •Square and nonsquare sampling
- •Resampling
- •NTSC and PAL encoding
- •NTSC and PAL decoding
- •S-video interface
- •Frequency interleaving
- •Composite analog SD
- •15. Introduction to HD
- •HD scanning
- •Colour coding for BT.709 HD
- •Data compression
- •Image compression
- •Lossy compression
- •JPEG
- •Motion-JPEG
- •JPEG 2000
- •Mezzanine compression
- •MPEG
- •Picture coding types (I, P, B)
- •Reordering
- •MPEG-1
- •MPEG-2
- •Other MPEGs
- •MPEG IMX
- •MPEG-4
- •AVC-Intra
- •WM9, WM10, VC-1 codecs
- •Compression for CE acquisition
- •AVCHD
- •Compression for IP transport to consumers
- •VP8 (“WebM”) codec
- •Dirac (basic)
- •17. Streams and files
- •Historical overview
- •Physical layer
- •Stream interfaces
- •IEEE 1394 (FireWire, i.LINK)
- •HTTP live streaming (HLS)
- •18. Metadata
- •Metadata Example 1: CD-DA
- •Metadata Example 2: .yuv files
- •Metadata Example 3: RFF
- •Metadata Example 4: JPEG/JFIF
- •Metadata Example 5: Sequence display extension
- •Conclusions
- •19. Stereoscopic (“3-D”) video
- •Acquisition
- •S3D display
- •Anaglyph
- •Temporal multiplexing
- •Polarization
- •Wavelength multiplexing (Infitec/Dolby)
- •Autostereoscopic displays
- •Parallax barrier display
- •Lenticular display
- •Recording and compression
- •Consumer interface and display
- •Ghosting
- •Vergence and accommodation
- •20. Filtering and sampling
- •Sampling theorem
- •Sampling at exactly 0.5fS
- •Magnitude frequency response
- •Magnitude frequency response of a boxcar
- •The sinc weighting function
- •Frequency response of point sampling
- •Fourier transform pairs
- •Analog filters
- •Digital filters
- •Impulse response
- •Finite impulse response (FIR) filters
- •Physical realizability of a filter
- •Phase response (group delay)
- •Infinite impulse response (IIR) filters
- •Lowpass filter
- •Digital filter design
- •Reconstruction
- •Reconstruction close to 0.5fS
- •“(sin x)/x” correction
- •Further reading
- •2:1 downsampling
- •Oversampling
- •Interpolation
- •Lagrange interpolation
- •Lagrange interpolation as filtering
- •Polyphase interpolators
- •Polyphase taps and phases
- •Implementing polyphase interpolators
- •Decimation
- •Lowpass filtering in decimation
- •Spatial frequency domain
- •Comb filtering
- •Spatial filtering
- •Image presampling filters
- •Image reconstruction filters
- •Spatial (2-D) oversampling
- •Retina
- •Adaptation
- •Contrast sensitivity
- •Contrast sensitivity function (CSF)
- •24. Luminance and lightness
- •Radiance, intensity
- •Luminance
- •Relative luminance
- •Luminance from red, green, and blue
- •Lightness (CIE L*)
- •Fundamentals of vision
- •Definitions
- •Spectral power distribution (SPD) and tristimulus
- •Spectral constraints
- •CIE XYZ tristimulus
- •CIE [x, y] chromaticity
- •Blackbody radiation
- •Colour temperature
- •White
- •Chromatic adaptation
- •Perceptually uniform colour spaces
- •CIE L*a*b* (CIELAB)
- •CIE L*u*v* and CIE L*a*b* summary
- •Colour specification and colour image coding
- •Further reading
- •Additive reproduction (RGB)
- •Characterization of RGB primaries
- •BT.709 primaries
- •Leggacy SD primaries
- •sRGB system
- •SMPTE Free Scale (FS) primaries
- •AMPAS ACES primaries
- •SMPTE/DCI P3 primaries
- •CMFs and SPDs
- •Normalization and scaling
- •Luminance coefficients
- •Transformations between RGB and CIE XYZ
- •Noise due to matrixing
- •Transforms among RGB systems
- •Camera white reference
- •Display white reference
- •Gamut
- •Wide-gamut reproduction
- •Free Scale Gamut, Free Scale Log (FS-Gamut, FS-Log)
- •Further reading
- •27. Gamma
- •Gamma in CRT physics
- •The amazing coincidence!
- •Gamma in video
- •Opto-electronic conversion functions (OECFs)
- •BT.709 OECF
- •SMPTE 240M OECF
- •sRGB transfer function
- •Transfer functions in SD
- •Bit depth requirements
- •Gamma in modern display devices
- •Estimating gamma
- •Gamma in video, CGI, and Macintosh
- •Gamma in computer graphics
- •Gamma in pseudocolour
- •Limitations of 8-bit linear coding
- •Linear and nonlinear coding in CGI
- •Colour acuity
- •RGB and R’G’B’ colour cubes
- •Conventional luma/colour difference coding
- •Luminance and luma notation
- •Nonlinear red, green, blue (R’G’B’)
- •BT.601 luma
- •BT.709 luma
- •Chroma subsampling, revisited
- •Luma/colour difference summary
- •SD and HD luma chaos
- •Luma/colour difference component sets
- •B’-Y’, R’-Y’ components for SD
- •PBPR components for SD
- •CBCR components for SD
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •“Full-swing” Y’CBCR
- •Y’UV, Y’IQ confusion
- •B’-Y’, R’-Y’ components for BT.709 HD
- •PBPR components for BT.709 HD
- •CBCR components for BT.709 HD
- •CBCR components for xvYCC
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •Conversions between HD and SD
- •Colour coding standards
- •31. Video signal processing
- •Edge treatment
- •Transition samples
- •Picture lines
- •Choice of SAL and SPW parameters
- •Video levels
- •Setup (pedestal)
- •BT.601 to computing
- •Enhancement
- •Median filtering
- •Coring
- •Chroma transition improvement (CTI)
- •Mixing and keying
- •Field rate
- •Line rate
- •Sound subcarrier
- •Addition of composite colour
- •NTSC colour subcarrier
- •576i PAL colour subcarrier
- •4fSC sampling
- •Common sampling rate
- •Numerology of HD scanning
- •Audio rates
- •33. Timecode
- •Introduction
- •Dropframe timecode
- •Editing
- •Linear timecode (LTC)
- •Vertical interval timecode (VITC)
- •Timecode structure
- •Further reading
- •34. 2-3 pulldown
- •2-3-3-2 pulldown
- •Conversion of film to different frame rates
- •Native 24 Hz coding
- •Conversion to other rates
- •Spatial domain
- •Vertical-temporal domain
- •Motion adaptivity
- •Further reading
- •36. Colourbars
- •SD colourbars
- •SD colourbar notation
- •Pluge element
- •Composite decoder adjustment using colourbars
- •-I, +Q, and Pluge elements in SD colourbars
- •HD colourbars
- •References
- •38. SDI and HD-SDI interfaces
- •Component digital SD interface (BT.601)
- •Serial digital interface (SDI)
- •Component digital HD-SDI
- •SDI and HD-SDI sync, TRS, and ancillary data
- •Analog sync and digital/analog timing relationships
- •Ancillary data
- •SDI coding
- •HD-SDI coding
- •Interfaces for compressed video
- •SDTI
- •Switching and mixing
- •Timing in digital facilities
- •Summary of digital interfaces
- •39. 480i component video
- •Frame rate
- •Interlace
- •Line sync
- •Field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Halfline blanking
- •Component digital 4:2:2 interface
- •Component analog R’G’B’ interface
- •Component analog Y’PBPR interface, EBU N10
- •Component analog Y’PBPR interface, industry standard
- •40. 576i component video
- •Frame rate
- •Interlace
- •Line sync
- •Analog field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Component digital 4:2:2 interface
- •Component analog 576i interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •43. HD videotape
- •HDCAM (D-11)
- •DVCPRO HD (D-12)
- •HDCAM SR (D-16)
- •JPEG blocks and MCUs
- •JPEG block diagram
- •Level shifting
- •Discrete cosine transform (DCT)
- •JPEG encoding example
- •JPEG decoding
- •Compression ratio control
- •JPEG/JFIF
- •Motion-JPEG (M-JPEG)
- •Further reading
- •46. DV compression
- •DV chroma subsampling
- •DV frame/field modes
- •Picture-in-shuttle in DV
- •DV overflow scheme
- •DV quantization
- •DV digital interface (DIF)
- •Consumer DV recording
- •Professional DV variants
- •47. MPEG-2 video compression
- •MPEG-2 profiles and levels
- •Picture structure
- •Frame rate and 2-3 pulldown in MPEG
- •Luma and chroma sampling structures
- •Macroblocks
- •Picture coding types – I, P, B
- •Prediction
- •Motion vectors (MVs)
- •Coding of a block
- •Frame and field DCT types
- •Zigzag and VLE
- •Refresh
- •Motion estimation
- •Rate control and buffer management
- •Bitstream syntax
- •Transport
- •Further reading
- •48. H.264 video compression
- •Algorithmic features, profiles, and levels
- •Baseline and extended profiles
- •High profiles
- •Hierarchy
- •Multiple reference pictures
- •Slices
- •Spatial intra prediction
- •Flexible motion compensation
- •Quarter-pel motion-compensated interpolation
- •Weighting and offsetting of MC prediction
- •16-bit integer transform
- •Quantizer
- •Variable-length coding
- •Context adaptivity
- •CABAC
- •Deblocking filter
- •Buffer control
- •Scalable video coding (SVC)
- •Multiview video coding (MVC)
- •AVC-Intra
- •Further reading
- •49. VP8 compression
- •Algorithmic features
- •Further reading
- •Elementary stream (ES)
- •Packetized elementary stream (PES)
- •MPEG-2 program stream
- •MPEG-2 transport stream
- •System clock
- •Further reading
- •Japan
- •United States
- •ATSC modulation
- •Europe
- •Further reading
- •Appendices
- •Cement vs. concrete
- •True CIE luminance
- •The misinterpretation of luminance
- •The enshrining of luma
- •Colour difference scale factors
- •Conclusion: A plea
- •Radiometry
- •Photometry
- •Light level examples
- •Image science
- •Units
- •Further reading
- •Glossary
- •Index
- •About the author
Don’t confuse point spread function (PSF) with progressive segmented-frame (PsF), to be described on page 94.
Image structure |
7 |
A naïve approach to digital imaging treats an image as a matrix of independent pixels, disregarding the spatial distribution of light power across each pixel. You might think that optimum image quality is obtained when there is no overlap between the distributions of neighboring pixels; many computer engineers hold this view. However, continuous-tone images are best reproduced with a certain degree of overlap between pixels; sharpness is reduced slightly, but pixel structure is made less visible and image quality is improved.
The distribution of intensity across a displayed pixel is referred to as its point spread function (PSF). A onedimensional slice through the center of a PSF is colloquially called a spot profile. A display’s PSF influences the nature of the images it reproduces. The effects of a PSF can be analyzed using filter theory, discussed for one dimension in the chapter Filtering and sampling, on page 191, and for two dimensions in Image digitization and reconstruction, on page 237.
Historically, the PSFs of greyscale (“black-and-white”) CRTs were roughly Gaussian in shape: Intensity distribution peaked at the center of the pixel, fell off over
a small distance, and overlapped neighboring pixels to some extent. The scanning spot of colour CRTs had this shape, too; but the PSF was influenced by the shadow mask or aperture grille. The introduction of direct-view colour CRTs shifted the requirement for spatial filtering to the viewer: The assumption was introduced that the viewers were sufficiently distant from the screens that the viewers’ visual systems would perform the spatial integration necessary to obscure the triad structure.
75
Figure 7.1“Box” reconstruction of a bitmapped graphic image is shown.
Figure 7.2 Gaussian reconstruction is shown for the same bitmapped image as Figure 7.1. I will detail the onedimensional AGaussian function on page 200.
Modern direct view fixed-pixel displays (FPDs) such as LCD and PDP displays have more or less uniform light emission over most of the area corresponding to each colour component (subpixel); their modulated light has a spatial structure comparable to that of
a direct-view colour CRT, and similarly depends upon the viewers being located at a sufficient distance that their visual characteristics perform the spatial intergation necessary to obscure the triad structure.
A pixel whose intensity distribution uniformly covers a small square area of the screen has a point spread function referred to as a “box.”
Image reconstruction
Figure 7.1 reproduces a portion of an idealized bitmapped (bilevel) graphic image, part of a computer’s desktop display. Each sample is either black or white. The element with horizontal “stripes” is part of a window’s titlebar; the checkerboard background is intended to integrate to grey. Figure 7.1 shows reconstruction of the image with a “box” distribution. Each pixel is uniformly shaded across its extent; there is no overlap between pixels. This figure exemplifies a rasterlocked image as displayed on an LCD. By raster-locked, I refer to image data having the underlying image elements aligned with the pixel array.
A CRT’s electron gun produces an electron beam that illuminates a spot on the phosphor screen. The beam is deflected to form a raster pattern of scan lines that traces the entire screen, as I will describe in the following chapter. The beam is not perfectly focused when it is emitted from the CRT’s electron gun, and is dispersed further in transit to the phosphor screen. The intensity produced for each pixel at the face of the screen has a “bell-shaped” distribution resembling
a two-dimensional Gaussian function. With a typical amount of spot overlap, the checkerboard area of this example will display as a nearly uniform grey as depicted in Figure 7.2. You might think that the blur caused by overlap between pixels would diminish image quality. However, for continuous-tone (“contone”) images, some degree of overlap is not only desirable but necessary, as you will see from the following examples.
76 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
Figure 7.3 Diagonal line reconstruction. At the left is a nearvertical line slightly more than 1 pixel wide, rendered as an array 20 pixels high that has been reconstructed using a box distribution. At the right, the line is reconstructed using a Gaussian distribution. Between the images I have placed a set of markers to indicate the vertical centers of the image rows.
Figure 7.4 Contone image reconstruction. At the left is a contin- uous-tone image of 16× 20 pixels that has been reconstructed using a box distribution. The pictured individual cannot be recognized. At the right is exactly the same image data, but reconstructed by a Gaussian function. The reconstructed image is very blurry but recognizable. Which reconstruction function do you think is best for continuous-tone imaging?
Visual acuity is detailed in
Contrast sensitivity function (CSF), on page 251.
Figure 7.3 shows a 16× 20-pixel image of a dark line slightly more than one pixel wide, 7.2° off the vertical. At the left, the image data is reconstructed using a box distribution; a jagged and “ropey” nature is evident. At the right, the image data is reconstructed using a Gaussian. It is blurry, but less jagged.
Figure 7.4 shows two ways to reconstruct the same 16× 20 pixels (320 bytes) of continuous-tone greyscale image data. The left-hand image is reconstructed using a box function, and the right-hand image with a Gaussian. The example was constructed so that each image is 4 cm (1.6 inches) wide. At typical reading distance of 40 cm (16 inches), a pixel subtends 0.4°, where visual acuity is near its maximum. At this distance, when reconstructed with a box function, the pixel structure of each image is highly visible; visibility of the pixel structure overwhelms the perception of the image itself. The right image is reconstructed using a Gaussian distribution. It is blurry, but easily recognizable as an American
CHAPTER 7 |
IMAGE STRUCTURE |
77 |
Figure 7.5 One frame of an animated sequence, reconstructed with a “box” filter.
Figure 7.6 A Moiré pattern is a form of aliasing in two dimensions that results when a sampling pattern (here the perforated square) has a sampling density that is too low for the image content (here the dozen bars, 14° offvertical). This figure is adapted from Fig. 3.12 of Wandell’s Foundations of Vision (cited on page 195).
cultural icon. This example shows that sharpness is not always good, and blurriness is not always bad!
Figure 7.5 in the margin shows a 16× 20-pixel image comprising 20 copies of the top row of Figure 7.3 (left). Consider a sequence of 20 animated frames, where each frame is formed from successive image rows of Figure 7.3. The animation would depict a narrow vertical line drifting rightward across the screenat a rate of 1 pixel every 8 frames. If image rows of Figure 7.3 (left) were used, the width of the moving line would appear to jitter frame-to-frame, and the minimum lightness would vary. With Gaussian reconstruction, as in Figure 7.3 (right), motion portrayal is much smoother.
Sampling aperture
In a practical image sensor, each element acquires information from a finite region of the image plane; the value of each pixel is a function of the distribution of intensity over that region. The distribution of sensitivity across a pixel of an image capture device is referred to as its sampling aperture, sort of a PSF in reverse – you could call it a point “collection” function. The sampling aperture influences the nature of the image signal originated by a sensor. Sampling apertures used in continuous-tone imaging systems usually peak at the center of each pixel, fall off over a small distance, and overlap neighboring pixels to some extent.
In 1915, Harry Nyquist published a landmark paper stating that a sampled analog signal cannot be reconstructed accurately unless all of its frequency components are contained strictly within half the sampling frequency. This condition subsequently became known as the Nyquist criterion; half the sampling rate became known as the Nyquist rate. Nyquist developed his theorem for one-dimensional signals, but it has been extended to two dimensions. In a digital system, it takes at least two elements – two pixels or two scanning lines – to represent a cycle. A cycle is equivalent to a line pair of film, or two “TV lines” (TVL).
In Figure 7.6, the black square punctured by a regular array of holes represents a grid of small sampling apertures. Behind the sampling grid is a set of a dozen black bars, tilted 14° off the vertical, representing image information. In the region where the image is sampled,
78 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
Figure 7.7 Bitmapped graphic image, rotated.
you can see three wide dark bars tilted at 45°. Those bars represent spatial aliases that arise because the number of bars per inch (or mm) in the image is greater than half the number of apertures per inch (or mm) in the sampling lattice. Aliasing can be prevented – or at least minimized – by imposing a spatial filter in front of the sampling process, as I will describe for one-dimen- sional signals in Filtering and sampling, on page 191, and for two dimensions in Image presampling filters, on page 242.
Nyquist explained that an arbitrary signal can be reconstructed accurately only if more than two samples are taken of the highest-frequency component of the signal. Applied to an image, there must be at least twice as many samples per unit distance as there are image elements. The checkerboard pattern in Figure 7.1 (on page 76) doesn’t meet this criterion in either the vertical or horizontal dimensions. Furthermore, the titlebar element doesn’t meet the criterion vertically. Such elements can be represented in a bilevel image only when they are in precise registration – “locked” – to the imaging system’s sampling grid. However, images captured from reality almost never have their elements precisely aligned with the grid!
Point sampling refers to capture with an infinitesimal sampling aperture. This is undesirable in continuoustone imaging. Figure 7.7 shows what would happen if a physical scene like that in Figure 7.1 were rotated 14°, captured with a point-sampled camera, and displayed with a box distribution. The alternating on-off elements are rendered with aliasing in both the checkerboard portion and the titlebar. (Aliasing would be evident even if this image were to be reconstructed with
a Gaussian.) This example emphasizes that in digital imaging, we must represent arbitrary scenes, not just scenes whose elements have an intimate relationship with the sampling grid.
A suitable presampling filter would prevent (or at least minimize) the Moiré artifact of Figure 7.6, and prevent or minimize the aliasing of Figure 7.7. When image content such as the example titlebar and the desktop pattern of Figure 7.2 is presented to a presampling filter, blurring will occur. Considering only bitmapped images such as Figure 7.1, you might think
CHAPTER 7 |
IMAGE STRUCTURE |
79 |
