- •Contents
- •Figures
- •Tables
- •Preface
- •Acknowledgments
- •1. Raster images
- •Aspect ratio
- •Geometry
- •Image capture
- •Digitization
- •Perceptual uniformity
- •Colour
- •Luma and colour difference components
- •Digital image representation
- •Square sampling
- •Comparison of aspect ratios
- •Aspect ratio
- •Frame rates
- •Image state
- •EOCF standards
- •Entertainment programming
- •Acquisition
- •Consumer origination
- •Consumer electronics (CE) display
- •Contrast
- •Contrast ratio
- •Perceptual uniformity
- •The “code 100” problem and nonlinear image coding
- •Linear and nonlinear
- •4. Quantization
- •Linearity
- •Decibels
- •Noise, signal, sensitivity
- •Quantization error
- •Full-swing
- •Studio-swing (footroom and headroom)
- •Interface offset
- •Processing coding
- •Two’s complement wrap-around
- •Perceptual attributes
- •History of display signal processing
- •Digital driving levels
- •Relationship between signal and lightness
- •Algorithm
- •Black level setting
- •Effect of contrast and brightness on contrast and brightness
- •An alternate interpretation
- •Brightness and contrast controls in LCDs
- •Brightness and contrast controls in PDPs
- •Brightness and contrast controls in desktop graphics
- •Symbolic image description
- •Raster images
- •Conversion among types
- •Image files
- •“Resolution” in computer graphics
- •7. Image structure
- •Image reconstruction
- •Sampling aperture
- •Spot profile
- •Box distribution
- •Gaussian distribution
- •8. Raster scanning
- •Flicker, refresh rate, and frame rate
- •Introduction to scanning
- •Scanning parameters
- •Interlaced format
- •Interlace and progressive
- •Scanning notation
- •Motion portrayal
- •Segmented-frame (24PsF)
- •Video system taxonomy
- •Conversion among systems
- •9. Resolution
- •Magnitude frequency response and bandwidth
- •Visual acuity
- •Viewing distance and angle
- •Kell effect
- •Resolution
- •Resolution in video
- •Viewing distance
- •Interlace revisited
- •10. Constant luminance
- •The principle of constant luminance
- •Compensating for the CRT
- •Departure from constant luminance
- •Luma
- •“Leakage” of luminance into chroma
- •11. Picture rendering
- •Surround effect
- •Tone scale alteration
- •Incorporation of rendering
- •Rendering in desktop computing
- •Luma
- •Sloppy use of the term luminance
- •Colour difference coding (chroma)
- •Chroma subsampling
- •Chroma subsampling notation
- •Chroma subsampling filters
- •Chroma in composite NTSC and PAL
- •Scanning standards
- •Widescreen (16:9) SD
- •Square and nonsquare sampling
- •Resampling
- •NTSC and PAL encoding
- •NTSC and PAL decoding
- •S-video interface
- •Frequency interleaving
- •Composite analog SD
- •15. Introduction to HD
- •HD scanning
- •Colour coding for BT.709 HD
- •Data compression
- •Image compression
- •Lossy compression
- •JPEG
- •Motion-JPEG
- •JPEG 2000
- •Mezzanine compression
- •MPEG
- •Picture coding types (I, P, B)
- •Reordering
- •MPEG-1
- •MPEG-2
- •Other MPEGs
- •MPEG IMX
- •MPEG-4
- •AVC-Intra
- •WM9, WM10, VC-1 codecs
- •Compression for CE acquisition
- •AVCHD
- •Compression for IP transport to consumers
- •VP8 (“WebM”) codec
- •Dirac (basic)
- •17. Streams and files
- •Historical overview
- •Physical layer
- •Stream interfaces
- •IEEE 1394 (FireWire, i.LINK)
- •HTTP live streaming (HLS)
- •18. Metadata
- •Metadata Example 1: CD-DA
- •Metadata Example 2: .yuv files
- •Metadata Example 3: RFF
- •Metadata Example 4: JPEG/JFIF
- •Metadata Example 5: Sequence display extension
- •Conclusions
- •19. Stereoscopic (“3-D”) video
- •Acquisition
- •S3D display
- •Anaglyph
- •Temporal multiplexing
- •Polarization
- •Wavelength multiplexing (Infitec/Dolby)
- •Autostereoscopic displays
- •Parallax barrier display
- •Lenticular display
- •Recording and compression
- •Consumer interface and display
- •Ghosting
- •Vergence and accommodation
- •20. Filtering and sampling
- •Sampling theorem
- •Sampling at exactly 0.5fS
- •Magnitude frequency response
- •Magnitude frequency response of a boxcar
- •The sinc weighting function
- •Frequency response of point sampling
- •Fourier transform pairs
- •Analog filters
- •Digital filters
- •Impulse response
- •Finite impulse response (FIR) filters
- •Physical realizability of a filter
- •Phase response (group delay)
- •Infinite impulse response (IIR) filters
- •Lowpass filter
- •Digital filter design
- •Reconstruction
- •Reconstruction close to 0.5fS
- •“(sin x)/x” correction
- •Further reading
- •2:1 downsampling
- •Oversampling
- •Interpolation
- •Lagrange interpolation
- •Lagrange interpolation as filtering
- •Polyphase interpolators
- •Polyphase taps and phases
- •Implementing polyphase interpolators
- •Decimation
- •Lowpass filtering in decimation
- •Spatial frequency domain
- •Comb filtering
- •Spatial filtering
- •Image presampling filters
- •Image reconstruction filters
- •Spatial (2-D) oversampling
- •Retina
- •Adaptation
- •Contrast sensitivity
- •Contrast sensitivity function (CSF)
- •24. Luminance and lightness
- •Radiance, intensity
- •Luminance
- •Relative luminance
- •Luminance from red, green, and blue
- •Lightness (CIE L*)
- •Fundamentals of vision
- •Definitions
- •Spectral power distribution (SPD) and tristimulus
- •Spectral constraints
- •CIE XYZ tristimulus
- •CIE [x, y] chromaticity
- •Blackbody radiation
- •Colour temperature
- •White
- •Chromatic adaptation
- •Perceptually uniform colour spaces
- •CIE L*a*b* (CIELAB)
- •CIE L*u*v* and CIE L*a*b* summary
- •Colour specification and colour image coding
- •Further reading
- •Additive reproduction (RGB)
- •Characterization of RGB primaries
- •BT.709 primaries
- •Leggacy SD primaries
- •sRGB system
- •SMPTE Free Scale (FS) primaries
- •AMPAS ACES primaries
- •SMPTE/DCI P3 primaries
- •CMFs and SPDs
- •Normalization and scaling
- •Luminance coefficients
- •Transformations between RGB and CIE XYZ
- •Noise due to matrixing
- •Transforms among RGB systems
- •Camera white reference
- •Display white reference
- •Gamut
- •Wide-gamut reproduction
- •Free Scale Gamut, Free Scale Log (FS-Gamut, FS-Log)
- •Further reading
- •27. Gamma
- •Gamma in CRT physics
- •The amazing coincidence!
- •Gamma in video
- •Opto-electronic conversion functions (OECFs)
- •BT.709 OECF
- •SMPTE 240M OECF
- •sRGB transfer function
- •Transfer functions in SD
- •Bit depth requirements
- •Gamma in modern display devices
- •Estimating gamma
- •Gamma in video, CGI, and Macintosh
- •Gamma in computer graphics
- •Gamma in pseudocolour
- •Limitations of 8-bit linear coding
- •Linear and nonlinear coding in CGI
- •Colour acuity
- •RGB and R’G’B’ colour cubes
- •Conventional luma/colour difference coding
- •Luminance and luma notation
- •Nonlinear red, green, blue (R’G’B’)
- •BT.601 luma
- •BT.709 luma
- •Chroma subsampling, revisited
- •Luma/colour difference summary
- •SD and HD luma chaos
- •Luma/colour difference component sets
- •B’-Y’, R’-Y’ components for SD
- •PBPR components for SD
- •CBCR components for SD
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •“Full-swing” Y’CBCR
- •Y’UV, Y’IQ confusion
- •B’-Y’, R’-Y’ components for BT.709 HD
- •PBPR components for BT.709 HD
- •CBCR components for BT.709 HD
- •CBCR components for xvYCC
- •Y’CBCR from studio RGB
- •Y’CBCR from computer RGB
- •Conversions between HD and SD
- •Colour coding standards
- •31. Video signal processing
- •Edge treatment
- •Transition samples
- •Picture lines
- •Choice of SAL and SPW parameters
- •Video levels
- •Setup (pedestal)
- •BT.601 to computing
- •Enhancement
- •Median filtering
- •Coring
- •Chroma transition improvement (CTI)
- •Mixing and keying
- •Field rate
- •Line rate
- •Sound subcarrier
- •Addition of composite colour
- •NTSC colour subcarrier
- •576i PAL colour subcarrier
- •4fSC sampling
- •Common sampling rate
- •Numerology of HD scanning
- •Audio rates
- •33. Timecode
- •Introduction
- •Dropframe timecode
- •Editing
- •Linear timecode (LTC)
- •Vertical interval timecode (VITC)
- •Timecode structure
- •Further reading
- •34. 2-3 pulldown
- •2-3-3-2 pulldown
- •Conversion of film to different frame rates
- •Native 24 Hz coding
- •Conversion to other rates
- •Spatial domain
- •Vertical-temporal domain
- •Motion adaptivity
- •Further reading
- •36. Colourbars
- •SD colourbars
- •SD colourbar notation
- •Pluge element
- •Composite decoder adjustment using colourbars
- •-I, +Q, and Pluge elements in SD colourbars
- •HD colourbars
- •References
- •38. SDI and HD-SDI interfaces
- •Component digital SD interface (BT.601)
- •Serial digital interface (SDI)
- •Component digital HD-SDI
- •SDI and HD-SDI sync, TRS, and ancillary data
- •Analog sync and digital/analog timing relationships
- •Ancillary data
- •SDI coding
- •HD-SDI coding
- •Interfaces for compressed video
- •SDTI
- •Switching and mixing
- •Timing in digital facilities
- •Summary of digital interfaces
- •39. 480i component video
- •Frame rate
- •Interlace
- •Line sync
- •Field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Halfline blanking
- •Component digital 4:2:2 interface
- •Component analog R’G’B’ interface
- •Component analog Y’PBPR interface, EBU N10
- •Component analog Y’PBPR interface, industry standard
- •40. 576i component video
- •Frame rate
- •Interlace
- •Line sync
- •Analog field/frame sync
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Picture center, aspect ratio, and blanking
- •Component digital 4:2:2 interface
- •Component analog 576i interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •Scanning
- •Analog sync
- •Picture center, aspect ratio, and blanking
- •R’G’B’ EOCF and primaries
- •Luma (Y’)
- •Component digital 4:2:2 interface
- •43. HD videotape
- •HDCAM (D-11)
- •DVCPRO HD (D-12)
- •HDCAM SR (D-16)
- •JPEG blocks and MCUs
- •JPEG block diagram
- •Level shifting
- •Discrete cosine transform (DCT)
- •JPEG encoding example
- •JPEG decoding
- •Compression ratio control
- •JPEG/JFIF
- •Motion-JPEG (M-JPEG)
- •Further reading
- •46. DV compression
- •DV chroma subsampling
- •DV frame/field modes
- •Picture-in-shuttle in DV
- •DV overflow scheme
- •DV quantization
- •DV digital interface (DIF)
- •Consumer DV recording
- •Professional DV variants
- •47. MPEG-2 video compression
- •MPEG-2 profiles and levels
- •Picture structure
- •Frame rate and 2-3 pulldown in MPEG
- •Luma and chroma sampling structures
- •Macroblocks
- •Picture coding types – I, P, B
- •Prediction
- •Motion vectors (MVs)
- •Coding of a block
- •Frame and field DCT types
- •Zigzag and VLE
- •Refresh
- •Motion estimation
- •Rate control and buffer management
- •Bitstream syntax
- •Transport
- •Further reading
- •48. H.264 video compression
- •Algorithmic features, profiles, and levels
- •Baseline and extended profiles
- •High profiles
- •Hierarchy
- •Multiple reference pictures
- •Slices
- •Spatial intra prediction
- •Flexible motion compensation
- •Quarter-pel motion-compensated interpolation
- •Weighting and offsetting of MC prediction
- •16-bit integer transform
- •Quantizer
- •Variable-length coding
- •Context adaptivity
- •CABAC
- •Deblocking filter
- •Buffer control
- •Scalable video coding (SVC)
- •Multiview video coding (MVC)
- •AVC-Intra
- •Further reading
- •49. VP8 compression
- •Algorithmic features
- •Further reading
- •Elementary stream (ES)
- •Packetized elementary stream (PES)
- •MPEG-2 program stream
- •MPEG-2 transport stream
- •System clock
- •Further reading
- •Japan
- •United States
- •ATSC modulation
- •Europe
- •Further reading
- •Appendices
- •Cement vs. concrete
- •True CIE luminance
- •The misinterpretation of luminance
- •The enshrining of luma
- •Colour difference scale factors
- •Conclusion: A plea
- •Radiometry
- •Photometry
- •Light level examples
- •Image science
- •Units
- •Further reading
- •Glossary
- •Index
- •About the author
Streams and files |
17 |
A file is an ordered sequence of bytes explicitly having a start and an end, characterized by storage. A stream is characterized by realtime data transfer of unbounded duration on a unidirectional channel – that is, with no upstream channel for flow control, acknowledgement, or retransmission request. Table 17.1 provides a general summary of the characteristics of files and streams.
A file … |
A stream … |
|
|
… has predefined beginning and end |
… has indeterminate beginning and end |
… usually involves storage media |
… usually involves an external data |
|
interconnect |
… permits “random access” to data |
… involves sequential data access, typically |
|
starting midstream |
… has structure imposed at a high level; |
… has structure imposed at a low level; |
data is arranged arbitrarily |
data is arranged to minimize buffering |
… has no need for embedded delimiters |
… contains embedded delimiters by which |
|
essence elements can be identified “on the |
|
fly” |
… transfer usually occurs across a general- |
… transfer usually occurs across a data |
purpose network |
interconnect |
… transfer is usually free-running |
… transfer is usually synchronized to a timing |
|
reference |
… transfer typically has variable data (bit) rate |
… transfer typically has constant data (bit) rate |
(VBR) |
(CBR) |
… transfer data integrity is guaranteed, |
… transfer data rate is guaranteed, |
but data transfer rate isn’t (best effort) |
but data integrity isn’t (errors may intrude) |
… transfer typically involves upstream |
… transfer typically has no upstream |
communication; transfers are generally |
communication; transfers are generally |
acknowledged |
not acknowledged |
Table 17.1 Files and streams are compared. |
|
163
A stream is “live,” and suitable for realtime interface. A file is not intrinsically “live.” A file may be operated on or exchanged slower than realtime, in realtime, or faster than realtime. A portion of a stream can be recorded as a file, and a file can be streamed across an interface; however, generally, streams are structured for realtime use across interfaces and files are structured for nonrealtime use on storage media. Generally, video interfaces (and videotape recorders) are characterized as streams; video storage is characterized as files.
Historical overview
Video signals were historically conceived as streams. VTRs recorded continuous streams (traditionally omitting the vertical blanking interval, and in DVTRs, omitting the horizontal blanking intervals as well).
A stream interface conveys elements (analogous to historical analog video sync) that permit synchronization on the fly: A receiver can connect to a stream at any time, and begin operation within a fraction of
a second. Stream formats are designed to have a property of locality whereby essence elements to be presented simultaneously – typically, video and the associated audio – are located nearby in the stream, so as to bound the required buffer storage capacity, and to bound latency to access the essence required for presentation.
Historically, uncompressed digital video was streamed in the studio in realtime across SDI or HD-SDI interfaces. SDI and HD-SDI timing was designed such that analog video could be obtained simply by stripping off the stream synchronization and ancillary elements, performing digital-to-analog conversion, and inserting analog sync. Almost no buffer storage was required.
As digital recording and playback of compressed digital video became possible, the SDTI specialization of SDI was designed to “wrap” compressed video, then audio, for conveyance across SDI. Various compression schemes such as MPEG IMX and DV100 were accommodated. DV video can be conveyed in streaming mode across IEEE 1394 interfaces.
As computing and networking technology advanced, it became feasible to store compressed video, then uncompressed video, in files. It became common to
164 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
The Apple iTunes model differs: Files are transferred for later playback.
exchange these files across TCP/IP Ethernet networks – first at 100 Mb/s, then 1 Gb/s, and soon, 10 Gb/s.
As commodity IT networking technology improved, the schemes that were used to package compressed video for transport across specialized interfaces such as SDTI were adapted to general-purpose networking. Compressed video in formats such as MPEG IMX and DV100 were stored in files as raw bytestreams. The MXF file format emerged as a mechanism to store video and audio essence in a more structured manner. Standards emerged to store compressed video in MXF files. An MXF file need not contain essence: It can refer to essence stored in separate files. Some formats for A/V storage in MXF files are “stream-friendly” in the sense that video and audio essence is stored in the MXF file in proximity to each other, suitable for playout with
a minimum of restructuring. Other formats have higherlevel structure more suitable to postproduction (for example, storing video and audio in separate files referred to by the main MXF file).
Today, file-based workflows are widely used in production and postproduction; however, stream-based techniques continue to dominate distribution of professionally produced content. Services such as YouTube, Hulu, and Netflix distribute video to consumers across what I call the big wooly internet – however, service and quality levels of these systems are lower than those associated with television broadcasting: The pictures aren’t at HD quality level. They stutter, and the audio loses sync. Internet-protocol television (IPTV) refers to adaptations of commodity TCP/IP-based networking to achieve the service and quality levels of broadcasting.
In professional video distribution, the file-based production/postproduction world and the stream-based distribution world meet at the playout server. The playout server includes a disk store with an associated file system. On the production side, the server is accessed asynchronously using IT networking. On the distribution side, a stream access mechanism reads files according to a timeline driven by house sync and timecode, and throttles playback accordingly. Realtime decompression of compressed video may be required. Dedicated stream interfaces then launch the content into the distribution network.
CHAPTER 17 |
STREAMS AND FILES |
165 |
|
Physical layer |
|
Serial digital video SDI and HD-SDI interfaces are based |
|
upon 10-bit words that are serialized, “scrambled,” then |
|
conveyed unidirectionally as a bitstream onto a single |
|
wire. The scrambling technique permits payload data |
|
transfer rate to equal the bit rate on the wire; however, |
|
signalling sync requires certain data values to be |
|
prohibited from appearing in video. |
|
Commodity computer interfaces are based upon |
|
8-bit bytes. Historically, data was serialized onto |
|
a single conductor (e.g., Ethernet); however, a number |
|
of computer-oriented interfaces (e.g., PCIe and Thun- |
|
derbolt) serialize data onto multiple “lanes.” In some |
|
physical interfaces, data is typically mapped – for |
|
example, using the 8b/10b scheme – so that all 8-bit |
|
byte values can be conveyed across the interface, while |
|
allowing the receiver to recover the clock and the data |
|
framing for arbitrary data. The bit rate in the channel of |
|
such encodings is somewhat higher than the payload |
|
rate – in the case of 8b/10b mapping, 1.25 times higher. |
|
Other interfaces (e.g., DVI and HDMI) use a dedicated |
|
clock wire to establish timing; arbitrary data can then |
|
be serialized and transferred without data value restric- |
|
tions. Some interfaces reverse the direction of data |
|
transfer across each conductor (or pair); others have |
|
dedicated wires (or pairs) in each direction. |
|
Stream interfaces |
SDI, HD-SDI |
SDI was designed for uncompressed 4:2:2 SD; it has |
|
a data rate of 270 Mb/s. (SDI was adapted to SDTI for |
|
compressed video; however, SDTI is now largely obso- |
|
lete.) HD-SDI was designed for uncompressed 4:2:2 |
|
HD; it has a data rate of about 1.5 Gb/s. Recently, |
|
a 3 Mb/s adaptation of HD-SDI has been standardized |
|
and commercialized. Details are found in SDI and HD- |
|
SDI interfaces, on page 429. |
DVI, HDMI, and DisplayPort |
DVI, HDMI, and DisplayPort are digital interfaces |
|
designed for connection of computer graphics |
|
subsystems to displays, across cables at lengths up to |
|
3 m. Apart from a very low-rate reverse channel – |
|
display data channel, DDC – that communicates display |
|
characteristics upstream to the graphics subsystem, DVI |
|
and HDMI are unidirectional. |
166 |
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES |
