
- •Preface
- •Introduction
- •1.1 Spatial coordinate systems
- •1.2 Sound fields and their physical characteristics
- •1.2.1 Free-field and sound waves generated by simple sound sources
- •1.2.2 Reflections from boundaries
- •1.2.3 Directivity of sound source radiation
- •1.2.4 Statistical analysis of acoustics in an enclosed space
- •1.2.5 Principle of sound receivers
- •1.3 Auditory system and perception
- •1.3.1 Auditory system and its functions
- •1.3.2 Hearing threshold and loudness
- •1.3.3 Masking
- •1.3.4 Critical band and auditory filter
- •1.4 Artificial head models and binaural signals
- •1.4.1 Artificial head models
- •1.4.2 Binaural signals and head-related transfer functions
- •1.5 Outline of spatial hearing
- •1.6 Localization cues for a single sound source
- •1.6.1 Interaural time difference
- •1.6.2 Interaural level difference
- •1.6.3 Cone of confusion and head movement
- •1.6.4 Spectral cues
- •1.6.5 Discussion on directional localization cues
- •1.6.6 Auditory distance perception
- •1.7 Summing localization and spatial hearing with multiple sources
- •1.7.1 Summing localization with two sound sources
- •1.7.2 The precedence effect
- •1.7.3 Spatial auditory perceptions with partially correlated and uncorrelated source signals
- •1.7.4 Auditory scene analysis and spatial hearing
- •1.7.5 Cocktail party effect
- •1.8 Room reflections and auditory spatial impression
- •1.8.1 Auditory spatial impression
- •1.8.2 Sound field-related measures and auditory spatial impression
- •1.8.3 Binaural-related measures and auditory spatial impression
- •1.9.1 Basic principle of spatial sound
- •1.9.2 Classification of spatial sound
- •1.9.3 Developments and applications of spatial sound
- •1.10 Summary
- •2.1 Basic principle of a two-channel stereophonic sound
- •2.1.1 Interchannel level difference and summing localization equation
- •2.1.2 Effect of frequency
- •2.1.3 Effect of interchannel phase difference
- •2.1.4 Virtual source created by interchannel time difference
- •2.1.5 Limitation of two-channel stereophonic sound
- •2.2.1 XY microphone pair
- •2.2.2 MS transformation and the MS microphone pair
- •2.2.3 Spaced microphone technique
- •2.2.4 Near-coincident microphone technique
- •2.2.5 Spot microphone and pan-pot technique
- •2.2.6 Discussion on microphone and signal simulation techniques for two-channel stereophonic sound
- •2.3 Upmixing and downmixing between two-channel stereophonic and mono signals
- •2.4 Two-channel stereophonic reproduction
- •2.4.1 Standard loudspeaker configuration of two-channel stereophonic sound
- •2.4.2 Influence of front-back deviation of the head
- •2.5 Summary
- •3.1 Physical and psychoacoustic principles of multichannel surround sound
- •3.2 Summing localization in multichannel horizontal surround sound
- •3.2.1 Summing localization equations for multiple horizontal loudspeakers
- •3.2.2 Analysis of the velocity and energy localization vectors of the superposed sound field
- •3.2.3 Discussion on horizontal summing localization equations
- •3.3 Multiple loudspeakers with partly correlated and low-correlated signals
- •3.4 Summary
- •4.1 Discrete quadraphone
- •4.1.1 Outline of the quadraphone
- •4.1.2 Discrete quadraphone with pair-wise amplitude panning
- •4.1.3 Discrete quadraphone with the first-order sound field signal mixing
- •4.1.4 Some discussions on discrete quadraphones
- •4.2 Other horizontal surround sounds with regular loudspeaker configurations
- •4.2.1 Six-channel reproduction with pair-wise amplitude panning
- •4.2.2 The first-order sound field signal mixing and reproduction with M ≥ 3 loudspeakers
- •4.3 Transformation of horizontal sound field signals and Ambisonics
- •4.3.1 Transformation of the first-order horizontal sound field signals
- •4.3.2 The first-order horizontal Ambisonics
- •4.3.3 The higher-order horizontal Ambisonics
- •4.3.4 Discussion and implementation of the horizontal Ambisonics
- •4.4 Summary
- •5.1 Outline of surround sounds with accompanying picture and general uses
- •5.2 5.1-Channel surround sound and its signal mixing analysis
- •5.2.1 Outline of 5.1-channel surround sound
- •5.2.2 Pair-wise amplitude panning for 5.1-channel surround sound
- •5.2.3 Global Ambisonic-like signal mixing for 5.1-channel sound
- •5.2.4 Optimization of three frontal loudspeaker signals and local Ambisonic-like signal mixing
- •5.2.5 Time panning for 5.1-channel surround sound
- •5.3 Other multichannel horizontal surround sounds
- •5.4 Low-frequency effect channel
- •5.5 Summary
- •6.1 Summing localization in multichannel spatial surround sound
- •6.1.1 Summing localization equations for spatial multiple loudspeaker configurations
- •6.1.2 Velocity and energy localization vector analysis for multichannel spatial surround sound
- •6.1.3 Discussion on spatial summing localization equations
- •6.1.4 Relationship with the horizontal summing localization equations
- •6.2 Signal mixing methods for a pair of vertical loudspeakers in the median and sagittal plane
- •6.3 Vector base amplitude panning
- •6.4 Spatial Ambisonic signal mixing and reproduction
- •6.4.1 Principle of spatial Ambisonics
- •6.4.2 Some examples of the first-order spatial Ambisonics
- •6.4.4 Recreating a top virtual source with a horizontal loudspeaker arrangement and Ambisonic signal mixing
- •6.5 Advanced multichannel spatial surround sounds and problems
- •6.5.1 Some advanced multichannel spatial surround sound techniques and systems
- •6.5.2 Object-based spatial sound
- •6.5.3 Some problems related to multichannel spatial surround sound
- •6.6 Summary
- •7.1 Basic considerations on the microphone and signal simulation techniques for multichannel sounds
- •7.2 Microphone techniques for 5.1-channel sound recording
- •7.2.1 Outline of microphone techniques for 5.1-channel sound recording
- •7.2.2 Main microphone techniques for 5.1-channel sound recording
- •7.2.3 Microphone techniques for the recording of three frontal channels
- •7.2.4 Microphone techniques for ambience recording and combination with frontal localization information recording
- •7.2.5 Stereophonic plus center channel recording
- •7.3 Microphone techniques for other multichannel sounds
- •7.3.1 Microphone techniques for other discrete multichannel sounds
- •7.3.2 Microphone techniques for Ambisonic recording
- •7.4 Simulation of localization signals for multichannel sounds
- •7.4.1 Methods of the simulation of directional localization signals
- •7.4.2 Simulation of virtual source distance and extension
- •7.4.3 Simulation of a moving virtual source
- •7.5 Simulation of reflections for stereophonic and multichannel sounds
- •7.5.1 Delay algorithms and discrete reflection simulation
- •7.5.2 IIR filter algorithm of late reverberation
- •7.5.3 FIR, hybrid FIR, and recursive filter algorithms of late reverberation
- •7.5.4 Algorithms of audio signal decorrelation
- •7.5.5 Simulation of room reflections based on physical measurement and calculation
- •7.6 Directional audio coding and multichannel sound signal synthesis
- •7.7 Summary
- •8.1 Matrix surround sound
- •8.1.1 Matrix quadraphone
- •8.1.2 Dolby Surround system
- •8.1.3 Dolby Pro-Logic decoding technique
- •8.1.4 Some developments on matrix surround sound and logic decoding techniques
- •8.2 Downmixing of multichannel sound signals
- •8.3 Upmixing of multichannel sound signals
- •8.3.1 Some considerations in upmixing
- •8.3.2 Simple upmixing methods for front-channel signals
- •8.3.3 Simple methods for Ambient component separation
- •8.3.4 Model and statistical characteristics of two-channel stereophonic signals
- •8.3.5 A scale-signal-based algorithm for upmixing
- •8.3.6 Upmixing algorithm based on principal component analysis
- •8.3.7 Algorithm based on the least mean square error for upmixing
- •8.3.8 Adaptive normalized algorithm based on the least mean square for upmixing
- •8.3.9 Some advanced upmixing algorithms
- •8.4 Summary
- •9.1 Each order approximation of ideal reproduction and Ambisonics
- •9.1.1 Each order approximation of ideal horizontal reproduction
- •9.1.2 Each order approximation of ideal three-dimensional reproduction
- •9.2 General formulation of multichannel sound field reconstruction
- •9.2.1 General formulation of multichannel sound field reconstruction in the spatial domain
- •9.2.2 Formulation of spatial-spectral domain analysis of circular secondary source array
- •9.2.3 Formulation of spatial-spectral domain analysis for a secondary source array on spherical surface
- •9.3 Spatial-spectral domain analysis and driving signals of Ambisonics
- •9.3.1 Reconstructed sound field of horizontal Ambisonics
- •9.3.2 Reconstructed sound field of spatial Ambisonics
- •9.3.3 Mixed-order Ambisonics
- •9.3.4 Near-field compensated higher-order Ambisonics
- •9.3.5 Ambisonic encoding of complex source information
- •9.3.6 Some special applications of spatial-spectral domain analysis of Ambisonics
- •9.4 Some problems related to Ambisonics
- •9.4.1 Secondary source array and stability of Ambisonics
- •9.4.2 Spatial transformation of Ambisonic sound field
- •9.5 Error analysis of Ambisonic-reconstructed sound field
- •9.5.1 Integral error of Ambisonic-reconstructed wavefront
- •9.5.2 Discrete secondary source array and spatial-spectral aliasing error in Ambisonics
- •9.6 Multichannel reconstructed sound field analysis in the spatial domain
- •9.6.1 Basic method for analysis in the spatial domain
- •9.6.2 Minimizing error in reconstructed sound field and summing localization equation
- •9.6.3 Multiple receiver position matching method and its relation to the mode-matching method
- •9.7 Listening room reflection compensation in multichannel sound reproduction
- •9.8 Microphone array for multichannel sound field signal recording
- •9.8.1 Circular microphone array for horizontal Ambisonic recording
- •9.8.2 Spherical microphone array for spatial Ambisonic recording
- •9.8.3 Discussion on microphone array recording
- •9.9 Summary
- •10.1 Basic principle and implementation of wave field synthesis
- •10.1.1 Kirchhoff–Helmholtz boundary integral and WFS
- •10.1.2 Simplification of the types of secondary sources
- •10.1.3 WFS in a horizontal plane with a linear array of secondary sources
- •10.1.4 Finite secondary source array and effect of spatial truncation
- •10.1.5 Discrete secondary source array and spatial aliasing
- •10.1.6 Some issues and related problems on WFS implementation
- •10.2 General theory of WFS
- •10.2.1 Green’s function of Helmholtz equation
- •10.2.2 General theory of three-dimensional WFS
- •10.2.3 General theory of two-dimensional WFS
- •10.2.4 Focused source in WFS
- •10.3 Analysis of WFS in the spatial-spectral domain
- •10.3.1 General formulation and analysis of WFS in the spatial-spectral domain
- •10.3.2 Analysis of the spatial aliasing in WFS
- •10.3.3 Spatial-spectral division method of WFS
- •10.4 Further discussion on sound field reconstruction
- •10.4.1 Comparison among various methods of sound field reconstruction
- •10.4.2 Further analysis of the relationship between acoustical holography and sound field reconstruction
- •10.4.3 Further analysis of the relationship between acoustical holography and Ambisonics
- •10.4.4 Comparison between WFS and Ambisonics
- •10.5 Equalization of WFS under nonideal conditions
- •10.6 Summary
- •11.1 Basic principles of binaural reproduction and virtual auditory display
- •11.1.1 Binaural recording and reproduction
- •11.1.2 Virtual auditory display
- •11.2 Acquisition of HRTFs
- •11.2.1 HRTF measurement
- •11.2.2 HRTF calculation
- •11.2.3 HRTF customization
- •11.3 Basic physical features of HRTFs
- •11.3.1 Time-domain features of far-field HRIRs
- •11.3.2 Frequency domain features of far-field HRTFs
- •11.3.3 Features of near-field HRTFs
- •11.4 HRTF-based filters for binaural synthesis
- •11.5 Spatial interpolation and decomposition of HRTFs
- •11.5.1 Directional interpolation of HRTFs
- •11.5.2 Spatial basis function decomposition and spatial sampling theorem of HRTFs
- •11.5.3 HRTF spatial interpolation and signal mixing for multichannel sound
- •11.5.4 Spectral shape basis function decomposition of HRTFs
- •11.6 Simplification of signal processing for binaural synthesis
- •11.6.1 Virtual loudspeaker-based algorithms
- •11.6.2 Basis function decomposition-based algorithms
- •11.7.1 Principle of headphone equalization
- •11.7.2 Some problems with binaural reproduction and VAD
- •11.8 Binaural reproduction through loudspeakers
- •11.8.1 Basic principle of binaural reproduction through loudspeakers
- •11.8.2 Virtual source distribution in two-front loudspeaker reproduction
- •11.8.3 Head movement and stability of virtual sources in Transaural reproduction
- •11.8.4 Timbre coloration and equalization in transaural reproduction
- •11.9 Virtual reproduction of stereophonic and multichannel surround sound
- •11.9.1 Binaural reproduction of stereophonic and multichannel sound through headphones
- •11.9.2 Stereophonic expansion and enhancement
- •11.9.3 Virtual reproduction of multichannel sound through loudspeakers
- •11.10.1 Binaural room modeling
- •11.10.2 Dynamic virtual auditory environments system
- •11.11 Summary
- •12.1 Physical analysis of binaural pressures in summing virtual source and auditory events
- •12.1.1 Evaluation of binaural pressures and localization cues
- •12.1.2 Method for summing localization analysis
- •12.1.3 Binaural pressure analysis of stereophonic and multichannel sound with amplitude panning
- •12.1.4 Analysis of summing localization with interchannel time difference
- •12.1.5 Analysis of summing localization at the off-central listening position
- •12.1.6 Analysis of interchannel correlation and spatial auditory sensations
- •12.2 Binaural auditory models and analysis of spatial sound reproduction
- •12.2.1 Analysis of lateral localization by using auditory models
- •12.2.2 Analysis of front-back and vertical localization by using a binaural auditory model
- •12.2.3 Binaural loudness models and analysis of the timbre of spatial sound reproduction
- •12.3 Binaural measurement system for assessing spatial sound reproduction
- •12.4 Summary
- •13.1 Analog audio storage and transmission
- •13.1.1 45°/45° Disk recording system
- •13.1.2 Analog magnetic tape audio recorder
- •13.1.3 Analog stereo broadcasting
- •13.2 Basic concepts of digital audio storage and transmission
- •13.3 Quantization noise and shaping
- •13.3.1 Signal-to-quantization noise ratio
- •13.3.2 Quantization noise shaping and 1-Bit DSD coding
- •13.4 Basic principle of digital audio compression and coding
- •13.4.1 Outline of digital audio compression and coding
- •13.4.2 Adaptive differential pulse-code modulation
- •13.4.3 Perceptual audio coding in the time-frequency domain
- •13.4.4 Vector quantization
- •13.4.5 Spatial audio coding
- •13.4.6 Spectral band replication
- •13.4.7 Entropy coding
- •13.4.8 Object-based audio coding
- •13.5 MPEG series of audio coding techniques and standards
- •13.5.1 MPEG-1 audio coding technique
- •13.5.2 MPEG-2 BC audio coding
- •13.5.3 MPEG-2 advanced audio coding
- •13.5.4 MPEG-4 audio coding
- •13.5.5 MPEG parametric coding of multichannel sound and unified speech and audio coding
- •13.5.6 MPEG-H 3D audio
- •13.6 Dolby series of coding techniques
- •13.6.1 Dolby digital coding technique
- •13.6.2 Some advanced Dolby coding techniques
- •13.7 DTS series of coding technique
- •13.8 MLP lossless coding technique
- •13.9 ATRAC technique
- •13.10 Audio video coding standard
- •13.11 Optical disks for audio storage
- •13.11.1 Structure, principle, and classification of optical disks
- •13.11.2 CD family and its audio formats
- •13.11.3 DVD family and its audio formats
- •13.11.4 SACD and its audio formats
- •13.11.5 BD and its audio formats
- •13.12 Digital radio and television broadcasting
- •13.12.1 Outline of digital radio and television broadcasting
- •13.12.2 Eureka-147 digital audio broadcasting
- •13.12.3 Digital radio mondiale
- •13.12.4 In-band on-channel digital audio broadcasting
- •13.12.5 Audio for digital television
- •13.13 Audio storage and transmission by personal computer
- •13.14 Summary
- •14.1 Outline of acoustic conditions and requirements for spatial sound intended for domestic reproduction
- •14.2 Acoustic consideration and design of listening rooms
- •14.3 Arrangement and characteristics of loudspeakers
- •14.3.1 Arrangement of the main loudspeakers in listening rooms
- •14.3.2 Characteristics of the main loudspeakers
- •14.3.3 Bass management and arrangement of subwoofers
- •14.4 Signal and listening level alignment
- •14.5 Standards and guidance for conditions of spatial sound reproduction
- •14.6 Headphones and binaural monitors of spatial sound reproduction
- •14.7 Acoustic conditions for cinema sound reproduction and monitoring
- •14.8 Summary
- •15.1 Outline of psychoacoustic and subjective assessment experiments
- •15.2 Contents and attributes for spatial sound assessment
- •15.3 Auditory comparison and discrimination experiment
- •15.3.1 Paradigms of auditory comparison and discrimination experiment
- •15.3.2 Examples of auditory comparison and discrimination experiment
- •15.4 Subjective assessment of small impairments in spatial sound systems
- •15.5 Subjective assessment of a spatial sound system with intermediate quality
- •15.6 Virtual source localization experiment
- •15.6.1 Basic methods for virtual source localization experiments
- •15.6.2 Preliminary analysis of the results of virtual source localization experiments
- •15.6.3 Some results of virtual source localization experiments
- •15.7 Summary
- •16.1.1 Application to commercial cinema and related problems
- •16.1.2 Applications to domestic reproduction and related problems
- •16.1.3 Applications to automobile audio
- •16.2.1 Applications to virtual reality
- •16.2.2 Applications to communication and information systems
- •16.2.3 Applications to multimedia
- •16.2.4 Applications to mobile and handheld devices
- •16.3 Applications to the scientific experiments of spatial hearing and psychoacoustics
- •16.4 Applications to sound field auralization
- •16.4.1 Auralization in room acoustics
- •16.4.2 Other applications of auralization technique
- •16.5 Applications to clinical medicine
- •16.6 Summary
- •References
- •Index

350 Spatial Sound
and two examples of formulation in a spatial-spectral domain are presented. In Section 9.3, the reconstructed sound field of Ambisonics in the spatial-spectral domain is analyzed in detail; the decoding equation and deriving signals for arbitrary-order Ambisonics with various secondary source arrays or loudspeaker configurations are derived; the theorem of the spatial sampling and reconstruction of sound field are discussed; near-field compensated higher-order Ambisonics is addressed; and the applications of spatial-spectral analysis in some Ambisonic-like techniques are outlined. In Section 9.4, the secondary sources arrays and stability of Ambisonic sound field are analyzed, and some spatial transformations of Ambisonic sound field are discussed. In Section 9.5, errors in Ambisonic sound field are evaluated, and the problems of spatial aliasing caused by discrete secondary source arrays in a horizontal circle or on a spatial spherical surface are analyzed. In Section 9.6, the basic method of spatial domain analysis of multichannel sound field is introduced, and method of multiple matching receiver positions and their relations with the mode-matching method are outlined. In Section 9.7, the problem of active compensation for reflections in a listening room for multichannel sound reproduction is addressed. In Section 9.8, microphone array techniques for sound field recording, especially Ambisonic recording, are discussed on the basis of the spatial sampling and reconstruction theorem of a sound field.
9.1 EACH ORDER APPROXIMATION OF IDEAL REPRODUCTION AND AMBISONICS
9.1.1 Each order approximation of ideal horizontal reproduction
In this section, the recorded and reproduced signals of Ambisonics are derived from each order approximation of an ideal reproduction (Xie and Xie, 1996). The case of horizontal reproduction is discussed first. Similar to the discussion in Section 3.1, the discussion here starts with an ideal horizontal reproduction by arranging an infinite number of loudspeakers on a circle uniformly and continuously around a listener or receiver region. If the radius r0 of the circle is large enough, then the incident wave in a region near the center of the circle can be approximated as the superposition of plane waves from each loudspeaker. According to Equation (1.2.12), for an original or target plane wave with unit amplitude and incident from a horizontal azimuth θS, the azimuthal distribution function of the complex-valued amplitude of the incident wave is taken in the form of Dirac delta function PA in S in , where θin is the azimuthal coordinate of a sound field. For an ideal reproduction, the azimuthal distribution function A(θ′, θS) of the normalized amplitude of loudspeaker signals should match with that of the target sound field, where θ′ is the azimuth of continuous loudspeaker arrangement. Letting θ′ = θin, the normalized signal amplitude for loudspeakers at an arbitrary azimuth θ′ is given as
A , S PA in S S . |
(9.1.1) |
As in the case of the preceding chapters, a unit transfer coefficient from the loudspeaker signal to the pressure amplitude of the free-field plane wave at the origin is assumed in Equation (9.1.1), e.g., EA = PA. Therefore, in an ideal reproduction, only the loudspeaker at azimuth θ′ = θS is active, and the other loudspeakers are inactive.
A(θ′, θS) is a periodic function of azimuth θS or θ′ with a period of 2π (360°), so it can be expanded into a complexor real-valued Fourier series within the azimuthal region of (−π, π]:

Analysis of multichannel sound field recording and reconstruction 351
|
|
|
|
|
|
A , S |
1 |
exp jq exp jq S |
|
|
|
2 |
|
|
|||
|
|
q |
|
(9.1.2) |
|
|
|
|
|
|
|
|
1 |
|
|||
|
|
|
|
|
|
2 |
1 |
2 cos q cos q S sin q sin q S . |
|
||
|
|
|
q 1 |
|
|
Therefore, the normalized amplitudes of loudspeaker signals in an ideal reproduction can be decomposed into a linear combination of azimuthal harmonics {exp(jqθS)} or {cos(qθS), sin(qθS)} with infinite orders. Complexand real-valued Fourier expansions are mathematically equivalent. The case of real-valued Fourier expansion is discussed in the following.
Equation (9.1.2) is analyzed from the points of multichannel sound recording and reproduction to obtain insights into physical significance. From the point of multichannel sound reproduction, Equation (9.1.2) represents the azimuthal distribution of the normalized amplitude of loudspeaker signals as a function of the azimuth θ′. It also represents the azimuthal distribution function of the normalized amplitude of the free-field plane wave incident to the center of the circle. As a zero-order approximation, the expansion in Equation (9.1.2) is truncated only to the term of q = 0, and other terms are omitted. Then, the normalized signal amplitude of the loudspeaker at arbitrary azimuth θ′ is given as
A , S |
1 |
. |
(9.1.3) |
|
|||
|
2 |
|
Equation (9.1.3) is equivalent to presenting the mono signal captured by an omnidirectional microphone to all loudspeakers in reproduction. The sound pressure at the center of the circle is a superposition of plane waves with equal amplitude and phase from the continuous azimuthal directions of all loudspeakers. Therefore, zero-order reproduction cannot recreate the spatial information of a target plane wave, or can create a perceived virtual source at the top direction similar to the case in Section 6.4.3.
As the first-order approximation, the expansion in Equation (9.1.2) is truncated up to the term of q = 1, and higher terms are omitted. Then, the normalized signal amplitude of loudspeakers at arbitrary azimuth θ′ is expressed as
A , S |
1 |
1 2 cos cos S 2 sin sin S . |
(9.1.4) |
2 |
Except for the difference in overall gain, Equation (9.1.4) is directly proportional to the conventional solution of the normalized amplitude of loudspeaker signals for first-order horizontal Ambisonics in Equations (4.3.15) and (4.3.25). Variations in the azimuthal distribution function or horizontal polar pattern of the normalized amplitude with the difference in θS − θ′ = θS − θi between the target and loudspeaker azimuths are illustrated in Section 4.1.3 and Figures 4.4, 4.5, and 4.17. For an arbitrary loudspeaker, the normalized signal magnitude maximizes when the loudspeaker azimuth coincides with the target source azimuth at θ′ = θS. When the loudspeaker azimuth deviates from the target source azimuth, the normalized signal magnitude reduces and gradually vanishes. However, as the loudspeaker azimuth further deviates from the target source azimuth, a weak crosstalk with a reversal phase occurs in the loudspeakers close to the direction opposite to the target source. As proven in

352 Spatial Sound
Equation (4.3.27), at the central listening position and low frequencies, the perceived virtual source direction in the first-order Ambisonic reproduction matches with that of the target source for the fixed head and the head oriented to the virtual source.
As the second-order approximation, the expansion in Equation (9.1.2) is truncated up to the term of q = 2, and the higher terms are omitted. Then, the normalized signal amplitude of loudspeakers at the arbitrary azimuth θ′ is expressed as
A , S 21 1 2 cos cos S 2 sin sin S 2 cos 2 cos 2 S 2 sin 2 sin 2 S . (9.1.5)
Except the difference in the overall gain, Equation (9.1.5) is directly proportional to the conventional solution of the normalized amplitude of loudspeaker signals for the secondorder horizontal Ambisonics in Equations (4.3.53) and (4.3.62). The variation in azimuthal distribution function or horizontal polar pattern of normalized amplitude with the difference θS − θ′ = θS − θi between target and loudspeaker azimuths are illustrated in Figure 4.17. In comparison with the case of the first-order reproduction, the relative signal magnitude of the loudspeaker nearest the target source azimuth increases, and the relative signal magnitudes (crosstalk) of the other loudspeakers decrease. Therefore, the incident power in reproduction is more focused on the target azimuth of θ′ = θS, and the performance of directional information reproduction is improved.
When the expansion in Equation (9.1.2) is truncated up to the term of q = 3 or higher, the conventional solution of loudspeaker signals for the thirdor higher-order horizontal Ambisonics is achieved. As illustrated in Section 4.3.3 and Figure 4.17, as the order increases, the relative signal magnitude of the loudspeaker consistent with the target source direction increases, and the relative signal magnitudes (crosstalk) of the other loudspeakers decrease. Then, the incident power in reproduction is gradually focused on the target azimuth of θ′ = θS. In other words, as the order increases, the approximated reproduction approaches ideal reproduction, and the reproduction of spatial information gradually improves. When the order in Equation (9.1.2) tends to infinity, the approximated reproduction achieves the limitation of ideal reproduction. Generally, when the expansion in Equation (9.1.2) is truncated to an arbitrary-order Q, the normalized amplitude of loudspeaker signals is expressed as
A , S |
1 |
|
Q |
|
|
|
|
|
Q 1 |
||
2 1 |
2 cos q cos q S sin q sin q S |
||||
|
|
|
q 1 |
|
(9.1.6) |
|
|
|
Q |
|
|
|
1 |
|
|||
|
|
|
|
|
|
2 1 |
2 cos q S . |
|
|||
|
|
|
q 1 |
|
|
From the points of multichannel sound recording, arbitrary Q ≥ 1 order Ambisonic signals A(θ′, θS) in Equation (9.1.6) are the linear combinations of (2Q + 1) independent signals or azimuthal harmonics. These independent signals can be theoretically recorded using (2Q + 1) coincident directional microphones. As stated in Sections 4.3.2 and 4.3.3, 1, cosθS, and sinθS are three normalized signals recorded with an omnidirectional microphone and two bidirectional microphones with their main axes pointing to the front and left directions, respectively; cosqθS and sinqθS (q ≥ 2) are normalized signals recorded with higher-directional microphones. In the polar patterns of the preceding three-order normalized signal amplitude

Analysis of multichannel sound field recording and reconstruction 353
A(θ′, θS) in Figure 4.17, horizontal Ambisonic signal recording can be regarded as a horizontal beamforming method. Beamforming enhances the recorded outputs at the target azimuth of θin = θ′ = θS and restrains the outputs at other azimuths, where θ′ is a parameter of beam direction. As the order Q increases, the beam becomes sharp, thereby improving the azimuthal resolution of recording. When the order Q tends to infinity, Equation (9.1.1) or (9.1.2) represents the case of recording with the method of ideal beamforming. Equation (9.1.6) indicates that the horizontal beam direction can be steered to an arbitrary azimuth without altering the beam shape by changing θ′. This feature is common for horizontal Ambisonics.
The reconstruction of a plane wave with unit amplitude and incident from a horizontal direction is discussed above. For a plane wave with arbitrary amplitude PA(f), Equation (9.1.1) is the azimuthal distribution function of incident plane wave amplitudes after being normalized by a factor of PA(f ). Therefore, actual loudspeaker signals are obtained by multiplying the normalized amplitude A(θ′, θS) with a signal waveform EA(f) = PA(f) in the frequency domain. According to Equation (1.2.12), an arbitrary sound field in a sourcefree region can be decomposed as a linear superposition of the plane wave from various directions. In this case, the azimuthal distribution function PA in , f of incident plane wave amplitudes with respect to the origin is no longer a Dirac delta function. If a set of the aforementioned coincident directional microphones are used to capture the sound field signals, the resultant microphone outputs are the superposition of the contribution of plane waves from all directions. For example, for ideal microphones in which the transfer functions from the incident plane wave amplitude to microphone outputs are a unit, the normalized amplitudes of the omnidirectional microphone and two bidirectional microphones are given as
|
|
|
|
W |
PA in, f d in X |
PA in, f cos ind in, |
|
|
|
|
(9.1.7) |
|
|
|
|
|
|
|
|
Y |
PA in, f sin ind in, |
|
|
where the subscript “Σ” denotes the outputs caused by the superposition of plane waves. The outputs from microphones are decoded using Equation (9.1.6), the unnormalized sig-
nal amplitude of the loudspeaker at θ′ is expressed as
|
1 |
|
Q |
|
|
|
|
|
|
E , f |
|
PA in, f d in 2 cos q PA in |
||
2 |
||||
|
|
|
q 1 |
|
sin q PA( in, f )sin q ind in
Q
PA1,0 f PA1,q f cos q PA2,q f sin q
q 1
,f cos q ind in
(9.1.8)
,
where θin is substituted by θ′ in the second equality of Equation (9.1.8). Equation (9.1.8) represents a Q ≥ 1 order truncation of the azimuthal Fourier expansion of the azimuthal

354 Spatial Sound
distribution function PA in , f of the unnormalized amplitude of the incident plane wave. The coefficients of the azimuthal Fourier expansion are given as
PA1,0 |
f |
1 |
PA in, f d in q 1, 2 Q |
|
|
|
|||
2 |
|
|
|
||||||
|
|
|
|
|
|
|
|
|
(9.1.9) |
|
|
|
|
PA in, f cos q ind in |
|
|
|
||
PA1,q |
f |
1 |
PA2,q f |
1 |
PA in, f sin q ind in. |
||||
|
|
||||||||
|
|
|
|
|
|
|
|
|
In Equation (9.1.8), for an arbitrary target sound field, the Q-order Ambisonic-independent or encoding signals can be theoretically recorded by (2Q + 1) coincident microphones with different order directional characteristics. The preceding Q-order azimuthal harmonic components of the target sound field can be recovered from these microphone outputs after decoding. If the target or incident sound field is spatially bandlimited, i.e., all the q > Q-order azimuthal harmonics in the azimuthal Fourier expansion of the azimuthal distribution function PA in , f of the incident plane wave amplitude vanishes, the target sound field can be recovered exactly by using the Q-order Ambisonic signals from decoding outputs. The corresponding equations in the time domain can be obtained by applying an inverse Fourier transform similar to that in Equation (1.2.13) and the above equations in the frequency domain.
The Q-order circular sinc function is defined as
|
Q |
|
csin , in, Q 1 2 cos q cos q in sin q sin q in |
||
|
q 1 |
|
|
|
1 |
|
|
(9.1.10) |
|
|
sin Q |
2 |
in |
|
||
|
|
|
|
. |
||
|
in |
|||||
|
|
|||||
|
sin |
|
2 |
|
|
|
|
|
|
|
|
The magnitude of the circular sinc function maximizes at θ′ = θin and spreads to two sides around the center of θ′ = θin. Except a constant gain, the polar patterns of Q = 1, 2, and 3 order circular sinc function are identical to those in Figure 4.17. The first equality on the right side of Equation (9.1.8) can be written as
|
1 |
|
|
|
E , f |
PA in, f csin , in, Q d in. |
(9.1.11) |
||
2 |
||||
|
|
|
|
Equation (9.1.11) indicates that Q-order-reproduced signals are obtained by multiplying a Q-order circular sinc function (Q-order azimuthal sampling function) to PA in , f of the incident plane wave amplitude and then superposing (taking an integral) over all the azimuths. When Q tends to infinite, the circular sinc function reaches the Dirac delta function
lim csin , in, Q in . |
(9.1.12) |
Q |
|

Analysis of multichannel sound field recording and reconstruction 355
In this case, Equation (9.1.11) approaches the limitation of an ideal sampling
|
|
E , f PA , f PA in, f in d in. |
(9.1.13) |
In the aforementioned discussion, a uniform and continuous configuration with infinite numbers of loudspeakers around a listener is supposed in reproduction. However, a finite number of loudspeakers are used in practical reproduction. For simplicity, the case of a uniform configuration with M loudspeakers arranged on a circle is discussed here. Let θi, i = 0, 1…(M − 1) denote the azimuth of the ith loudspeaker. For a target plane wave with unit amplitude and incident from azimuthal θS, M loudspeaker signals are equivalent to M azimuthal samples of A(θ′, θS) in Equation (9.1.6) for continuous loudspeaker configuration. The maximal allowing the azimuthal interval of loudspeakers or equally the minimal number of loudspeakers required for the Q-order reproduction cannot be evaluated with Equation (9.1.6), but this parameter should be derived from the analysis of the reproduced sound field in Section 9.3. However, the overall signal gain for reproduction with M loudspeakers can be derived by analyzing the sound pressure at the origin (center of the circle). For a continuous and uniform configuration with an infinite number of loudspeakers, the normalized amplitudes of the loudspeaker signals for arbitrary Q-order Ambisonic reproduction are expressed in Equation (9.1.6); the reproduced sound pressure in the frequency domain and at the origin is given as
A |
|
|
S |
|
|
|
d . |
|
|||
P |
|
A , |
|
(9.1.14) |
For the finite Q-order and infinite-order (ideal) reproduction, the result of integral is shown as
|
(9.1.15) |
PA PA 1. |
In this case, the amplitude of the reproduced sound pressure at the origin is exactly equal to that of the target sound field.
For the Q-order Ambisonic reproduction with M loudspeakers, the normalized amplitudes of loudspeaker signals are the samples of A(θ′, θS) in Equation (9.1.6) at θ′ = θi of M loudspeaker directions. Accordingly, the integral over the continuous azimuth θ′ is replaced by the summation over discrete azimuths θi. In the case of uniform configuration with M loudspeakers, the interval between adjacent loudspeakers is 2π / M ≈ dθ′, then Equation (9.1.14) becomes
|
|
M 1 |
|
|
S |
2 |
|
P |
|
A |
, |
||||
M |
|||||||
A |
|
|
i |
|
|||
|
|
i 0 |
|
|
|
|
M 1 |
1 |
|
Q |
|
|
|
|
|
|
||
|
|
1 |
2 cos q i cos q S sin q i sin q S . |
(9.1.16) |
|
|
|||||
i 0 |
M |
q 1 |
|
|
|
|
|
|
|
|
|

356 Spatial Sound
The normalized amplitude of the actual signal of the ith loudspeaker is Ai(θS). For a target plane wave with a unit amplitude, the reproduced sound pressure at the origin should satisfy the following equation:
A |
M 1 |
|
S |
|
|
i |
1. |
|
|||
P |
A |
|
|
(9.1.17) |
i 0
Comparing Equation (9.1.16) and (9.1.17) yields
Ai S |
1 |
|
|
Q |
|
|
|
|
|
|
|
|
|
|
|
||
M |
|
1 |
2 cos q i cos q S sin q i sin q S |
|
||||
|
|
|
|
q 1 |
|
|
|
|
|
1 |
|
|
Q |
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
(9.1.18) |
|
M |
|
1 |
2 cos q S i |
|||||
|
|
|
|
q 1 |
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
1 |
|
sin Q |
S i |
|
|
||
|
|
|
|
2 |
|
. |
|
|
M |
|
|
sin |
S i |
|
|
||
|
|
|
|
|
||||
|
|
|
|
|
2 |
|
|
|
Equation (9.1.18) is the amplitude of the reproduced signals for the Q-order Ambisonics with constant-amplitude normalization given by Equations (4.3.63) and (4.3.80a).Therefore, for reproduction involving a finite number of horizontal loudspeakers with uniform configuration, the Q-order approximation of ideal reproduction leads to the Ambisonic decoding equation, the amplitude of the reproduced signals, and the normalized factor of the overall amplitude. The normalized signal amplitude of the ith loudspeaker in discrete configuration cannot be obtained directly by letting θ′ = θi in A(θ′, θS) for continuous configuration expressed in Equation (9.1.16). An overall gain and normalized factor should be supplemented.
In conclusion, the following statements have been mathematically proven:
1.An ideal horizontal reproduction requires an infinite number of loudspeakers arranged continuously and uniformly on a circle with a far-field radius.
2.The loudspeaker signals for an ideal reproduction can be expanded into an azimuthal Fourier series. The Q-order approximation of the azimuthal Fourier expansion is equivalent to the Q-order horizontal Ambisonic reproduced signals, which are the linear combination of (2Q + 1) independent or encoded signals. The decoding equation and reproduced signals of the conventional solution of Ambisonics is a natural consequence of each order approximation of ideal reproduced signals.
3.As the order Q of Ambisonics increases, the approximated reproduction gradually approaches ideal reproduction, but higher-order reproduction requires more independent signals and becomes complicated.
4.The independent signals of the Q-order Ambisonics, which involve the preceding Q-order azimuthal harmonic components of the target sound field, can be theoretically recorded by (2Q + 1) coincident microphones with appropriate directivities. Decoding can be regarded as beamforming processing.