
- •Preface
- •Introduction
- •1.1 Spatial coordinate systems
- •1.2 Sound fields and their physical characteristics
- •1.2.1 Free-field and sound waves generated by simple sound sources
- •1.2.2 Reflections from boundaries
- •1.2.3 Directivity of sound source radiation
- •1.2.4 Statistical analysis of acoustics in an enclosed space
- •1.2.5 Principle of sound receivers
- •1.3 Auditory system and perception
- •1.3.1 Auditory system and its functions
- •1.3.2 Hearing threshold and loudness
- •1.3.3 Masking
- •1.3.4 Critical band and auditory filter
- •1.4 Artificial head models and binaural signals
- •1.4.1 Artificial head models
- •1.4.2 Binaural signals and head-related transfer functions
- •1.5 Outline of spatial hearing
- •1.6 Localization cues for a single sound source
- •1.6.1 Interaural time difference
- •1.6.2 Interaural level difference
- •1.6.3 Cone of confusion and head movement
- •1.6.4 Spectral cues
- •1.6.5 Discussion on directional localization cues
- •1.6.6 Auditory distance perception
- •1.7 Summing localization and spatial hearing with multiple sources
- •1.7.1 Summing localization with two sound sources
- •1.7.2 The precedence effect
- •1.7.3 Spatial auditory perceptions with partially correlated and uncorrelated source signals
- •1.7.4 Auditory scene analysis and spatial hearing
- •1.7.5 Cocktail party effect
- •1.8 Room reflections and auditory spatial impression
- •1.8.1 Auditory spatial impression
- •1.8.2 Sound field-related measures and auditory spatial impression
- •1.8.3 Binaural-related measures and auditory spatial impression
- •1.9.1 Basic principle of spatial sound
- •1.9.2 Classification of spatial sound
- •1.9.3 Developments and applications of spatial sound
- •1.10 Summary
- •2.1 Basic principle of a two-channel stereophonic sound
- •2.1.1 Interchannel level difference and summing localization equation
- •2.1.2 Effect of frequency
- •2.1.3 Effect of interchannel phase difference
- •2.1.4 Virtual source created by interchannel time difference
- •2.1.5 Limitation of two-channel stereophonic sound
- •2.2.1 XY microphone pair
- •2.2.2 MS transformation and the MS microphone pair
- •2.2.3 Spaced microphone technique
- •2.2.4 Near-coincident microphone technique
- •2.2.5 Spot microphone and pan-pot technique
- •2.2.6 Discussion on microphone and signal simulation techniques for two-channel stereophonic sound
- •2.3 Upmixing and downmixing between two-channel stereophonic and mono signals
- •2.4 Two-channel stereophonic reproduction
- •2.4.1 Standard loudspeaker configuration of two-channel stereophonic sound
- •2.4.2 Influence of front-back deviation of the head
- •2.5 Summary
- •3.1 Physical and psychoacoustic principles of multichannel surround sound
- •3.2 Summing localization in multichannel horizontal surround sound
- •3.2.1 Summing localization equations for multiple horizontal loudspeakers
- •3.2.2 Analysis of the velocity and energy localization vectors of the superposed sound field
- •3.2.3 Discussion on horizontal summing localization equations
- •3.3 Multiple loudspeakers with partly correlated and low-correlated signals
- •3.4 Summary
- •4.1 Discrete quadraphone
- •4.1.1 Outline of the quadraphone
- •4.1.2 Discrete quadraphone with pair-wise amplitude panning
- •4.1.3 Discrete quadraphone with the first-order sound field signal mixing
- •4.1.4 Some discussions on discrete quadraphones
- •4.2 Other horizontal surround sounds with regular loudspeaker configurations
- •4.2.1 Six-channel reproduction with pair-wise amplitude panning
- •4.2.2 The first-order sound field signal mixing and reproduction with M ≥ 3 loudspeakers
- •4.3 Transformation of horizontal sound field signals and Ambisonics
- •4.3.1 Transformation of the first-order horizontal sound field signals
- •4.3.2 The first-order horizontal Ambisonics
- •4.3.3 The higher-order horizontal Ambisonics
- •4.3.4 Discussion and implementation of the horizontal Ambisonics
- •4.4 Summary
- •5.1 Outline of surround sounds with accompanying picture and general uses
- •5.2 5.1-Channel surround sound and its signal mixing analysis
- •5.2.1 Outline of 5.1-channel surround sound
- •5.2.2 Pair-wise amplitude panning for 5.1-channel surround sound
- •5.2.3 Global Ambisonic-like signal mixing for 5.1-channel sound
- •5.2.4 Optimization of three frontal loudspeaker signals and local Ambisonic-like signal mixing
- •5.2.5 Time panning for 5.1-channel surround sound
- •5.3 Other multichannel horizontal surround sounds
- •5.4 Low-frequency effect channel
- •5.5 Summary
- •6.1 Summing localization in multichannel spatial surround sound
- •6.1.1 Summing localization equations for spatial multiple loudspeaker configurations
- •6.1.2 Velocity and energy localization vector analysis for multichannel spatial surround sound
- •6.1.3 Discussion on spatial summing localization equations
- •6.1.4 Relationship with the horizontal summing localization equations
- •6.2 Signal mixing methods for a pair of vertical loudspeakers in the median and sagittal plane
- •6.3 Vector base amplitude panning
- •6.4 Spatial Ambisonic signal mixing and reproduction
- •6.4.1 Principle of spatial Ambisonics
- •6.4.2 Some examples of the first-order spatial Ambisonics
- •6.4.4 Recreating a top virtual source with a horizontal loudspeaker arrangement and Ambisonic signal mixing
- •6.5 Advanced multichannel spatial surround sounds and problems
- •6.5.1 Some advanced multichannel spatial surround sound techniques and systems
- •6.5.2 Object-based spatial sound
- •6.5.3 Some problems related to multichannel spatial surround sound
- •6.6 Summary
- •7.1 Basic considerations on the microphone and signal simulation techniques for multichannel sounds
- •7.2 Microphone techniques for 5.1-channel sound recording
- •7.2.1 Outline of microphone techniques for 5.1-channel sound recording
- •7.2.2 Main microphone techniques for 5.1-channel sound recording
- •7.2.3 Microphone techniques for the recording of three frontal channels
- •7.2.4 Microphone techniques for ambience recording and combination with frontal localization information recording
- •7.2.5 Stereophonic plus center channel recording
- •7.3 Microphone techniques for other multichannel sounds
- •7.3.1 Microphone techniques for other discrete multichannel sounds
- •7.3.2 Microphone techniques for Ambisonic recording
- •7.4 Simulation of localization signals for multichannel sounds
- •7.4.1 Methods of the simulation of directional localization signals
- •7.4.2 Simulation of virtual source distance and extension
- •7.4.3 Simulation of a moving virtual source
- •7.5 Simulation of reflections for stereophonic and multichannel sounds
- •7.5.1 Delay algorithms and discrete reflection simulation
- •7.5.2 IIR filter algorithm of late reverberation
- •7.5.3 FIR, hybrid FIR, and recursive filter algorithms of late reverberation
- •7.5.4 Algorithms of audio signal decorrelation
- •7.5.5 Simulation of room reflections based on physical measurement and calculation
- •7.6 Directional audio coding and multichannel sound signal synthesis
- •7.7 Summary
- •8.1 Matrix surround sound
- •8.1.1 Matrix quadraphone
- •8.1.2 Dolby Surround system
- •8.1.3 Dolby Pro-Logic decoding technique
- •8.1.4 Some developments on matrix surround sound and logic decoding techniques
- •8.2 Downmixing of multichannel sound signals
- •8.3 Upmixing of multichannel sound signals
- •8.3.1 Some considerations in upmixing
- •8.3.2 Simple upmixing methods for front-channel signals
- •8.3.3 Simple methods for Ambient component separation
- •8.3.4 Model and statistical characteristics of two-channel stereophonic signals
- •8.3.5 A scale-signal-based algorithm for upmixing
- •8.3.6 Upmixing algorithm based on principal component analysis
- •8.3.7 Algorithm based on the least mean square error for upmixing
- •8.3.8 Adaptive normalized algorithm based on the least mean square for upmixing
- •8.3.9 Some advanced upmixing algorithms
- •8.4 Summary
- •9.1 Each order approximation of ideal reproduction and Ambisonics
- •9.1.1 Each order approximation of ideal horizontal reproduction
- •9.1.2 Each order approximation of ideal three-dimensional reproduction
- •9.2 General formulation of multichannel sound field reconstruction
- •9.2.1 General formulation of multichannel sound field reconstruction in the spatial domain
- •9.2.2 Formulation of spatial-spectral domain analysis of circular secondary source array
- •9.2.3 Formulation of spatial-spectral domain analysis for a secondary source array on spherical surface
- •9.3 Spatial-spectral domain analysis and driving signals of Ambisonics
- •9.3.1 Reconstructed sound field of horizontal Ambisonics
- •9.3.2 Reconstructed sound field of spatial Ambisonics
- •9.3.3 Mixed-order Ambisonics
- •9.3.4 Near-field compensated higher-order Ambisonics
- •9.3.5 Ambisonic encoding of complex source information
- •9.3.6 Some special applications of spatial-spectral domain analysis of Ambisonics
- •9.4 Some problems related to Ambisonics
- •9.4.1 Secondary source array and stability of Ambisonics
- •9.4.2 Spatial transformation of Ambisonic sound field
- •9.5 Error analysis of Ambisonic-reconstructed sound field
- •9.5.1 Integral error of Ambisonic-reconstructed wavefront
- •9.5.2 Discrete secondary source array and spatial-spectral aliasing error in Ambisonics
- •9.6 Multichannel reconstructed sound field analysis in the spatial domain
- •9.6.1 Basic method for analysis in the spatial domain
- •9.6.2 Minimizing error in reconstructed sound field and summing localization equation
- •9.6.3 Multiple receiver position matching method and its relation to the mode-matching method
- •9.7 Listening room reflection compensation in multichannel sound reproduction
- •9.8 Microphone array for multichannel sound field signal recording
- •9.8.1 Circular microphone array for horizontal Ambisonic recording
- •9.8.2 Spherical microphone array for spatial Ambisonic recording
- •9.8.3 Discussion on microphone array recording
- •9.9 Summary
- •10.1 Basic principle and implementation of wave field synthesis
- •10.1.1 Kirchhoff–Helmholtz boundary integral and WFS
- •10.1.2 Simplification of the types of secondary sources
- •10.1.3 WFS in a horizontal plane with a linear array of secondary sources
- •10.1.4 Finite secondary source array and effect of spatial truncation
- •10.1.5 Discrete secondary source array and spatial aliasing
- •10.1.6 Some issues and related problems on WFS implementation
- •10.2 General theory of WFS
- •10.2.1 Green’s function of Helmholtz equation
- •10.2.2 General theory of three-dimensional WFS
- •10.2.3 General theory of two-dimensional WFS
- •10.2.4 Focused source in WFS
- •10.3 Analysis of WFS in the spatial-spectral domain
- •10.3.1 General formulation and analysis of WFS in the spatial-spectral domain
- •10.3.2 Analysis of the spatial aliasing in WFS
- •10.3.3 Spatial-spectral division method of WFS
- •10.4 Further discussion on sound field reconstruction
- •10.4.1 Comparison among various methods of sound field reconstruction
- •10.4.2 Further analysis of the relationship between acoustical holography and sound field reconstruction
- •10.4.3 Further analysis of the relationship between acoustical holography and Ambisonics
- •10.4.4 Comparison between WFS and Ambisonics
- •10.5 Equalization of WFS under nonideal conditions
- •10.6 Summary
- •11.1 Basic principles of binaural reproduction and virtual auditory display
- •11.1.1 Binaural recording and reproduction
- •11.1.2 Virtual auditory display
- •11.2 Acquisition of HRTFs
- •11.2.1 HRTF measurement
- •11.2.2 HRTF calculation
- •11.2.3 HRTF customization
- •11.3 Basic physical features of HRTFs
- •11.3.1 Time-domain features of far-field HRIRs
- •11.3.2 Frequency domain features of far-field HRTFs
- •11.3.3 Features of near-field HRTFs
- •11.4 HRTF-based filters for binaural synthesis
- •11.5 Spatial interpolation and decomposition of HRTFs
- •11.5.1 Directional interpolation of HRTFs
- •11.5.2 Spatial basis function decomposition and spatial sampling theorem of HRTFs
- •11.5.3 HRTF spatial interpolation and signal mixing for multichannel sound
- •11.5.4 Spectral shape basis function decomposition of HRTFs
- •11.6 Simplification of signal processing for binaural synthesis
- •11.6.1 Virtual loudspeaker-based algorithms
- •11.6.2 Basis function decomposition-based algorithms
- •11.7.1 Principle of headphone equalization
- •11.7.2 Some problems with binaural reproduction and VAD
- •11.8 Binaural reproduction through loudspeakers
- •11.8.1 Basic principle of binaural reproduction through loudspeakers
- •11.8.2 Virtual source distribution in two-front loudspeaker reproduction
- •11.8.3 Head movement and stability of virtual sources in Transaural reproduction
- •11.8.4 Timbre coloration and equalization in transaural reproduction
- •11.9 Virtual reproduction of stereophonic and multichannel surround sound
- •11.9.1 Binaural reproduction of stereophonic and multichannel sound through headphones
- •11.9.2 Stereophonic expansion and enhancement
- •11.9.3 Virtual reproduction of multichannel sound through loudspeakers
- •11.10.1 Binaural room modeling
- •11.10.2 Dynamic virtual auditory environments system
- •11.11 Summary
- •12.1 Physical analysis of binaural pressures in summing virtual source and auditory events
- •12.1.1 Evaluation of binaural pressures and localization cues
- •12.1.2 Method for summing localization analysis
- •12.1.3 Binaural pressure analysis of stereophonic and multichannel sound with amplitude panning
- •12.1.4 Analysis of summing localization with interchannel time difference
- •12.1.5 Analysis of summing localization at the off-central listening position
- •12.1.6 Analysis of interchannel correlation and spatial auditory sensations
- •12.2 Binaural auditory models and analysis of spatial sound reproduction
- •12.2.1 Analysis of lateral localization by using auditory models
- •12.2.2 Analysis of front-back and vertical localization by using a binaural auditory model
- •12.2.3 Binaural loudness models and analysis of the timbre of spatial sound reproduction
- •12.3 Binaural measurement system for assessing spatial sound reproduction
- •12.4 Summary
- •13.1 Analog audio storage and transmission
- •13.1.1 45°/45° Disk recording system
- •13.1.2 Analog magnetic tape audio recorder
- •13.1.3 Analog stereo broadcasting
- •13.2 Basic concepts of digital audio storage and transmission
- •13.3 Quantization noise and shaping
- •13.3.1 Signal-to-quantization noise ratio
- •13.3.2 Quantization noise shaping and 1-Bit DSD coding
- •13.4 Basic principle of digital audio compression and coding
- •13.4.1 Outline of digital audio compression and coding
- •13.4.2 Adaptive differential pulse-code modulation
- •13.4.3 Perceptual audio coding in the time-frequency domain
- •13.4.4 Vector quantization
- •13.4.5 Spatial audio coding
- •13.4.6 Spectral band replication
- •13.4.7 Entropy coding
- •13.4.8 Object-based audio coding
- •13.5 MPEG series of audio coding techniques and standards
- •13.5.1 MPEG-1 audio coding technique
- •13.5.2 MPEG-2 BC audio coding
- •13.5.3 MPEG-2 advanced audio coding
- •13.5.4 MPEG-4 audio coding
- •13.5.5 MPEG parametric coding of multichannel sound and unified speech and audio coding
- •13.5.6 MPEG-H 3D audio
- •13.6 Dolby series of coding techniques
- •13.6.1 Dolby digital coding technique
- •13.6.2 Some advanced Dolby coding techniques
- •13.7 DTS series of coding technique
- •13.8 MLP lossless coding technique
- •13.9 ATRAC technique
- •13.10 Audio video coding standard
- •13.11 Optical disks for audio storage
- •13.11.1 Structure, principle, and classification of optical disks
- •13.11.2 CD family and its audio formats
- •13.11.3 DVD family and its audio formats
- •13.11.4 SACD and its audio formats
- •13.11.5 BD and its audio formats
- •13.12 Digital radio and television broadcasting
- •13.12.1 Outline of digital radio and television broadcasting
- •13.12.2 Eureka-147 digital audio broadcasting
- •13.12.3 Digital radio mondiale
- •13.12.4 In-band on-channel digital audio broadcasting
- •13.12.5 Audio for digital television
- •13.13 Audio storage and transmission by personal computer
- •13.14 Summary
- •14.1 Outline of acoustic conditions and requirements for spatial sound intended for domestic reproduction
- •14.2 Acoustic consideration and design of listening rooms
- •14.3 Arrangement and characteristics of loudspeakers
- •14.3.1 Arrangement of the main loudspeakers in listening rooms
- •14.3.2 Characteristics of the main loudspeakers
- •14.3.3 Bass management and arrangement of subwoofers
- •14.4 Signal and listening level alignment
- •14.5 Standards and guidance for conditions of spatial sound reproduction
- •14.6 Headphones and binaural monitors of spatial sound reproduction
- •14.7 Acoustic conditions for cinema sound reproduction and monitoring
- •14.8 Summary
- •15.1 Outline of psychoacoustic and subjective assessment experiments
- •15.2 Contents and attributes for spatial sound assessment
- •15.3 Auditory comparison and discrimination experiment
- •15.3.1 Paradigms of auditory comparison and discrimination experiment
- •15.3.2 Examples of auditory comparison and discrimination experiment
- •15.4 Subjective assessment of small impairments in spatial sound systems
- •15.5 Subjective assessment of a spatial sound system with intermediate quality
- •15.6 Virtual source localization experiment
- •15.6.1 Basic methods for virtual source localization experiments
- •15.6.2 Preliminary analysis of the results of virtual source localization experiments
- •15.6.3 Some results of virtual source localization experiments
- •15.7 Summary
- •16.1.1 Application to commercial cinema and related problems
- •16.1.2 Applications to domestic reproduction and related problems
- •16.1.3 Applications to automobile audio
- •16.2.1 Applications to virtual reality
- •16.2.2 Applications to communication and information systems
- •16.2.3 Applications to multimedia
- •16.2.4 Applications to mobile and handheld devices
- •16.3 Applications to the scientific experiments of spatial hearing and psychoacoustics
- •16.4 Applications to sound field auralization
- •16.4.1 Auralization in room acoustics
- •16.4.2 Other applications of auralization technique
- •16.5 Applications to clinical medicine
- •16.6 Summary
- •References
- •Index

Analysis of multichannel sound field recording and reconstruction 359
normalized so that the integral of their square norm over direction is a unit. This normalization is for mathematical convenience, but it is different from the normalization of signals W, X, Y, and Z in Equation (6.4.1). In Equation (6.4.1), the maximal magnitudes of W, X, Y, and Z are normalized to a unit to avoid signal overload in practice. Independent signals or spherical harmonic functions can be normalized by various methods. Independent signals with different normalizations are mathematically equivalent except those for gain factors. The methods of normalization should be noticed to avoid confusion (Charpentier, 2017).
Similar to the case of horizontal recording, Equation (9.1.25) can be regarded as a recording method on the basis of directional beamforming. According to the summation formula of spherical harmonic functions given in Equation (A.17) in Appendix A, Equation (9.1.25) can be written as
A |
, |
|
|
1 L 1 |
2l 1 P cos |
|
, |
(9.1.26) |
|||
|
|
|
|||||||||
S |
4 |
||||||||||
|
|
|
l |
|
|
|
|
||||
|
|
|
|
|
l 0 |
|
|
|
|
|
where Pl[cos(ΔΩ′)] is the l-order Legendre polynomials, ΔΩ′ is the angle between Ω′ and ΩS. As the order L − 1 increases, the directivity or directional pattern of beamforming becomes sharp; accordingly, Equation (9.1.26) approaches the loudspeaker signals for ideal reproduction. However, the number of independent or encoded signals also increases with the order and system become complicated. When the order (L − 1) tends to infinity, Equation (9.1.25) or (9.1.26) reaches the case of recording with ideal beamforming given in Equation (9.1.20). Equations (9.1.25) and (9.1.26) indicate that the directional beam can be steered to arbitrary direction without altering the beam shape by changing Ω′. This feature is common in spatial Ambisonics.
Similar to the case in a horizontal sound field, an arbitrary spatial sound field in a sourcefree region can be decomposed as a linear superposition of plane waves from various direc-
tions. In this case, the directional distribution function PA in , f of incident plane wave amplitudes with respect to the origin is no longer a Dirac delta function. Spatial Ambisonic recording can also be regarded as a directional sampling on PA in , f of incident plane wave amplitudes. The outputs of various order directional microphones are the spherical harmonic components of PA in , f . When the order (L − 1) tends to infinite, loudspeaker signals in reproduction approach the limitation of the Dirac delta function.
The discrete configuration with a finite number of loudspeakers is used in practical reproduction. The loudspeaker configurations of spatial Ambisonics are more complex than those of horizontal Ambisonics. In many practical cases, the decoding equations and reproduced signals of higher-order spatial Ambisonics cannot be derived directly from Equation (9.1.25). Instead, they should be derived through sound field analysis in Sections 9.2 and 9.3.
9.2 GENERAL FORMULATION OF MULTICHANNEL SOUND FIELD RECONSTRUCTION
9.2.1 General formulation of multichannel sound field reconstruction in the spatial domain
As stated in Sections 1.9.1 and 3.1, ideal spatial sound reproduction can be achieved through the physical reconstruction of the target sound field within a region. Sound field reconstruction is an important aspect of spatial sound. It is closely related not only to conventional multichannel sound techniques but also to some methods in other fields, such as active noise

360 Spatial Sound
control and acoustical holography (Fazi and Nelson, 2010, 2013). Therefore, analysis of the reconstructed sound field is important to the evaluation of the accuracy or extent of approximation in practical multichannel sound reproduction and development of new reproduction techniques. Moreover, the mathematical expression and analysis methods of sound field reconstruction vary in different literature, although they are actually equivalent (Ahrens and Spors, 2008b; Poletti, 2005b). For unification, a general formulation of multichannel sound field reconstruction is addressed in this section and Sections 9.2.2 and 9.2.3. The formulation of sound field reconstruction in a spatial domain, or more strictly in the frequency and spatial domain, is first discussed here because a sound field can be represented by sound pressure as a function of receiver position and frequency.
Some terms are first introduced. In the analysis of sound field reconstruction, including the analyses in this chapter and those on acoustical holography and wave field synthesis in Chapter 10, the terms “secondary sources” and “driving signals” are used to represent the ideal sound sources of reproduction and the signals of these ideal sound sources, respectively. Accordingly, the term “secondary source array” is used to represent the configuration of secondary sources. In practice, the ideal sound sources are approximately realized by loudspeakers. Therefore, the term “loudspeaker” is usually referred to a practical secondary source, the corresponding configuration is called “loudspeaker configuration” and the corresponding driving signals are termed “loudspeaker signals.” In the following discussion, these terms are used flexibly and alternately.
Suppose that secondary sources are arranged continuously and uniformly in space; the frequency domain deriving signals for secondary source at position r′ is E(r′, f); the frequency domain transfer function from the secondary source to arbitrary receiver position r is G(r, r′, f). The sound pressure at the receiver point is a superposition of those caused by all secondary sources:
P r, f G r, r , f E r , f dr . |
(9.2.1) |
The integral in Equation (9.2.1) is calculated over the whole region of the secondary source distribution.
Under the free-field condition, the transfer function G(r, r′, f ) in Equation (9.2.1) depends on the physical characteristic or radiation pattern of secondary sources. If secondary sources are point sources with unit strength in the free field, G(r, r′, f) is the frequency domain sound pressure at the receiver position r caused by the source at r′ and termed free-field Green’s function in a three-dimensional space (and frequency domain). Letting Qp(f) = 1 and substituting rS with r′ in Equation (1.2.3) yield
Gfree3D r, r , f Gfree3D |r r |, f |
|
|
|
|
|||
|
1 |
|
|
|
|
|
|
|
|
exp jk |
|
r r |
|
(9.2.2) |
|
|
4 |r r | |
|
|
|
|
||
|
1 |
|
|
|
|
|
|
|
4 |r r |exp jk|r |
r | , |
|
|
where the superscript “3D” denotes the case of a point source in a three-dimensional space, and the subscript “free” denotes the free field. In the following discussion, these superscript and subscript are preserved or omitted depending on a situation. Equation (9.2.2) indicates
that Gfree3D r, r , f only depends on the relative distance |r − r′| between the sound source

Analysis of multichannel sound field recording and reconstruction 361
and the receiver position. Gfree3D r, r , f is also invariant after an exchanging of r and r′. This invariance is the consequence of the acoustic principle of reciprocity.
In the local region of a far field where the distance between the receiver position and the source is large enough, the spherical wave caused by a point source can be approximated as a plane wave. In this case, the secondary source can be modeled by a plane wave source in the free field. If the strength of a point source is chosen to be Qp = 4π r′ or if PA(f) = 1 in Equation (1.2.3), the transfer function from a free-field plane wave source to a region close to the origin is given as
Gpl |
r, r , f |
|
exp |
jk |
|
r r |
. |
(9.2.3) |
free |
|
|
|
|
|
|
where the superscript “pl” denotes the case of a plane wave source. By choosing an appropriate initial phase of the point source in Equation (1.2.3) to cancel the factor exp(jkr′) in Equation (9.2.3), or directly letting PA(f) = 1 in Equation (1.2.6), Equation (9.2.3) becomes
Gfreepl r, r , f exp jk r . |
(9.2.4) |
where the wave vector k in Equation (9.2.4) depends on the direction of a secondary source. Similarly, if secondary sources are straight-line sources with unit strength and an infinite length arranged perpendicular to the horizontal plane (two-dimensional space) in the free field, G(r, r′, f) is the frequency domain sound pressure at the horizontal receiver position r caused by straight-line sources intersecting the horizontal plane at r′. For convenience, r′ is called the “horizontal position of the straight-line source.” G(r, r′, f) is termed the free-field Green’s function in a two-dimensional space or horizontal plane (and frequency domain).
Letting Qli(f) = 1 in Equation (1.2.7) yields
Gfree2D r, r , f |
j |
H0 k |
|
r r |
|
. |
(9.2.5) |
|
|
|
|||||||
|
||||||||
4 |
|
|
|
|
|
|
where the superscript “2D” denotes the case of a straight-line source in a two-dimensional space (horizontal plane).
Similar to the case of a point source in a three-dimensional space, in the local region of a far field where the distance between the horizontal receiver position and the straight-line source is large enough, the cylindrical wave caused by a straight-line source can be approximated by a plane wave according to the discussion in Equations (1.2.7) to (1.2.10). When an appropriate initial phase and the frequency-dependent strength of the straight-line source are chosen, the horizontal transfer function from a straight-line source to a region close to the origin is also expressed as the plane wave approximation of Equation (9.2.4).
In addition to three ideal and relatively simple secondary sources, e.g., point source, plane wave source, and straight-line source, other practical secondary sources (such as loudspeakers) exhibit more complicated radiation patterns (such as frequency-dependent directivity). In these cases, reconstructed sound pressures can be calculated using Equation (9.2.1), but G(r, r′, f) from the secondary source to the receiver position is more complicated and does not even have an analytical expression.
Equation (9.2.1) can be extended to the case of reproduction in reflective environments, such as in a listening room with reflections. In this case, the free-field transfer function in Equation (9.2.1) should be replaced with the transfer function in reflective environments, e.g., the frequency domain counterparts of room impulse responses in Section 1.2.2.

362 Spatial Sound
Practical sound field reconstruction or sound reproduction is implemented through a discrete array with a finite number of secondary sources. If M secondary sources are arranged at the position ri (i = 0, 1…M − 1), the integral over the continuous region of secondary source distribution in Equation (9.2.1) is replaced with the summation of all the discrete secondary sources:
M 1 |
|
P r, f G r, ri , f Ei ri , f . |
(9.2.6) |
i 0
Equation (9.2.1) or (9.2.6) is the general formulation of multichannel sound field reconstruction in the spatial domain. The driving signals Ei(ri, f) of the discrete secondary source array cannot be obtained directly by letting r′ = ri in E(r′, f) for continuous secondary source distribution. For a uniform array in a horizontal circle, the overall gain in the driving signals of continuous secondary source distribution and discrete array differs.
Given the characteristics, the array and driving signals of secondary sources, Equation (9.2.1) or (9.2.6) can be used to analyze the reconstructed sound field. Conversely, given the target sound field, Equations (9.2.1) and (9.2.6) can be used to search for the type, array, and driving signals of secondary sources and design the reproduction system. This discussed content is the main point of reconstructed sound field analysis in the following sections.
9.2.2 Formulation of spatial-spectral domain analysis of circular secondary source array
Equations (9.2.1) and (9.2.6) are formulated in a spatial domain or more strictly in frequency and spatial domains, where the reconstructed sound pressure is a function of frequency f and receiver position r. The reconstructed sound field can also be analyzed in a spatial-spectral domain or more strictly a frequency and spatial-spectral domain. For some regular secondary source array and appropriate coordinate systems, transforming Equations (9.2.1) and (9.2.6) to the spatial-spectral domain for analysis is convenient and may lead to significant results. This phenomenon is similar to the transformation of the signals in the time domain to the frequency domain for analysis in signal processing. However, the appropriate spatial spectrum representation of the reconstructed sound field depends on the secondary source array and chosen coordinate system.
A usual array involves arranging secondary sources uniformly on a horizontal circle, which is often used for horizontal Ambisonic reproduction. Spatial-spectral domain analysis of circular secondary source array is discussed in this section (Bamford and Vanderkooy, 1995; Daniel, 2000; Ward and Abhayapala, 2001; Poletti, 1996, 2000). In this case, a polar coordinate system is convenient for analysis.
The secondary sources are arranged uniformly and continuously on a horizontal circle at r′ = r0. The position of secondary sources is denoted by the polar coordinate (r0, θ′). An arbitrary receiver position inside the circle is denoted by (r, θ). The line element on the circle is r0dθ′. The line integral in Equation (9.2.1) is calculated over the circle r′ = r0. If r0 is merged into the driving signals E(r′, f) as an overall gain, Equation (9.2.1) can be expressed as a onedimensional convolution over the azimuth θ:
|
|
P r, , r0 , f G r, , r0 , f E , f d . |
(9.2.7) |

Analysis of multichannel sound field recording and reconstruction 363
The arguments indicate the dependence of each function on physical variables.
In Equation (9.2.7), the reconstructed sound pressure is a periodic function of θ with a period of 2π and therefore can be expanded into a realor complex-valued Fourier series of θ. Realand complex-valued Fourier expansions are mathematically equivalent. The realvalued expansion is consistent with the preceding discussion of horizontal Ambisonics. The complex-valued expansion is convenient for mathematical expression and analysis. Here, both forms of Fourier expansions are given as
|
|
|
|
|
|
|
|
|
|
|
1 |
r, r0 |
, f cos q Pq |
2 |
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
|
|
|
P r, , r0, f Pq |
|
|
|
(r, r0, f )sin q |
|
|||||||||||
|
|
|
|
|
|
|
|
|
|
q 0 |
|
|
|
|
|
|
|
(9.2.8) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
Pq r, r0, f exp jq . |
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
q |
|
|
|
|
|
|
|
|
|
The real-valued Fourier coefficients of expansion are calculated with the following: |
|||||||||||||||||||
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
||
P0 1 r, r0, f |
|
P r, , r0, f d , |
|
|
|
|
|
|
|
||||||||||
2 |
|
|
|
|
|
|
|
||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
1 |
|
|
|||
Pq 1 r, r0, f |
P r, , r0, f cos q d |
Pq 2 r, r0, f |
P r, , r0, f sin q d . (9.2.9) |
||||||||||||||||
|
|
||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
q 1, 2, 3 |
|
|
|
|
|
|
|
|
||||||||||
0 |
|
|
0 |
|
|
|
0 , which is preserved in Equation (9.2.8) for convenience in writing. |
||||||||||||
where P 2 |
|
r, r , |
f |
|
|||||||||||||||
The complex-valued Fourier coefficients of expansion are calculated as follows: |
|
||||||||||||||||||
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
Pq |
r, r0, f |
P r, , r0, f exp jq d |
|
|
q 0, 1, 2,. |
(9.2.10) |
||||||||||||
|
2 |
|
|
||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The relationship between realand complex-valued Fourier coefficients is expressed in the following equations:
P0
Pq
Pq
r,
r,
r,
r0, f P0 1 r, r0,
r0, f |
|
1 Pq 1 |
r, |
||||
|
|
|
|
2 |
|
|
|
r |
, f |
|
|
1 P 1 |
|
r, |
|
0 |
|
|
2 |
q |
|
f , |
|
|
|
|
|
|
|
r0 |
, f |
jPq 2 (r, r0 |
, f ) |
q 0, |
(9.2.11) |
||
|
|
|
|
|
|
|
|
r |
, f |
|
jP 2 (r, r , f ) |
q 0. |
|
||
0 |
|
q |
0 |
|
|
|
|
|
|
|
|
|
|
|
Equation (9.2.8) shows that reconstructed sound field can be represented by the azimuthal Fourier coefficients of sound pressure expressed in Equation (9.2.9) or (9.2.10). These coefficients are the azimuthal spectrum of sound pressure, which is a special and appropriate form of the spatial spectrum representation of the horizontal sound field created through the circular array of secondary sources.

364 Spatial Sound
Similarly, transfer functions from a secondary source to receiver positions can be expanded into realor complex-valued azimuthal Fourier series as
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
G |
|
r, , r , f |
|
|
G 1 |
|
r, r , f |
|
cos q |
G 2 |
|
r, r , f |
|
sin q |
|||
|
0 |
|
|
q |
0 |
|
|
q |
0 |
|
|
q 0
Gq r,
q
(9.2.12)
0 |
|
|
|
|
|
r |
, f |
|
exp jq |
|
. |
The real-valued Fourier coefficients of expansion are calculated as
G01 r, r0, f |
1 |
G r, , r0, f d , |
|
|
|
|
2 |
|
|
|
|||
|
|
|
|
|
|
|
Gq1 r, r0, f |
1 |
G r, , r0, f cos q d |
Gq2 r, r0, f |
1 |
G r, , r0, f sin q d (9.2.13) |
|
|
|
|||||
|
|
|
|
|
|
|
q 1,2,3 |
|
|
|
The complex-valued Fourier coefficients of expansion are related to those of real-valued expansion by an equation similar to Equation (9.2.11). For a secondary source with its main axis pointing to the origin and symmetric against the main axis, it has Gq2 r, r0, f 0 . The
realor complex-valued Fourier coefficients of expansion in Equation (9.2.13) represent the spatial or azimuthal spectrum of transfer function from secondary sources to receiver positions.
Given the type, array, and orientations of the main axis of the secondary sources, the azimuthal spectrum of the transfer functions from secondary sources to receiver positions is calculated using Equation (9.2.13). For example, for a plane wave source, the transfer function is expressed as Equation (9.2.4). The azimuthal spectrum is calculated by substituting Equation (9.2.4) into Equation (9.2.13) or directly by expanding a plane wave into Bessel–Fourier series,
Gpl |
r, , f |
|
exp jkr cos |
|
|
|
|
|
|
||||||||||||
free |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J0 kr |
2 jq Jq |
kr cos q cos q sin q sin q |
|
|||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
q 1 |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(9.2.14) |
|
|
|
|
|
J |
0 |
kr |
|
|
2 |
|
q |
|
kr |
|
|
|
|
||||
|
|
|
|
||||||||||||||||||
|
|
|
|
|
|
|
jq J |
|
|
|
cos q |
|
|
||||||||
|
|
|
|
|
|
|
|
|
|
|
q 1 |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
q |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
|
|
|
|
|
|
|
jq J |
|
kr |
|
exp |
jq , |
|
|
q
where Jq(kr), q = 0, 1, 2… are the q-order Bessel functions with J−q(kr) = (−1)qJq(kr). Comparing Equation (9.2.12) with Equation (9.2.14) yields
G01 r, f J0 kr |
Gq1 r, f 2jq Jq kr |
Gq2 r, f 0 q 1, 2,3 |
(9.2.15) |
Gq r, f jq Jq kr |
q 0, 1, 2 |
|
|
|
|

Analysis of multichannel sound field recording and reconstruction 365
For simplicity, the subscript “free” for a free field and superscript “pl” for a plane wave source are omitted in Equation (9.2.15).
Similarly, for a straight-line secondary source perpendicular to a horizontal plane, the transfer function from a source to a receiver position is expressed in Equation (9.2.5), e.g., by the free-field Green’s function in a horizontal plane. The distance between a source and a receiver position is
|r r | r2 r02 2rr0 cos .
Equation (9.2.5) can be written as
Gfree2D r, r , f Gfree2D r, , r0, f
4j H0 k r2 r02 2rr0 cos .
The Hankel function of the second kind can be expanded as
|
0 |
|
|
|
|
0 |
|
|
|
|
|
0 |
0 |
|
|
|
|
|
q |
|
|
|
|
|
q |
0 |
|
|
|
||
|
k|r r | |
|
|
|
|
|
|
2 |
|
|
|
|
|
||||||||||||||||||
H |
|
|
J |
|
kr |
|
H |
|
|
kr |
|
|
|
J |
|
|
kr |
|
H |
|
kr |
cos q |
|
||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
q 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
q |
|
|
|
|
q |
|
0 |
|
|
|
|
|
|
|
|
|
|
|
0 |
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||||||||
|
|
|
|
|
|
|
J |
|
|
kr |
|
H |
|
kr |
|
exp |
jq |
|
|
r r . |
|
q
(9.2.16)
(9.2.17)
(9.2.18)
Substituting Equation (9.2.18) into Equation (9.2.17) and comparing with Equation (9.2.12) yield
G01 r, r0 |
, f |
j |
|
J0 kr H0 kr0 , |
|
||||
|
4 |
(9.2.19) |
||
Gq1 r, r0 |
|
j |
||
, f |
|
Jq kr Hq kr0 Gq2 r, r0, f 0 q 1, 2, 3. |
||
|
|
|||
|
2 |
|
|
and
Gq r, r0 |
, f |
j |
Jq kr Hq kr0 q 0, 1, 2 . |
(9.2.20) |
|
||||
|
4 |
|
|
The equation for driving signals can be obtained from the spatial-spectral or azimuthalspectral domain representation of the transfer function from a secondary source to a receiver position. Substituting Equation (9.2.8) and the Fourier expansion in Equation (9.2.12) into Equation (9.2.7) and using the following equations of trigonometric functions,
|
|
|
|
|
cos q |
cos q cos q sin q sin q , |
(9.2.21) |
||
|
|
|
||
|
||||
sin q |
sin q cos q cos q sin q . |
|

366 Spatial Sound
the following equation is obtained:
|
|
|
|
|
|
|
|
|
|
|
|
|
Pq 1 r, r0, f |
cos q Pq 2 r, r0, f sin q |
|
||||||||||
q 0 |
|
|
|
|
|
q 1 |
|
|
|
|
|
|
|
|
|
1 |
|
|
2 |
|
|
|
|
||
|
|
|
|
|
(r, r0 |
|
(9.2.22) |
|||||
|
Gq |
|
r, r0, f cos q Gq |
, f )sin q E , f d cos q |
||||||||
q 0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
2 |
|
|
|
|
|
|
|
|
|
r, r0 |
, |
(r, r0, |
|
|
|
||||
[Gq |
f sin q Gq |
f )cos q ]E( , f )d sin q . |
|
|||||||||
q 1 |
|
|
|
|
|
|
|
|
|
|
|
The left side of Equation (9.2.22) characterizes the azimuthal variation in the reconstructed sound pressure, and each term of cosqθ or sinqθ represents a mode of azimuthal variation.
Given the target sound field, the reconstructed sound pressure in Equation (9.2.8) should match with the target sound pressure P(r, θ, f ). Accordingly, the azimuthal spectrum representation of the constructed sound pressure on the left side of Equation (9.2.22) should be substituted with those of the target sound pressure, e.g., substituted with Pq1 r, f and Pq 2 r, f . The coefficients for each azimuthal mode on the two sides of Equation (9.2.22) should be equal because each azimuthal mode is independent. This equality leads to a set of equations for driving signals E(θ′, f ):
G01 r, r0, f E , f d P01 r, f ,
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
r, r0 |
2 |
1 |
r, f , |
|
||
Gq |
, f cos q Gq |
r, r0, f sin q |
E , f d Pq |
(9.2.23) |
|||
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
r, r0 |
2 |
2 |
r, f . |
|
||
Gq |
, f sin q Gq |
r, r0, f cos q E , f d Pq |
|
A finite number of secondary sources are used in practical reproduction. M secondary sources are arranged in a horizontal circle with radius r0, then the azimuth of the ith secondary source is θi (i = 0, 1…M − 1). The integral over the azimuth in Equation (9.2.23) is replaced with the summation of discrete secondary source azimuths:
M 1 |
|
|
|
G01 r, r0, f Ei i , f P01 r, f |
, |
|
|
i 0 |
|
|
|
M 1 |
|
|
|
Gq1 r, r0, f cos q i Gq2 r, r0, f sin q i Ei i , f Pq1 r, f , |
(9.2.24) |
||
i 0 |
|
|
|
M 1 |
|
|
|
Gq1 r, r0, f sin q i Gq2 r, r0, f |
cos q i Ei i , f Pq2 r, f . |
|
i 0
The physical significance of Equations (9.2.23) and (9.2.24) is that a matching of the horizontal reconstructed and target sound field requires the matching of their corresponding

Analysis of multichannel sound field recording and reconstruction 367
azimuthal Fourier or harmonic components or vice versa. The method of solving driving signals via Equation (9.2.23) or (9.2.24) is a mode-matching method. Equation (9.2.23) or (9.2.24) is valid when the two sides of the equality in these equations do not vanish. In addition, Ei(θi, f) of discrete secondary sources cannot be directly obtained by letting θ′ = θi in E(θ′, f) for continuous secondary sources in Equation (9.2.7). An overall gain or normalized factor should be supplemented.
Through azimuthal spectrum representation, Equation (9.2.7) can be converted to a form different from Equation (9.2.23). The driving signal E(θ′, f) is a periodic function of azimuth θ′ with a period 2π, so it can be expanded as a realor complex-valued Fourier series:
|
1 |
2 |
|
|
|||
E , f Eq |
f cos q Eq |
f sin q |
|
q 0 |
|
|
(9.2.25) |
|
|
|
|
|
|
|
|
Eq f exp jq . |
|
||
q |
|
|
|
The realor complex-valued azimuthal Fourier coefficients can be calculated similarly to Equation (9.2.9) or (9.2.10). In the complex-valued Fourier expansion, when Equations (9.2.8), (9.2.12), and (9.2.25) are substituted into Equation (9.2.7), a convolution between two functions in the spatial domain becomes a multiplication between two corresponding
functions in the azimuthal–spectral domain: |
|
Pq r, r0, f 2 Gq r, r0, f Eq f q 0, 1, 2 ; |
(9.2.26) |
This equation is the formulation of multichannel sound field reconstruction in the azi- muthal-spectral domain.
Equation (9.2.26) can be expressed in the form of real-valued azimuthal Fourier coefficients, but it is relatively complicated. For a secondary source with its main axis pointing to
the origin and symmetric against the main axis, it has Gq2 r, r0, f 0 . In this case, Equation (9.2.26) is expressed by real-valued azimuthal Fourier coefficients as
P0 1 r, r0, f 2 G01 r, r0 |
, f E01 f , |
(9.2.27) |
|
Pq r, r0, f Gq1 r, r0, |
f Eq f |
||
q 1, 2, 3 , 1, 2. |
According to Equation (9.2.26), given the driving signals and transfer functions from secondary sources to receiver positions in the azimuthal-spectral domain, the reconstructed sound pressure in the azimuthal–spectral domain can be evaluated. Or, given the azimuthal
spectrum representation Pq r, f or Pq(r, f) of the target sound field, the driving signals of secondary sources can be found by substituting the azimuthal spectrum representation of reconstructed sound pressures in Equation (9.2.26) with Pq r, f or Pq(r, f):
Eq f |
Pq r, f |
q 0, |
1, 2 ; |
|
(9.2.28) |
||
2 Gq r, r0 |
, f |
||||||
|
|
|
|
|