
- •Preface
- •Introduction
- •1.1 Spatial coordinate systems
- •1.2 Sound fields and their physical characteristics
- •1.2.1 Free-field and sound waves generated by simple sound sources
- •1.2.2 Reflections from boundaries
- •1.2.3 Directivity of sound source radiation
- •1.2.4 Statistical analysis of acoustics in an enclosed space
- •1.2.5 Principle of sound receivers
- •1.3 Auditory system and perception
- •1.3.1 Auditory system and its functions
- •1.3.2 Hearing threshold and loudness
- •1.3.3 Masking
- •1.3.4 Critical band and auditory filter
- •1.4 Artificial head models and binaural signals
- •1.4.1 Artificial head models
- •1.4.2 Binaural signals and head-related transfer functions
- •1.5 Outline of spatial hearing
- •1.6 Localization cues for a single sound source
- •1.6.1 Interaural time difference
- •1.6.2 Interaural level difference
- •1.6.3 Cone of confusion and head movement
- •1.6.4 Spectral cues
- •1.6.5 Discussion on directional localization cues
- •1.6.6 Auditory distance perception
- •1.7 Summing localization and spatial hearing with multiple sources
- •1.7.1 Summing localization with two sound sources
- •1.7.2 The precedence effect
- •1.7.3 Spatial auditory perceptions with partially correlated and uncorrelated source signals
- •1.7.4 Auditory scene analysis and spatial hearing
- •1.7.5 Cocktail party effect
- •1.8 Room reflections and auditory spatial impression
- •1.8.1 Auditory spatial impression
- •1.8.2 Sound field-related measures and auditory spatial impression
- •1.8.3 Binaural-related measures and auditory spatial impression
- •1.9.1 Basic principle of spatial sound
- •1.9.2 Classification of spatial sound
- •1.9.3 Developments and applications of spatial sound
- •1.10 Summary
- •2.1 Basic principle of a two-channel stereophonic sound
- •2.1.1 Interchannel level difference and summing localization equation
- •2.1.2 Effect of frequency
- •2.1.3 Effect of interchannel phase difference
- •2.1.4 Virtual source created by interchannel time difference
- •2.1.5 Limitation of two-channel stereophonic sound
- •2.2.1 XY microphone pair
- •2.2.2 MS transformation and the MS microphone pair
- •2.2.3 Spaced microphone technique
- •2.2.4 Near-coincident microphone technique
- •2.2.5 Spot microphone and pan-pot technique
- •2.2.6 Discussion on microphone and signal simulation techniques for two-channel stereophonic sound
- •2.3 Upmixing and downmixing between two-channel stereophonic and mono signals
- •2.4 Two-channel stereophonic reproduction
- •2.4.1 Standard loudspeaker configuration of two-channel stereophonic sound
- •2.4.2 Influence of front-back deviation of the head
- •2.5 Summary
- •3.1 Physical and psychoacoustic principles of multichannel surround sound
- •3.2 Summing localization in multichannel horizontal surround sound
- •3.2.1 Summing localization equations for multiple horizontal loudspeakers
- •3.2.2 Analysis of the velocity and energy localization vectors of the superposed sound field
- •3.2.3 Discussion on horizontal summing localization equations
- •3.3 Multiple loudspeakers with partly correlated and low-correlated signals
- •3.4 Summary
- •4.1 Discrete quadraphone
- •4.1.1 Outline of the quadraphone
- •4.1.2 Discrete quadraphone with pair-wise amplitude panning
- •4.1.3 Discrete quadraphone with the first-order sound field signal mixing
- •4.1.4 Some discussions on discrete quadraphones
- •4.2 Other horizontal surround sounds with regular loudspeaker configurations
- •4.2.1 Six-channel reproduction with pair-wise amplitude panning
- •4.2.2 The first-order sound field signal mixing and reproduction with M ≥ 3 loudspeakers
- •4.3 Transformation of horizontal sound field signals and Ambisonics
- •4.3.1 Transformation of the first-order horizontal sound field signals
- •4.3.2 The first-order horizontal Ambisonics
- •4.3.3 The higher-order horizontal Ambisonics
- •4.3.4 Discussion and implementation of the horizontal Ambisonics
- •4.4 Summary
- •5.1 Outline of surround sounds with accompanying picture and general uses
- •5.2 5.1-Channel surround sound and its signal mixing analysis
- •5.2.1 Outline of 5.1-channel surround sound
- •5.2.2 Pair-wise amplitude panning for 5.1-channel surround sound
- •5.2.3 Global Ambisonic-like signal mixing for 5.1-channel sound
- •5.2.4 Optimization of three frontal loudspeaker signals and local Ambisonic-like signal mixing
- •5.2.5 Time panning for 5.1-channel surround sound
- •5.3 Other multichannel horizontal surround sounds
- •5.4 Low-frequency effect channel
- •5.5 Summary
- •6.1 Summing localization in multichannel spatial surround sound
- •6.1.1 Summing localization equations for spatial multiple loudspeaker configurations
- •6.1.2 Velocity and energy localization vector analysis for multichannel spatial surround sound
- •6.1.3 Discussion on spatial summing localization equations
- •6.1.4 Relationship with the horizontal summing localization equations
- •6.2 Signal mixing methods for a pair of vertical loudspeakers in the median and sagittal plane
- •6.3 Vector base amplitude panning
- •6.4 Spatial Ambisonic signal mixing and reproduction
- •6.4.1 Principle of spatial Ambisonics
- •6.4.2 Some examples of the first-order spatial Ambisonics
- •6.4.4 Recreating a top virtual source with a horizontal loudspeaker arrangement and Ambisonic signal mixing
- •6.5 Advanced multichannel spatial surround sounds and problems
- •6.5.1 Some advanced multichannel spatial surround sound techniques and systems
- •6.5.2 Object-based spatial sound
- •6.5.3 Some problems related to multichannel spatial surround sound
- •6.6 Summary
- •7.1 Basic considerations on the microphone and signal simulation techniques for multichannel sounds
- •7.2 Microphone techniques for 5.1-channel sound recording
- •7.2.1 Outline of microphone techniques for 5.1-channel sound recording
- •7.2.2 Main microphone techniques for 5.1-channel sound recording
- •7.2.3 Microphone techniques for the recording of three frontal channels
- •7.2.4 Microphone techniques for ambience recording and combination with frontal localization information recording
- •7.2.5 Stereophonic plus center channel recording
- •7.3 Microphone techniques for other multichannel sounds
- •7.3.1 Microphone techniques for other discrete multichannel sounds
- •7.3.2 Microphone techniques for Ambisonic recording
- •7.4 Simulation of localization signals for multichannel sounds
- •7.4.1 Methods of the simulation of directional localization signals
- •7.4.2 Simulation of virtual source distance and extension
- •7.4.3 Simulation of a moving virtual source
- •7.5 Simulation of reflections for stereophonic and multichannel sounds
- •7.5.1 Delay algorithms and discrete reflection simulation
- •7.5.2 IIR filter algorithm of late reverberation
- •7.5.3 FIR, hybrid FIR, and recursive filter algorithms of late reverberation
- •7.5.4 Algorithms of audio signal decorrelation
- •7.5.5 Simulation of room reflections based on physical measurement and calculation
- •7.6 Directional audio coding and multichannel sound signal synthesis
- •7.7 Summary
- •8.1 Matrix surround sound
- •8.1.1 Matrix quadraphone
- •8.1.2 Dolby Surround system
- •8.1.3 Dolby Pro-Logic decoding technique
- •8.1.4 Some developments on matrix surround sound and logic decoding techniques
- •8.2 Downmixing of multichannel sound signals
- •8.3 Upmixing of multichannel sound signals
- •8.3.1 Some considerations in upmixing
- •8.3.2 Simple upmixing methods for front-channel signals
- •8.3.3 Simple methods for Ambient component separation
- •8.3.4 Model and statistical characteristics of two-channel stereophonic signals
- •8.3.5 A scale-signal-based algorithm for upmixing
- •8.3.6 Upmixing algorithm based on principal component analysis
- •8.3.7 Algorithm based on the least mean square error for upmixing
- •8.3.8 Adaptive normalized algorithm based on the least mean square for upmixing
- •8.3.9 Some advanced upmixing algorithms
- •8.4 Summary
- •9.1 Each order approximation of ideal reproduction and Ambisonics
- •9.1.1 Each order approximation of ideal horizontal reproduction
- •9.1.2 Each order approximation of ideal three-dimensional reproduction
- •9.2 General formulation of multichannel sound field reconstruction
- •9.2.1 General formulation of multichannel sound field reconstruction in the spatial domain
- •9.2.2 Formulation of spatial-spectral domain analysis of circular secondary source array
- •9.2.3 Formulation of spatial-spectral domain analysis for a secondary source array on spherical surface
- •9.3 Spatial-spectral domain analysis and driving signals of Ambisonics
- •9.3.1 Reconstructed sound field of horizontal Ambisonics
- •9.3.2 Reconstructed sound field of spatial Ambisonics
- •9.3.3 Mixed-order Ambisonics
- •9.3.4 Near-field compensated higher-order Ambisonics
- •9.3.5 Ambisonic encoding of complex source information
- •9.3.6 Some special applications of spatial-spectral domain analysis of Ambisonics
- •9.4 Some problems related to Ambisonics
- •9.4.1 Secondary source array and stability of Ambisonics
- •9.4.2 Spatial transformation of Ambisonic sound field
- •9.5 Error analysis of Ambisonic-reconstructed sound field
- •9.5.1 Integral error of Ambisonic-reconstructed wavefront
- •9.5.2 Discrete secondary source array and spatial-spectral aliasing error in Ambisonics
- •9.6 Multichannel reconstructed sound field analysis in the spatial domain
- •9.6.1 Basic method for analysis in the spatial domain
- •9.6.2 Minimizing error in reconstructed sound field and summing localization equation
- •9.6.3 Multiple receiver position matching method and its relation to the mode-matching method
- •9.7 Listening room reflection compensation in multichannel sound reproduction
- •9.8 Microphone array for multichannel sound field signal recording
- •9.8.1 Circular microphone array for horizontal Ambisonic recording
- •9.8.2 Spherical microphone array for spatial Ambisonic recording
- •9.8.3 Discussion on microphone array recording
- •9.9 Summary
- •10.1 Basic principle and implementation of wave field synthesis
- •10.1.1 Kirchhoff–Helmholtz boundary integral and WFS
- •10.1.2 Simplification of the types of secondary sources
- •10.1.3 WFS in a horizontal plane with a linear array of secondary sources
- •10.1.4 Finite secondary source array and effect of spatial truncation
- •10.1.5 Discrete secondary source array and spatial aliasing
- •10.1.6 Some issues and related problems on WFS implementation
- •10.2 General theory of WFS
- •10.2.1 Green’s function of Helmholtz equation
- •10.2.2 General theory of three-dimensional WFS
- •10.2.3 General theory of two-dimensional WFS
- •10.2.4 Focused source in WFS
- •10.3 Analysis of WFS in the spatial-spectral domain
- •10.3.1 General formulation and analysis of WFS in the spatial-spectral domain
- •10.3.2 Analysis of the spatial aliasing in WFS
- •10.3.3 Spatial-spectral division method of WFS
- •10.4 Further discussion on sound field reconstruction
- •10.4.1 Comparison among various methods of sound field reconstruction
- •10.4.2 Further analysis of the relationship between acoustical holography and sound field reconstruction
- •10.4.3 Further analysis of the relationship between acoustical holography and Ambisonics
- •10.4.4 Comparison between WFS and Ambisonics
- •10.5 Equalization of WFS under nonideal conditions
- •10.6 Summary
- •11.1 Basic principles of binaural reproduction and virtual auditory display
- •11.1.1 Binaural recording and reproduction
- •11.1.2 Virtual auditory display
- •11.2 Acquisition of HRTFs
- •11.2.1 HRTF measurement
- •11.2.2 HRTF calculation
- •11.2.3 HRTF customization
- •11.3 Basic physical features of HRTFs
- •11.3.1 Time-domain features of far-field HRIRs
- •11.3.2 Frequency domain features of far-field HRTFs
- •11.3.3 Features of near-field HRTFs
- •11.4 HRTF-based filters for binaural synthesis
- •11.5 Spatial interpolation and decomposition of HRTFs
- •11.5.1 Directional interpolation of HRTFs
- •11.5.2 Spatial basis function decomposition and spatial sampling theorem of HRTFs
- •11.5.3 HRTF spatial interpolation and signal mixing for multichannel sound
- •11.5.4 Spectral shape basis function decomposition of HRTFs
- •11.6 Simplification of signal processing for binaural synthesis
- •11.6.1 Virtual loudspeaker-based algorithms
- •11.6.2 Basis function decomposition-based algorithms
- •11.7.1 Principle of headphone equalization
- •11.7.2 Some problems with binaural reproduction and VAD
- •11.8 Binaural reproduction through loudspeakers
- •11.8.1 Basic principle of binaural reproduction through loudspeakers
- •11.8.2 Virtual source distribution in two-front loudspeaker reproduction
- •11.8.3 Head movement and stability of virtual sources in Transaural reproduction
- •11.8.4 Timbre coloration and equalization in transaural reproduction
- •11.9 Virtual reproduction of stereophonic and multichannel surround sound
- •11.9.1 Binaural reproduction of stereophonic and multichannel sound through headphones
- •11.9.2 Stereophonic expansion and enhancement
- •11.9.3 Virtual reproduction of multichannel sound through loudspeakers
- •11.10.1 Binaural room modeling
- •11.10.2 Dynamic virtual auditory environments system
- •11.11 Summary
- •12.1 Physical analysis of binaural pressures in summing virtual source and auditory events
- •12.1.1 Evaluation of binaural pressures and localization cues
- •12.1.2 Method for summing localization analysis
- •12.1.3 Binaural pressure analysis of stereophonic and multichannel sound with amplitude panning
- •12.1.4 Analysis of summing localization with interchannel time difference
- •12.1.5 Analysis of summing localization at the off-central listening position
- •12.1.6 Analysis of interchannel correlation and spatial auditory sensations
- •12.2 Binaural auditory models and analysis of spatial sound reproduction
- •12.2.1 Analysis of lateral localization by using auditory models
- •12.2.2 Analysis of front-back and vertical localization by using a binaural auditory model
- •12.2.3 Binaural loudness models and analysis of the timbre of spatial sound reproduction
- •12.3 Binaural measurement system for assessing spatial sound reproduction
- •12.4 Summary
- •13.1 Analog audio storage and transmission
- •13.1.1 45°/45° Disk recording system
- •13.1.2 Analog magnetic tape audio recorder
- •13.1.3 Analog stereo broadcasting
- •13.2 Basic concepts of digital audio storage and transmission
- •13.3 Quantization noise and shaping
- •13.3.1 Signal-to-quantization noise ratio
- •13.3.2 Quantization noise shaping and 1-Bit DSD coding
- •13.4 Basic principle of digital audio compression and coding
- •13.4.1 Outline of digital audio compression and coding
- •13.4.2 Adaptive differential pulse-code modulation
- •13.4.3 Perceptual audio coding in the time-frequency domain
- •13.4.4 Vector quantization
- •13.4.5 Spatial audio coding
- •13.4.6 Spectral band replication
- •13.4.7 Entropy coding
- •13.4.8 Object-based audio coding
- •13.5 MPEG series of audio coding techniques and standards
- •13.5.1 MPEG-1 audio coding technique
- •13.5.2 MPEG-2 BC audio coding
- •13.5.3 MPEG-2 advanced audio coding
- •13.5.4 MPEG-4 audio coding
- •13.5.5 MPEG parametric coding of multichannel sound and unified speech and audio coding
- •13.5.6 MPEG-H 3D audio
- •13.6 Dolby series of coding techniques
- •13.6.1 Dolby digital coding technique
- •13.6.2 Some advanced Dolby coding techniques
- •13.7 DTS series of coding technique
- •13.8 MLP lossless coding technique
- •13.9 ATRAC technique
- •13.10 Audio video coding standard
- •13.11 Optical disks for audio storage
- •13.11.1 Structure, principle, and classification of optical disks
- •13.11.2 CD family and its audio formats
- •13.11.3 DVD family and its audio formats
- •13.11.4 SACD and its audio formats
- •13.11.5 BD and its audio formats
- •13.12 Digital radio and television broadcasting
- •13.12.1 Outline of digital radio and television broadcasting
- •13.12.2 Eureka-147 digital audio broadcasting
- •13.12.3 Digital radio mondiale
- •13.12.4 In-band on-channel digital audio broadcasting
- •13.12.5 Audio for digital television
- •13.13 Audio storage and transmission by personal computer
- •13.14 Summary
- •14.1 Outline of acoustic conditions and requirements for spatial sound intended for domestic reproduction
- •14.2 Acoustic consideration and design of listening rooms
- •14.3 Arrangement and characteristics of loudspeakers
- •14.3.1 Arrangement of the main loudspeakers in listening rooms
- •14.3.2 Characteristics of the main loudspeakers
- •14.3.3 Bass management and arrangement of subwoofers
- •14.4 Signal and listening level alignment
- •14.5 Standards and guidance for conditions of spatial sound reproduction
- •14.6 Headphones and binaural monitors of spatial sound reproduction
- •14.7 Acoustic conditions for cinema sound reproduction and monitoring
- •14.8 Summary
- •15.1 Outline of psychoacoustic and subjective assessment experiments
- •15.2 Contents and attributes for spatial sound assessment
- •15.3 Auditory comparison and discrimination experiment
- •15.3.1 Paradigms of auditory comparison and discrimination experiment
- •15.3.2 Examples of auditory comparison and discrimination experiment
- •15.4 Subjective assessment of small impairments in spatial sound systems
- •15.5 Subjective assessment of a spatial sound system with intermediate quality
- •15.6 Virtual source localization experiment
- •15.6.1 Basic methods for virtual source localization experiments
- •15.6.2 Preliminary analysis of the results of virtual source localization experiments
- •15.6.3 Some results of virtual source localization experiments
- •15.7 Summary
- •16.1.1 Application to commercial cinema and related problems
- •16.1.2 Applications to domestic reproduction and related problems
- •16.1.3 Applications to automobile audio
- •16.2.1 Applications to virtual reality
- •16.2.2 Applications to communication and information systems
- •16.2.3 Applications to multimedia
- •16.2.4 Applications to mobile and handheld devices
- •16.3 Applications to the scientific experiments of spatial hearing and psychoacoustics
- •16.4 Applications to sound field auralization
- •16.4.1 Auralization in room acoustics
- •16.4.2 Other applications of auralization technique
- •16.5 Applications to clinical medicine
- •16.6 Summary
- •References
- •Index

Analysis of multichannel sound field recording and reconstruction 415
array in a horizontal circle. This analysis can also be extended to arbitrary three-dimensional sound field reconstruction technique with a uniform secondary source array on a spherical surface. Therefore, the analysis in this section is general.
9.6 MULTICHANNEL RECONSTRUCTED SOUND FIELD ANALYSIS IN THE SPATIAL DOMAIN
9.6.1 Basic method for analysis in the spatial domain
The general formulation for multichannel sound field reconstruction in the spatial domain, including the formulations for continuous secondary source array in Equation (9.2.1) and discrete and finite secondary source array in Equation (9.2.6), is presented in Section 9.2.1. The methods for analyzing a multichannel reconstructed sound field in two special spatialspectral domains are discussed in Sections 9.2.2 and 9.2.3, and the Ambisonic sound field is analyzed in Section 9.3. The analysis in the spatial-spectral domain is convenient and appropriate for some regular second source arrays, such as uniform or nearly uniform array in a circle or spherical surfaces.
Equation (9.2.1) or (9.2.6) can be directly analyzed and solved in the spatial domain. For continuous and uniform secondary source arrays in a horizontal circle or on a spherical surface, analyses in the spatial domain and the spatial-spectral domain are basically equivalent. However, for arbitrary discrete arrays with a finite number of secondary sources, analyses in the spatial domain often lead to significant results. Given the secondary source array and driving signals, reconstructed sound pressures can be directly evaluated from Equations (9.2.1) and (9.2.6) in the spatial domain. Conversely, under certain conditions, given secondary source array and target sound field, driving signals can be derived directly in the spatial domain.
In the case of discrete array with a finite number of secondary sources, M secondary sources are arranged at positions ri (i = 0, 1…M − 1). The receiver position or controlled point in the
sound field is specified by rcontro, then the sound pressures at a controlled point are calculated with Equation (9.2.6):
M 1 |
|
P rcontro, f G rcontro, ri , f Ei ri , f . |
(9.6.1) |
i 0
According to Equation (9.6.1) and under certain conditions, driving signals Ei(ri, f) can be derived by minimizing the error of the reconstructed sound pressures at a set of receiver positions or controlled points within a special spatial region. Indeed, the resultant driving signals depend on the chosen error criteria and controlled points. This phenomenon is the basic consideration and method of multichannel reconstructed sound field analysis in the spatial domain.
9.6.2 Minimizing error in reconstructed sound field and summing localization equation
M secondary sources are arranged in a horizontal circle with radius r0, the azimuth of the ith secondary source is θi, and the normalized amplitude of driving signals is Ai. For a unit signal waveform EA(f) = 1 in the frequency domain, the driving signals of secondary sources are Ei = Ai. Two controlled points rcontro,L and rcontro,R are located at a distance of r = a (radius

416 Spatial Sound
of head) and azimuth of ±90°, respectively. This simplified head model is used for binaural pressure analysis in Sections 2.1.1 and 3.2.1 in which the effect of head shadow is neglected. Secondary sources are located at a far-field distance with r0 a; according to Equation (9.6.1), pressures at two controlled points are the superposition of pressures caused by incident plane waves from secondary sources:
M 1 |
|
|
|
|
|
|
|
|
, |
||
PL P rcontro,L, f Ai exp |
jka cos 90 |
|
i |
||
i 0 |
|
|
|
|
(9.6.2) |
M 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
PR P rcontro,R, f Ai exp |
jkacos 90 |
i . |
i 0
Given the target pressures PL and PR at two controlled points (two ears), two linear Equations for the normalized amplitudes Ai of driving signals can be obtained by matching the left sides of Equation (9.6.2) with the target pressures. However, in the case of more than two secondary sources (M > 2), the equations are underdetermined with more unknown Ai than equations exist. Therefore, a unique solution of driving signal amplitudes cannot be obtained. Conversely, given the driving signal amplitudes, the pressures at two controlled points can be calculated from Equation (9.6.2), and the direction of summing virtual sources can be evaluated.
If a target (virtual) source is located at the far-field distance rI a and at the azimuth of θI, according to Equation (1.2.6), the pressures of a plane wave at two controlled points created by the target sources are given as
PL P rcontro,L, I , f PA exp jka cos 90 I ,
(9.6.3)
PR P rcontro,R, I , f PA exp jka cos 90 I .
The overall square error of complex-valued pressures at two controlled points is evaluated using
Err4 |
|
P rcontro,L, f P(rcontro,L, I , f ) |
|
2 |
|
P rcontro,R P(rcontro,R, I , f ) |
|
2 . |
(9.6.4) |
|
|
|
|
Equations (9.6.2) and (9.6.3) are substituted into Equation (9.6.4) to evaluate the virtual source direction, and the amplitude PA and the incident azimuth θI of a target plane wave are chosen to minimize the error in Equation (9.6.4) or equally
Err4 0 |
Err4 0. |
(9.6.5) |
PA |
I |
|
The optimal matched target or virtual source direction is found by using the following equation:
|
|
|
M 1 |
|
|
|
|
|
Ai sin ka sin i |
|
|
sin I |
1 |
|
i 0 |
|
(9.6.6) |
|
arctan |
|
. |
|
|
ka |
M 1 |
|
|||
|
|
Ai cos ka sin i |
|
||
|
|
|
|||
|
|
|
i 0 |
|
|

Analysis of multichannel sound field recording and reconstruction 417
Equation (9.6.6) is the summing localization equation of multiple horizontal secondary sources (loudspeakers) for a fixed head in Equation (3.2.6). At low frequencies with ka 1, Equation (9.6.6) is simplified into Equation (3.2.7).
The optimal matched target or virtual plane wave amplitude and the corresponding mini-
mal square error Err4,min can be evaluated from Equation (9.6.5). The general results are complicated and omitted here. At low frequencies with ka 1, the best-matched plane wave
amplitude is given as
M 1 |
|
PA Ai. |
(9.6.7) |
i 0
It is the sum of the normalized amplitudes of the driving signals of secondary sources, if the normalized amplitudes of driving signals satisfy
M 1 |
|
Ai 1. |
(9.6.8) |
i 0
The optimal matched target sound field is a plane wave with a unit amplitude and incident from azimuth θI,
PA =1. |
(9.6.9) |
If the controlled points are continuously and uniformly distributed in a circle centered at the origin and r = a, where a can be the head radius, but this condition is not limited to this phenomenon. Under the far-far-field condition, the superposed pressure at the controlled point (a, θ) caused by M secondary sources is given as
|
|
|
|
|
|
|
|
|
|
M 1 |
|
|
|
|
P |
r |
, f |
P |
a, , f |
|
A exp jka cos |
. |
(9.6.10) |
||||||
|
contro |
|
|
|
|
i |
|
|
i |
|
i 0
If a target (virtual) source is located at a far-field distance and at an azimuth of θI , the pressure of the plane wave at the controlled point (a, θ) created by the target sources is given as
|
|
|
|
|
(9.6.11) |
P(rcontro, I , f ) P(a, , I , f ) PA exp jka cos( I ) . |
|||||
|
|
|
|
|
|
The integral square error of complex-valued pressure (reconstructed wavefront) over the circle of controlled points is evaluated by
|
|
2 |
|
Err5 |
P a, , f P(a, , I , f ) |
d . |
(9.6.12) |
|
|
|
|

418 Spatial Sound
Equations (9.6.10) and (9.6.11) are substituted into Equation (9.6.12) to evaluate the virtual source direction, and PA and θI of the target plane wave are chosen to minimize the error in Equation (9.6.12) or equally
Err5 0 Err5 0. |
(9.6.13) |
|
PA |
|
|
|
I |
|
The calculation in Equation (9.6.13) is complicated. However, at low frequencies with ka 1, the optimal matched target or virtual source direction is found using the following:
M 1
Ai sin i
tan I |
i 0 |
. |
(9.6.14) |
M 1 |
|||
|
Ai cos i |
|
|
i 0
The optimally matched amplitude of the target plane wave is determined with the following:
M 1 |
|
PA Ai. |
(9.6.15) |
i 0
Equation (9.6.14) is the summing localization equation in Equation (3.2.9) for the head oriented to the virtual source.
Here, the summing localization equations of multichannel sound reproduction are derived from the criteria of minimizing reconstructed pressure errors in the controlled points. Unlike the derivation in Section 3.2, the psychoacoustic cue (ITD) of low-frequency localization is not considered here. Choosing different controlled points and error criteria and minimizing pressure error lead to different summing localization equations. Minimizing the square error of complex-valued pressures at two ears results in the localization equation for a fixed head; minimizing the integral square error of complex-valued pressures over a circle leads to the localization equation for the head oriented to the virtual source. For horizontal Ambisonics with conventional driving signal mixing, Equations (4.3.60) and (4.3.61) indicate that the two optimal matched conditions can be satisfied at the same time at low frequencies. However, for some other signal panning or mixing methods, such as pair-wise amplitude panning (Section 4.1.2), the two optimal matched conditions cannot be satisfied at the same time. In these cases, the perceived virtual source direction for a fixed head and head oriented to the virtual source is different, especially for a pair of stereophonic loudspeakers (secondary sources) with a large span angle and a pair of side loudspeakers. Therefore, analyzing the reconstructed sound field helps provide insights into the physical nature of summing localization equations.
This method can be extended to the analysis of the reconstructed sound field of multichannel spatial surround sound, but it is omitted here because of the limitation of length.
9.6.3 Multiple receiver position matching method and its relation to the mode-matching method
Under a certain condition, given the discrete and finite secondary source array and controlled points, driving signals can be solved from Equation (9.6.1). Without the loss of generality,

Analysis of multichannel sound field recording and reconstruction 419
the sound pressures at O controlled points specified by the position vector rcontro, o, o = 0, 1 … (O − 1) are given as
M 1 |
|
P rcontro,o, f G rcontro,o, ri, f Ei ri, f o 0,1 O 1 . |
(9.6.16) |
i 0
Equation (9.6.16) can be written as a matrix form:
P G |
E, |
(9.6.17) |
|
|
where P′ = [P′(rcontro,0, f), P′(rcontro,1, f), P′(rcontro,O−1, f)]T is an O × 1 column vector or matrix composed of the sound pressures at O controlled points; E is an M × 1 column vector or
matrix composed of the driving signals of M secondary sources; and [G] is an O × M matrix composed of the complex-valued transfer functions from M secondary sources to O con-
trolled points, whose entries are Goi = G(rcontro,o, ri, f), o = 0, 1 … (O – 1), i = 0, 1 … (M – 1). Equation (9.6.17) is the general formulation for controlling the sound pressures at mul-
tiple receiver positions by multiple secondary sources. This formulation is suitable for various sound field reconstruction systems and secondary source arrays. From the point of signal processing, this problem occurs in a multi-input and multi-output (MIMO) system. If driving signals in Equation (9.6.17) are chosen so that the reconstructed sound pressures at the O controlled points match with the target sound pressures, then the O × 1 vector on the left side of Equation (9.6.17) becomes P′ = P. In this case, Equation (9.6.17) is a matrix equation or a set of linear equations with respect to vector E or M driving signals Ei(ri, f). Solving the vector E of driving signals is a multichannel inverse filtering problem (Nelson et al., 1996).
When the number of controlled points is equal to the number of secondary sources and matrix [G] is a full rank, i.e., rank [G] = O = M, a unique solution of Equation (9.6.17) for driving signals is given as
E G 1 |
P. |
(9.6.18) |
|
|
In this case, the errors in the reconstructed sound pressures at the O controlled points vanish.
Generally, the rank of matrix [G] is rank[G] = K ≤ min(O,M). When the number of controlled points is fewer than that of secondary sources, i.e., K ≤ O < M, Equation (9.6.17) is underdetermined, and infinite sets of the solution for driving signals exist. The pseudoinverse solution that minimizes the overall power of driving signals is given as
|
|
1 P. |
|
E G G G |
(9.6.19) |
where superscript “+” denotes the transpose and conjugation of the matrix. Regularization can be applied to the solution to avoid the ill condition or instability in the pseudoinverse of matrix {[G][G]+} at some frequencies:
E G G G I 1 |
P, |
(9.6.20) |
|
|
|
|
|

420 Spatial Sound
where [I] is an O × O identity matrix, and ε is a regularization parameter that balances the stability and accuracy of the solution.
When the number of controlled points is larger than that of secondary sources, i.e., O > M ≥ K, Equation (9.6.17) is overdetermined and thus without the exact solution. However, an approximate or pseudoinverse solution of driving signals can be found by minimizing the square norm of the error (cost function) between the complex-valued amplitude vectors of the reconstructed and target sound pressures:
min Err6 min |
P P |
|
P P |
O 1 |
|
P rcontro,o, f P (rcontro,o, f ) |
|
2 |
. (9.6.21) |
|
|
||||||||
|
min |
|
|
|
|||||
|
|
|
|
o 0 |
|
|
|
|
|
The result is given as
E G G 1 |
G P. |
(9.6.22) |
|
|
|
|
|
Regularization can be applied to the solution to avoid the ill condition or instability in the pseudoinverse of the matrix {[G]+[G]} at some frequencies:
E G G I 1 |
G P. |
(9.6.23) |
|
|
|
|
|
where [I] is an M × M identity matrix.
The method discussed above is the least square error method for controlling the sound pressures at multiple receiver positions (Kirkeby and Nelson, 1993). Indeed, driving signals obtained by the aforementioned method may not satisfy the causality and thus may be unrealizable. Kirkeby et al. (1996) further proposed a method to obtain causal driving signals in the time domain.
Equation (9.6.17) can also be solved by the method of singular value decomposition (SVD). If the rank of O × M transfer matrix [G] in Equation (9.6.17) is Κ = rank [G] ≤ min (O, M), {[G][G]+} and {[G]+[G]} are O × O and M × M Hermitian matrices, respectively;
they share K real and positive eigenvalues 02 12 K2 1 0 , and other eigenvalues are
zeros: |
|
G G u 2u G G v 2v 0,1 K 1 , |
(9.6.24) |
where the eigenvectors uκ and vκ are O × 1 left singular value vector and M × 1 right singular value vector of matrix [G], respectively, and they satisfy the following orthogonality and normalization:
u u |
v v |
1 |
|
. |
|
|
|||
|
|
0 |
|
The SVD of [G] is given as
G U V .
(9.6.25)
(9.6.26)

Analysis of multichannel sound field recording and reconstruction 421
where [ ] is an O × M singular value matrix, whose K non-zero left-diagonal entries are the singular values of [G] in descending order, i.e., δ0 ≥ δ1 ≥ … ≥ δK − 1 ≥ 0, then
0 |
0 |
0 |
|
|
0 |
|
|
|
|
0 |
1 |
0 |
|
|
0 |
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
(9.6.27) |
|
|
0 K‘ 1 |
|
|
. |
|||
0 |
0 |
|
||||||
|
|
|
|
: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
|
[U] and [V] are O × O and M × M unitarity matrices, respectively, with [U]−1 = [U]+ and [V]−1 = [V]+. The preceding K columns of matrix [U] are constructed from the K orthonormal and normalized eigenvectors uκ, whereas the preceding K columns of matrix [V] are constructed from the K orthonormal and normalized eigenvectors vκ. Substituting Equation (9.6.26) into Equation (9.6.17) yields
P U V E. |
(9.6.28) |
If P′ = P at each given frequency, driving signals are solved from Equation (9.6.28):
E V 1/ |
U P, |
(9.6.29) |
|
|
where [1/ ] is an M × O diagonal matrix with K non-zero left-diagonal entries:
|
|
|
1 |
0 |
0 |
|
0 |
|
|
|
|
0 |
|
|
|||||
|
|
|
0 |
1 1 |
0 |
|
0 |
|
|
|
|
|
|
|
|||||
1/ |
|
|
|
|
|
|
|
|
|
|
|
|
K11 |
|
. |
(9.6.30) |
|||
|
0 0 |
|
|
0 |
|
||||
|
|
|
|
|
|
: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
|
When δκ is small, the corresponding matrix entry 1 in Equation (9.6.30) is large, leading to the instability of driving signals in Equation (9.6.29). In this case, the δκ in Equation (9.6.27) that is larger than a certain threshold is retained, and other small δκ are discarded. Thus, the solution of driving signals given in Equations (9.6.29) and (9.6.30) becomes stable.
The aforementioned method is essentially a multiple receiver position matching method through which sound fields are sampled spatially, and sound pressure at O receiver positions is controlled to match with those of target sound pressures as far as possible. If the target sound field is spatially bandlimited, the receiver region can be sampled by a grid of controlled points and the distance between adjacent controlled points does not exceed the minimal half wavelength. According to Shannon–Nyquist spatial sampling theorem, a match of the reconstructed sound pressures at all the controlled points means an accurate reconstruction of target sound field in the concerned region (i.e., the Gibbs effect on the boundary is neglected). Otherwise, spatial aliasing errors occur in the reconstructed sound

422 Spatial Sound
field. Similar to the case of Ambisonics in Section 9.4.1, the controlled points and secondary source array should be chosen appropriately to obtain stable driving signals by solving Equation (9.6.17) so that the transfer matrix [G] is well-conditioned within the concerned frequency range.
To increase the frequency limit of anti-spatial aliasing, Kolundžija et al. (2011) suggested using secondary sources only that contribute mostly to the reconstructed sound field to control the sound pressures at receiver positions. Active secondary sources in array are selected according to the positions of the target source and reconstructed region on the basis of appropriate geometrical acoustic criteria. In addition, equalization can be introduced to the filters for driving signals; in this way, the overall sound power at the controlled points can be constant.
As an example of multiple receiver positions matching method, horizontal far-field Ambisonics is considered. The target sound field is a plane wave with a unit amplitude and incident from θS. O controlled points are located uniformly in a circle with radius r, and the azimuth of the oth controlled point is θo, o = 0, 1 … (O − 1). Target sound pressures at controlled points are calculated with Equations (9.3.1) and (9.3.2). M secondary sources are arranged in a circle with radius r0, azimuth of the ith secondary source is θi, the corresponding normalized amplitude of driving signal is Ai, i = 0, 1 … (M − 1). For secondary sources at a far-field distance so that they can be approximated as plane wave sources, the reconstructed sound pressures at O controlled points are calculated using Equation (9.2.14). Matching the reconstructed sound pressures with the target sound pressures at the O controlled points yields a set of O equations:
|
|
|
|
|
|
J0 kr 2 jq Jq kr |
cos q S cos q o sin q S sin q o |
|
|||
|
|
|
q 1 |
|
|
M 1 |
|
|
|
|
|
Ai J0 |
kr 2 jq Jq kr cos q i cos q o sin q i sin q o |
(9.6.31) |
|||
i 0 |
|
q 1 |
|
|
|
|
|
|
|
|
|
o 0,1 |
o 1 |
|
|
According to the discussion in Equations (9.3.14) and (9.3.15), the summation of azimuthal harmonics in Equation (9.6.31) can be truncated up to order Q = integer (kr), which is equivalent to the sampling of the sound field along a circle with radius r at an interval of half wavelength. Moreover, driving signals satisfy Equation (9.3.5) if the number of the controlled points or azimuthal sampling points satisfies the condition of O = M ≥ (2Q + 1). This result can be proven by multiplying cosqθo or sinqθo to both sides of Equation (9.6.31), thereby summing over θo and using the discrete orthogonality of trigonometric functions given in Equations (4.3.16) to (4.3.18). The above example indicates that a match of sound pressures at discrete azimuthal sampling points yields results identical to those obtained by a match of sound pressure in a whole continuous circle if the condition of Shannon–Nyquist spatial sampling theorem is satisfied.
Multiple receiver position-matching methods are closely related to the mode-matching method in Section 9.2.2 (Nelson and Kahana, 2001). By substituting P of target sound pressures with P′ of arbitrary reconstructed sound pressure and using [U]+ = [U]−1, Equation (9.6.28) becomes
U P V E, |
(9.6.32) |

Analysis of multichannel sound field recording and reconstruction 423
Or
PU EV , |
(9.6.33) |
where
PU U P EV V E. |
(9.6.34) |
Therefore, O × O unitarity matrices [U]+ transform the sound pressure vector P′ of the controlled points to a new vector P′U, and M × M unitarity matrices [V]+ transform the driving signal vector E to a new vector EV. Vector P′U and EV are the equivalent representations of P′ and E, which represent the spatial modes of sound field (pressure) and driving signals, respectively. By using Equations (9.6.24) to (9.6.27), Equation (9.6.32) can be written as
u P v E 0, 1 K 1 . |
(9.6.35) |
Equation (9.6.35) indicates that a special spatial mode component u P of the sound field is created by the corresponding spatial mode component v E of driving signals in the SVD representation. Therefore, this method is applied to control the K independent modes of the reconstructed sound field.
Spatial Ambisonics is analyzed as an example to obtain insights into the relationship between multiple receiver position matching and mode-matching methods. Ambisonics, or more strictly, spatial harmonics decomposition and reconstruction, can be regarded as a method of controlling the independent modes of a reconstructed sound field. M secondary sources are arranged on a spherical surface with radius r0 and at Ωi, i = 0, 1 … (M − 1). O controlled points are located on a spherical surface with r < r0 and at Ωo, o = 0, 1 … (O − 1). Similar to Equation (9.3.22), an L2 × M matrix [Y3D(Ωi)] associated with the secondary source array is introduced, and its entries are the real-valued spherical harmonic functions Ylm i of the secondary source direction. Each row of the matrix corresponds to a given (l, m, σ) with l = 0, 1 … (L − 1), m = 0, 1 … l, σ = 1, 2; and each column of the matrix corresponds to a special secondary source direction. Similarly, an L2 × O matrix [Y3D(Ωo)] associated with the locations of controlled points is introduced, and its entries are the real-valued spherical harmonic functions Ylm o of the controlled point directions. Each row of the matrix corresponds to a given (l, m, σ) with l = 0, 1 … (L − 1), m = 0, 1 … l, σ = 1, 2, and each column of the matrix corresponds to a special controlled point direction.
If both secondary sources and controlled points are uniformly or nearly uniformly distributed on spherical surfaces and the number of secondary sources and the number of controlled points satisfy the requirement of Shannon–Nyquist spatial sampling theorem, and if the directional sampling of spherical harmonic functions satisfies the discrete orthogonality given in Equation (A.25) in Appendix A, the matrices [Y3D(Ωi)] and [Y3D(Ωo)] satisfy
4 Y |
|
Y |
|
T |
I |
|
|
M |
3D |
|
i 3D |
|
i |
|
|
|
|
|
|
|
|
|
4 Y |
|
Y |
|
T |
|
I |
, |
(9.6.36) |
||
O |
|
3D |
|
o 3D |
|
o |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
where [I] is an L2 × L2 identity matrix. In Equation (9.6.17), the entries of [G] of the transfer function can be decomposed by real-valued spherical harmonic functions. For secondary

424 Spatial Sound
point sources, the following equation is obtained from Equations (9.2.37) and (9.2.41) or directly from Equation (9.3.35):
|
l |
2 |
|
|
|
|
|
|
|
|
|
|
Goi G rcontro,o, ri , f glYlm o Ylm i |
|
gl jkhl kr0 jl kr . |
(9.6.37) |
|||||||||
l 0 m 0 1 |
|
|
|
|
|
|
|
|
|
|
||
Truncating Equation (9.6.37) up to the order (L − 1), matrix [G] can be written as |
||||||||||||
G |
Y |
|
T |
|
g |
Y |
|
i |
|
, |
(9.6.38) |
|
|
|
3D |
|
o |
|
3D |
|
|
|
|
where [g] is an L2 × L2 diagonal matrix, whose diagonal entries associated with the l-order spherical harmonic functions are denoted by gl. Substituting Equation (9.6.38) into Equation (9.6.17) yields
P Y |
|
T |
|
g |
Y |
|
|
E. |
(9.6.39) |
3D |
|
o |
|
3D |
|
i |
|
Multiplying [Y3D(Ωo)] to both sides of Equation (9.6.39) and using Equation (9.6.36) yields
4 |
Y |
|
P |
|
g |
Y |
|
E. |
(9.6.40) |
|
|
||||||||||
O |
|
3D |
|
o |
|
3D |
|
i |
||
|
|
|
|
|
|
|
|
|
|
The left side of Equation (9.6.40) represents an L2 × 1 column vector P of the preceding
lm
(L − 1) order spherical harmonic coefficients (spectrum) of the sound pressures at O discrete
controlled points. This result can be derived by using the discrete orthogonality of the spheri-
cal harmonic function given in Equation (9.3.68). The component of vector P |
is given in |
||||||||||||||||
P |
|
r, r , f |
|
|
|
|
|
|
|
|
|
|
|
|
|
lm |
|
in Equation (9.2.31). Similarly, the right side of Equation (9.6.40) represents |
|||||||||||||||||
lm |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
an L2 × 1 column vector E of the preceding (L − 1) order spherical harmonic coefficients |
|||||||||||||||||
|
|
|
|
lm |
|
|
|
|
|
|
|
|
|
|
|
|
|
(spectrum) of the driving signals of M secondary sources. The component of vector E is |
|||||||||||||||||
expressed in Elm f in Equation (9.2.44). Then, Equation (9.6.40) becomes |
lm |
||||||||||||||||
|
|||||||||||||||||
|
|
|
|
|
P lm |
|
|
M |
g Elm . |
|
|
|
|
(9.6.41) |
|||
|
|
|
|
|
|
4 |
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
That is, |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
P |
|
r, r , f |
|
|
|
M |
g E |
|
f |
|
. |
(9.6.42) |
|
|
|
|
|
|
|
||||||||||||
|
|
|
|
lm |
0 |
|
|
4 |
l lm |
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Equation (9.6.42) indicates that a special spatial mode component of the sound field is also created by the corresponding spatial mode component of driving signals in spherical harmonic representation. Equation (9.6.42) is equivalent to Equation (9.2.45) except for a scale caused by the discrete secondary source array.
Two matrices, namely, an L2 × O matrix [TU] and an L2 × M matrix [TV] are introduced,
and they satisfy [TU] [TU]+ = [I] and [TV] [TV]+ = [I], where [I] is an L2 × L2 identity matrix. Inserting these two matrices into Equation (9.6.39) yields
P Y |
|
T |
|
T T |
|
g |
|
T T Y |
|
E. |
(9.6.43) |
3D |
|
o |
U U |
|
V V 3D |
|
i |