Abstract
Sound field recording using spherical harmonics (SH) has been widely used. However, too many microphones are needed when recording sound fields over large areas, due to the capture of the higher order of spherical harmonic coefficients. The theory of GO in deep learning inspired us. With training the data much less than all GO’s legal positions data, the Alpha Go has defeated top GO players. According to the information learned from a specific dataset, the higher spherical harmonics coefficients may be estimated with few captured sound pressures. In this paper, a learning-based approach for estimation of the SH coefficients has been investigated. In the proposed approach, SH coefficients are estimated with a feed-forward neural network (FNN) based on measurements of a spherical array. We generate a uniformly distributed dataset, try to evaluate the method on an average situation. Moreover, with the real sound field data in the SOFiA dataset, we try to evaluate the performance of our method when the correlations of data are weak. Experimental results show that the proposed approach achieves higher estimation accuracy of SH coefficients than a previously reported method. In simulations, 9 microphones’ performance using the proposed approach can approximate an array with 16 microphones. The experiments confirmed the feasibility of estimating the SH coefficients with the data-driven method. Thus in a specific application, it can be used to reduce the required number of microphones.
Similar content being viewed by others
References
Abhayapala T, Gupta A (2010) Spherical harmonic analysis of wavefields using multiple circular sensor arrays. IEEE Transactions on Audio Speech & Language Processing 18(6):1655–1666
Abhayapala T, Ward DB (2002) Theory and design of high order sound field microphones using spherical microphone array. In: IEEE International conference on acoustics, speech, and signal processing, pp II–1949–II–1952
Alon DL, Rafaely B (2017) Beamforming with optimal aliasing cancellation in spherical microphone arrays. IEEE/ACM Transactions on Audio Speech & Language Processing 24(1):196–210
Bishop CM (2006) Pattern recognition and machine learning. Springer
Chang J, Marschall M (2018) Periphony-lattice mixed-order ambisonic scheme for spherical microphone arrays. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 26(5):924–936
Chen H, Abhayapala T, Zhang W (2015) 3d sound field analysis using circular higher-order microphone array. In: 2015 23rd European signal processing conference (EUSIPCO). IEEE, pp 1153–1157
Chen H, Abhayapala T, Zhang W (2015) Theory and design of compact hybrid microphone arrays on two-dimensional planes for three-dimensional soundfield analysis. J Acoust Soc Am 138(5):3081
Chollet F et al (2015) Keras. https://github.com/fchollet/keras
Epain N, Jin CT, Epain N, Jin CT, Epain N, Jin CT (2016) Spherical harmonic signal covariance and sound field diffuseness. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 24(10):1796–1807
Fahim A, Samarasinghe PN, Abhayapala T (2017) Sound field separation in a mixed acoustic environment using a sparse array of higher order spherical microphones. In: Hands-free speech communications and microphone arrays
Fliege J Integration nodes for the sphere. http://www.personal.soton.ac.uk/jf1w07/nodes/nodes.html
Gerzon MA (1973) Periphony: with-height sound reproduction. J Audio Eng Soc 21(1):2–10
Gerzon MA (1985) Ambisonics in multichannel broadcasting and video. J Audio Eng Soc 33(11):859–871
Gupta A, Abhayapala T (2010) Double sided cone array for spherical harmonic analysis of wavefields. In: IEEE International conference on acoustics speech and signal processing, pp 77–80
Hohnerlein C, Ahrens J (2017) Spherical microphone array processing in python with the sound field analysis-py toolbox. Proc of DAGA, Kiel Germany
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (ToG) 36(4):107
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arxiv, pp 448–456
Jin CT, Epain N, Parthy A (2013) Design, optimization and evaluation of a dual-radius spherical microphone array. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(1):193–204
Kennedy RA, Sadeghi, Abhayapala T, Jones HM (2007) Intrinsic limits of dimensionality and richness in random multipath fields. IEEE Transactions on Signal Processing 55(6):2542–2556
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Koyama S, Furuya K, Wakayama K, Shimauchi S, Saruwatari H (2016) Analytical approach to transforming filter design for sound field recording and reproduction using circular arrays with a spherical baffle. J Acoust Soc Am 139(3):1024
Kumar L, Hegde RM (2016) Near-field acoustic source localization and beamforming in spherical harmonics domain. IEEE Transactions on Signal Processing 64(13):3351–3361
Miller E, Rafaely B (2019) The role of direct sound spherical harmonics representation in externalization using binaural reproduction. Appl Acoust 148:40–45
Okamoto T (2019) Horizontal 3d sound field recording and 2.5 d synthesis with omni-directional circular arrays. In: ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 960–964
Park M, Rafaely B (2005) Sound-field analysis by plane-wave decomposition using spherical microphone array. J Acoust Soc Am 118(5):3094–3103
Poletti MA (2005) Three-dimensional surround sound systems based on spherical harmonics. J Audio Eng Soc 53(11):1004–1025
Pomberger H, Pausch F (2014) Design and evaluation of a spherical segment array with double cone. Acta Acustica United with Acustica 100(5):921–927
Rafaely B (2005) Analysis and design of spherical microphone arrays. IEEE Transactions on Speech and Audio Processing 13(1):135–143
Samarasinghe PN, Abhayapala T (2017) Blind estimation of directional properties of room reverberation using a spherical microphone array. In: IEEE International conference on acoustics, speech and signal processing
Samarasinghe PN, Abhayapala T, Chen H (2017) Estimating the direct-to-reverberant energy ratio using a spherical harmonics-based spatial correlation model. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(2):310–319
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Sun Y, Chen J, Yuen C, Rahardja S (2017) Indoor sound source localization with probabilistic neural network. IEEE Trans Ind Electron 65(8):6403–6413
Tromp J Number of legal go positions. https://tromp.github.io/go/legal.html
Ueno N, Koyama S, Saruwatari H (2018) Sound field recording using distributed microphones based on harmonic analysis of infinite order. IEEE Signal Processing Letters 25(1):135–139
Wakayama K, Trevino J, Takada H, Sakamoto S, Suzuki Y (2017) Extended sound field recording using position information of directional sound sources. In: 2017 IEEE Workshop on applications of signal processing to audio and acoustics (WASPAA). IEEE, pp 185–189
Ward DB, Abhayapala T (2001) Reproduction of a plane-wave sound field using an array of loudspeakers. IEEE Transactions on Speech and Audio Processing 9(6):697–707. 10.1109/89.943347
Williams EG (1999) Fourier acoustics: sound radiation and nearfield acoustical holography. Academic Press
Zhang W, Samarasinghe P, Chen H, Abhayapala T (2017) Surround by sound: a review of spatial audio recording and reproduction. Appl Sci 7 (5):532
Zuo H, Samarasinghe PN, Abhayapala T (2018) Exterior-interior 3d sound field separation using a planar array of differential microphones. In: 2018 16th international workshop on acoustic signal enhancement (IWAENC). IEEE, pp 216–220
Acknowledgments
This research is partially supported by the National Key R&D Program of China (No. 2017YFB1002803), National Nature Science Foundation of China (No. U1736206, No. 61761044), Hubei Province Technological Innovation Major Project (No. 2017AAA123).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, L., Wang, X., Hu, R. et al. Estimation of spherical harmonic coefficients in sound field recording using feed-forward neural networks. Multimed Tools Appl 80, 6187–6202 (2021). https://doi.org/10.1007/s11042-020-09979-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09979-z