Skip to main content
Log in

Estimation of spherical harmonic coefficients in sound field recording using feed-forward neural networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Sound field recording using spherical harmonics (SH) has been widely used. However, too many microphones are needed when recording sound fields over large areas, due to the capture of the higher order of spherical harmonic coefficients. The theory of GO in deep learning inspired us. With training the data much less than all GO’s legal positions data, the Alpha Go has defeated top GO players. According to the information learned from a specific dataset, the higher spherical harmonics coefficients may be estimated with few captured sound pressures. In this paper, a learning-based approach for estimation of the SH coefficients has been investigated. In the proposed approach, SH coefficients are estimated with a feed-forward neural network (FNN) based on measurements of a spherical array. We generate a uniformly distributed dataset, try to evaluate the method on an average situation. Moreover, with the real sound field data in the SOFiA dataset, we try to evaluate the performance of our method when the correlations of data are weak. Experimental results show that the proposed approach achieves higher estimation accuracy of SH coefficients than a previously reported method. In simulations, 9 microphones’ performance using the proposed approach can approximate an array with 16 microphones. The experiments confirmed the feasibility of estimating the SH coefficients with the data-driven method. Thus in a specific application, it can be used to reduce the required number of microphones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Abhayapala T, Gupta A (2010) Spherical harmonic analysis of wavefields using multiple circular sensor arrays. IEEE Transactions on Audio Speech & Language Processing 18(6):1655–1666

    Article  Google Scholar 

  2. Abhayapala T, Ward DB (2002) Theory and design of high order sound field microphones using spherical microphone array. In: IEEE International conference on acoustics, speech, and signal processing, pp II–1949–II–1952

  3. Alon DL, Rafaely B (2017) Beamforming with optimal aliasing cancellation in spherical microphone arrays. IEEE/ACM Transactions on Audio Speech & Language Processing 24(1):196–210

    Article  Google Scholar 

  4. Bishop CM (2006) Pattern recognition and machine learning. Springer

  5. Chang J, Marschall M (2018) Periphony-lattice mixed-order ambisonic scheme for spherical microphone arrays. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 26(5):924–936

    Article  Google Scholar 

  6. Chen H, Abhayapala T, Zhang W (2015) 3d sound field analysis using circular higher-order microphone array. In: 2015 23rd European signal processing conference (EUSIPCO). IEEE, pp 1153–1157

  7. Chen H, Abhayapala T, Zhang W (2015) Theory and design of compact hybrid microphone arrays on two-dimensional planes for three-dimensional soundfield analysis. J Acoust Soc Am 138(5):3081

    Article  Google Scholar 

  8. Chollet F et al (2015) Keras. https://github.com/fchollet/keras

  9. Epain N, Jin CT, Epain N, Jin CT, Epain N, Jin CT (2016) Spherical harmonic signal covariance and sound field diffuseness. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 24(10):1796–1807

    Article  Google Scholar 

  10. Fahim A, Samarasinghe PN, Abhayapala T (2017) Sound field separation in a mixed acoustic environment using a sparse array of higher order spherical microphones. In: Hands-free speech communications and microphone arrays

  11. Fliege J Integration nodes for the sphere. http://www.personal.soton.ac.uk/jf1w07/nodes/nodes.html

  12. Gerzon MA (1973) Periphony: with-height sound reproduction. J Audio Eng Soc 21(1):2–10

    Google Scholar 

  13. Gerzon MA (1985) Ambisonics in multichannel broadcasting and video. J Audio Eng Soc 33(11):859–871

    Google Scholar 

  14. Gupta A, Abhayapala T (2010) Double sided cone array for spherical harmonic analysis of wavefields. In: IEEE International conference on acoustics speech and signal processing, pp 77–80

  15. Hohnerlein C, Ahrens J (2017) Spherical microphone array processing in python with the sound field analysis-py toolbox. Proc of DAGA, Kiel Germany

  16. Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (ToG) 36(4):107

    Article  Google Scholar 

  17. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arxiv, pp 448–456

  18. Jin CT, Epain N, Parthy A (2013) Design, optimization and evaluation of a dual-radius spherical microphone array. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(1):193–204

    Article  Google Scholar 

  19. Kennedy RA, Sadeghi, Abhayapala T, Jones HM (2007) Intrinsic limits of dimensionality and richness in random multipath fields. IEEE Transactions on Signal Processing 55(6):2542–2556

    Article  MathSciNet  Google Scholar 

  20. Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  21. Koyama S, Furuya K, Wakayama K, Shimauchi S, Saruwatari H (2016) Analytical approach to transforming filter design for sound field recording and reproduction using circular arrays with a spherical baffle. J Acoust Soc Am 139(3):1024

    Article  Google Scholar 

  22. Kumar L, Hegde RM (2016) Near-field acoustic source localization and beamforming in spherical harmonics domain. IEEE Transactions on Signal Processing 64(13):3351–3361

    Article  MathSciNet  Google Scholar 

  23. Miller E, Rafaely B (2019) The role of direct sound spherical harmonics representation in externalization using binaural reproduction. Appl Acoust 148:40–45

    Article  Google Scholar 

  24. Okamoto T (2019) Horizontal 3d sound field recording and 2.5 d synthesis with omni-directional circular arrays. In: ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 960–964

  25. Park M, Rafaely B (2005) Sound-field analysis by plane-wave decomposition using spherical microphone array. J Acoust Soc Am 118(5):3094–3103

    Article  Google Scholar 

  26. Poletti MA (2005) Three-dimensional surround sound systems based on spherical harmonics. J Audio Eng Soc 53(11):1004–1025

    Google Scholar 

  27. Pomberger H, Pausch F (2014) Design and evaluation of a spherical segment array with double cone. Acta Acustica United with Acustica 100(5):921–927

    Article  Google Scholar 

  28. Rafaely B (2005) Analysis and design of spherical microphone arrays. IEEE Transactions on Speech and Audio Processing 13(1):135–143

    Article  Google Scholar 

  29. Samarasinghe PN, Abhayapala T (2017) Blind estimation of directional properties of room reverberation using a spherical microphone array. In: IEEE International conference on acoustics, speech and signal processing

  30. Samarasinghe PN, Abhayapala T, Chen H (2017) Estimating the direct-to-reverberant energy ratio using a spherical harmonics-based spatial correlation model. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(2):310–319

    Article  Google Scholar 

  31. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  32. Sun Y, Chen J, Yuen C, Rahardja S (2017) Indoor sound source localization with probabilistic neural network. IEEE Trans Ind Electron 65(8):6403–6413

    Article  Google Scholar 

  33. Tromp J Number of legal go positions. https://tromp.github.io/go/legal.html

  34. Ueno N, Koyama S, Saruwatari H (2018) Sound field recording using distributed microphones based on harmonic analysis of infinite order. IEEE Signal Processing Letters 25(1):135–139

    Article  Google Scholar 

  35. Wakayama K, Trevino J, Takada H, Sakamoto S, Suzuki Y (2017) Extended sound field recording using position information of directional sound sources. In: 2017 IEEE Workshop on applications of signal processing to audio and acoustics (WASPAA). IEEE, pp 185–189

  36. Ward DB, Abhayapala T (2001) Reproduction of a plane-wave sound field using an array of loudspeakers. IEEE Transactions on Speech and Audio Processing 9(6):697–707. 10.1109/89.943347

    Article  Google Scholar 

  37. Williams EG (1999) Fourier acoustics: sound radiation and nearfield acoustical holography. Academic Press

  38. Zhang W, Samarasinghe P, Chen H, Abhayapala T (2017) Surround by sound: a review of spatial audio recording and reproduction. Appl Sci 7 (5):532

    Article  Google Scholar 

  39. Zuo H, Samarasinghe PN, Abhayapala T (2018) Exterior-interior 3d sound field separation using a planar array of differential microphones. In: 2018 16th international workshop on acoustic signal enhancement (IWAENC). IEEE, pp 216–220

Download references

Acknowledgments

This research is partially supported by the National Key R&D Program of China (No. 2017YFB1002803), National Nature Science Foundation of China (No. U1736206, No. 61761044), Hubei Province Technological Innovation Major Project (No. 2017AAA123).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochen Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Wang, X., Hu, R. et al. Estimation of spherical harmonic coefficients in sound field recording using feed-forward neural networks. Multimed Tools Appl 80, 6187–6202 (2021). https://doi.org/10.1007/s11042-020-09979-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09979-z

Keywords

Navigation