Skip to main content

Advertisement

Log in

A classification and quantification approach to generate features in soundscape ecology using neural networks

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In soundscape ecology analysis, the use of acoustic features is well established and offers important baselines to ecological analyses. However, in many cases, the problem is difficult due to high-class overlap in terms of time-frequency characteristics, as well as the presence of noise. Deep neural networks have become state-of-the-art for feature learning in many multi-class applications, but they often present issues such as over-fitting or achieve unbalanced performances for different classes, which can hamper the deployment of such models in realistic scenarios. In the context of counting the number of classes in observations, the quantification task is attracting attention and was shown to be effective in other applications. This paper investigates the use of quantification combined with classification loss in order to train a convolutional neural network to classify species of birds and anurans. Results indicate quantification has advantages over both acoustic features alone and the use of regular classification networks, in particular in terms of generalization and class recall making it a suitable choice for segregation tasks related to soundscape ecology. Moreover, we show that a more compact network can outperform a deeper one for fine-grained scenarios of birds and anurans species.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Animal order of Amphibia classes such as frogs and toads.

  2. https://github.com/fabiofelix/CNN-CQ.

  3. Spatial Ecology and Conservation Lab - LEEC. website: https://github.com/LEEClab.

  4. The complete area of the ecological corridor is located between northeastern São Paulo state and south Minas Gerais state, Brazil.

  5. https://librosa.github.io/librosa/.

  6. https://www.kaggle.com/huseinzol05/sound-augmentation-librosa.

  7. http://ljvillanueva.github.io/soundecology/.

  8. https://cran.r-project.org/web/packages/tuneR.

  9. https://scikit-learn.org/stable/.

  10. https://keras.io/.

  11. https://www.tensorflow.org/.

  12. https://keras.io/getting_started/faq/.

References

  1. Aalborg University (2004) The mel frequency scale and coefficients. http://kom.aau.dk/group/04gr742/pdf/MFCC_worksheet.pdf

  2. Bedoya C, Isaza C, Daza JM, López JD (2017) Automatic identification of rainfall in acoustic recordings. Ecol Indic 75:95–100

    Article  Google Scholar 

  3. Beijbom O, Hoffman J, Yao E, Darrell T, Rodriguez-Ramirez A, Gonzalez-Rivero M, Guldberg OH (2015) Quantification in-the-wild: data-sets and baselines. arXiv preprint arXiv:1510.04811

  4. Bella A, Ferri C, Hernández-Orallo J, Ramirez-Quintana MJ (2010). Quantification via probability estimators. In: IEEE international conference on data mining. IEEE, pp 737–742

  5. Boelman NT, Asner GP, Hart PJ, Martin RE (2007) Multi-trophic invasion resistance in hawaii: bioacoustics, field surveys, and airborne remote sensing. Ecol Appl 17(8):2137–2144

    Article  Google Scholar 

  6. Bottou L (1998) Online algorithms and stochastic approximations. In: Saad D (ed) Online learning and neural networks. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  7. Bradfer-Lawrence T, Gardner N, Bunnefeld L, Bunnefeld N, Willis SG, Dent DH (2019) Guidelines for the use of acoustic indices in environmental research. Methods Ecol Evol 10(10):1796–1807

    Article  Google Scholar 

  8. Briggs F, Lakshminarayanan B, Neal L, Fern XZ, Raich R, Hadley SJK, Hadley AS, Betts MG (2012) Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. J Acoust Soc Am 131(6):4640–4650

    Article  Google Scholar 

  9. Brown A, Garg S, Montgomery J (2019) Automatic rain and cicada chorus filtering of bird acoustic data. Appl Soft Comput 81:105501

    Article  Google Scholar 

  10. Cakır E, Parascandolo G, Heittola T, Huttunen H, Virtanen T (2017) Convolutional recurrent neural networks for polyphonic sound event detection. IEEE/ACM Trans Audio Speech Lang Process 25(6):1291–1303

    Article  Google Scholar 

  11. Cavallari GB, Ribeiro LS, Ponti MA (2018). Unsupervised representation learning using convolutional and stacked auto-encoders: a domain and cross-domain feature space analysis. In: 31st SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE, pp 440–446

  12. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  13. Depraetere M, Pavoine S, Jiguet F, Gasc A, Duvail S, Sueur J (2012) Monitoring animal diversity using acoustic indices: Implementation in a temperate woodland. Ecol Indic 13(1):46–54

    Article  Google Scholar 

  14. Dong X, Towsey M, Zhang J, Roe P (2015) Compact features for birdcall retrieval from environmental acoustic recordings. In: Proceedings of the 2015 IEEE 15th international conference on data mining workshops. IEEE Computer Society, pp 1–6

  15. Dröge S, Martin DA, Andriafanomezantsoa R, Burivalova Z, Fulgence TR, Osen K, Rakotomalala E, Schwab D, Wurz A, Richter T et al (2021) Listening to a changing landscape: acoustic indices reflect bird species richness and plot-scale vegetation structure across different land-use types in north-eastern madagascar. Ecol Indic 120:106929

    Article  Google Scholar 

  16. Forman G (2005) Counting positives accurately despite inaccurate classification. European conference on machine learning. Springer, Berlin, pp 564–575

    Google Scholar 

  17. Gao W, Sebastiani F (2015) Tweet sentiment: from classification to quantification. In: IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 97–104

  18. Gao W, Sebastiani F (2016) From classification to quantification in tweet sentiment analysis. Soc Netw Anal Min 6(1):19

    Article  Google Scholar 

  19. Gasc A, Sueur J, Pavoine S, Pellens R, Grandcolas P (2013) Biodiversity sampling using a global acoustic approach: contrasting sites with microendemics in new caledonia. PLoS ONE 8(5):e65311

    Article  Google Scholar 

  20. González P, Castaño A, Chawla NV, Coz JJD (2017) A review on quantification learning. ACM Comput Surv (CSUR) 50(5):1–40

    Article  Google Scholar 

  21. González P, Díez J, Chawla N, del Coz JJ (2017) Why is quantification an interesting learning problem? Prog Artif Intell 6(1):53–58

    Article  Google Scholar 

  22. González-Castro V, Alaiz-Rodríguez R, Alegre E (2013) Class distribution estimation based on the hellinger distance. Inf Sci 218:146–164

    Article  Google Scholar 

  23. Harvey M (2018) Acoustic detection of humpback whales using a convolutional neural network. https://ai.googleblog.com/2018/10/acoustic-detection-of-humpback-whales.html

  24. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  25. Hilasaca LMH, Gaspar LP, Ribeiro MC, Minghim R (2021) Visualization and categorization of ecological acoustic events based on discriminant features. Ecol Indic 126:107316

    Article  Google Scholar 

  26. Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data 6(1):1–54

    Article  Google Scholar 

  27. Kasten EP, Gage SH, Fox J, Joo W (2012) The remote environmental assessment laboratory’s acoustic library: an archive for studying soundscape ecology. Ecol Inform 12:50–67

    Article  Google Scholar 

  28. Kingma, D.P., Ba, J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  29. Kornblith S, Shlens J, Le QV (2019) Do better imagenet models transfer better? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2661–2671

  30. Krause B (1987) Bioacoustics, habitat ambience in ecological balance. Whole Earth Rev 57:14–18

    Google Scholar 

  31. LeBien J, Zhong M, Campos-Cerqueira M, Velev JP, Dodhia R, Ferres JL, Aide TM (2020) A pipeline for identification of bird and frog species in tropical soundscape recordings using a convolutional neural network. Ecol Inform 59:101113

    Article  Google Scholar 

  32. Lin TH, Fang SH, Tsao Y (2017) Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings. Sci Rep 7(1):4547

    Article  Google Scholar 

  33. Lin TH, Tsao Y (2020) Source separation in ecoacoustics: a roadmap towards versatile soundscape information retrieval. Remote Sens Ecol Conserv 6(3):236–247

    Article  Google Scholar 

  34. van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  35. Maletzke A, dos Reis D, Cherman E, Batista G (2019) Dys: a framework for mixture models in quantification. Proc AAAI Confer Artif Intell 33:4552–4560

    Google Scholar 

  36. Maletzke AG, dos Reis DM, Batista GE (2017). Quantification in data streams: Initial results. In: Brazilian conference on intelligent systems (BRACIS). IEEE, pp 43–48

  37. Mello RF, Ponti MA (2018) Machine learning: a practical approach on the statistical learning theory. Springer, Berlin

    Book  MATH  Google Scholar 

  38. Mezquida DA, Martínez JL (2009) Platform for bee-hives monitoring based on sound analysis. a perpetual warehouse for swarm’s daily activity. Span J Agric Res 7(4):824–828

  39. Mitchell SL, Bicknell JE, Edwards DP, Deere NJ, Bernard H, Davies ZG, Struebig MJ (2020) Spatial replication and habitat context matters for assessments of tropical biodiversity using acoustic indices. Ecol Indic 119:106717

    Article  Google Scholar 

  40. Nonato LG, Aupetit M (2018) Multidimensional projection for visual analytics: linking techniques with distortions, tasks, and layout enrichment. IEEE Trans Vis Comput Graph 25(8):2650–2673

    Article  Google Scholar 

  41. Parascandolo G, Huttunen H, Virtanen T (2016) Recurrent neural networks for polyphonic sound event detection in real life recordings. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6440–6444

  42. Parks SE, Miksis-Olds JL, Denes SL (2014) Assessing marine ecosystem acoustic diversity across ocean basins. Ecol Inform 21:81–88

    Article  Google Scholar 

  43. Pekin B, Jung J, Villanueva-Rivera L, Pijanowski B, Ahumada J (2012) Modeling acoustic diversity using soundscape recordings and lidar-derived metrics of vertical forest structure in aneotropical rainforest. Landsc Ecol 27(10):1513–1522

    Article  Google Scholar 

  44. Perez, L., Wang, J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621

  45. Pieretti N, Farina A, Morri D (2011) A new methodology to infer the singing activity of an avian community: the Acoustic Complexity Index (ACI). Ecol Indic 11(3):868–873

    Article  Google Scholar 

  46. Pijanowski BC, Farina A, Gage SH, Dumyahn SL, Krause BL (2011) What is soundscape ecology? An introduction and overview of an emerging new science. Landsc Ecol 26(9):1213–1232

    Article  Google Scholar 

  47. Ponti M.A, Ribeiro L.S.F, Nazare T.S, Bui T, Collomosse J (2017) Everything you wanted to know about deep learning for computer vision but were afraid to ask. In: SIBGRAPI-conference on graphics, patterns and images. Brazilian Computer Society (SBC)

  48. Ramsay JO (2006) Functional data analysis. Wiley Online Library

  49. Righini R, Pavan G (2020) A soundscape assessment of the sasso fratino integral nature reserve in the central apennines, italy. Biodiversity 21(1):4–14

    Article  Google Scholar 

  50. Salamon J, Bello JP (2015). Unsupervised feature learning for urban sound classification. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 171–175

  51. Salamon J, Bello JP (2017) Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process Lett 24(3):279–283

    Article  Google Scholar 

  52. Sánchez-Gendriz I, Padovese L (2016) Underwater soundscape of marine protected areas in the south Brazilian coast. Mar Pollut Bull 105(1):65–72

    Article  Google Scholar 

  53. Scarpelli MD, Ribeiro MC, Teixeira CP (2021) What does atlantic forest soundscapes can tell us about landscape? Ecol Indicat 121:107050

    Article  Google Scholar 

  54. Scarpelli MD, Ribeiro MC, Teixeira FZ, Young RJ, Teixeira CP (2020) Gaps in terrestrial soundscape research: it’s time to focus on tropical wildlife. Sci Total Environ 707:135403

    Article  Google Scholar 

  55. Servick K (2014) Eavesdropping on ecosystems. Science 343:834–837

    Article  Google Scholar 

  56. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  57. Stowell D, Plumbley MD (2014) Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2:e488

    Article  Google Scholar 

  58. Strout J, Rogan B, Seyednezhad SM, Smart K, Bush M, Ribeiro E (2017) Anuran call classification with deep learning. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2662–2665

  59. Sueur J, Aubin T, Simonis C (2008) Seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2):213–226

    Article  Google Scholar 

  60. Sueur J, Farina A, Gasc A, Pieretti N, Pavoine S (2014) Acoustic indices for biodiversity assessment and landscape investigation. Acta Acust United Acust 100(4):772–781

    Article  Google Scholar 

  61. Sueur J, Pavoine S, Hamerlynck O, Duvail S (2008) Rapid acoustic survey for biodiversity appraisal. PLoS ONE 3(12):e4065

    Article  Google Scholar 

  62. Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Pearson Education India, Noida

    Google Scholar 

  63. Tasche D (2014) Exact fit of simple finite mixture models. J Risk Financ Manag 7(4):150–164

    Article  Google Scholar 

  64. Thomas M, Martin B, Kowarski K, Gaudet B, Matwin S (2019) Marine mammal species classification using convolutional neural networks and a novel acoustic representation. Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 290–305

    Google Scholar 

  65. Towsey M, Wimmer J, Williamson I, Roe P (2014) The use of acoustic indices to determine avian species richness in audio-recordings of the environment. Ecol Inform 21:110–119

    Article  Google Scholar 

  66. Villanueva-Rivera L, Pijanowski B, Doucette J, Pekin B (2011) A primer of acoustic analysis for landscape ecologists. Landsc Ecol 26(9):1233–1246

    Article  Google Scholar 

  67. Welch P (1967) The use of fast fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans Audio Electroacoust 15(2):70–73

    Article  Google Scholar 

Download references

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) Finance Code 001, FAPESP (Grant #2019/07316-0) and the Conselho Nacional de Desenvolvimento Científico e Tecnológico - Brasil (CNPq) Grant #307411/2016-8 and #304266/2020-5. The authors would like to thank professor Mílton C. Ribeiro from the São Paulo State University, Rio Claro, Brazil, for his data and useful feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fábio Felix Dias.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dias, F.F., Ponti, M.A. & Minghim, R. A classification and quantification approach to generate features in soundscape ecology using neural networks. Neural Comput & Applic 34, 1923–1937 (2022). https://doi.org/10.1007/s00521-021-06501-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06501-w

Keywords

Navigation