Skip to main content

Automatic Sound-Imitation Word Recognition from Environmental Sounds Focusing on Ambiguity Problem in Determining Phonemes

  • Conference paper
PRICAI 2004: Trends in Artificial Intelligence (PRICAI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3157))

Included in the following conference series:

Abstract

Sound-imitation words (SIWs), or onomatopoeia, are important for computer human interactions and the automatic tagging of sound archives. The main problem in automatic SIW recognition is ambiguity in the determining phonemes, since different listener hears the same environmental sound as a different SIW even under the same situation. To solve this problem, we designed a set of new phonemes, called the basic phoneme-group set, to represent environmental sounds in addition to a set of the articulation-based phoneme-groups. Automatic SIW recognition based on Hidden Markov Model (HMM) with the basic phoneme-groups is allowed to generate plural SIWs in order to absorb ambiguities caused by listener- and situation-dependency. Listening experiments with seven subjects proved that automatic SIW recognition based on the basic phoneme-groups outperformed that based on the articulation-based phoneme-groups and that based on Japanese phonemes. The proposed system proved more adequate to use computer interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jahns, G., et al.: Sound Analysis to Recognize Individuals and Animal Conditions, XIII CIGR Congress on Agricultural (1998)

    Google Scholar 

  2. Nagahata, K.: A study of how visually impaired persons identify a place using environmental sounds. Journal of the Acoustic Society of Japan 56(6), 406–417 (2000)

    Google Scholar 

  3. Zhang, T., Jay Kuo, C.C.: Audio-guided audiovisual data segmentation, indexing, and retrieval. In: Proc. of the SPIE The International Society for Optical Emphasis Engineering, vol. 3656, pp. 316–327 (1998)

    Google Scholar 

  4. Darvishi, A.: World Wide Web access for blind people: problems, available solutions and an approach for using environmental sounds. In: Proc. of the 5th International conference on Computers helping people with special needs, vol. 1, pp. 369–373 (1996)

    Google Scholar 

  5. Ashiya, T., et al.: IOSES: An Indoor Observation System Based on Environmental Sounds Recognition Using a Neural Network. Trans. of the Institute of Electrical Engineers of Japan 116-C(3), 341–349 (1996)

    Google Scholar 

  6. Tanaka, K.: Study of Onomatopoeia Expressing Strange Sounds (Case if Impulse Sounds and Beat Sounds). Trans. of the Japan Society of Mechanical Engineers Series C 61(592) (1995) (in Japanese)

    Google Scholar 

  7. Wake, S., Asahi, T.: Sound Retrieval with Intuitive Verbal Descriptions, IEICE 2001. Trans. on Information and Systems E84-D(11), 1568–1576 (2001)

    Google Scholar 

  8. Ishihara, K., Tsubota, Y., Okuno, H.G.: Automatic Transformation of Environmental Sounds into Sound-ImitationWords Based on Japanese Syllable Structure. In: Proc. of EUROSPEECH 2003, pp. 3185–3188 (2003)

    Google Scholar 

  9. HTK3.0, http://htk.eng.cam.ac.uk/

  10. Hiyane, K.: Study of Spectrum Structure of Short-time Sounds and its Onomatopoeia Expression, IEICE Technical Report, SP97-125 (1998) (in Japanese)

    Google Scholar 

  11. Ladefoged, P.: A Cours In Phonetics. Harcourt Brace College Publishers (1993)

    Google Scholar 

  12. Hattori, Y., et al.: Repeat recognition of Continuous Environmental Sound. Information Processing Society of Japan (2003) (in Japanese)

    Google Scholar 

  13. Cowling, M., Sitte, R.: Comparison of techniques for environmental sound recognition. Pattern Recognition Letter 24, 2895–2907 (2003)

    Article  Google Scholar 

  14. Tamori, I., Schourup, L.: Onomatopoeia – ke-i-ta-i to i-mi –. Kuroshio Publisher (1999)

    Google Scholar 

  15. RWCP Sound Scene Database in Real Acoustical Environments, http://tosa.mri.co.jp/sounddb/indexe.htm

  16. SHI-N KO-KA-O-N DA-I-ZE-N-SHU, KING RECORD (in Japanese)

    Google Scholar 

  17. KO-KA-O-N DA-I-ZE-N-SHU, KING RECORD (in Japanese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ishihara, K., Nakatani, T., Ogata, T., Okuno, H.G. (2004). Automatic Sound-Imitation Word Recognition from Environmental Sounds Focusing on Ambiguity Problem in Determining Phonemes. In: Zhang, C., W. Guesgen, H., Yeap, WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science(), vol 3157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28633-2_96

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28633-2_96

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22817-2

  • Online ISBN: 978-3-540-28633-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics