Mapping Phonetic Features for Voice-Driven Sound Synthesis

Janer, Jordi; Maestre, Esteban

doi:10.1007/978-3-540-88653-2_23

Jordi Janer³ &
Esteban Maestre³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 23))

Included in the following conference series:

International Conference on E-Business and Telecommunications

663 Accesses

Abstract

In applications where the human voice controls the synthesis of musical instruments sounds, phonetics convey musical information that might be related to the sound of the imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrument-dependent, but they remain constant for a single subject and instrument. We propose a user-adapted system, where mappings from voice features to synthesis parameters depend on how subjects sing musical articulations, i.e. note to note transitions. The system consists of two components. First, a voice signal segmentation module that automatically determines note-to-note transitions. Second, a classifier that determines the type of musical articulation for each transition based on a set of phonetic features. For validating our hypothesis, we run an experiment where subjects imitated real instrument recordings with their voice. Performance recordings consisted of short phrases of saxophone and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (user-independent). Finally, from the previous results we show how to control the articulation in a sample-concatenation synthesizer by selecting the most appropriate samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lesaffre, M., Tanghe, K., Martens, G., Moelants, D., Leman, M., Baets, B.D., Meyer, H.D., Martens, J.: The mami query-by-voice experiment: Collecting and annotating vocal queries for music information retrieval. In: Proceedings of the ISMIR 2003, 4th International Conference on Music Information Retrieval, Baltimore (2003)
Google Scholar
Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of International Computer Music Conference 2006, New Orleans (2006)
Google Scholar
Jehan, T., Schoner, B.: An audio-driven, spectral analysis-based, perceptually meaningful timbre synthesizer, Amsterdam, Netherland (2001)
Google Scholar
Poepel, C., Dannenberg, R.B.: Audio signal driven sound synthesis. In: ICMC 2005 International Computer Music Conference, Barcelona, Spain, pp. 391–394. ICMC (2005)
Google Scholar
Janer, J.: Voice-controlled plucked bass guitar through two synthesis techniques. In: Int. Conf. on New Interfaces for Musical Expression, Vancouver, Canada, pp. 132–134 (2005)
Google Scholar
Bonada, J., Serra, X.: Synthesis of the singing voice by performance sampling and spectral models. IEEE Signal Processing Magazine 24, 67–79 (2007)
Article Google Scholar
Lindemann, E.: Music synthesis with reconstructive phrase modeling. IEEE Signal Processing Magazine 24(2), 80–91 (2007)
Article Google Scholar
Wanderley, M., Depalle, P.: Contrôle Gestuel de la Synthèse Sonore. In: Vinet, H., Delalande, F. (eds.) Interfaces homme - machine et création musicale, pp. 145–163. Hermès Science Publishing, Paris (1999)
Google Scholar
Egozy, E.B.: Deriving musical control features from a real-time timbre analysis of the clarinet. Master’s thesis, Massachusetts Institut of Technology (1995)
Google Scholar
Widmer, G., Goebl, W.: Computational models of expressive music performance: The state of the art. J. New Music Research 3, 203–216 (2004)
Article Google Scholar
Sundberg, J.: Musical significance of musicians’ syllable choice in improvised nonsense text singing: A preliminary study. Phonetica 54, 132–145 (1994)
Article Google Scholar
Lieberman, P., Blumstein, S.E.: Speech physiology, speech perception, and acoustic phonetics. Cambridge University Press, Cambridge (1986)
Google Scholar
Maestre, E., Gómez, E.: Automatic characterization of dynamics and articulation of monophonic expressive recordings. In: Procedings of the 118th AES Convention (2005)
Google Scholar
Bonada, J., Blaauw, M., Loscos, A.: Improvements to a sample-concatenation based singing voice synthesizer. In: Proceedings of 121st Convention of the Audio Engineering Society, San Francisco, CA, USA (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Music Technology Group, Universitat Pompeu Fabra, Ocata 1, 08003, Barcelona, Spain
Jordi Janer & Esteban Maestre

Authors

Jordi Janer
View author publications
You can also search for this author in PubMed Google Scholar
Esteban Maestre
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Polytechnic Institute of Setúbal – INSTICC,, Av. D. Manuel I, 27A - 2. Esq., 2910-595, Setúbal, Portugal
Joaquim Filipe
Department of Computer Science, Monmouth University, West Long Branch, NJ 07764, U.S.A.
Mohammad S. Obaidat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janer, J., Maestre, E. (2008). Mapping Phonetic Features for Voice-Driven Sound Synthesis. In: Filipe, J., Obaidat, M.S. (eds) E-business and Telecommunications. ICETE 2007. Communications in Computer and Information Science, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88653-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-88653-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88652-5
Online ISBN: 978-3-540-88653-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics