Abstract
In applications where the human voice controls the synthesis of musical instruments sounds, phonetics convey musical information that might be related to the sound of the imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrument-dependent, but they remain constant for a single subject and instrument. We propose a user-adapted system, where mappings from voice features to synthesis parameters depend on how subjects sing musical articulations, i.e. note to note transitions. The system consists of two components. First, a voice signal segmentation module that automatically determines note-to-note transitions. Second, a classifier that determines the type of musical articulation for each transition based on a set of phonetic features. For validating our hypothesis, we run an experiment where subjects imitated real instrument recordings with their voice. Performance recordings consisted of short phrases of saxophone and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (user-independent). Finally, from the previous results we show how to control the articulation in a sample-concatenation synthesizer by selecting the most appropriate samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lesaffre, M., Tanghe, K., Martens, G., Moelants, D., Leman, M., Baets, B.D., Meyer, H.D., Martens, J.: The mami query-by-voice experiment: Collecting and annotating vocal queries for music information retrieval. In: Proceedings of the ISMIR 2003, 4th International Conference on Music Information Retrieval, Baltimore (2003)
Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of International Computer Music Conference 2006, New Orleans (2006)
Jehan, T., Schoner, B.: An audio-driven, spectral analysis-based, perceptually meaningful timbre synthesizer, Amsterdam, Netherland (2001)
Poepel, C., Dannenberg, R.B.: Audio signal driven sound synthesis. In: ICMC 2005 International Computer Music Conference, Barcelona, Spain, pp. 391–394. ICMC (2005)
Janer, J.: Voice-controlled plucked bass guitar through two synthesis techniques. In: Int. Conf. on New Interfaces for Musical Expression, Vancouver, Canada, pp. 132–134 (2005)
Bonada, J., Serra, X.: Synthesis of the singing voice by performance sampling and spectral models. IEEE Signal Processing Magazine 24, 67–79 (2007)
Lindemann, E.: Music synthesis with reconstructive phrase modeling. IEEE Signal Processing Magazine 24(2), 80–91 (2007)
Wanderley, M., Depalle, P.: Contrôle Gestuel de la Synthèse Sonore. In: Vinet, H., Delalande, F. (eds.) Interfaces homme - machine et création musicale, pp. 145–163. Hermès Science Publishing, Paris (1999)
Egozy, E.B.: Deriving musical control features from a real-time timbre analysis of the clarinet. Master’s thesis, Massachusetts Institut of Technology (1995)
Widmer, G., Goebl, W.: Computational models of expressive music performance: The state of the art. J. New Music Research 3, 203–216 (2004)
Sundberg, J.: Musical significance of musicians’ syllable choice in improvised nonsense text singing: A preliminary study. Phonetica 54, 132–145 (1994)
Lieberman, P., Blumstein, S.E.: Speech physiology, speech perception, and acoustic phonetics. Cambridge University Press, Cambridge (1986)
Maestre, E., Gómez, E.: Automatic characterization of dynamics and articulation of monophonic expressive recordings. In: Procedings of the 118th AES Convention (2005)
Bonada, J., Blaauw, M., Loscos, A.: Improvements to a sample-concatenation based singing voice synthesizer. In: Proceedings of 121st Convention of the Audio Engineering Society, San Francisco, CA, USA (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Janer, J., Maestre, E. (2008). Mapping Phonetic Features for Voice-Driven Sound Synthesis. In: Filipe, J., Obaidat, M.S. (eds) E-business and Telecommunications. ICETE 2007. Communications in Computer and Information Science, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88653-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-88653-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88652-5
Online ISBN: 978-3-540-88653-2
eBook Packages: Computer ScienceComputer Science (R0)