Skip to main content

New Machine Scores and Their Combinations for Automatic Mandarin Phonetic Pronunciation Quality Assessment

  • Conference paper
Knowledge-Based Intelligent Information and Engineering Systems (KES 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4692))

  • 1434 Accesses

Abstract

This paper discusses Mandarin vowel pronunciation quality assessment. The phonetic pronunciation quality is traditionally evaluated under the speech recognition framework by the phonetic posterior probability score, which may be computed by normalizing the frame-based posterior probability or be calculated on the phone segment directly. By the first method, we can achieve a human-machine scoring correlation coefficient (CC) of 0.832 for vowel; and by the second, the CC can be up to 0.847. This paper proposes a novel kind of formant feature and applies the feature to the evaluation of vowel: we transform the formant plots on the time-frequency plane to a bitmap and extract its Gabor feature for pattern classification; when use the classification probability for pronunciation assessment, we can get a CC of 0.842. Finally we combine the three scores with various linear or nonlinear methods; the best CC of 0.913 is gotten by using neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Franco, H., Neumeyer, L., et al.: Automatic pronunciation Scoring for Language Instruction. ICASSP, Munich, pp. 1471–1474. Munich (1997)

    Google Scholar 

  2. Neumeyer, L., Franco, H.: Automatic Scoring of Pronunciation Quality. Speech Communication 30, 83–93 (2000)

    Article  Google Scholar 

  3. Franco, H., Neumeyer, L., Digalakis, V., Ronen, V.: Combination of machine scores for automatic grading of pronunciation quality. Speech Communication 30, 121–130 (2000)

    Article  Google Scholar 

  4. Yasushi, T., Masatake, D., Tatsuya, K.: Practical use of English pronunciation system for Japanese students in the CALL classroom. INTERSPEECH, pp. 1689–1692 (2004)

    Google Scholar 

  5. Witt, S.M., Young, S.J.: Phone-level pronunciation scoring and assessment for interactive language learning. Speech communication 30, 95–108 (2000)

    Article  Google Scholar 

  6. Hillenbrand, J., Getty, L.A., Clark, M.J., et al.: Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America 97, 3099–3111 (1995)

    Article  Google Scholar 

  7. Schmid, P., Barnard, E.: Explicit, n-best formant features for vowel classification. ICASSP, pp. 21–24 (1997)

    Google Scholar 

  8. Nearey, T.M., Assmann, P.F.: Modeling the role of inherent spectral change in vowel identification. Jorunal of the Acoustical Society of America 80, 1297–1308 (1986)

    Article  Google Scholar 

  9. Lee, M., VanSanten, J., Mobius, B., Olive, J.: Formant Tracking Using Context-Dependent Phonemic Information. IEEE Transactions on Speech and Audio Processing 13, 741–750 (2005)

    Article  Google Scholar 

  10. Petkov, N.: Biologically motivated computationally intensive approaches to image pattern recognition. Future Generation Computer Systems 11, 451–465 (1995)

    Article  Google Scholar 

  11. Grigorescu, S.E., Petkov, N., Kruizinga, P.: Comparison of texture features based on Gabor filters. IEEE Transactions on Image Processing 11, 1160–1167 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bruno Apolloni Robert J. Howlett Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pan, F., Zhao, Q., Yan, Y. (2007). New Machine Scores and Their Combinations for Automatic Mandarin Phonetic Pronunciation Quality Assessment. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74819-9_101

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74819-9_101

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74817-5

  • Online ISBN: 978-3-540-74819-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics