Skip to main content

Medium and Low Bit Rate Speech Transmission

  • Conference paper
Automatic Speech Analysis and Recognition

Part of the book series: NATO Advanced Study Institutes Series ((ASIC,volume 88))

  • 149 Accesses

Abstract

We describe five types of speech coding systems with transmission bit rates spanning the range from 16,000 bits/sec (b/s) down to 100 b/s: adaptive predictive coders at 16 kb/s, baseband coders at 9.6 kb/s, linear predictive coders at 1.5–2.4 kb/s, clustering vocoders at 600–800 b/s, and diphone-based phonetic vocoders at 100 b/s. For each type of coders, we describe the coder configuration, discuss the important coding issues, and present the results available to date.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Makhoul, R. Viswanathan, R. Schwartz and A.W.F. Huggins, “A Mixed-Source Model for Speech Compression and Synthesis,” J. Acoust. Soc. Amer., Vol. 64, Dec. 1978, pp. 1577–1581.

    Article  Google Scholar 

  2. A.W.F. Huggins, R. Schwartz, R. Viswanathan and J. Makhoul, “Subjective Quality Testing of a New Source Model of LPC Vocoders,” Presented at the 96th Meeting of the Acoust. Soc. Amer., Honolulu, Hawaii, Nov. 27-Dec. 1, 1978.

    Google Scholar 

  3. J. Makhoul, “Linear Prediction: A Tutorial Review,” Proc. IEEE, Vol. 63, April 1975, pp. 561–580.

    Article  Google Scholar 

  4. J. Makhoul and M. Berouti, “Adaptive Noise Spectral Shaping and Entropy Coding in Predictive Coding of Speech,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-27, Feb. 1979, PP. 63–73.

    Article  Google Scholar 

  5. B.S. Atal and M.R. Schroeder, “Predictive Coding of Speech Signals and Subjective Error Criteria,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-27, June 1979, PP. 247–254.

    Article  Google Scholar 

  6. H. Viswanathan, W. Russell, and A. Higgins,, “Design and Real-Time Implementation of a Robust APC Coder for Speech Transmission over 16 kb/s Noisy Channels,” BBN Report No. 4565, Vol. I: Algorithm Design and Simulation, AD No. A096091, Final Report, Contract DCA100-79-C-0037, Dec. 1980.

    Google Scholar 

  7. R. Viswanathan, W. Russell, A. Higgins, M. Berouti and J. Makhoul, “Speech-Quality Optimization of 16 kb/s Adaptive Predictive Coders,” IEEE International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 520-525.

    Google Scholar 

  8. R. Viswanathan, W. Russell, and A. Higgins, “Noisy Channel Performance of 16 kb/s APC Coders,” IEEE International Conf. Acoustics, Speech and Signal Processing, Atlanta, GA, April 1981, pp. 615-618.

    Google Scholar 

  9. J. Wolf and K. Field, “Real-Time Speech Coder Implementation at 9.6 and 16 kb/s,” IEEE International Conf. Acoustics, Speech and Signal Processing, Atlanta, GA, April 1981, pp. 607-610, (A revised version has been accepted for publication in the April 1982 issue of IEEE Trans. Communication.).

    Google Scholar 

  10. R. Viswanathan, J. Wolf, L. Cosell, K. Field, A. Higgins, and W. Russell, “Design and Real-Time Implementation of a Baseband LPC Coder for Speech Transmission Over 9600 Bps Noisy Channels,” Final Report, Contract No. DCA100-79-0003, Bolt Beranek and Newman Inc., BBN Report No. 4327, Vol I: ADA083079, Vol II: ADA083238, Feb. 1980.

    Google Scholar 

  11. R. Viswanathan, W. Russell, and J. Makhoul, “Voice-Excited LPC Coders for 9.6 Kbps Speech Transmission,” IEEE International Conf. Acoustics, Speech and Signal Processing, Washington, DC, April 1979, pp. 568-571.

    Google Scholar 

  12. R. Viswanathan, A. Higgins, W. Russell and J. Makhoul, “Baseband LPC Coders for Speech Transmission Over 9.6 kb/s Noisy Channels,” IEEI International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 348-351.

    Google Scholar 

  13. J. Makhoul and M. Berouti, “High-Frequency Regeneration in Speech Coding Systems,” IEEE International Conf. Acoustics. Speech and Signal Processing, Washington, DC, April 1979, PP. 428-432.

    Google Scholar 

  14. J. Makhoul and M. Berouti, “Predictive and Residual Encoding of Speech,” J. Acoust. Soc. Amer., Dec. 1979, pp. 1633-1641.

    Google Scholar 

  15. A. Higgins, R. Viswanathan, and W. Russell, “New High-Frequency Regeneration (HFR) Techniques for Voice-Excited Speech Coders,” J. Acoust. Soc. Amer., Vol. 66, Nov. 1979, pp. S22 (Abstract).

    Article  Google Scholar 

  16. R. Viswanathan and J. Makhoul, “Quantization Properties of Transmission Parameters in Linear Predictive Systems,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-23. June 1975, pp. 309–321.

    Article  Google Scholar 

  17. R. Viswanathan, J. Makhoul and A.W.F. Huggins, “Speech Compression and Evaluation,” Final Report, Contract No. MDA903-75-C-0180, Bolt Beranek and Newman Inc., BBN Report No. 3794, ADA055019, April 1978.

    Google Scholar 

  18. J. Makhoul, R. Viswanathan, L. Cosell and W. Russell, “Natural Communications with Computers: Speech Compression Research at BBN,” Bolt Beranek and Newman Inc., Report No. 2976, Vol. II, Dec. 1974.

    Google Scholar 

  19. T.E. Tremain. J.W. Fussell, R.A. Dean, B.M. Abzug, M.D. Cowing and P.W. Boudra, Jr., “Implementation of Two Real-Time Narrowband Speech Algorithms,” Proc. EASCON’ 78, Washington, D.C., September 1978, pp. 698-708.

    Google Scholar 

  20. R. Viswanathan, J. Makhoul, R. Schwartz, and A.W.F. Huggins, “Variable-Frame-Rate Transmission: A Review of the Methodology and Application to Narrowband LPC Speech Coding,” Accepted for publication in IEEE Trans. Comm., April 1982.

    Google Scholar 

  21. R. Viswanathan, J. Makhoul and R. Wicke, “The Application of a Functional Perceptual Model of Speech to Variable-Rate LPC Systems,” IEEE International Conf. Acoustics, Speech and Signal Processing, Hartford, CT, May 1977, pp. 219-222.

    Google Scholar 

  22. R. Viswanathan, E. Blackman and J. Makhoul, “Variable Frame Rate Narrowband Speech Transmission over Fixed Rate Noisy Channels.” EASCON’ 77 Record, Sept. 1977, p. 23-24.

    Google Scholar 

  23. E. Blackman, R. Viswanathan, W. Russell and J. Makhoul, “Narrowband LPC Speech Transmission over Noisy Channels.” IEEE International Conf. Acoustics, Speech and Signal Processing, Washington, D.C., April 1979, PP. 60-63.

    Google Scholar 

  24. C.P. Smith, “Perception of Vocoder Speech Processed by Pattern Matching,” J. Acoust. Soc. Amer., Vol. 46, No. 6, (Part 2) 1969, PP. 1562–1571.

    Article  Google Scholar 

  25. A. Buzo, A.H. Gray Jr., R.M. Gray, and J.D. Markel, “Speech Coding Based Upon Vector Quantization,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-28, Oct. 1980, pp. 562–574.

    Article  MathSciNet  Google Scholar 

  26. J. Makhoul, C. Cook, R. Schwartz, and D. Klatt, “A Feasibility Study of a Very Low Rate Speech Compression System,” Report No. 3508 (NTIS NO. AD A044400), Bolt Beranek and Newman Inc., Cambridge, MA, Feb. 1977.

    Google Scholar 

  27. R. Schwartz, J. Klovstad, J. Makhoul, D. Klatt and V. Zue, “Diphone Synthesis for Phonetic Vocoding,” IEEE International Conf. Acqutics, Speech and Signal Processing. Washington, DC, April 1979, pp. 891-894.

    Google Scholar 

  28. R. Schwartz, J. Klovstad, J. Makhoul, and J. Sorensen, “A Preliminary Design of a Phonetic Vocoder Based on a Diphone Model,” IPEE International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 32-35.

    Google Scholar 

  29. M. Berouti and J. Makhoul, “An Embedded-Code Multirate Speech Transform Coder,” Proc. 1980 Int. Conf. Acoustics, Speech, and Signal Processing. Denver, CO, April 1980, pp. 356-359.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1982 D. Reidel Publishing Company, Dordrecht, Holland

About this paper

Cite this paper

Viswanathan, V.R., Makhoul, J., Schwartz, R. (1982). Medium and Low Bit Rate Speech Transmission. In: Haton, JP. (eds) Automatic Speech Analysis and Recognition. NATO Advanced Study Institutes Series, vol 88. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-7879-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-7879-9_2

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-009-7881-2

  • Online ISBN: 978-94-009-7879-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics