Skip to main content
Log in

Accessing the spoken word

  • Regular contribution
  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Spoken-word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access, and preservation of such data is stimulated by political, economic, cultural, and educational needs. This paper outlines the major issues in the field, reviews the current state of technology, examines the rapidly changing policy issues relating to privacy and copyright, and presents issues relating to the collection and preservation of spoken audio content .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. IASA Technical Committee(1997) The safeguarding of the audio heritage: ethics, principles and preservation strategy, February 1997. IASA-TC 03 Version 1

  2. (1999) Risk management suggestions. In: Multimedia Web Strategist 5

  3. Appelt D, Martin D (1999) Named entity recognition in speech: approach and results using the TextPro system. In: Proc DARPA workshop on broadcast news, pp 51–54

  4. Arons B (1997) SpeechSkimmer: a systen for interactively skimming recorded speech. ACM Trans Comput Hum Interact 4:3–38

    Article  Google Scholar 

  5. Bird S, Harrington J (eds) (2001) Special issue on speech annotation and corpus tools. Speech Commun 33(1–2):1–174

    Article  Google Scholar 

  6. Bird S, Simons G (2003) Seven dimensions of portability for language documentation and description. Language 79:557–582

    Article  Google Scholar 

  7. Campbell JP Jr (1997) Speaker recognition: a tutorial. Proc IEEE 85:1437–1462

    Article  Google Scholar 

  8. Chen S, Gopalakrishnan PS (1998) Clustering via the Bayesian Information Criterion with applications in speech recognition. In: Proceedings of IEEE ICASSP-98, pp 645–648

  9. Christensen CM (1997) The innovator’s dilemma. Harvard Business School Press, Boston

  10. Electronic Privacy Information Center (EPIC) and Privacy International (2002) Privacy and Human Rights 2002, Washington, DC

  11. Garofolo JS, Auzanne CGP, Voorhees EM (2000) The TREC spoken document retrieval track: a success story. In: Proc. RIAO 2000

  12. Gauvain J-L, Lamel L (2000) Large-vocabulary continuous speech recognition: advances and applications. Proc IEEE 88:1181–1200

    Article  Google Scholar 

  13. Glover R, Worlton A (2002) Trans-national employers must harmonize conflicting privacy rules. In: Metropolitan Corporate Counsel, Mid-atlantic edn. Metropolitan Corporate Counsel, Mountainside, NJ, p 20

  14. Godsill SJ, Rayner PJW (1995) A Bayesian approach to the restoration of degraded audio signals. IEEE Trans Speech Audio Process 3:267–278

    Article  Google Scholar 

  15. Gotoh Y, Renals S (2000) Information extraction from broadcast news. Philos Trans R Soc Lond Ser A 358:1295–1310

    Article  MATH  Google Scholar 

  16. Hori C, Furui S, Malkin R, Yu H, Waibel A (2003) A statistical approach for automatic speech summarization. EURASIP J Appl Signal Process 2:128–139

    Article  MATH  Google Scholar 

  17. Lagoze C, Van de Sompel H (2001) The Open Archives Initiative: building a low-barrier interoperability framework. In: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries, pp 54–62

  18. Ling T (2002) Why the archive introduced digitisation on demand. RLG Diginews, 6(4) http://www.rlg.org/preserv/diginews/diginews6-4.html#feature1

  19. Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15

    Article  Google Scholar 

  20. Litman J (2001) Digital Copyright. Prometheus Books, Amherst, NY, p 84

  21. Logan B, Robinson T (2001) Adaptive model-based speech enhancement. Speech Commun 34:351–368

    Article  MATH  Google Scholar 

  22. Makhoul J, Kubala F, Leek T, Liu D, Nguyen L, Schwartz R, Srivastava A (2000) Speech and language technologies for audio indexing and retrieval. Proc IEEE 88:1338–1353

    Article  Google Scholar 

  23. Maybury M (ed) (2000) Special issue on news on demand. Commun ACM 43(2):32–34

    Article  Google Scholar 

  24. Oard DW (1997) Serving users in many languages: cross-language information retrieval. D-Lib Mag http://www.dlib.org/dlib/december97/oard/12oard.html

  25. Oard DW (2000) User interface design for speech-based retrieval. Bull Am Soc Inf Sci 26(5):20–22

    Google Scholar 

  26. Rigoll G (2001) The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information. In: IEEE workshop on automatic speech recognition and understanding, pp 301–306

  27. Rothenberg LE (2000) Rethinking privacy: peeping toms, video voyeurs and failure of the criminal law to recognize a reasonable expectiation of privacy in the public space. Am University Law Rev 49:1127

    Google Scholar 

  28. Simons G, Bird S (2003) Building an Open Language Archives Community on the OAI foundation. Library Hi Tech 21:210–218

  29. Sundara Rajan MT (2002) Moral rights and copyright harmonization: prospects for an “international moral right”. In: 17th BILETA annual conference, April 2002

  30. Wactlar HD, Kanade T, Smith MA, Stevens SM (1996) Intelligent access to digital video: informedia project. IEEE Comput 29(5):46–53

    Article  Google Scholar 

  31. Wahlster W (ed) (2000) Verbmobil: foundations of speech-to-speech translation. Springer, Berlin Heidelberg New York

  32. Wayne C (2000) Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. In: Language resources and evaluation conference (LREC), pp 1487–1494

  33. Whittaker S, Hirschberg J, Choi J, Hindle D, Pereira F, Singhal A (1999) SCAN: designing and evaluating user interfaces to support retrieval from speech archives. In: Proceedings of ACM SIGIR-99 conference on research and development in information retrieval, pp 26–33

  34. World Intellectual Property Organization (WIPO) (1979) Berne Convention for the Protection of Literary and Artistic Works. http://www.wipo.int/treaties/ip/berne/

  35. Young S (1996) A review of large-vocabulary continuous-speech recognition. IEEE Signal Process Mag 13(5):45–57

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Goldman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goldman, J., Renals, S., Bird, S. et al. Accessing the spoken word. Int J Digit Libr 5, 287–298 (2005). https://doi.org/10.1007/s00799-004-0101-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-004-0101-0

Keywords

Navigation