Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5459))

Included in the following conference series:

Abstract

It is very significant in the knowledge society to accumulate spoken documents on the web. However, because of the high redundancy of spontaneous speech, the transcribed text in itself is not readable on an Internet browser, and therefore not suitable as a web document. This paper proposes a technique for converting spoken documents into web documents for the purpose of building a speech archiving system. The technique edits automatically transcribed texts and improves its readability on the browser. The readable text can be generated by applying technology such as paraphrasing, segmentation and structuring to the transcribed texts. An edit experiment using lecture data showed the feasibility of the technique. A prototype system of spoken document archiving was implemented to confirm its effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bain, K., Basson, S., Faisman, A., Kanevsky, D.: Accessibility, transcription, and access everywhere. IBM System Journal 44(3), 589–603 (2005)

    Article  Google Scholar 

  2. Shibata, T., Kurohashi, S.: Automatic Slide Generation Based on Discourse Structure Analysis. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS, vol. 3651, pp. 754–766. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Chatain, P., Whittaker, E.W.D., Mmzinski, J.A., Furui, S.: Topic and Stylistic Adaptation for Speech Summarisation. In: Proc. IEEE ICASSP (2006)

    Google Scholar 

  4. James, C., Mirella, L.: Models for Sentence Compression: a Comparison Across Domains, Training Requirements and Evaluation Measures. In: Proc. ACL/COLING 2006, pp. 377–384 (2006)

    Google Scholar 

  5. Zhu, X., Penn, G.: Summarization of Spontaneous Conversations. In: Proc. 9th ICSLP, pp. 1531–1534 (2006)

    Google Scholar 

  6. Murray, G., Renals, S., Carletta, J., Moore, J.: Incorporating Speaker and Discourse Features into Speech Summarization. In: Proc. HLT, pp. 367–374 (2006)

    Google Scholar 

  7. Shitaoka, K., Nanjo, H., Kawahara, T.: Automatic Transformation of Lecture Transcription into Document Style Using Statistical Framework. In: Proc. 8th ICSLP, pp. 2169–2172 (2004)

    Google Scholar 

  8. Matsubara, S., Takagi, A., Kawaguchi, N., Inagaki, Y.: Bilingual Spoken Monologue Corpus for Simultaneous Machine Interpretation Research. In: Proc. 3rd LREC, pp. 153–159 (2002)

    Google Scholar 

  9. http://nlp.kuee.kyoto-u.ac.jp/nl-resource/knp.html

  10. http://nlp.kuee.kyoto-u.ac.jp/nl-resource/juman.html

  11. http://www.hke.jp/products/voice/voice_index.htm

  12. http://www.nhk.or.jp/strl/tvml/english/player2/index.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ito, M., Ohno, T., Matsubara, S. (2009). Text Editing for Lecture Speech Archiving on the Web. In: Li, W., Mollá-Aliod, D. (eds) Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy. ICCPOL 2009. Lecture Notes in Computer Science(), vol 5459. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00831-3_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00831-3_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00830-6

  • Online ISBN: 978-3-642-00831-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics