Segmenting Conversations by Topic, Initiative, and Style

Ries, Klaus

doi:10.1007/3-540-45637-6_5

Klaus Ries^6,7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2273))

Included in the following conference series:

Workshop on Information Retrieval Techniques for Speech Applications

224 Accesses
4 Citations

Abstract

Topical segmentation is a basic tool for information access to audio records of meetings and other types of speech documents which may be fairly long and contain multiple topics. Standard segmentation algorithms are typically based on keywords, pitch contours or pauses. This work demonstrates that speaker initiative and style may be used as segmentation criteria as well. A probabilistic segmentation procedure is presented which allows the integration and modeling of these features in a clean framework with good results.

Keyword based segmentation methods degrade significantly on our meeting database when speech recognizer transcripts are used instead of manual transcripts. Speaker initiative is an interesting feature since it delivers good segmentations and should be easy to obtain from the audio. Speech style variation at the beginning, middle and end of topics may also be exploited for topical segmentation and would not require the detection of rare keywords.

I would like to thank my advisor Alex Waibel for supporting and encouraging this work and my collegues for various discoursive and practical contributions, especially Hua Yu and Klaus Zechner. The reviewers provided valuable comments for the final paper presentation. I would also like to thank our sponsors at DARPA. Any opinions, findings and conclusions expressed in this material are those of the authors and do not necessarily reflect the views of DARPA, or any other party.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Allan, J. Carbonell, G. Doddington, J. P. Yamron, and Y. Yang. Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, Virginia, USA, February 1998.
Google Scholar
D. Beeferman, A. Berger, and J. Lafferty. Statistical models for text segmentation. Machine Learning, 34:177–210, 1999. Special Issue on Natural Language Learning (C. Cardie and R. Mooney, eds).
Google Scholar
J. Carletta, A. Isard, S. Isard, J. C. Kowtko, G. Doherty-Sneddon, and A. H. Anderson. The reliability of a dialogue structure coding scheme. Computational Linguistics, 23(1):13–31, March 1997.
Google Scholar
F. Choi. Advances in domain independent linear text segmentation. In Proceedings of NAACL, Seattle, USA, 2000. Available with software at: http://www.cs.man.ac.uk/~choif/http://xxx.lanl.gov/abs/cs.CL/0003083.
J. Garofolo, C. Auzanne, and E. Voorhees. The TREC spoken document retrieval track: A success story. In E. Voorhees, editor, Text Retrieval Conference (TREC) 8, Gaithersburg, Maryland, USA, 1999. November 16–19.
Google Scholar
P. Geutner, M. Finke, and P. Scheytt. Adaptive vocabularies for transcribing multilingual broadcast news. In ICASSP, 1998.
Google Scholar
B. Grosz and C. Sidner. Attention, intention and the structure of discourse. Computational Linguistics, 12(3):172–204, 1986.
Google Scholar
M. Halliday and R. Hasan. Cohesion in English. Longman Group, 1976.
Google Scholar
M. A. Hearst. Texttiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1):33–64, March 1997.
Google Scholar
J. Hirschberg and C. Nakatani. Acoustic indicators of topic segmentation. In ICSLP, Sidney, Australia, 1998.
Google Scholar
M.-Y. Kan, J. Klavans, and K. R. McKeown. Linear segmentation and segment signi ficance. In Proceedings of the 6th International Workshop on Very Large Corpora (WVLC-6), pages 197–205, Montreal, Canada, August 1998.
Google Scholar
R. Kuhn and R. de Mori. A cache-base natural language model for speech recognition. IEEE Transactions on Pattern Analysis and machince Intelligence, 12(6):570–583, June 1990.
Article Google Scholar
P. Linell, L. Gustavsson, and P. Juvonen. Interactional dominance in dyadic communication: a presentation of initiative-response analysis. Linguistics, 26:415–442, 1988.
Article Google Scholar
W. C. Mann and S. Thomson. Rhetorical structure theory: Towards a functional theory of text organization. TEXT, 8:243–281, 1988.
Google Scholar
D. Marcu. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. PhD thesis, Department of Computer Science, University of Toronto, December 1997. Also published as Technical Report CSRG-371, Computer Systems Research Group, University of Toronto.
Google Scholar
E. Mittendorf and P. Schäuble. Document and passage retrieval based on hidden markov models. In Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, 1994.
Google Scholar
T. P. Moran, L. Palen, S. Harrison, P. Chiu, D. Kimber, S. Minneman, W. van Melle, and P. Zellweger. “i’ll get that off the audio”: A case study of salvaging multimedia meeting records. In CHI 97, 1997.
Google Scholar
H. Ney, U. Essen, and R. Kneser. On structuring probabilistic dependencies in stochastic language modeling. Computer Speech and Language, 8:1–35, 1994.
Article Google Scholar
Y. Pan and A. Waibel. The effects of room acoustics on MFCC speech parameters. In Proceedings of the ICSLP, Beijing, China, 2000.
Google Scholar
R. J. Passonneau and D. J. Litman. Discourse segmentation by human and automated means. Computational Linguistics, 23(1):103, March 1997. 139.
Google Scholar
J. M. Ponte and B. W. Croft. Text segmentation by topic. In Proceedings of the first European Conference on research and advanced technology for digital libraries, 1997. U.Mass. Computer Science Technical Report TR97-18.
Google Scholar
M. Porter. An algorithm for suffix stripping. Program, 14(3):130–137, July 1980.
Google Scholar
F. Quek, D. McNeill, R. Bryll, C. Kirbas, H. Arslan, K. E. McCullough, and N. Furuyama. Gesture, speech, and gaze cues for discourse segmentation. In Proceedings of the Computer Vision and Pattern Recognition CVPR, 2000.
Google Scholar
J. C. Reynar. Topic segmentation: Algorithms and applications. PhD thesis, Computer and Information Science, University of Pennsylvenia, 1998. Institute for Research in Cognitive Science (IRCS), University of Pennsylvenia, Technical report: IRCS-98-21.
Google Scholar
K. Ries. HMM and neural network based speech act classification. In Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 497–500, Phoenix, AZ, March 1999.
Google Scholar
K. Ries, L. Levin, L. Valle, A. Lavie, and A. Waibel. Shallow discourse genre annotation in callhome spanish. In Proceecings of the International Conference on Language Ressources and Evaluation (LREC-2000), Athens, Greece, May 2000.
Google Scholar
K. Ries and A. Waibel. Activity detection for information access to oral communication. In Human Language Technology Conference, Sand Diego, CA, USA, March 2001.
Google Scholar
E. Shriberg, A. Stolcke, D. Hakkani-Tür, and G. Tür. Prosody modeling for automatic sentence and topic segmentation from speech. Speech Communication, 32(1–2):127–154, 2000. Special Issue on Accessing Information in Spoken Audio.
Article Google Scholar
A. Singhal and F. Pereira. Document expansion for speech retrieval. In In Proceedings of SIGIR, 1999.
Google Scholar
A. Waibel, M. Bett, F. Metze, K. Ries, T. Schaaf, T. Schultz, H. Soltau, H. Yu, and K. Zechner. Advances in automatic meeting record creation and access. In ICASSP, Salt Lake City, Utah, USA, 2001.
Google Scholar
M. A. Walker and S. Whittaker. Mixed initiative in dialogue: An investigation into discourse segmentation. In In Proc. 28th Annual Meeting of the ACL, 1990.
Google Scholar
S. Whittaker, P. Hyland, and M. Wiley. Filochat: handwritten notes provide access to recorded conversations. In In Proceedings of CHI94 Conference on Computer Human Interaction, pages 271–277, 1994.
Google Scholar
Yamron, I. Carp, L. Gillick, S. Lowe, and P. van Mulbregt. A hidden markov model approach to text segmentation and event tracking. In Proceedings of ICASSP, volume 1, pages 333–336, Seattle, WA, May 1998.
Google Scholar
H. Yu, T. Tomokiyo, Z. Wang, and A. Waibel. New developments in automatic meeting transcription. In Proceedings of the ICSLP, Beijing, China, October 2000.
Google Scholar
K. Zechner and A. Waibel. DIASUMM: Flexible summarization of spontaneous dialogues in unrestricted domains. In Proceedings of COLING, Saarbrücken, Germany, 2000.
Google Scholar

Download references

Author information

Authors and Affiliations

Interactive Systems Labs, Carnegie Mellon University, 15213, Pittsburgh, PA, USA
Klaus Ries
Fakultät für Informatik, Universität Karlsruhe, 76128, Karlsruhe, Germany
Klaus Ries

Authors

Klaus Ries
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM T.J. Watson Research Center, P.O.Box 704, 10598, Yorktown Heights, NY, USA
Anni R. Coden & Eric W. Brown &
IBM Almaden Research Center, 650 Harry Road, 95120, San Jose, CA, USA
Savitha Srinivasan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ries, K. (2002). Segmenting Conversations by Topic, Initiative, and Style. In: Coden, A.R., Brown, E.W., Srinivasan, S. (eds) Information Retrieval Techniques for Speech Applications. IRTSA 2001. Lecture Notes in Computer Science, vol 2273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45637-6_5

Download citation

DOI: https://doi.org/10.1007/3-540-45637-6_5
Published: 22 January 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43156-5
Online ISBN: 978-3-540-45637-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics