Learning to Summarize Time Series Data

Sowdaboina, Pranay Kumar Venkata; Chakraborti, Sutanu; Sripada, Somayajulu

doi:10.1007/978-3-642-54906-9_42

Pranay Kumar Venkata Sowdaboina¹⁷,
Sutanu Chakraborti¹⁷ &
Somayajulu Sripada¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8403))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2187 Accesses
3 Citations

Abstract

In this paper we focus on content selection for summarizing time series data using Machine Learning techniques. The goal is to exploit a parallel corpus to predict the appropriate level of abstraction required for a summarization task. This is an important step towards building an automated NLG (Natural Language Generation) system to generate text for unseen data. Machine learning approaches are used to induce the underlying rules for text summarization, which are potentially close to the ones that humans use to generate textual summaries. We present an approach to select important points in a time series that can aid in generating captions or textual summaries. We evaluate our techniques on a parallel corpus of human generated weather forecast text corresponding to numerical weather prediction data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Belz, A.: Corpus-driven generation of weather forecasts. In: Proceedings of the 3rd Corpus Linguistics Conference (CL 2005) (2005)
Google Scholar
Kelly, C., Copestake, A., Karamanis, N.: Investigating content selection for language generation using machine learning. In: Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009), Athens, pp. 130–137 (2009)
Google Scholar
Duboue, P.A., McKeown, K.R.: Statistical Acquisition of Content Selection Rules for Natural Language Generation (EMNLP 2003), pp. 121–128 (2003)
Google Scholar
Goldberg, E., Driedger, N.: Using natural-language processing to produce weather forecasts. In: Proceedings of the IEEE Expert (1994)
Google Scholar
Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., Allan, J.: Mining of concurrent text and time series. In: Proceedings of the 6 th ACM SIGKDD Intl Conference on Knowledge Discovery and Data Mining Workshop on Text Mining (2000)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Google Scholar
Reiter, E., Dale, R.: Building applied natural language generation systems. Natural Langauge Engineering 3(1), 57–87 (1997)
Article Google Scholar
Reiter, E., Sripada, S., Robertson, R.: Acquiring correct knowledge for natural language generation. Journal of Artificial Intelligence Research 18, 491–516 (2003a)
MATH Google Scholar
Ehud Reiter, R.R., Osman, L.M.: Generating tailored smoking cessation letters. In: Artificial Intelligence (2003b)
Google Scholar
Reiter, E.: Learning the meaning and usage of time phrases from a parallel text-data corpus. In: Proceedings of the HLT-NAACL 2003 Workshop on Learning Word Meaning from Non-Linguistic Data (2003c)
Google Scholar
Sripada, S., Reiter, E., Hunter, J., Yu, J.: Segmenting time series for weather forecasting. In: Applications and Innovations in Intelligent Systems X. Springer (2002)
Google Scholar
Sripada, S.G., Reiter, E., Davy, I.: SUMTIME-MOUSAM: Configurable Marine Weather Forecast Generator (2003)
Google Scholar
Somayajulu, S.G., Reiter, E., Hunter, J., Yu, J.: Segmenting time series for weather forecasting. University of Aberdeen, U.K. (2001a)
Google Scholar
Somayajulu, S.G., Reiter, E., Hunter, J., Yu, J.: Modelling the task of Summarising Time Series Data using KA Techniques. University of Aberdeen, U.K. (2001b)
Google Scholar
Sripada, S.G., Reiter, E., Hunter, J., Yu, J.: Exploiting a parallel text-data corpus. In: Proceedings of Corpus Linguistics (2003)
Google Scholar
Vasko, K.T., Toivonen, H.T.: Estimating the number of segments in time series data using permutation tests. In: IEEE International Conference on Data Mining (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Indian Institute Technology Madras, India
Pranay Kumar Venkata Sowdaboina & Sutanu Chakraborti
Computing Science, University of Abeerdeen, UK
Somayajulu Sripada

Authors

Pranay Kumar Venkata Sowdaboina
View author publications
You can also search for this author in PubMed Google Scholar
Sutanu Chakraborti
View author publications
You can also search for this author in PubMed Google Scholar
Somayajulu Sripada
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Bátiz, Col. Nueva Industrial Vallejo, 07738, Mexico D.F., Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sowdaboina, P.K.V., Chakraborti, S., Sripada, S. (2014). Learning to Summarize Time Series Data. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-54906-9_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics