Short Text Classification of Buyer-Initiated Questions in Online Auctions: A Score Assigning Method

Li, Yichen; Srinivasan, Ananth; Tripathi, Arvind

doi:10.1007/978-3-319-64930-6_13

Yichen Li¹⁰,
Ananth Srinivasan¹⁰ &
Arvind Tripathi¹⁰

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 295))

Included in the following conference series:

International Conference on Business Informatics Research

770 Accesses

Abstract

Classification of short text (SMS, reviews, feedback, etc.) presents a unique set of challenges compared to classic text classification. Short texts are characterized by cryptic constructions, poor spelling, improper grammar, etc. that makes the application of traditional methods difficult. Proper classification enables us to use this information for further action. We study this problem in the context of online auctions. The paper presents a score assigning approach which outperforms traditional methods (e.g. Naïve Bayes) in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Borko, H., Bernick, M.: Automatic document classification. J. ACM (JACM) 10(2), 151–162 (1963)
Article Google Scholar
Callan, J., Connell, M., Du, A.: Automatic discovery of language models for text databases. Paper Presented at the ACM SIGMOD Record (1999)
Google Scholar
Cavnar, W.: Using an N-gram-based document representation with a vector processing retrieval model, pp. 269–269. NIST Special Publication SP (1995)
Google Scholar
Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. Ann Arbor MI 48113(2), 161–175 (1994)
Google Scholar
Chen, J., Huang, H., Tian, S., Qu, Y.: Feature selection for text classification with Naïve Bayes. Expert Syst. Appl. 36(3), 5432–5435 (2009)
Article Google Scholar
Chuang, S.-L., Chien, L.-F.: Enriching web taxonomies through subject categorization of query terms from search engine logs. Decis. Support Syst. 35(1), 113–127 (2003)
Article Google Scholar
Cormack, G.V., Lynam, T.R.: Online supervised spam filter evaluation. ACM Trans. Inf. Syst. (TOIS) 25(3), 11 (2007)
Article Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)
Article Google Scholar
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
Google Scholar
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)
Article Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. Paper Presented at the Proceedings of the Eleventh International Conference Machine Learning (1994)
Google Scholar
Kohavi, R.A.: Study of cross-validation and bootstrap for accuracy estimation and model selection. Paper Presented at the Ijcai (1995)
Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection (1996)
Google Scholar
Larkey, L.S., Croft, W.B.: Automatic assignment of ICD9 codes to discharge summaries. University of Massachusetts (1995)
Google Scholar
Li, L., Qu, S.: Short text classification based on improved ITC. J. Comput. Commun. 1, 22–27 (2013)
Article Google Scholar
Losiewicz, P., Oard, D.W., Kostoff, R.N.: Textual data mining to support science and technology management. J. Intell. Inf. Syst. 15(2), 99–119 (2000)
Article Google Scholar
Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst. Appl. 40(2), 621–633 (2013)
Article Google Scholar
Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. Paper Presented at the Proceedings of the 17th International Conference on World Wide Web (2008)
Google Scholar
Rogati, M., Yang, Y.: High-performing feature selection for text classification. Paper Presented at the Proceedings of the Eleventh International Conference on Information and Knowledge Management (2002)
Google Scholar
Song, G., Li, Y., Li, C., Chen, J., Ye, Y.: Mining textual stream with partial labeled instances using ensemble framework. Int. J. Database Theory Appl. 7(4), 47–58 (2014)
Article Google Scholar
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. Paper presented at the Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (2010)
Google Scholar
Sun, A.: Short text classification using very few words. Paper Presented at the Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (2012)
Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. Paper Presented at the ICML (1997)
Google Scholar
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. Paper Presented at the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Youn, S., McLeod, D.: A comparative study for email classification. In: Elleithy, K. (ed.) Advances and Innovations in Systems, Computing Sciences and Software Engineering, pp. 387–391. Springer, Dordrecht (2007)
Chapter Google Scholar
Zhang, D., Lee, W.S.: Question classification using support vector machines. Paper Presented at the Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (2003)
Google Scholar
Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Am. Soc. Inf. Sci. Technol. 57(3), 378–393 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ISOM Department, University of Auckland Business School, 12 Grafton Road, Private Bag 92019, Auckland, 1142, New Zealand
Yichen Li, Ananth Srinivasan & Arvind Tripathi

Authors

Yichen Li
View author publications
You can also search for this author in PubMed Google Scholar
Ananth Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
Arvind Tripathi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ananth Srinivasan .

Editor information

Editors and Affiliations

Lund University, Lund, Sweden
Björn Johansson
Aalborg University, Aalborg, Denmark
Charles Møller
Aalborg University, Copenhagen, Denmark
Atanu Chaudhuri
Aalborg University, Copenhagen, Denmark
Frantisek Sudzina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Srinivasan, A., Tripathi, A. (2017). Short Text Classification of Buyer-Initiated Questions in Online Auctions: A Score Assigning Method. In: Johansson, B., Møller, C., Chaudhuri, A., Sudzina, F. (eds) Perspectives in Business Informatics Research. BIR 2017. Lecture Notes in Business Information Processing, vol 295. Springer, Cham. https://doi.org/10.1007/978-3-319-64930-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-64930-6_13
Published: 05 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64929-0
Online ISBN: 978-3-319-64930-6
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics