Skip to main content

Simplifying the Classification of App Reviews Using Only Lexical Features

  • Conference paper
  • First Online:
Software Technologies (ICSOFT 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1077))

Included in the following conference series:

Abstract

User reviews submitted to app marketplaces contain information that falls into different categories, e.g., feature evaluation, feature request, and bug report. This information is valuable for developers to improve the quality of mobile applications. However, due to the large volume of reviews received every day, manual classification of user reviews into these categories is not feasible. Therefore, developing automatic classification methods using machine learning approaches is desirable. In this study, we address the problem of automatic classification of app review sentences (as opposed to full reviews) into different categories. We compare the simplest textual machine learning classifier using only lexical features – the so-called Bag-of-Words (BoW) approach – with more complex models used in previous work adopting rich linguistic features. We find that the performance of the simple BoW model is very competitive and has the advantage of not requiring any external linguistic tools to extract the features. Moreover, we experiment with deep learning based Convolutional Neural Network (CNN) models that have recently achieved state-of-the-art results in many classification tasks. We find that, on average, the CNN models do not perform significantly better than the simple BoW model. Finally, the manual analysis of misclassification errors and data annotations suggests that classifying review sentences in isolation does not always contain enough information to make a correct prediction. Thus, we suggest that adopting neural models to incorporate additional contextual knowledge might improve the classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://guxd.github.io/srminer/appendix.html.

  2. 2.

    https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html.

  3. 3.

    http://www.nltk.org/.

  4. 4.

    https://stanfordnlp.github.io/CoreNLP/.

  5. 5.

    https://spacy.io/.

  6. 6.

    https://code.google.com/archive/p/word2vec/.

  7. 7.

    http://scikit-learn.org/stable/.

  8. 8.

    https://github.com/dennybritz/cnn-text-classification-tf.

  9. 9.

    https://www.tensorflow.org/.

  10. 10.

    There are no examples from the sentence type Feature Request because all sentences in our sample annotated with that type contained an aspect term.

References

  1. Chen, N., Lin, J., Hoi, S.C.H., Xiao, X., Zhang, B.: AR-miner: mining informative reviews for developers from mobile app marketplace. In: Proceedings of the ICSE 2014, pp. 767–778. ACM Press (2014)

    Google Scholar 

  2. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)

    MATH  Google Scholar 

  3. Du, J., Gui, L., Xu, R., He, Y.: A convolutional attention model for text classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 183–195. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_16

    Chapter  Google Scholar 

  4. Fu, W., Menzies, T.: Easy over hard: a case study on deep learning. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, pp. 49–60. ACM, New York (2017). https://doi.org/10.1145/3106237.3106256, http://doi.acm.org/10.1145/3106237.3106256

  5. Gao, C., Zeng, J., Lo, D., Lin, C.Y., Lyu, M.R., King, I.: Infar: insight extraction from app reviews. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018, pp. 904–907. ACM, New York (2018). https://doi.org/10.1145/3236024.3264595, http://doi.acm.org/10.1145/3236024.3264595

  6. Genc-Nayebi, N., Abran, A.: A systematic literature review: opinion mining studies from mobile app store user reviews. J. Syst. Softw. 125, 207–219 (2017)

    Article  Google Scholar 

  7. Gu, X., Kim, S.: What parts of your apps are loved by users? In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 760–770, November 2015. https://doi.org/10.1109/ASE.2015.57

  8. Iacob, C., Harrison, R.: Retrieving and analyzing mobile apps feature requests from online reviews. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 41–44. IEEE Press (2013)

    Google Scholar 

  9. Iacob, C., Harrison, R., Faily, S.: Online reviews as first class artifacts in mobile app development. In: Memmi, G., Blanke, U. (eds.) MobiCASE 2013. LNICST, vol. 130, pp. 47–53. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05452-0_4

    Chapter  Google Scholar 

  10. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the EMNLP 2014, pp. 1746–1751. ACL (2014)

    Google Scholar 

  11. Liu, T., Yu, S., Xu, B., Yin, H.: Recurrent networks with attention and convolutional networks for sentence representation and classification. Appl. Intell. 48(10), 3797–3806 (2018)

    Article  Google Scholar 

  12. Lu, M., Liang, P.: Automatic classification of non-functional requirements from augmented app user reviews. In: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering, EASE 2017, pp. 344–353. ACM, New York (2017). https://doi.org/10.1145/3084226.3084241, http://doi.acm.org/10.1145/3084226.3084241

  13. Maalej, W., Nabil, H.: Bug report, feature request, or simply praise? On automatically classifying app reviews. In: Proceedings of RE 2015, pp. 116–125. IEEE, August 2015

    Google Scholar 

  14. Martin, W., Sarro, F., Jia, Y., Zhang, Y., Harman, M.: A survey of app store analysis for software engineering. IEEE Trans. Softw. Eng. 43(9), 817–847 (2017)

    Article  Google Scholar 

  15. McIlroy, S., Ali, N., Khalid, H., Hassan, A.E.: Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empir. Softw. Eng. 21(3), 1067–1106 (2016)

    Article  Google Scholar 

  16. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  17. Pagano, D., Maalej, W.: User feedback in the appstore: an empirical study. In: Proceedings of RE 2013, pp. 125–134 (2013)

    Google Scholar 

  18. Panichella, S., Di Sorbo, A., Guzman, E., Visaggio, C.A., Canfora, G., Gall, H.C.: How can i improve my app? Classifying user reviews for software maintenance and evolution. In: Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), ICSME 2015, pp. 281–290. IEEE Computer Society, Washington, D.C. (2015). https://doi.org/10.1109/ICSM.2015.7332474, http://dx.doi.org/10.1109/ICSM.2015.7332474

  19. Shah, F.A., Sirts, K., Pfahl, D.: Simple app review classification with only lexical features. In: Proceedings of the 13th International Conference on Software Technologies, ICSOFT, vol. 1, pp. 112–119. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006855901460153

  20. Socher, R., Lin, C.C.Y., Ng, A.Y., Manning, C.D.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML 2011, pp. 129–136. Omnipress, Madison (2011). http://dl.acm.org/citation.cfm?id=3104482.3104499

  21. Sorbo, A.D., Panichella, S., Alexandru, C.V., Visaggio, C.A., Canfora, G.: Surf: summarizer of user reviews feedback. In: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 55–58, May 2017. https://doi.org/10.1109/ICSE-C.2017.5

  22. Villarroel, L., Bavota, G., Russo, B., Oliveto, R., Di Penta, M.: Release planning of mobile apps based on user reviews. In: Proceedings of the ICSE 2016, pp. 14–24. ACM (2016)

    Google Scholar 

  23. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)

    Google Scholar 

Download references

Acknowledgments

We are grateful to Xiaodong Gu for sharing the review dataset for this study. This research was supported by the institutional research grant IUT20-55 of the Estonian Research Council and the Estonian Center of Excellence in ICT research (EXCITE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Faiz Ali Shah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shah, F.A., Sirts, K., Pfahl, D. (2019). Simplifying the Classification of App Reviews Using Only Lexical Features. In: van Sinderen, M., Maciaszek, L. (eds) Software Technologies. ICSOFT 2018. Communications in Computer and Information Science, vol 1077. Springer, Cham. https://doi.org/10.1007/978-3-030-29157-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29157-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29156-3

  • Online ISBN: 978-3-030-29157-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics