Skip to main content
Log in

What do developers search for on the web?

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the frequency and difficulty of 34 search tasks which are grouped along the following seven dimensions: general search, debugging and bug fixing, programming, third party code reuse, tools, database, and testing. We find that searching for explanations for unknown terminologies, explanations for exceptions/error messages (e.g., HTTP 404), reusable code snippets, solutions to common programming bugs, and suitable third-party libraries/services are the most frequent search tasks that developers perform, while searching for solutions to performance bugs, solutions to multi-threading bugs, public datasets to test newly developed algorithms or systems, reusable code snippets, best industrial practices, database optimization solutions, solutions to security bugs, and solutions to software configuration bugs are the most difficult search tasks that developers consider. Our study sheds light as to why practitioners often perform some of these tasks and why they find some of them to be challenging. We also discuss the implications of our findings to future research in several research areas, e.g., code search engines, domain-specific search engines, and automated generation and refinement of search queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://www.google.com

  2. http://www.bing.com

  3. http://www.baidu.com

  4. Notice although Google is blocked in China, developers can use Google by using agent service such as Shadowsocks. See https://shadowsocks.com/ for more details.

  5. CSDN is one of the largest technical blog site in China, see http://www.csdn.net/ for more details.

  6. https://www.quora.com/.

  7. Zhihu is one of the largest Q&A site in China, see http://www.zhihu.com/ for more details.

  8. http://www.insigmaservice.com/.

  9. http://www.hengtiansoft.com/.

  10. Cloudera is a software company that provides Apache Hadoop-based software, support and services, and training to business customers (https://www.cloudera.com/).

  11. We identified another 6 search tasks in the open-ended interviews.

  12. https://www.wikipedia.org/

  13. https://baike.baidu.com/

  14. Wensong Zhang is the co-founder of the project Linux Virtual Server, see https://en.wikipedia.org/wiki/Linux_Virtual_Server for more details.

  15. D An expletive was masked out.

  16. https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers

References

  • Krugle (2014) http://opensearch.krugle.org/projects/

  • Koders (2016) http://www.koders.com

  • Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search Proceedings of the 21st ACM SIGPLAN symposium on object-oriented programming systems, languages, and applications, ACM, pp 681–682

  • Bajracharya SK, Lopes CV (2009) Mining search topics from a code search engine usage log Proceedings of the 6th international working conference on mining software repositories (MSR), IEEE

  • Bajracharya SK, Lopes CV (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17(4-5):424–466

    Article  Google Scholar 

  • Bao L, Xing Z, Wang X, Zhou B (2015a) Tracking and analyzing cross-cutting activities in developers’ daily work Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 277–282

  • Bao L, Ye D, Xing Z, Xia X, Wang X (2015b) Activityspace: a remembrance framework to support interapplication information needs Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), IEEE, pp 864–869

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Brandt J, Guo PJ, Lewenstein J, Dontcheva M, Klemmer SR (2009) Two studies of opportunistic programming: interleaving web foraging, learning, and writing code Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 1589–1598

  • Broder A (2002) A taxonomy of web search ACM SIGIR Forum, ACM, vol 36, pp 3–10

  • Cutrell E, Guan Z (2007) What are you looking for?: an eye-tracking study of information usage in web search Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 407–416

  • Haiduc S, Bavota G, Marcus A, Oliveto R, Lucia AD, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering Proceedings of the 35th international conference on software engineering (ICSE), pp 842–851

  • Jansen BJ, Spink A, Saracevic T (2000) Real life, real users, and real needs: a study and analysis of user queries on the web. Inf Process Manag 36(2):207–227

    Article  Google Scholar 

  • Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng (TSE) 32(12):971–987

    Article  Google Scholar 

  • Lee U, Liu Z, Cho J (2005) Automatic identification of user goals in web search Proceedings of the 14th international conference on world wide web (WWW), ACM, pp 391–400

  • Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) Codegenie: using test-cases to search and reuse source code Proceedings of the 22nd IEEE/ACM international conference on automated software engineering (ASE), ACM, pp 525–526

  • Li H, Xing Z, Peng X, Zhao W (2013) What help do developers seek, when and how? Proceedings of the 20th working conference on reverse engineering (WCRE), IEEE, pp 142–151

  • Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Disc 18(2):300–336

    Article  MathSciNet  Google Scholar 

  • Ponzanelli L, Bacchelli A, Lanza M (2013) Seahawk: Stack overflow in the ide Proceedings of the 2013 international conference on software engineering, IEEE Press, pp 1295–1298

  • Rahman MM, Yeasmin S, Roy CK (2014) Towards a context-aware ide-based meta search engine for recommendation about programming errors and exceptions Software evolution week-IEEE conference on software maintenance, reengineering and reverse engineering (CSMR-WCRE), 2014, IEEE, pp 194–203

  • Rose DE, Levinson D (2004) Understanding user goals in web search Proceedings of the 13th international conference on world wide web (WWW), ACM, pp 13–19

  • Sadowski C, Stolee KT, Elbaum S (2015) How developers search for code: a case study Proceedings of the 10th joint meeting on foundations of software engineering (FSE), ACM, pp 191–201

  • Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512

  • Sillito J, Murphy GC, De Volder K (2006) Questions programmers ask during software evolution tasks Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering, ACM, pp 23–34

  • Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log ACM SIGIR Forum, ACM, vol 33, pp 6–12

  • Sim SE, Clarke CL, Holt RC (1998) Archetypal source code searches: a survey of software developers and maintainers Proceedings of the 6th international workshop on program comprehension (IWPC), IEEE, pp 180–187

  • Sim SE, Umarji M, Ratanotayanon S, Lopes CV (2011) How well do search engines support code retrieval on the web? ACM Trans Softw Eng Methodol (TOSEM) 21(1):4

    Article  Google Scholar 

  • Sim SE, Philip K, Umarji M, Agarwala M, Gallardo-Valencia R, Lopes CV, Ratanotayanon S (2012) Software reuse through methodical component reuse and amethodical snippet remixing Proceedings of the ACM 2012 conference on computer supported cooperative work, ACM, pp 1361–1370

  • Spink A, Jansen BJ, Wolfram D, Saracevic T (2002) From e-sex to e-commerce: Web search changes. Computer 35(3):107–109

    Article  Google Scholar 

  • Stolee KT, Elbaum S, Dobos D (2014) Solving the search for source code. ACM Trans Softw Eng Methodol (TOSEM) 23(3):26

    Article  Google Scholar 

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction model. IEEE Trans Softw Eng (TSE) 43(1):1–18

  • Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web?: Nier track Proceedings of the 33rd international conference on software engineering (ICSE), IEEE, pp 804–807

  • Wuensch KL (2005) What is a likert scale? and how do you pronounce’likert?’. East Carolina University

Download references

Acknowledgments

The authors thank to all the developers who participated in this study. This research is supported by NSFC Program (No.61602403) and National Key Technology R&D Program of the Ministry of Science and Technology of China under grant 2015BAH17F01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingfeng Bao.

Additional information

Communicated by: Emerson Murphy-Hill

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, X., Bao, L., Lo, D. et al. What do developers search for on the web?. Empir Software Eng 22, 3149–3185 (2017). https://doi.org/10.1007/s10664-017-9514-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-017-9514-4

Keywords

Navigation