Skip to main content

Which Apps Have Privacy Policies?

An Analysis of Over One Million Google Play Store Apps

  • Conference paper
  • First Online:
Privacy Technologies and Policy (APF 2018)

Abstract

Smartphone app privacy policies are intended to describe smartphone apps’ data collection and use practices. However, not all apps have privacy policies. Without prominent privacy policies, it becomes more difficult for users, regulators, and privacy organizations to evaluate apps’ privacy practices. We answer the question: “Which apps have privacy policies?” by analyzing the metadata of over one million apps from the Google Play Store. Only about half of the apps we examined link to a policy from their Play Store pages. First, we conducted an exploratory data analysis of the relationship between app metadata features and whether apps link to privacy policies. Next, we trained a logistic regression model to predict the probability that individual apps will have policy links. Finally, by comparing three crawls of the Play Store, we observe an overall-increase in the percent of apps with links between September 2017 and May 2018 (from 41.7% to 51.8%).

This study was supported in part by the NSF Frontier grant on Usable Privacy Policies (CNS-1330596 and CNS-1330141) and a DARPA Brandeis grant on Personalized Privacy Assistants (FA8750-15-2-0277). The US Government is authorized to reproduce and distribute reprints for Governmental purposes not withstanding any copyright notation. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the NSF, DARPA, or the US Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Cal. Bus. & Prof. Code §22575(a).

  2. 2.

    Del. Code Tit. 6 §1205C(a).

  3. 3.

    16 CFR §312.4(d).

  4. 4.

    We were unable to consider the Top Developer badge in our analysis because it is no longer displayed on the Play Store [13, 26].

  5. 5.

    https://www.usableprivacy.org, accessed: May 20, 2018.

  6. 6.

    Our virtual server had four Intel Xeon E5-2640 CPU cores at 2.50 GHz and 8 GB of RAM.

  7. 7.

    https://github.com/openvenues/pypostal, accessed: May 20, 2018.

  8. 8.

    Consequently, the relative frequencies shown in Fig. 1 should be interpreted cautiously. For example, it would not be safe to assume that there are more developers from India than from the US as developers from India may possibly include the country in their address more frequently than developers from the US.

  9. 9.

    https://bitbucket.org/flyingcircus/pycountry, accessed: May 20, 2018.

  10. 10.

    By the time of our Third Crawl, the Play Store had changed the display of the install ranges and started showing only their smallest values, e.g., 1+ installs, 5+ installs, etc.

  11. 11.

    Note that the full Warning descriptor reads “Warning - content has not yet been rated. Unrated apps may potentially contain content appropriate for mature audiences only.”

  12. 12.

    Inherently, scaling has the disadvantage that the intercept cannot easily be interpreted because it is the y-intercept of the scaled variables.

  13. 13.

    We can ignore rating_countˆ2 and rating_countˆ3 because they were eliminated from the model.

References

  1. Almuhimedi, H., Schaub, F., Sadeh, N., Adjerid, I.: Your location has been shared 5,398 times!: A field study on mobile app privacy nudging. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (2015). https://doi.org/10.1145/2702123.2702210, http://dl.acm.org/citation.cfm?id=2702210

  2. Balebako, R., Marsh, A., Lin, J., Hong, J.I., Cranor, L.F.: The privacy and security behaviors of smartphone app developers. In: Workshop on Usable Security (2014). http://repository.cmu.edu/hcii/265/

  3. Blenner, S.R., Kollmer, M., Rouse, A.J., Daneshvar, N., Williams, C., Andrews, L.B.: Privacy policies of android diabetes apps and sharing of health information. JAMA 315(10), 1051–1052 (2016). https://doi.org/10.1001/jama.2015.19426. http://jama.jamanetwork.com/article.aspx?doi=10.1001/jama.2015.19426

    Article  Google Scholar 

  4. Bouchard, B., Suzuki, K.: Find great apps and games on Google Play with the editors’ choice update, July 2017. https://www.blog.google/products/google-play/find-great-apps-and-games-google-play-editors-choice-update/. Accessed 20 May 2018

  5. Clark, B.: Millions of apps could soon be purged from Google Play Store, February 2017. https://thenextweb.com/google/2017/02/08/millions-apps-soon-purged-google-play-store/. Accessed 20 May 2018

  6. scikit-learn developers: sklearn.linear model.logisticregression. http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html. Accessed 20 May 2018

  7. scikit-learn developers: sklearn.linear model.sgdclassifier. http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html. Accessed 20 May 2018

  8. scikit-learn developers: sklearn.preprocessing.standardscaler. http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html. Accessed 20 May 2018

  9. scikit-learn developers: Stochastic gradient descent: Tips on practical use. http://scikit-learn.org/stable/modules/sgd.html#tips-on-practical-use. Accessed 20 May 2018

  10. scikit-learn developers: Choosing the right estimator (2017). http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html. Accessed 20 May 2018

  11. d’Heureuse, N., Huici, F., Arumaithurai, M., Ahmed, M., Papagiannaki, K., Niccolini, S.: What’s app?: A wide-scale measurement study of smart phone markets. dl.acm.org. https://dl.acm.org/citation.cfm?id=2396759

  12. Entertainment Software Rating Board (ESRB): ESRB ratings guide (2015). https://www.esrb.org/ratings/ratings_guide.aspx. Accessed 20 May 2018

  13. Fahey, K.: Recognizing android excellence on Google Play, June 2017. https://android-developers.googleblog.com/2017/06/recognizing-android-excellence-on.html. Accessed 20 May 2018

  14. Federal Trade Commission: Mobile privacy disclosures, February 2013. https://www.ftc.gov/os/2013/02/130201mobileprivacyreport.pdf. Accessed 20 May 2018

  15. Federal Trade Commission: Children’s Online Privacy Protection Rule (“COPPA”), August 2015. https://www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule. Accessed 20 May 2018

  16. FTC: Privacy online: a report to congress, June 1998. https://www.ftc.gov/reports/privacy-online-report-congress. Accessed 20 May 2018

  17. Google: Designed for families. https://developer.android.com/distribute/google-play/families.html. Accessed 20 May 2018

  18. Google: Content ratings for apps & games. https://support.google.com/googleplay/android-developer/answer/188189?hl=en (2017). Accessed 20 May 2018

  19. Google: Ratings questionnaire help (2017). https://support.google.com/googleplay/android-developer/topic/6169305?hl=en&ref_topic=6159951. Accessed 20 May 2018

  20. Google: Requesting permissions (2017). https://developer.android.com/guide/topics/permissions/requesting.html. Accessed 20 May 2018

  21. California Department of Justice: Attorney General Kamala D. Harris secures global agreement to strengthen privacy protections for users of mobile applications, February 2012. http://www.oag.ca.gov/news/press-releases/attorney-general-kamala-d-harris-secures-global-agreement-strengthen-privacy. Accessed 20 May 2018

  22. California Department of Justice: Making your privacy practices public, May 2014. https://oag.ca.gov/sites/all/files/agweb/pdfs/cybersecurity/making_your_privacy_practices_public.pdf. Accessed 20 May 2018

  23. Kelley, P.G., Cranor, L.F., Sadeh, N.: Privacy as part of the app decision-making process. In: CHI, p. 3393 (2013). https://doi.org/10.1145/2470654.2466466, http://dl.acm.org/citation.cfm?doid=2470654.2466466

  24. Lin, J., Liu, B., Sadeh, N., Hong, J.I.: Modeling users’ mobile app privacy preferences - restoring usability in a sea of permission settings. In: Proceedings of the Twelfth Symposium on Usable Privacy and Security (2014). http://dblp.org/rec/conf/soups/LinLSH14

  25. Lin, J., Sadeh, N., Amini, S., Lindqvist, J., Hong, J.I., Zhang, J.: Expectation and purpose - understanding users’ mental models of mobile app privacy through crowdsourcing. In: UbiComp, p. 501 (2012). https://doi.org/10.1145/2370216.2370290, http://dl.acm.org/citation.cfm?doid=2370216.2370290

  26. Palmer, J.: After several years of service, the Google Play Top Developer Program is being put to rest, May 2017. http://www.androidpolice.com/2017/05/05/several-years-service-google-play-top-developer-program-put-rest/. Accessed 20 May 2018

  27. Pedregosa, F., et al.: scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  28. Sadeh, N., et al.: The usable privacy policy project: combining crowdsourcing, machine learning and natural language processing to semi-automatically answer those privacy questions users care about. Carnegie Mellon University Technical Report CMU-ISR-13-119, pp. 1–24, December 2013. http://reports-archive.adm.cs.cmu.edu/anon/isr2013/CMU-ISR-13-119.pdf

  29. Statista: Number of apps available in leading app stores as of March 2017 (2017). https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/. Accessed 20 May 2018

  30. Sunyaev, A., Dehling, T., Taylor, P.L., Mandl, K.D.: Availability and quality of mobile health app privacy policies. J. Am. Med. Inform. Assoc. 22, e28–e33 (2014). https://doi.org/10.1136/amiajnl-2013-002605. https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2013-002605

    Article  Google Scholar 

  31. Viennot, N., Garcia, E., Nieh, J.: A measurement study of Google Play. In: The 2014 ACM International Conference, pp. 221–233. ACM Press, New York City (2014). https://doi.org/10.1145/2591971.2592003, http://dl.acm.org/citation.cfm?doid=2591971.2592003

  32. Wang, H., et al.: An explorative study of the mobile app ecosystem from app developers’ perspective. In: The 26th International Conference, pp. 163–172. ACM Press, New York City (2017). https://doi.org/10.1145/3038912.3052712, http://dl.acm.org/citation.cfm?doid=3038912.3052712

  33. Zimmeck, S., et al.: Automated analysis of privacy requirements for mobile apps. In: 24th Network & Distributed System Security Symposium (NDSS 2017). NDSS 2017, San Diego, CA. Internet Society, February 2017

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Story .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Odds Calculations

Here we provide additional details about how the baseline app’s odds were calculated, and how to interpret the model’s quantitative variables.

Logistic regression models operate directly in terms of log(odds). For inter-pretability, log(odds) are easily converted to odds:

$$ e^{\text{log(odds)}} = {\text{odds}} $$
(2)

Under the definition of the baseline app in Sect. 4, the log(odds) of the baseline app having a privacy policy can be calculated by substituting the coefficients of Table 2 into the following equation:

$$ {\text{log(odds(policy}} = {\text{True))}} = b_{0} + b_{{{\text{date}}\_{\text{published}}\_{\text{relative}}}} *{\text{date}}\_{\text{published}}\_{\text{relative}}\_{\text{scaled}} + \ldots $$
(3)

where b0 is the intercept, bdate_published_relative is a feature coefficient, and date_published_relative_scaled is a scaled feature value. Note that the full equation would include terms for all of the coefficients in Table 2. From this equation, we calculate the log(odds) = −0.887 and odds = e−0.887 = 0.412 of the baseline app having a privacy policy.

Next, we give an example of changing a quantitative variable. Suppose we start with the baseline app, which has no ratings, and increase the rating_count to 1,000,000. First, we scale rating_countFootnote 13 using the coefficients from Table 1:

$$ \Delta rating\_count\_scaled = \frac{1,000,000 - rating\_count\_baseline}{{S_{rating\_count} }} \approx \frac{1,000,000}{{1.598*10^{5} }} \approx 6.258 $$
(4)

Next, we simply multiply this scaled value by its corresponding coefficient from Table 2 and add it to the log(odds) of the baseline app. This gives us log(odds) = 8.850, or equivalently odds = 6,974. According to our model, an app with 1,000,000 ratings has much greater odds of having a privacy policy than an app with no ratings.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Story, P., Zimmeck, S., Sadeh, N. (2018). Which Apps Have Privacy Policies?. In: Medina, M., Mitrakas, A., Rannenberg, K., Schweighofer, E., Tsouroulas, N. (eds) Privacy Technologies and Policy. APF 2018. Lecture Notes in Computer Science(), vol 11079. Springer, Cham. https://doi.org/10.1007/978-3-030-02547-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02547-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02546-5

  • Online ISBN: 978-3-030-02547-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics