Skip to main content

Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Abstract

It has been shown recently that many non-convex objective/loss functions in machine learning are known to be strict saddle. This means that finding a second-order stationary point (i.e., approximate local minimum) and thus escaping saddle points are sufficient for such functions to obtain a classifier with good generalization performance. Existing algorithms for escaping saddle points, however, all fail to take into consideration a critical issue in their designs, that is, the protection of sensitive information in the training set. Models learned by such algorithms can often implicitly memorize the details of sensitive information, and thus offer opportunities for malicious parties to infer it from the learned models. In this paper, we investigate the problem of privately escaping saddle points and finding a second-order stationary point of the empirical risk of non-convex loss function. Previous result on this problem is mainly of theoretical importance and has several issues (e.g., high sample complexity and non-scalable) which hinder its applicability, especially, in big data. To deal with these issues, we propose in this paper a new method called Differentially Private Trust Region, and show that it outputs a second-order stationary point with high probability and less sample complexity, compared to the existing one. Moreover, we also provide a stochastic version of our method (along with some theoretical guarantees) to make it faster and more scalable. Experiments on benchmark datasets suggest that our methods are indeed more efficient and practical than the previous one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A point w of a function \(F(\cdot )\) is called a first-order stationary point (critical point) if it satisfies the condition of \(\Vert \nabla F(w)\Vert =0\).

  2. 2.

    Generally, \(D_\alpha (P\Vert Q)\) is the Rényi divergence between P and Q which is defined as

    $$\begin{aligned} D_\alpha (P\Vert Q)= \frac{1}{\alpha -1}\log \mathbb {E}_{x\sim Q} (\frac{P(x)}{Q(x)})^\alpha . \end{aligned}$$

    .

  3. 3.

    This is a special version of \((\epsilon , \gamma )\)-SOSP [15]. Our results can be easily extended to the general definition. The same applies to the constrained case.

References

  1. Agarwal, N., Singh, K.: The price of differential privacy for online learning. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, pp. 32–40 (2017)

    Google Scholar 

  2. Anandkumar, A., Ge, R.: Efficient approaches for escaping higher order saddle points in non-convex optimization. In: Conference on Learning Theory, pp. 81–102 (2016)

    Google Scholar 

  3. Balcan, M.F., Dick, T., Vitercik, E.: Dispersion for data-driven algorithm design, online learning, and private optimization. In: 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 603–614. IEEE (2018)

    Google Scholar 

  4. Bassily, R., Smith, A., Thakurta, A.: Private empirical risk minimization: efficient algorithms and tight error bounds. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS), pp. 464–473. IEEE (2014)

    Google Scholar 

  5. Bhojanapalli, S., Neyshabur, B., Srebro, N.: Global optimality of local search for low rank matrix recovery. In: Advances in Neural Information Processing Systems, pp. 3873–3881 (2016)

    Google Scholar 

  6. Bun, M., Steinke, T.: Concentrated differential privacy: simplifications, extensions, and lower bounds. In: Hirt, M., Smith, A. (eds.) TCC 2016. LNCS, vol. 9985, pp. 635–658. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53641-4_24

    Chapter  Google Scholar 

  7. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)

    Google Scholar 

  8. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)

    MathSciNet  MATH  Google Scholar 

  9. Conn, A.R., Gould, N.I., Toint, P.L.: Trust region methods. SIAM 1 (2000)

    Google Scholar 

  10. Dauphin, Y.N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: Advances in Neural Information Processing Systems, pp. 2933–2941 (2014)

    Google Scholar 

  11. Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79228-4_1

    Chapter  MATH  Google Scholar 

  12. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  13. Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)

    MathSciNet  MATH  Google Scholar 

  14. Erlingsson, Ú., Pihur, V., Korolova, A.: RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1054–1067. ACM (2014)

    Google Scholar 

  15. Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points-online stochastic gradient for tensor decomposition. In: Conference on Learning Theory, pp. 797–842 (2015)

    Google Scholar 

  16. Ge, R., Lee, J.D., Ma, T.: Learning one-hidden-layer neural networks with landscape design. In: International Conference on Learning Representations (2018)

    Google Scholar 

  17. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  18. Gould, N.I., Lucidi, S., Roma, M., Toint, P.L.: Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2), 504–525 (1999)

    Article  MathSciNet  Google Scholar 

  19. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1, March 2014. http://cvxr.com/cvx

  20. Huai, M., Wang, D., Miao, C., Xu, J., Zhang, A.: Pairwise learning with differential privacy guarantees. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York City, New York, USA, 7–12 February 2020 (2020)

    Google Scholar 

  21. Jain, P., Kothari, P., Thakurta, A.: Differentially private online learning. In: Conference on Learning Theory, pp. 24.1–24.34 (2012)

    Google Scholar 

  22. Kasiviswanathan, S.P., Jin, H.: Efficient private empirical risk minimization for high-dimensional learning. In: International Conference on Machine Learning, pp. 488–497 (2016)

    Google Scholar 

  23. Kawaguchi, K.: Deep learning without poor local minima. In: Advances in Neural Information Processing Systems, pp. 586–594 (2016)

    Google Scholar 

  24. Kohler, J.M., Lucchi, A.: Sub-sampled cubic regularization for non-convex optimization. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1895–1904. JMLR. org (2017)

    Google Scholar 

  25. Mei, S., Bai, Y., Montanari, A., et al.: The landscape of empirical risk for nonconvex losses. Ann. Stat. 46(6A), 2747–2774 (2018)

    Article  MathSciNet  Google Scholar 

  26. Talwar, K., Thakurta, A.G., Zhang, L.: Nearly optimal private LASSO. In: Advances in Neural Information Processing Systems, pp. 3025–3033 (2015)

    Google Scholar 

  27. Thakurta, A.G., Smith, A.: (nearly) optimal algorithms for private online learning in full-information and bandit settings. In: Advances in Neural Information Processing Systems, pp. 2733–2741 (2013)

    Google Scholar 

  28. Wang, D., Chen, C., Xu, J.: Differentially private empirical risk minimization with non-convex loss functions. In: International Conference on Machine Learning, pp. 6526–6535 (2019)

    Google Scholar 

  29. Wang, D., Gaboardi, M., Xu, J.: Empirical risk minimization in non-interactive local differential privacy revisited (2018)

    Google Scholar 

  30. Wang, D., Smith, A., Xu, J.: Noninteractive locally private learning of linear models via polynomial approximations. In: Algorithmic Learning Theory, pp. 897–902 (2019)

    Google Scholar 

  31. Wang, D., Xu, J.: Differentially private empirical risk minimization with smooth non-convex loss functions: a non-stationary view (2019)

    Google Scholar 

  32. Wang, D., Xu, J.: Differentially private empirical risk minimization with smooth non-convex loss functions: a non-stationary view. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1182–1189 (2019)

    Google Scholar 

  33. Wang, D., Xu, J.: On sparse linear regression in the local differential privacy model. In: International Conference on Machine Learning, pp. 6628–6637 (2019)

    Google Scholar 

  34. Wang, D., Ye, M., Xu, J.: Differentially private empirical risk minimization revisited: faster and more general. In: Advances in Neural Information Processing Systems, pp. 2722–2731 (2017)

    Google Scholar 

  35. Wang, D., Zhang, H., Gaboardi, M., Xu, J.: Estimating smooth GLM in non-interactive local differential privacy model with public unlabeled data. arXiv preprint arXiv:1910.00482 (2019)

  36. Wang, Y.X., Lei, J., Fienberg, S.E.: Learning with differential privacy: stability, learnability and the sufficiency and necessity of ERM principle. J. Mach. Learn. Res. 17(183), 1–40 (2016)

    MathSciNet  MATH  Google Scholar 

  37. Zhang, J., Zheng, K., Mou, W., Wang, L.: Efficient private ERM for smooth objectives. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3922–3928. AAAI Press (2017)

    Google Scholar 

  38. Zhou, D., Xu, P., Gu, Q.: Stochastic variance-reduced cubic regularized Newton method. In: International Conference on Machine Learning, pp. 5985–5994 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Di Wang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 185 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, D., Xu, J. (2021). Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67664-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67663-6

  • Online ISBN: 978-3-030-67664-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics