Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

Wang, Di; Xu, Jinhui

doi:10.1007/978-3-030-67664-3_6

Di Wang^12,13 &
Jinhui Xu¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1809 Accesses
1 Citations

Abstract

It has been shown recently that many non-convex objective/loss functions in machine learning are known to be strict saddle. This means that finding a second-order stationary point (i.e., approximate local minimum) and thus escaping saddle points are sufficient for such functions to obtain a classifier with good generalization performance. Existing algorithms for escaping saddle points, however, all fail to take into consideration a critical issue in their designs, that is, the protection of sensitive information in the training set. Models learned by such algorithms can often implicitly memorize the details of sensitive information, and thus offer opportunities for malicious parties to infer it from the learned models. In this paper, we investigate the problem of privately escaping saddle points and finding a second-order stationary point of the empirical risk of non-convex loss function. Previous result on this problem is mainly of theoretical importance and has several issues (e.g., high sample complexity and non-scalable) which hinder its applicability, especially, in big data. To deal with these issues, we propose in this paper a new method called Differentially Private Trust Region, and show that it outputs a second-order stationary point with high probability and less sample complexity, compared to the existing one. Moreover, we also provide a stochastic version of our method (along with some theoretical guarantees) to make it faster and more scalable. Experiments on benchmark datasets suggest that our methods are indeed more efficient and practical than the previous one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A point w of a function $F(\cdot )$ is called a first-order stationary point (critical point) if it satisfies the condition of $\Vert \nabla F(w)\Vert =0$.
2.
Generally, $D_\alpha (P\Vert Q)$ is the Rényi divergence between P and Q which is defined as
$$\begin{aligned} D_\alpha (P\Vert Q)= \frac{1}{\alpha -1}\log \mathbb {E}_{x\sim Q} (\frac{P(x)}{Q(x)})^\alpha . \end{aligned}$$
.
3.
This is a special version of $(\epsilon , \gamma )$-SOSP [15]. Our results can be easily extended to the general definition. The same applies to the constrained case.

References

Agarwal, N., Singh, K.: The price of differential privacy for online learning. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, pp. 32–40 (2017)
Google Scholar
Anandkumar, A., Ge, R.: Efficient approaches for escaping higher order saddle points in non-convex optimization. In: Conference on Learning Theory, pp. 81–102 (2016)
Google Scholar
Balcan, M.F., Dick, T., Vitercik, E.: Dispersion for data-driven algorithm design, online learning, and private optimization. In: 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 603–614. IEEE (2018)
Google Scholar
Bassily, R., Smith, A., Thakurta, A.: Private empirical risk minimization: efficient algorithms and tight error bounds. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS), pp. 464–473. IEEE (2014)
Google Scholar
Bhojanapalli, S., Neyshabur, B., Srebro, N.: Global optimality of local search for low rank matrix recovery. In: Advances in Neural Information Processing Systems, pp. 3873–3881 (2016)
Google Scholar
Bun, M., Steinke, T.: Concentrated differential privacy: simplifications, extensions, and lower bounds. In: Hirt, M., Smith, A. (eds.) TCC 2016. LNCS, vol. 9985, pp. 635–658. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53641-4_24
Chapter Google Scholar
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)
Google Scholar
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
MathSciNet MATH Google Scholar
Conn, A.R., Gould, N.I., Toint, P.L.: Trust region methods. SIAM 1 (2000)
Google Scholar
Dauphin, Y.N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: Advances in Neural Information Processing Systems, pp. 2933–2941 (2014)
Google Scholar
Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79228-4_1
Chapter MATH Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Chapter Google Scholar
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)
MathSciNet MATH Google Scholar
Erlingsson, Ú., Pihur, V., Korolova, A.: RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1054–1067. ACM (2014)
Google Scholar
Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points-online stochastic gradient for tensor decomposition. In: Conference on Learning Theory, pp. 797–842 (2015)
Google Scholar
Ge, R., Lee, J.D., Ma, T.: Learning one-hidden-layer neural networks with landscape design. In: International Conference on Learning Representations (2018)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)
MATH Google Scholar
Gould, N.I., Lucidi, S., Roma, M., Toint, P.L.: Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2), 504–525 (1999)
Article MathSciNet Google Scholar
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1, March 2014. http://cvxr.com/cvx
Huai, M., Wang, D., Miao, C., Xu, J., Zhang, A.: Pairwise learning with differential privacy guarantees. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York City, New York, USA, 7–12 February 2020 (2020)
Google Scholar
Jain, P., Kothari, P., Thakurta, A.: Differentially private online learning. In: Conference on Learning Theory, pp. 24.1–24.34 (2012)
Google Scholar
Kasiviswanathan, S.P., Jin, H.: Efficient private empirical risk minimization for high-dimensional learning. In: International Conference on Machine Learning, pp. 488–497 (2016)
Google Scholar
Kawaguchi, K.: Deep learning without poor local minima. In: Advances in Neural Information Processing Systems, pp. 586–594 (2016)
Google Scholar
Kohler, J.M., Lucchi, A.: Sub-sampled cubic regularization for non-convex optimization. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1895–1904. JMLR. org (2017)
Google Scholar
Mei, S., Bai, Y., Montanari, A., et al.: The landscape of empirical risk for nonconvex losses. Ann. Stat. 46(6A), 2747–2774 (2018)
Article MathSciNet Google Scholar
Talwar, K., Thakurta, A.G., Zhang, L.: Nearly optimal private LASSO. In: Advances in Neural Information Processing Systems, pp. 3025–3033 (2015)
Google Scholar
Thakurta, A.G., Smith, A.: (nearly) optimal algorithms for private online learning in full-information and bandit settings. In: Advances in Neural Information Processing Systems, pp. 2733–2741 (2013)
Google Scholar
Wang, D., Chen, C., Xu, J.: Differentially private empirical risk minimization with non-convex loss functions. In: International Conference on Machine Learning, pp. 6526–6535 (2019)
Google Scholar
Wang, D., Gaboardi, M., Xu, J.: Empirical risk minimization in non-interactive local differential privacy revisited (2018)
Google Scholar
Wang, D., Smith, A., Xu, J.: Noninteractive locally private learning of linear models via polynomial approximations. In: Algorithmic Learning Theory, pp. 897–902 (2019)
Google Scholar
Wang, D., Xu, J.: Differentially private empirical risk minimization with smooth non-convex loss functions: a non-stationary view (2019)
Google Scholar
Wang, D., Xu, J.: Differentially private empirical risk minimization with smooth non-convex loss functions: a non-stationary view. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1182–1189 (2019)
Google Scholar
Wang, D., Xu, J.: On sparse linear regression in the local differential privacy model. In: International Conference on Machine Learning, pp. 6628–6637 (2019)
Google Scholar
Wang, D., Ye, M., Xu, J.: Differentially private empirical risk minimization revisited: faster and more general. In: Advances in Neural Information Processing Systems, pp. 2722–2731 (2017)
Google Scholar
Wang, D., Zhang, H., Gaboardi, M., Xu, J.: Estimating smooth GLM in non-interactive local differential privacy model with public unlabeled data. arXiv preprint arXiv:1910.00482 (2019)
Wang, Y.X., Lei, J., Fienberg, S.E.: Learning with differential privacy: stability, learnability and the sufficiency and necessity of ERM principle. J. Mach. Learn. Res. 17(183), 1–40 (2016)
MathSciNet MATH Google Scholar
Zhang, J., Zheng, K., Mou, W., Wang, L.: Efficient private ERM for smooth objectives. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3922–3928. AAAI Press (2017)
Google Scholar
Zhou, D., Xu, P., Gu, Q.: Stochastic variance-reduced cubic regularized Newton method. In: International Conference on Machine Learning, pp. 5985–5994 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, 14260, USA
Di Wang & Jinhui Xu
King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Di Wang

Authors

Di Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Di Wang .

Editor information

Editors and Affiliations

Albert-Ludwigs-Universität, Freiburg, Germany
Frank Hutter
TU Darmstadt, Darmstadt, Germany
Kristian Kersting
Ghent University, Ghent, Belgium
Jefrey Lijffijt
Saarland University, Saarbrücken, Germany
Isabel Valera

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 185 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, D., Xu, J. (2021). Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-67664-3_6
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67663-6
Online ISBN: 978-3-030-67664-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)