Skip to main content

Bilevel Optimization with Nonsmooth Lower Level Problems

  • Conference paper
  • First Online:
Scale Space and Variational Methods in Computer Vision (SSVM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9087))

Abstract

We consider a bilevel optimization approach for parameter learning in nonsmooth variational models. Existing approaches solve this problem by applying implicit differentiation to a sufficiently smooth approximation of the nondifferentiable lower level problem. We propose an alternative method based on differentiating the iterations of a nonlinear primal–dual algorithm. Our method computes exact (sub)gradients and can be applied also in the nonsmooth setting. We show preliminary results for the case of multi-label image segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kunisch, K., Pock, T.: A bilevel optimization approach for parameter learning in variational models. SIAM Journal on Imaging Sciences 6(2), 938–983 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  2. Reyes, J.C.D.L., Schönlieb, C.B.: Image denoising: Learning noise distribution via pde-constrained optimisation. Inverse Problems and Imaging 7, 1183–1214 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  3. Samuel, K., Tappen, M.: Learning optimized MAP estimates in continuously-valued MRF models. In: International Conference on Computer Vision and Pattern Recognition (CVPR), 477–484 (2009)

    Google Scholar 

  4. Tappen, M., Samuel, K., Dean, C., Lyle, D.: The logistic random field-a convenient graphical model for learning parameters for MRF-based labeling. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)

    Google Scholar 

  5. Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches. IEEE Transactions on Information Theory 51, 3697–3717 (2002)

    Article  MathSciNet  Google Scholar 

  6. Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  7. Taskar, B., Chatalbashev, V., Koller, D., Guestrin, C.: Learning structured prediction models: a large margin approach. In: International Conference on Machine Learning (ICML), pp. 896–903 (2005)

    Google Scholar 

  8. LeCun, Y., Huang, F.: Loss functions for discriminative training of energy-based models. In: International Workshop on Artificial Intelligence and Statistics (2005)

    Google Scholar 

  9. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems (NIPS), pp. 2951–2959 (2012)

    Google Scholar 

  10. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Eggensperger, K., Feurer, M., Hutter, F., Bergstra, J., Snoek, J., Hoos, H., Leyton-Brown, K.: Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In: NIPS Workshop (2013)

    Google Scholar 

  12. Ranftl, R., Pock, T.: A deep variational model for image segmentation. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 104–115. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  13. Peyré, G., Fadili, J.: Learning analysis sparsity priors. In: Proceedings of Sampta (2011)

    Google Scholar 

  14. Chen, Y., Pock, T., Ranftl, R., Bischof, H.: Revisiting loss-specific training of filter-based MRFs for image restoration. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 271–281. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  15. Chen, Y., Ranftl, R., Pock, T.: Insights into analysis operator learning: From patch-based sparse models to higher order MRFs. IEEE Transactions on Image Processing 23(3), 1060–1072 (2014)

    Article  MathSciNet  Google Scholar 

  16. Tappen, M.: Utilizing variational optimization to learn MRFs. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)

    Google Scholar 

  17. Domke, J.: Generic methods for optimization-based modeling. In: International Workshop on Artificial Intelligence and Statistics, pp. 318–326 (2012)

    Google Scholar 

  18. Geman, D., Reynolds, G.: Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 367–383 (1992)

    Article  Google Scholar 

  19. Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision 40(1), 120–145 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  20. Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Technical report (2014) (to appear)

    Google Scholar 

  21. Deledalle, C.A., Vaiter, S., Fadili, J., Peyré, G.: Stein Unbiased GrAdient estimator of the Risk (SUGAR) for multiple parameter selection. SIAM Journal on Imaging Sciences 7(4), 2448–2487 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  22. Foo, C.S., Do, C., Ng, A.: Efficient multiple hyperparameter learning for log-linear models. In: Advances in Neural Information Processing Systems (NIPS), pp. 377–384. Curran Associates, Inc. (2008)

    Google Scholar 

  23. Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: International Conference on Computer Vision and Pattern Recognition Workshop (CVPR) (2004)

    Google Scholar 

  24. Ochs, P., Chen, Y., Brox, T., Pock, T.: ipiano: Inertial proximal algorithm for non-convex optimization. SIAM Journal on Imaging Sciences 7(2), 1388–1419 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  25. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Mathematical Programming 45(1), 503–528 (1989)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Ochs .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ochs, P., Ranftl, R., Brox, T., Pock, T. (2015). Bilevel Optimization with Nonsmooth Lower Level Problems. In: Aujol, JF., Nikolova, M., Papadakis, N. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2015. Lecture Notes in Computer Science(), vol 9087. Springer, Cham. https://doi.org/10.1007/978-3-319-18461-6_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18461-6_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18460-9

  • Online ISBN: 978-3-319-18461-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics