Derivation and analysis of parallel-in-time neural ordinary differential equations

Lorin, E.

doi:10.1007/s10472-020-09702-6

Derivation and analysis of parallel-in-time neural ordinary differential equations

Published: 25 July 2020

Volume 88, pages 1035–1059, (2020)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

E. Lorin^1,2

228 Accesses
3 Citations
Explore all metrics

Abstract

The introduction in 2015 of Residual Neural Networks (RNN) and ResNET allowed for outstanding improvements of the performance of learning algorithms for evolution problems containing a “large” number of layers. Continuous-depth RNN-like models called Neural Ordinary Differential Equations (NODE) were then introduced in 2019. The latter have a constant memory cost, and avoid the a priori specification of the number of hidden layers. In this paper, we derive and analyze a parallel (-in-parameter and time) version of the NODE, which potentially allows for a more efficient implementation than a standard/naive parallelization of NODEs with respect to the parameters only. We expect this approach to be relevant whenever we have access to a very large number of processors, or when we are dealing with high dimensional ODE systems. Moreover, when using implicit ODE solvers, solutions to linear systems with up to cubic complexity are then required for solving nonlinear systems using for instance Newton’s algorithm; as the proposed approach allows to reduce the overall number of time-steps thanks to an iterative increase of the accuracy order of the ODE system solvers, it then reduces the number of linear systems to solve, hence benefiting from a scaling effect.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Connections Between Numerical Algorithms for PDEs and Neural Networks

Article Open access 24 June 2022

Large-Scale Neural Solvers for Partial Differential Equations

Parareal with a Physics-Informed Neural Network as Coarse Propagator

References

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016, 770–778 (2016)
Chen, R.T., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equations. arXiv:1806.07366v4 (2019)
Bertsekas, D.P., Tsitsiklis, J.N.: Gradient convergence in gradient methods with errors. SIAM J. Optim. 10(3), 627–642 (2000)
Article MathSciNet Google Scholar
Anthony, M., Bartlett, P.L.: Neural network learning: theoretical foundations. Cambridge University Press, Cambridge (1999)
Book Google Scholar
White, H.: Artificial neural networks. Blackwell Publishers, Oxford (1992). Approximation and learning theory, With contributions by A. R. Gallant, K. Hornik, M. Stinchcombe and J. Wooldridge
Google Scholar
Lions, J-L, Maday, Y., Turinici, G.: Résolution d’EDP par un schéma en temps “pararéel”. C. R. Acad. Sci. Paris Sér. I Math. 332(7), 661–668 (2001)
Article MathSciNet Google Scholar
Maday, Y.: Symposium: Recent advances on the parareal in time algorithms. 1168, 1515–1516 (2009)
Gander, M.J., Jiang, Y-L, Li, R-J: Parareal Schwarz waveform relaxation methods. Lect. Notes Comput. Sci. Eng. 91, 451–458 (2013)
Article MathSciNet Google Scholar
Fischer, P.F., Hecht, F., Maday, Y.: A parareal in time semi-implicit approximation of the Navier-Stokes equations. Lect. Notes Comput. Sci. Eng. 40, 433–440 (2005)
Article MathSciNet Google Scholar
Falgout, R.D., Friedhoff, S., Kolev, T.V., MacLachlan, S.P., Schroder, J.B.: Parallel time integration with multigrid. SIAM J. Sci. Comput. 36(6) (2014)
Giannakoglou, K.C., Papadimitriou, D.I. In: Thévenin, D, Janiga, G (eds.) : Adjoint methods for shape optimization, pp 79–108. Springer, Berlin (2008)
Quarteroni, A., Sacco, R., Saleri, F.: Numerical mathematics, vol. 37. of texts in Applied Mathematics. Springer, New York (2000)
MATH Google Scholar
Parpas, P., Muir, C.: Predict globally, correct locally: Parallel-in-time optimal control of neural networks. arXiv:1902.02542 (2019)
Günther, S., Ruthotto, L., Schroder, J.B., Cyr, E.C., Gauger, N.R.: Layer-parallel training of deep residual neural networks (2019)
Gander, M.J., Vandewalle, S.: Analysis of the parareal time-parallel time-integration method. SIAM J. Sci. Comput. 29(2), 556–578 (2007)
Article MathSciNet Google Scholar
Saad, Y., Schultz, M.H.: GMRES - A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems. SIAM J. Sci. Statis. Comput. 7(3), 856–869 (1986)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The author would like to thank Prof. D. Duvenaud and Dr. R. Chen from the University of Toronto for enlightening discussions about NODEs.

Author information

Authors and Affiliations

Centre de Recherches Mathématiques,, Université de Montréal, Montréal, H3T 1J4, Canada
E. Lorin
School of Mathematics and Statistics, Carleton University, Ottawa, K1S 5B6, Canada
E. Lorin

Authors

E. Lorin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to E. Lorin.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lorin, E. Derivation and analysis of parallel-in-time neural ordinary differential equations. Ann Math Artif Intell 88, 1035–1059 (2020). https://doi.org/10.1007/s10472-020-09702-6

Download citation

Published: 25 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10472-020-09702-6

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Derivation and analysis of parallel-in-time neural ordinary differential equations

Abstract

Access this article

Similar content being viewed by others

Connections Between Numerical Algorithms for PDEs and Neural Networks

Large-Scale Neural Solvers for Partial Differential Equations

Parareal with a Physics-Informed Neural Network as Coarse Propagator

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Derivation and analysis of parallel-in-time neural ordinary differential equations

Abstract

Access this article

Similar content being viewed by others

Connections Between Numerical Algorithms for PDEs and Neural Networks

Large-Scale Neural Solvers for Partial Differential Equations

Parareal with a Physics-Informed Neural Network as Coarse Propagator

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation