Skip to main content

Direct Zero-Norm Minimization for Neural Network Pruning and Training

  • Conference paper
Engineering Applications of Neural Networks (EANN 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 311))

  • 1582 Accesses

Abstract

Designing a feed-forward neural network with optimal topology in terms of complexity (hidden layer nodes and connections between nodes) and training performance has been a matter of considerable concern since the very beginning of neural networks research. Typically, this issue is dealt with by pruning a fully interconnected network with “many” nodes in the hidden layers, eliminating “superfluous” connections and nodes. However the problem has not been solved yet and it seems to be even more relevant today in the context of deep learning networks. In this paper we present a method of direct zero-norm minimization for pruning while training a Multi Layer Perceptron. The method employs a cooperative scheme using two swarms of particles and its purpose is to minimize an aggregate function corresponding to the total risk functional. Our discussion highlights relevant computational and methodological issues of the approach that are not apparent and well defined in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Norgaard, M.: Neural Network Based System Identification Toolbox, version 2. Technical report, 00-E-891, Dept. of Automation, Technical University of Denmark (2000)

    Google Scholar 

  2. Stepniewski, S.W., Keane, A.J.: Topology Design of Feedforward Neural Networks by Genetic Algorithms. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 771–780. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  3. Pinkus, A.: Approximation theory of the MLP model in neural model. Acta Numerica, 143–195 (1999)

    Google Scholar 

  4. Jones, L.K.: A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training. The Annals of Statistics 20, 601–613 (1992)

    Article  Google Scholar 

  5. Barron, A.R.: Universal approximation bounds for superposition of a sigmoidal function. IEEE Trans. Inform. Theory 39, 930–945 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  6. Kůrková, V., Kainen, P.C., Kreinovich, V.: Estimates of the number of hidden units and variation with respect to half-spaces. Neural Networks 10, 1061–1068 (1997)

    Article  Google Scholar 

  7. Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Networks 4, 251–257 (1991)

    Article  Google Scholar 

  8. Reed, R.: Pruning algorithms - A Survey. IEEE Trans. Neural Networks 4, 740–747 (1993)

    Article  Google Scholar 

  9. Tikhonov, A.N., Arsenin, V.Y.: Solution of Ill-posed Problems. W.H. Winston, Washington, DC (1977)

    Google Scholar 

  10. Haykin, S.: Neural networks: A comprehensive Foundation. Prentice-Hall, Upper Saddle River (1999)

    MATH  Google Scholar 

  11. Hinton, G.E.: Connectionist learning procedures. Artificial Intelligence 40, 185–234 (1989)

    Article  Google Scholar 

  12. Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Lippmann, R., Moody, J., Touretzky, D. (eds.) Advances in Neural Information Processing Systems (3), pp. 875–882. Morgan-Kaufmann, San Mateo (1991)

    Google Scholar 

  13. Mozer, M.C., Smolensky, P.: Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (1), pp. 40–48. Morgan Kaufmann, San Francisco (1989)

    Google Scholar 

  14. Karnin, E.D.: A simple procedure for pruning back-propagation trained neural networks. IEEE Trans. Neural Networks 1, 239–242 (1990)

    Article  Google Scholar 

  15. LeCun, Y., Denker, J.S., Solla, S.A.: Optimal Brain Damage. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (2), pp. 598–605. Morgan Kaufmann, San Francisco (1990)

    Google Scholar 

  16. Hassibi, B., Stork, D.G.: Second order derivatives for network pruning: Optimal Brain Surgeon. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems (5), pp. 164–172. Morgan-Kaufmann, San Mateo (1993)

    Google Scholar 

  17. Hancock, P.J.B.: Pruning neural networks by genetic algorithm. In: Aleksander, I., Taylor, J.G. (eds.) Proc. of the International Conference on Artificial Neural Networks, pp. 991–994. Elsevier, Brighton (1992)

    Google Scholar 

  18. Whitley, D.: Genetic Algorithms and Neural Networks. Genetic Algorithms in Engineering and Computer Science, pp. 191–201. John Wiley (1995)

    Google Scholar 

  19. Garro, B.A., Sossa, H., Vazquez, R.A.: Design of artificial neural networks using a modified particle swarm optimization algorithm. In: Proc. IEEE International Joint Conference on Neural Networks, Atlanta, pp. 938–945 (2009)

    Google Scholar 

  20. Zhao, L., Qian, F.: Tuning the structure and parameters of a neural network using cooperative binary-real particle swarm optimization. Expert Systems with Applications (2010)

    Google Scholar 

  21. Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of the zero-norm with linear models and kernel methods. J. Machine Learning Res. 3, 1439–1461 (2003)

    MATH  Google Scholar 

  22. Fung, G.M., Mangasarian, O.L., Smola, A.J.: Minimal kernel classifiers. J. Machine Learning Res. 3, 303–321 (2002)

    MathSciNet  Google Scholar 

  23. Amaldi, E., Kann, V.: On the approximability of minimizing non zero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 237–260 (1998)

    Google Scholar 

  24. Moody, J.E., Rögnvaldsson, T.: Smoothing regularizers for projective basis function networks. In: Mozer, M., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems (9), pp. 585–591. MIT Press, Denver (1997)

    Google Scholar 

  25. Hanson, S.J., Pratt, L.Y.: Comparing biases for minimal network construction with back-propagation. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (1), pp. 177–185. Morgan Kaufmann, San Francisco (1989)

    Google Scholar 

  26. Parsopoulos, K.E., Tasoulis, D.K., Vrahatis, M.N.: Multi-objective optimization using parallel vector evaluated particle swarm optimization. In: Proc. of the IASTED International Conference on Artificial Intelligence and Applications (AIA), Innsbruck, vol. 2, pp. 823–828 (2004)

    Google Scholar 

  27. van de Bergh, F., Engelbrecht, A.P.: A cooperative approach to particle swarm optimization. IEEE Trans. Evolutionary Computation 8, 1–15 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Adam, S.P., Magoulas, G.D., Vrahatis, M.N. (2012). Direct Zero-Norm Minimization for Neural Network Pruning and Training. In: Jayne, C., Yue, S., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2012. Communications in Computer and Information Science, vol 311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32909-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32909-8_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32908-1

  • Online ISBN: 978-3-642-32909-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics