Skip to main content
Log in

Knowledge Extraction from Transducer Neural Networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Previously neural networks have shown interesting performance results for tasks such as classification, but they still suffer from an insufficient focus on the structure of the knowledge represented therein. In this paper, we analyze various knowledge extraction techniques in detail and we develop new transducer extraction techniques for the interpretation of recurrent neural network learning. First, we provide an overview of different possibilities to express structured knowledge using neural networks. Then, we analyze a type of recurrent network rigorously, applying a broad range of different techniques. We argue that analysis techniques, such as weight analysis using Hinton diagrams, hierarchical cluster analysis, and principal component analysis may be useful for providing certain views on the underlying knowledge. However, we demonstrate that these techniques are too static and too low-level for interpreting recurrent network classifications. The contribution of this paper is a particularly broad analysis of knowledge extraction techniques. Furthermore, we propose dynamic learning analysis and transducer extraction as two new dynamic interpretation techniques. Dynamic learning analysis provides a better understanding of how the network learns, while transducer extraction provides a better understanding of what the network represents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. H¨olldobler, “A structured connectionist unification algorithm,” in Proceedings of the National Conference of the American Association on Artificial Intelligence 90, Boston, MA, 1990, pp. 587–593.

  2. F. Kurfeß, “Unification on a connectionist simulator,” in Arti-ficial Neural Networksedited by T. Kohonen, K. M¨akisara, O. Simula, and J. Kangas, North-Holland, pp. 471–476, 1991.

  3. A. Sperduti, A. Starita, and C. Goller, “Learning distributed representations for the classifications of terms,” in Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, 1995, pp. 494–515.

  4. S. Wermter, Hybrid Connectionist Natural Language Processing, Chapman and Hall, Thomson International: London, UK, 1995.

    Google Scholar 

  5. J. Hallam (ed.), “Hybrid Problems, Hybrid Solutions,” IOS Press: Sheffield, UK, 1996, in Proceedings of the 10th Biennial Conference on AI and Cognitive Science (AISB-95), Amsterdam.

  6. L.R. Medsker, Hybrid Intelligent Systems, Kluwer Academic Publishers: Boston, 1995.

    Google Scholar 

  7. R. Sun, “Schemas, logics and neural assemblies,” Applied Intelligence, vol. 5, pp. 83–102, 1995.

    Google Scholar 

  8. S. Wermter, E. Riloff, and G. Scheler, Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Springer: Berlin, 1996.

    Google Scholar 

  9. J.L. Elman, E.A. Bates, M.H. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett, Rethinking Innateness, MIT Press: Cambridge, MA, 1996.

    Google Scholar 

  10. M.W. Craven, “Extracting Comprehensible Models from Trained Neural Networks,” Ph.D. Thesis, University of Wisconsin-Madison, 1996.

  11. S. Wermter, “Preference moore machines for neural fuzzy integration,” in Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm, 1999, pp. 840–845.

  12. R. Andrews and J. Diederich, Rules and Networks, Queensland University of Technology: Brisbane, Australia, 1996.

  13. S. Abe, M. Kayama, H. Takenaga, and T. Kitamura, “Extracting algorithms from pattern classification neural networks,” Neural Networks, vol. 6, no. 5, pp. 729–735, 1993.

    Google Scholar 

  14. J. Shavlik, “A framework for combining symbolic and neural learning,” in Artificial Intelligence and Neural Networks: Steps Towards Principled Integration, edited by V. Honavar and L. Uhr, Academic Press: San Diego, pp. 561–580, 1994.

    Google Scholar 

  15. G.E. Hinton, “Learning distributed representations of concepts,” in Proceedings of the 8th Meeting of the Cognitive Science Society, 1986.

  16. R.P. Gorman and T.J. Sejnowski, “Analysis of hidden units in a layered network trained to classify sonar targets,” Neural Networks, vol. 1, pp. 75–89, 1988.

    Google Scholar 

  17. J.L. Elman, “Language as a dynamical system,” in Mind as Motion: Explorations in the Dynamics of Cognition, edited by R.F. Port and T. van Gelder, MIT: Cambridge, MA, pp. 195–225, 1995.

    Google Scholar 

  18. C.L. Giles and C.W. Omlin, “Extraction, insertion and refinement of symbolic rules in dynamically driven recurrent neural networks,” Connection Science, vol. 5, pp. 307–337, 1993.

    Google Scholar 

  19. C.W. Omlin and C.L. Giles, “Extraction of rules from discretetime recurrent neural networks,” Neural Networks, vol. 9, no. 1, pp. 41–52, 1996.

    Google Scholar 

  20. J. Wiles and J. Elman, “Learning to count without a counter: a case study of dynamics and activation landscapes in recurrent networks,” in Proceedings of the AAAI Workshop on Computational Cognitive Modeling: Source of the Power, Portland, Oregon, 1996.

  21. S. Wermter and V. Weber, “SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks,” Journal of Artificial Intelligence Research, vol. 6, no. 1, pp. 35–85, 1997.

    Google Scholar 

  22. S. Wermter and M. Meurer, “Building lexical representations dynamically using artificial neural networks,” in Proceedings of the International Conference of the Cognitive Science Society, Stanford, 1997, pp. 802–807.

  23. S.Wermter and M. L¨ochel, “Learning dialog act processing,” in Proceedings of the International Conference on Computational Linguistics, Copenhagen, Denmark, 1996, pp. 740–745.

  24. J.L. Elman, “Distributed representations, simple recurrent networks, and grammatical structure,” Machine Learning, vol. 7, pp. 195–226, 1991.

    Google Scholar 

  25. S. Wermter, “The hybrid approach to artificial neural networkbased language processing,” in A Handbook of Natural Language Processing, edited by R. Dale, H. Moisl, and H. Somers, Marcel Dekker, 2000.

  26. T.L. Booth, Sequential Machines and Automata Theory, John Wiley: New York, 1967.

    Google Scholar 

  27. Z.Kohavi, Switching and Finite Automata Theory, McGrawHill: New York, 1970.

    Google Scholar 

  28. M.W. Shields, An Introduction to Automata Theory, Blackwell Scientific Publications: London, 1987.

    Google Scholar 

  29. J. Hopcroft and J. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison Wesley: Reading, MA, 1979.

    Google Scholar 

  30. P.S. Churchland and T.J. Sejnowski, The Computational Brain, MIT Press: Cambridge, MA, 1992.

    Google Scholar 

  31. T. Winograd, Language as a Cognitive Process. Addison-Wesley: Reading, MA, 1983.

    Google Scholar 

  32. R. Kaplan, “Finite state technology. ” in Survey of the State of the Art in Human Language Technology, edited by R.A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, V. Zue, G. Varile, and A. Zampolli, NSF: EU, 1995, pp. 419–422.

  33. E.S. Santos, “Fuzzy sequential functions,” Journal of Cybernetics, vol. 3, no. 3, pp. 15–31, 1973.

    Google Scholar 

  34. S.C. Kremer, “On the computational power of Elman-style recurrent networks,” IEEE Transactions on Neural Networks, vol. 6, no. 4, pp. 1000–1004, 1995.

    Google Scholar 

  35. S.C. Kremer, “A theory of grammatical induction in the connectionist paradigm,” Technical Report, Ph.D. dissertation, Dept. of Computing Science, University of Alberta, Edmonton, 1996.

  36. M.W. Goudreau and C.L. Giles, “On recurrent neural networks and representing finite-state recognizers,” in Proceedings of the Third International Conference on Neural Networks, 1995, pp. 51–55.

  37. M.W. Goudreau, C.L. Giles, S.T. Chakradhar, and D. Chen, “First-order vs. second-order single layer recurrent neural networks,” IEEE Transactions on Neural Networks, vol. 5, no. 3, pp. 511–513, 1994.

    Google Scholar 

  38. C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, and Y.C. Lee, “Learning and extracted finite state automata with secondorder recurrent neural networks,” Neural Computation, vol. 4, no. 3, pp. 393–405, 1992.

    Google Scholar 

  39. P. Tino, B.G. Horne, C.L. Giles, and P.C. Collingwood, “Finite state machines and recurrent neural networks,” Technical Report CS-TR-3396, University of Maryland, College Park, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wermter, S. Knowledge Extraction from Transducer Neural Networks. Applied Intelligence 12, 27–42 (2000). https://doi.org/10.1023/A:1008320219610

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008320219610

Navigation