Knowledge Extraction from Transducer Neural Networks

Wermter, Stefan

doi:10.1023/A:1008320219610

Knowledge Extraction from Transducer Neural Networks

Published: January 2000

Volume 12, pages 27–42, (2000)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Stefan Wermter¹

92 Accesses
9 Citations
Explore all metrics

Abstract

Previously neural networks have shown interesting performance results for tasks such as classification, but they still suffer from an insufficient focus on the structure of the knowledge represented therein. In this paper, we analyze various knowledge extraction techniques in detail and we develop new transducer extraction techniques for the interpretation of recurrent neural network learning. First, we provide an overview of different possibilities to express structured knowledge using neural networks. Then, we analyze a type of recurrent network rigorously, applying a broad range of different techniques. We argue that analysis techniques, such as weight analysis using Hinton diagrams, hierarchical cluster analysis, and principal component analysis may be useful for providing certain views on the underlying knowledge. However, we demonstrate that these techniques are too static and too low-level for interpreting recurrent network classifications. The contribution of this paper is a particularly broad analysis of knowledge extraction techniques. Furthermore, we propose dynamic learning analysis and transducer extraction as two new dynamic interpretation techniques. Dynamic learning analysis provides a better understanding of how the network learns, while transducer extraction provides a better understanding of what the network represents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

S. H¨olldobler, “A structured connectionist unification algorithm,” in Proceedings of the National Conference of the American Association on Artificial Intelligence 90, Boston, MA, 1990, pp. 587–593.
F. Kurfeß, “Unification on a connectionist simulator,” in Arti-ficial Neural Networksedited by T. Kohonen, K. M¨akisara, O. Simula, and J. Kangas, North-Holland, pp. 471–476, 1991.
A. Sperduti, A. Starita, and C. Goller, “Learning distributed representations for the classifications of terms,” in Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, 1995, pp. 494–515.
S. Wermter, Hybrid Connectionist Natural Language Processing, Chapman and Hall, Thomson International: London, UK, 1995.
Google Scholar
J. Hallam (ed.), “Hybrid Problems, Hybrid Solutions,” IOS Press: Sheffield, UK, 1996, in Proceedings of the 10th Biennial Conference on AI and Cognitive Science (AISB-95), Amsterdam.
L.R. Medsker, Hybrid Intelligent Systems, Kluwer Academic Publishers: Boston, 1995.
Google Scholar
R. Sun, “Schemas, logics and neural assemblies,” Applied Intelligence, vol. 5, pp. 83–102, 1995.
Google Scholar
S. Wermter, E. Riloff, and G. Scheler, Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Springer: Berlin, 1996.
Google Scholar
J.L. Elman, E.A. Bates, M.H. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett, Rethinking Innateness, MIT Press: Cambridge, MA, 1996.
Google Scholar
M.W. Craven, “Extracting Comprehensible Models from Trained Neural Networks,” Ph.D. Thesis, University of Wisconsin-Madison, 1996.
S. Wermter, “Preference moore machines for neural fuzzy integration,” in Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm, 1999, pp. 840–845.
R. Andrews and J. Diederich, Rules and Networks, Queensland University of Technology: Brisbane, Australia, 1996.
S. Abe, M. Kayama, H. Takenaga, and T. Kitamura, “Extracting algorithms from pattern classification neural networks,” Neural Networks, vol. 6, no. 5, pp. 729–735, 1993.
Google Scholar
J. Shavlik, “A framework for combining symbolic and neural learning,” in Artificial Intelligence and Neural Networks: Steps Towards Principled Integration, edited by V. Honavar and L. Uhr, Academic Press: San Diego, pp. 561–580, 1994.
Google Scholar
G.E. Hinton, “Learning distributed representations of concepts,” in Proceedings of the 8th Meeting of the Cognitive Science Society, 1986.
R.P. Gorman and T.J. Sejnowski, “Analysis of hidden units in a layered network trained to classify sonar targets,” Neural Networks, vol. 1, pp. 75–89, 1988.
Google Scholar
J.L. Elman, “Language as a dynamical system,” in Mind as Motion: Explorations in the Dynamics of Cognition, edited by R.F. Port and T. van Gelder, MIT: Cambridge, MA, pp. 195–225, 1995.
Google Scholar
C.L. Giles and C.W. Omlin, “Extraction, insertion and refinement of symbolic rules in dynamically driven recurrent neural networks,” Connection Science, vol. 5, pp. 307–337, 1993.
Google Scholar
C.W. Omlin and C.L. Giles, “Extraction of rules from discretetime recurrent neural networks,” Neural Networks, vol. 9, no. 1, pp. 41–52, 1996.
Google Scholar
J. Wiles and J. Elman, “Learning to count without a counter: a case study of dynamics and activation landscapes in recurrent networks,” in Proceedings of the AAAI Workshop on Computational Cognitive Modeling: Source of the Power, Portland, Oregon, 1996.
S. Wermter and V. Weber, “SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks,” Journal of Artificial Intelligence Research, vol. 6, no. 1, pp. 35–85, 1997.
Google Scholar
S. Wermter and M. Meurer, “Building lexical representations dynamically using artificial neural networks,” in Proceedings of the International Conference of the Cognitive Science Society, Stanford, 1997, pp. 802–807.
S.Wermter and M. L¨ochel, “Learning dialog act processing,” in Proceedings of the International Conference on Computational Linguistics, Copenhagen, Denmark, 1996, pp. 740–745.
J.L. Elman, “Distributed representations, simple recurrent networks, and grammatical structure,” Machine Learning, vol. 7, pp. 195–226, 1991.
Google Scholar
S. Wermter, “The hybrid approach to artificial neural networkbased language processing,” in A Handbook of Natural Language Processing, edited by R. Dale, H. Moisl, and H. Somers, Marcel Dekker, 2000.
T.L. Booth, Sequential Machines and Automata Theory, John Wiley: New York, 1967.
Google Scholar
Z.Kohavi, Switching and Finite Automata Theory, McGrawHill: New York, 1970.
Google Scholar
M.W. Shields, An Introduction to Automata Theory, Blackwell Scientific Publications: London, 1987.
Google Scholar
J. Hopcroft and J. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison Wesley: Reading, MA, 1979.
Google Scholar
P.S. Churchland and T.J. Sejnowski, The Computational Brain, MIT Press: Cambridge, MA, 1992.
Google Scholar
T. Winograd, Language as a Cognitive Process. Addison-Wesley: Reading, MA, 1983.
Google Scholar
R. Kaplan, “Finite state technology. ” in Survey of the State of the Art in Human Language Technology, edited by R.A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, V. Zue, G. Varile, and A. Zampolli, NSF: EU, 1995, pp. 419–422.
E.S. Santos, “Fuzzy sequential functions,” Journal of Cybernetics, vol. 3, no. 3, pp. 15–31, 1973.
Google Scholar
S.C. Kremer, “On the computational power of Elman-style recurrent networks,” IEEE Transactions on Neural Networks, vol. 6, no. 4, pp. 1000–1004, 1995.
Google Scholar
S.C. Kremer, “A theory of grammatical induction in the connectionist paradigm,” Technical Report, Ph.D. dissertation, Dept. of Computing Science, University of Alberta, Edmonton, 1996.
M.W. Goudreau and C.L. Giles, “On recurrent neural networks and representing finite-state recognizers,” in Proceedings of the Third International Conference on Neural Networks, 1995, pp. 51–55.
M.W. Goudreau, C.L. Giles, S.T. Chakradhar, and D. Chen, “First-order vs. second-order single layer recurrent neural networks,” IEEE Transactions on Neural Networks, vol. 5, no. 3, pp. 511–513, 1994.
Google Scholar
C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, and Y.C. Lee, “Learning and extracted finite state automata with secondorder recurrent neural networks,” Neural Computation, vol. 4, no. 3, pp. 393–405, 1992.
Google Scholar
P. Tino, B.G. Horne, C.L. Giles, and P.C. Collingwood, “Finite state machines and recurrent neural networks,” Technical Report CS-TR-3396, University of Maryland, College Park, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Informatics Centre, School of Computing, Engineering and Technology, St. Peter's Way, University of Sunderland, Sunderland, SR6 0DD, United Kingdom
Stefan Wermter

Authors

Stefan Wermter
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wermter, S. Knowledge Extraction from Transducer Neural Networks. Applied Intelligence 12, 27–42 (2000). https://doi.org/10.1023/A:1008320219610

Download citation

Issue Date: January 2000
DOI: https://doi.org/10.1023/A:1008320219610

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge Extraction from Transducer Neural Networks

Abstract

Access this article

Similar content being viewed by others

Artificial Neural Network Models

Learning Multiple Timescales in Recurrent Neural Networks

Predictive learning as a network mechanism for extracting low-dimensional latent space representations

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Knowledge Extraction from Transducer Neural Networks

Abstract

Access this article

Similar content being viewed by others

Artificial Neural Network Models

Learning Multiple Timescales in Recurrent Neural Networks

Predictive learning as a network mechanism for extracting low-dimensional latent space representations

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation