Abstract
A new methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the WordNet lexical database and an ad hoc modified Sammon algorithm to associate a vector to each word in a semantic n-space. All words have been grouped according to the WordNet lexicographers’ files classification criteria: these groups have been called lexical sets. The word vector is composed by two parts: the first one, takes into account the belonging of the word to one of these lexical sets; the second one is related to the meaning of the word and it is responsible for distinguishing the word among the other ones of the same lexical set. The application of the proposed technique over all the words of WordNet would lead to an interesting instrument for the sub-symbolic processing of texts. The first experimental results show the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bellegarda, J.R.: Exploiting latent semantic information in statistical language modelling. Proceedings of the IEEE 88(8), 1279–1296 (2000)
Hofmann, T.: Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization. In: Solla, S.A., Leen, T.K., Muller, K.R. (eds.) Advances in Neural Information Processing Systems, pp. 914–920. MIT press, Cambridge (2000)
Pilato, G., Sorbello, F., Vassallo, G.: Ordering Web Pages through the Use of the Sammon Formula and the CGRD Algorithm. In: Proc. of AICA Congress, Taormina (ME) -Italy, October 27-30, pp. 495–503 (2000)
Honkela, T., Leinonen, T., Lonka, K., Raike, A.: Self-Organizing Maps and Constructive Learning. In: Proc. of ICEUT 2000, Beijing, August 21-25, pp. 339–343 (2000)
Siolas, G., d’Alche-Buc, F.: Support Vector Machines based on a semantic kernel for text categorization. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, vol. 5, pp. 205–209 (2000)
Yang, H., Lee, C.: Automatic category generation for text documents by selforganizing maps. In: Proc. of IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000, vol. 3, pp. 581–586 (2000)
Siivola, V.: Language modeling based on neural clustering of words. In: IDIAP-Com 02, Martigny, Switzerland (2000)
Honkela, T., Pulkki, V., Kohonen, T.: Contextual Relations of Words in Grimm Tales, Analyzed by Self-Organizing Map. In: Fogelman-Soulie, F., Gallinari, P. (eds.) Proceedings of International Conference on Artificial Neural Networks, ICANN 1995, pp. 3–7. EC2 et Cie, Paris (1995)
Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis Discourse Processes 25, 259-284 (1998)
Burgess, C., Lund, K.: The Dynamics of Meaning in Memory Cognitive dynamics: Conceptual and Representational Change. In: Dietrich, E., Markman, A. (eds.) Humans and Machines. Lawrence Erlbaum Associates, Inc., Hillsdale (2000)
Sahlgren, M., Karlgren, J., Cöster, R., Järvinen, T.: SICS at CLEF 2002: Automatic Query Expansion Using Random Indexing. In: The CLEF 2002 Workshop, Rome, Italy, September 19-20 (2002)
Steyvers, M., Shiffrin, R.M., Nelson, D.L.: Semantic spaces based on free association that predict memory performance http://wwwpsych.stanford.edu/msteyver/papers/DIssertationNewA.pdf
Levy, J.P., Bullinaria, J.A.: Learning Lexical Properties from Word Usage Patterns: Which Context Words Should be Used? In: French, R.F., Sougne, J.P. (eds.) Connectionist Models of Learning, Development and Evolution: Proceedings of the 6th Neural Computation and Psychology Workshop, pp. 273–282. Springer, London (2001)
Widdows, D., Cederberg, S., Dorow, B.: Visualisation Techniques for Analysing Meaning 5th International Conference on Text, Speech and Dialogue, Brno, Czech Republic, pp. 107-115 (September 2002)
Magnini, B., Strapparava, C.: Experiments in word domain disambiguation for parallel texts. In: Proc. of SIGLEX Workshop on Word Senses and Multi-linguality, Hong-Kong (October 2000) (held in conjunction with ACL 2000)
Miller, G.A., Beckwidth, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography 3(4), 235–244 (1990)
Sammon Jr., J.W.: A Nonlinear Mapping for Data Structure Analysis. IEEE Transactions on Computers C-18(5), 401–409 (1969)
Didion, J.: JWNL (Java WordNet Library), http://www.sourceforge.net
Sloan Jr., K.R., Tanimoto, S.L.: Progressive Refinement of Raster Images. IEEE Transactions on Computers 28(11), 871–874 (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vassallo, G., Pilato, G., Maggio, A., Puglisi, A., Gaglio, S. (2003). Sub-symbolic Encoding of Words. In: Cappelli, A., Turini, F. (eds) AI*IA 2003: Advances in Artificial Intelligence. AI*IA 2003. Lecture Notes in Computer Science(), vol 2829. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39853-0_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-39853-0_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20119-9
Online ISBN: 978-3-540-39853-0
eBook Packages: Springer Book Archive