Abstract
In recent years, Deep Learning methods have become very popular in NLP classification tasks, due to their ability to reach high performances by relying on very simple input representations. One of the drawbacks in training deep architectures is the large amount of annotated data required for effective training. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs).
In this paper, an SS-GAN is shown to be effective in semantic processing tasks operating in low-dimensional embeddings derived by the unsupervised approximation of rich Reproducing Kernel Hilbert Spaces. Preliminary analyses over a sentence classification task show that the proposed Kernel-based GAN achieves promising results when only 1% of labeled examples are used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The input layer and the Nyström layer are not modified during the learning process, and they are not regularized.
- 2.
For the remaining kernel parameters, the same setting of [4] is used.
- 3.
The word embeddings used for the CNN is the same used for the kernel computation.
References
Annesi, P., Croce, D., Basili, R.: Semantic compositionality in tree kernels. In: Proceedings of CIKM 2014. ACM (2014)
Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning, 1st edn. The MIT Press, Cambridge (2010)
Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS 2001), pp. 625–632 (2001)
Croce, D., Filice, S., Castellucci, G., Basili, R.: Deep learning in semantic kernel spaces. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 345–354. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1032, http://aclweb.org/anthology/P17-1032
Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1034–1046. Association for Computational Linguistics, Edinburgh, July 2011. https://www.aclweb.org/anthology/D11-1096
Dai, Z., Yang, Z., Yang, F., Cohen, W.W., Salakhutdinov, R.: Good semi-supervised learning that requires a bad GAN. CoRR abs/1705.09783 (2017). http://arxiv.org/abs/1705.09783
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Goldberg, Y.: A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57(1), 345–420 (2016). http://dl.acm.org/citation.cfm?id=3176748.3176757
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Goodfellow, I.J.: NIPS 2016 tutorial: generative adversarial networks. CoRR abs/1701.00160 (2017). http://arxiv.org/abs/1701.00160
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014. A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751 (2014). http://aclweb.org/anthology/D/D14/D14-1181.pdf
Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12–17 February 2016, pp. 2741–2749 (2016). http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12489
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907 (2016). http://arxiv.org/abs/1609.02907
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 971–980. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6698-self-normalizing-neural-networks.pdf
Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Rappaport Hovav, M., Levin, B.: The syntax-semantics interface, Chap. 19, pp. 593–624. Wiley (2015). https://doi.org/10.1002/9781118882139.ch19, https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118882139.ch19
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 2234–2242. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6125-improved-techniques-for-training-gans.pdf
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Hoboken (1998)
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1168–1175. ACM, New York (2008). https://doi.org/10.1145/1390156.1390303, http://doi.acm.org/10.1145/1390156.1390303
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press (2001)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. CoRR abs/1502.03044 (2015). http://dblp.uni-trier.de/db/journals/corr/corr1502.html#XuBKCCSZB15
Yang, Z., Cohen, W.W., Salakhutdinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 40–48. JMLR.org (2016). http://dl.acm.org/citation.cfm?id=3045390.3045396
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Croce, D., Castellucci, G., Basili, R. (2019). Kernel-Based Generative Adversarial Networks for Weakly Supervised Learning. In: Alviano, M., Greco, G., Scarcello, F. (eds) AI*IA 2019 – Advances in Artificial Intelligence. AI*IA 2019. Lecture Notes in Computer Science(), vol 11946. Springer, Cham. https://doi.org/10.1007/978-3-030-35166-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-35166-3_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35165-6
Online ISBN: 978-3-030-35166-3
eBook Packages: Computer ScienceComputer Science (R0)