Finding the Best k for the Dimension of the Latent Space in Autoencoders

Mai Ngoc, Kien; Hwang, Myunggwon

doi:10.1007/978-3-030-63007-2_35

Kien Mai Ngoc^14,15 &
Myunggwon Hwang^14,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12496))

Included in the following conference series:

International Conference on Computational Collective Intelligence

1995 Accesses
4 Citations

Abstract

In machine learning, one of the most efficient feature extraction methods is autoencoder which transforms the data from its original space to a latent space. The transformed data is then used for machine learning downstream tasks rather than the original data. However, there is little research about choosing the best number of latent space dimensions (k) for autoencoders that can affect the result of these tasks. In this paper, we focus on the impact of k on the accuracy of a downstream task. Concretely, we survey recently developed autoencoders and their characteristics, and conduct experiments using different autoencoders and k for extracting information from different datasets. We then present the accuracy of a classifier on the extracted datasets and the reconstruction error of the autoencoders according to k. From the empirical results, we recommend the best k of the latent space dimension for each dataset and each autoencoder.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The source code is available at: https://github.com/KienMN/Autoencoder-Experiments.

References

Doersch, C.: Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016)
Ghojogh, B., et al.: Feature selection and feature extraction in pattern analysis: a literature review. arXiv preprint arXiv:1905.02845 (2019)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample-problem. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 513–520. MIT Press (2007). http://papers.nips.cc/paper/3110-a-kernel-method-for-the-two-sample-problem.pdf
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database. ATT Labs 2 (2010). http://yann.lecun.com/exdb/mnist
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Tschannen, M., Bachem, O., Lucic, M.: Recent advances in autoencoder-based representation learning. arXiv preprint arXiv:1812.05069 (2018)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017). http://arxiv.org/abs/1708.07747
Zhao, S., Song, J., Ermon, S.: InfoVAE: information maximizing variational autoencoders. arXiv preprint arXiv:1706.02262 (2017)

Download references

Acknowledgement

This research was supported by Korea Institute of Science and Technology Information (KISTI).

Author information

Authors and Affiliations

University of Science and Technology, Daejeon, Korea
Kien Mai Ngoc & Myunggwon Hwang
Korea Institute of Science and Technology Information, Daejeon, Korea
Kien Mai Ngoc & Myunggwon Hwang

Authors

Kien Mai Ngoc
View author publications
You can also search for this author in PubMed Google Scholar
Myunggwon Hwang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Myunggwon Hwang .

Editor information

Editors and Affiliations

Department of Applied Informatics, Wrocław University of Science and Technology, Wroclaw, Poland
Ngoc Thanh Nguyen
Thua Thien Hue Center of Information Technology, Hue, Vietnam
Bao Hung Hoang
Vietnam - Korea University of Information and Communication Technology, University of Da Nang, Da Nang, Vietnam
Cong Phap Huynh
Department of Computer Engineering, Yeungnam University, Gyeungsan, Korea (Republic of)
Dosam Hwang
Department of Applied Informatics, Wrocław University of Science and Technology, Wroclaw, Poland
Bogdan Trawiński
Department of Information Systems, University of Münster, Münster, Germany
Gottfried Vossen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mai Ngoc, K., Hwang, M. (2020). Finding the Best k for the Dimension of the Latent Space in Autoencoders. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-63007-2_35
Published: 23 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63006-5
Online ISBN: 978-3-030-63007-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics