Abstract
One of the most challenging tasks of data analysis is finding clusters in mixed data sets, as they have numerical and categorical variables, and lack a labeled variable to serve as a guide. These clusters could serve to summarize all the variables of a data set into one and be able to find information more easily than generating summarizations for each variable. In this research thesis, a methodology of clustering on mixed data sets is proposed, which yields better results than the methods applied in the state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ströing, P.: Scientific Phenomena and Patterns in Data. Ludwig-Maximilians-Universität, München (2018)
Zaki, M.J., Meira, W.: Data Mining and Analysis. Cambridge University Press, Cambridge (2014)
Bramer, M.: Principles of Data Mining. Springer, London (2016). https://doi.org/10.1007/978-1-4471-7307-6
Soley-Bori, M.: Dealing with missing data: key assumptions and methods for applied analysis, vol. 23. Boston University (2013)
Yadav, M., Roychoudhury, B.: Handling missing values: a study of popular imputation packages in R. Knowl.-Based Syst. 160, 104–118 (2018)
Larose, D., Larose, C.: Discovering Knowledge in Data: An Introduction to Data Mining, 2nd edn. Wiley, Hoboken (2014)
Patki, N., Wedge, R., Veeramachaneni, K.: The synthetic data vault. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 399–410. IEEE (2016)
Adolfsson, A., Ackerman, M., Brownstein, N.: To cluster, or not to cluster: an analysis of clusterability methods. Pattern Recogn. 88, 13–26 (2019)
McCue, C.: Public-safety-specific evaluation. In: Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis, pp. 157–183. Butterworth-Heinemann (2015)
Wu, X., Ma, T., Cao, J., Tian, Y., Alabdulkarim, A.: A comparative study of clustering ensemble algorithms. Comput. Electr. Eng. 68, 603–615 (2018)
Jukes, E.: Encyclopedia of machine learning and data mining (2nd edition). Ref. Rev. 32, 3–4 (2018)
Loshin, D.: Knowledge discovery and data mining for predictive analytics. In: Business Intelligence. The Savvy Manager’s Guide MK Series on Business Intelligence, 2nd edn., pp. 271–286 (2013)
Tharwat, A.: Classification assessment methods. Appl. Comput. Inform. (2018)
Hennig, C.: What are the true clusters? Pattern Recogn. Lett. 64, 53–62 (2015)
Gurrutxaga, I., Muguerza, J., Arbelaitz, O., Pérez, J., Martín, J.: Towards a standard methodology to evaluate internal cluster validity indices. Pattern Recogn. Lett. 32, 505–515 (2011)
Jauhiainen, J., Kärkkäinen, S.: Comparison of internal clustering validation indices for prototype-based clustering. Algorithms 10, 105 (2017)
Desgraupes, B.: Clustering Indices. University of Paris Ouest-Lab Modal’X, vol. 1, pp. 34 (2013)
Han, J., Kamber, M., Pei, J.: Cluster analysis: basic concepts and methods. In: Data Mining, pp. 443–495 (2012)
Benabdellah, A., Benghabrit, A., Bouhaddou, I.: A survey of clustering algorithms for an industrial context. Proc. Comput. Sci. 148, 291–302 (2019)
Rodriguez, M., Comin, C., Casanova, D., Bruno, O., Amancio, D., Costa, L., Rodrigues, F.: Clustering algorithms: a comparative approach. PLoS One 14, e0210236 (2019)
Yang, Y.: Temporal Data Mining via Unsupervised Ensemble Learning. Elsevier Science, Amsterdam (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
González León, J.G., Mata Rivera, M.F. (2019). Clustering Methodology in Mixed Data Sets. In: Mata-Rivera, M., Zagal-Flores, R., Barría-Huidobro, C. (eds) Telematics and Computing. WITCOM 2019. Communications in Computer and Information Science, vol 1053. Springer, Cham. https://doi.org/10.1007/978-3-030-33229-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-33229-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33228-0
Online ISBN: 978-3-030-33229-7
eBook Packages: Computer ScienceComputer Science (R0)