Abstract
Instead of clustering data points to cluster center points in k-means, k-plane clustering (kPC) clusters data points to the center planes. However, kPC only concerns on within-cluster data points. In this paper, we propose a novel plane-based clustering, called k-proximal plane clustering (kPPC). In kPPC, each center plane is not only close to the objective data points but also far away from the others by solving several eigenvalue problems. The objective function of our kPPC comprises the information from between- and within-clusters data points. In addition, our kPPC is extended to nonlinear case by kernel trick. A determinative strategy using a Laplace graph to initialize data points is established in our kPPC. The experiments conducted on several artificial and benchmark datasets show that the performance of our kPPC is much better than both kPC and k-means.
Similar content being viewed by others
References
Han J, Kamber M (2006) Data mining concepts and techniques. Morgan Kaufmann, San Francisco
Wang Z, Shao Y, Bai L et al (2015) Twin support vector machine for clustering. IEEE Trans Neural Netw Learn Sys 26(10):2583–2588
Anderberg M (1973) Cluster analysis for applications. Academic Press, New York
Aldenderfer M, Blashfield R (1985) Cluster analysis. Sage Publications, Los Angeles
Andrews H (1972) Introduction to mathematical techniques in pattern recognition. Wiley, New York
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28(5):781–793
Jain A, Dubes R (1988) Algorithms for clustering data. Englewood Cliffs, NJ
Fisher D (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172
Hassoun M (1995) Fundamentals of artificial neural networks. MIT, Cambridge
Bradley P, Mangasarian O, Street W (1997) Clustering via concave minimization. Adv Neural Inf Process Syst 9:368–374
Rao M (1987) Cluster analysis and mathematical programming. Am Stat Assoc 66(335):622–626
Selim S, Ismail M (1984) K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans Pattern Anal Mach Intell PAMI 6(1):81–87
Bezdek JC, Hathaway RJ, Sabin MJ, Tucker WT (1987) Convergence theory for fuzzy c-means: counterexamples and repairs. Syst Man Cybern IEEE Trans 17(5):873–877
Mangasarian O, Wild E (2006) Multisurface proximal support vector classification via generalize eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74
Bradley P, Mangasarian O (2000) k-Plane clustering. J Glob Optim 16(1):23–32
Tseng P (2000) Nearest q-flat to m points. J Optim Theory Appl 105(1):249–252
Amaldi E, Coniglio S (2013) A distance-based point-reassignment heuristic for the k-hyperplane clustering problem. Eur J Oper Res 227(1):22–29
Rahman MA, Islam MZ, Bossomaier T (2014) Denclust: a density based seed selection approach for k-means. Artif Intell Soft Comput 8468:784–795
Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. Syst Man Cybernet Part B Cybern IEEE Trans 28(3):301–315
Li C, Kuo B, Chin T (2011) Lda-based clustering algorithm and its application to an unsupervised feature extraction. Fuzzy Syst IEEE Trans 19(1):152–163
Pang Y, Wang S, Yuan Y (2014) Learning regularized lda by clustering. Neural Netw Learn Syst IEEE Trans 25(12):2191–2201
Yang ZM, Guo YR, Li CN, Shao YH (2015) Local k-proximal plane clustering. Neural Comput Appl 26(1):199–211
Shao Y, Deng N, Chen W, Wang Z (2013) Improved generalized eigenvalue proximal support vector machine. IEEE Signal Process Lett 20(3):213–216
Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968
Shao Y, Chen W, Deng N (2014) Nonparallel hyperplane support vector machine for binary classification problems. Inf Sci 263:22–35
Qi Z, Tian Y, Shi Y (2012) Twin support vector machine with universum data. Neural Netw 36:112–119
Qi Z, Tian Y, Shi Y (2012) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316
Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. China Machine Press, Beijing
Ding C, He X (2004) K-means clustering via principal component analysis, in: Proceedings of the twenty-first international conference on machine learning, ACM, p 29
Scarborough J (1958) Numerical mathematical analysis, 4th edn. Johns Hopkins Press, New York
Deng N, Tian Y, Zhang C (2013) Support vector machines: optimization based theory, algorithms, and extensions. CRC Press, Boca Raton
Naldi M, Campello R (2014) Evolutionary k-means for distributed datasets. Neurocomputing 127(3):30–42
Bradley P, Fayyad U (1998) Refining initial points for k-means clustering, in: Proceedings of the 15th international conference on machine learning (ICML98), pp. 91–99
Fayyad U, Reina C, Bradley B (1998) Initialization of iterative refinement clustering algorithms In: Proc 14th Intl Conf on machine learning (ICML), pp. 194–198
Shao Y-H, Bai L, Wang Z, Hua X-Y, Deng N-Y (2013) Proximal plane clustering via eigenvalues. Proc Comput Sci 17:41–47
Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Hathaway RJ, Bezdek JC, Huband JM (2005) Kernelized non-euclidean relational c-means algorithms. Neural Parallel Sci Comput 13(3):305–326
Scholköpf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge, MA
Shao Y, Deng N (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25:114–121
Qi Z, Tian Y, Shi Y (2012) Laplacian twin support vector machine for semi-supervised classification. Neural Netw 35:46–53
Shao Y-H, Wang Z, Chen W-J, Deng N-Y (2013) A regularization for the projection twin support vector machine. Knowl Based Syst 37:203–210
Blake CL, Merz CJ (199 8) UCI repository for machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Accessed Jan 2015
Derya B, Alp K (2007) ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl Eng 60(1):208–221
Zhou A, Zhou S, Cao J, Fan Y, Hu Y (2000) Approaches for scaling DBSCAN algorithm to large spatial databases. J Comput Sci Technol 15(6):509–526
The MathWorks Inc (1994–2001) Matlab, User’s guide. http://www.mathworks.com
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. Intell Inf Syst J 17:107–145
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mac Learn Res 7(1):1–30
Hodges J Jr, Lehmann EL (1956) The efficiency of some nonparametric competitors of the t-test. Ann Math Stat 27(2):324–335
Hollander M, Wolfe D, Chicken E (1973) Nonparametric statistical methods, 2nd edn. Wiley, New York
Wang Y, Jiang Y, Wu Y, Zhou Z (2011) Localized k-flats. In: Proceedings of the twenty-fifth AAAI conference on artificial intelligence, pp. 525–530
Huang P, Zhang D (2010) Locality sensitive c-means clustering algorithms. Neurocomputing 73:2935–2943
Yang B, Chen S (2010) Sample-dependent graph construction with application to dimensionality reduction. Neurocomputing 74:301–314
Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33
Shao Y-H, Deng N-Y, Chen W-J (2013) A proximal classifier with consistency. Knowl Based Syst 49:171–178
Bezdek JC, Gunderson R, Ehrlich R, Meloy T (1978) On the extension of fuzzy k-means algorithms for detection of linear clusters. Decision and control including the 17th symposium on adaptive processes 17(1):1438–1443
Bezdek JC, Coray C, Gunderson R, Watson J (1981) Detection and characterization of cluster substructure i. linear structure: fuzzy c-lines. SIAM J Appl Math 40(2):339–357
Bezdek JC, Coray C, Gunderson R, Watson J (1981) Detection and characterization of cluster substructure ii. fuzzy c-varieties and complex combinations thereof. SIAM J Appl Math 40(2):358–372
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Nos. 11201426, 11371365, and 11501310), the Zhejiang Provincial Natural Science Foundation of China (Nos. LY15F030013, LQ14G010004, and LY16A010020), the National Statistical Science Research Project of China (No. 2013LZ13), and the Natural Science Foundation of Inner Mongolia Autonomous Region of China (No. 2015BS0606).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, LM., Guo, YR., Wang, Z. et al. k-Proximal plane clustering. Int. J. Mach. Learn. & Cyber. 8, 1537–1554 (2017). https://doi.org/10.1007/s13042-016-0526-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0526-y