Skip to main content

Dynamically Adaptive Genetic Algorithm to Select Training Data for SVMs

  • Conference paper
  • First Online:
Advances in Artificial Intelligence -- IBERAMIA 2014 (IBERAMIA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8864))

Included in the following conference series:

Abstract

This paper addresses an important problem of training set selection for support vector machines (SVMs). It is a critical step in case of large and noisy data sets due to high time and memory complexity of the SVM training. There have been several methods proposed so far, in majority underpinned with the analysis of data geometry either in the input or kernel space. Here, we propose a new dynamically adaptive genetic algorithm (DAGA) to select valuable training sets. We demonstrate that not only can DAGA quickly select the training data, but in addition it dynamically determines the desired training set size without any prior information. We analyze the impact of the support vectors ratio, defined as the percentage of support vectors in the training set, on the DAGA performance. Also, we investigate and discuss the possibility of incorporating reduced SVMs into the proposed algorithm. Extensive experimental study shows that DAGA offers fast and effective training set optimization that is independent on the entire training set size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  2. Kawulok, M., Nalepa, J.: Support vector machines training data selection using a genetic algorithm. In: Gimel’farb, G., Hancock, E., Imiya, A., Kuijper, A., Kudo, M., Omachi, S., Windeatt, T., Yamada, K. (eds.) SSPR & SPR 2012. LNCS, vol. 7626, pp. 557–565. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Nalepa, J., Kawulok, M.: Adaptive genetic algorithm to select training set for support vector machines. In: EvoIASP, EvoApp. LNCS. Springer (in press, 2014)

    Google Scholar 

  4. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 169–184. MIT Press (1999)

    Google Scholar 

  5. Rodriguez-Lujan, I., Cruz, C.S., Huerta, R.: Hierarchical linear support vector machine. Patt. Recogn. 45(12), 4414–4427 (2012)

    Article  MATH  Google Scholar 

  6. Le, Q., Sarlos, T., Smola, A.: Fastfood - approximating kernel expansions in loglinear time. In: Proc. ICML (2013)

    Google Scholar 

  7. Balcázar, J., Dai, Y., Watanabe, O.: A Random Sampling Technique for Training Support Vector Machines. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, pp. 119–134. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  8. Ferragut, E., Laska, J.: Randomized sampling for large data applications of SVM. In: Int. Conf. on Mach. Learning and App., vol. 1, pp. 350–355 (2012)

    Google Scholar 

  9. Lee, Y.J., Huang, S.Y.: Reduced support vector machines: A statistical theory. IEEE Trans. on Neural Networks 18(1), 1–13 (2007)

    Article  Google Scholar 

  10. Chang, C.C., Pao, H.K., Lee, Y.J.: An RSVM based two-teachers-one-student semi-supervised learning algorithm. Neural Networks 25, 57–69 (2012)

    Article  Google Scholar 

  11. Chien, L.J., Chang, C.C., Lee, Y.J.: Variant methods of reduced set selection for reduced support vector machines. J. Inf. Sci. Eng. 26(1), 183–196 (2010)

    MATH  Google Scholar 

  12. Koggalage, R., Halgamuge, S.: Reducing the number of training samples for fast support vector machine classification. Neural Information Process. Lett. and Reviews 2(3), 57–65 (2004)

    Google Scholar 

  13. Shin, H., Cho, S.: Neighborhood property-based pattern selection for SVMs. Neural Comput. 19(3), 816–855 (2007)

    Article  MATH  Google Scholar 

  14. Abe, S., Inoue, T.: Fast Training of Support Vector Machines by Extracting Boundary Data. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol. 2130, pp. 308–313. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  15. Wang, D., Shi, L.: Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing 71, 2772–2781 (2008)

    Article  Google Scholar 

  16. Salvador, S., Chan, P.: Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In: Proc. IEEE ICTAI, pp. 576–584 (2004)

    Google Scholar 

  17. Wang, J., Neskovic, P., Cooper, L.N.: Training data selection for SVMs. In: Adv. in Natural Comp., pp. 554–564. Springer (2005)

    Google Scholar 

  18. Lopez-Chau, A., Li, X., Yu, W.: Convex-concave hull for classification with SVM. In: Proc. IEEE ICDMW, pp. 431–438 (2012)

    Google Scholar 

  19. Zhang, W., King, I.: Locating support vectors via \(\beta \)-skeleton technique. In: Int. Conf. on Neural Inf. Process., pp. 1423–1427 (2002)

    Google Scholar 

  20. Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast SVM training on very large data sets. J. of Machine Learning Research 6, 363–392 (2005)

    MATH  MathSciNet  Google Scholar 

  21. Zeng, Z.Q., Xu, H.R., Xie, Y.Q., Gao, J.: A geometric approach to train SVM on very large data sets. Intell. Sys. and Knowl. Eng. 1, 991–996 (2008)

    Google Scholar 

  22. Musicant, D.R., Feinberg, A.: Active set support vector regression. IEEE Trans. on Neural Networks 15(2), 268–275 (2004)

    Article  Google Scholar 

  23. Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: Int. Conf. on Mach. Learning, pp. 839–846 (2000)

    Google Scholar 

  24. Nalepa, J., Kawulok, M.: A memetic algorithm to select training data for support vector machines. In: Proc. of the 2014 Conf. on Genetic and Evolutionary Computation, GECCO 2014, pp. 573–580. ACM (2014)

    Google Scholar 

  25. Nalepa, J., Czech, Z.J.: New Selection Schemes in a Memetic Algorithm for the Vehicle Routing Problem with Time Windows. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 396–405. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  26. Elamin, E.E.A.: A proposed genetic algorithm selection method. In: 1st National Symposium (NITS) (2006)

    Google Scholar 

  27. Lee, J.S., Kuo, Y.M., Chung, P.C., Chen, E.L.: Naked image detection based on adaptive and extensible skin color model. Pattern Recognit. 40, 2261–2270 (2007)

    Article  MATH  Google Scholar 

  28. Phung, S.L., Chai, D., Bouzerdoum, A.: Adaptive skin segmentation in color images. In: Proc. IEEE ICASSP, pp. 353–356 (2003)

    Google Scholar 

  29. Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)

    Google Scholar 

  30. Lin, K.M., Lin, C.J.: A study on reduced support vector machines. IEEE Trans. on Neural Networks 14(6), 1449–1459 (2003)

    Article  Google Scholar 

  31. Simiński, K.: Transformation of Input Domain for SVM in Regression Task. In: Gruca, A., Czachórski, T., Kozielski, S. (eds.) Man-Machine Interactions 3. AISC, vol. 242, pp. 423–430. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  32. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakub Nalepa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kawulok, M., Nalepa, J. (2014). Dynamically Adaptive Genetic Algorithm to Select Training Data for SVMs. In: Bazzan, A., Pichara, K. (eds) Advances in Artificial Intelligence -- IBERAMIA 2014. IBERAMIA 2014. Lecture Notes in Computer Science(), vol 8864. Springer, Cham. https://doi.org/10.1007/978-3-319-12027-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12027-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12026-3

  • Online ISBN: 978-3-319-12027-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics