Skip to main content

An Evaluation of Sampling Methods for Data Mining with Fuzzy C-Means

  • Chapter
Data Mining for Design and Manufacturing

Part of the book series: Massive Computing ((MACO,volume 3))

Abstract

Using fuzzy c-means as the data-mining tool, this study evaluates the effectiveness of sampling methods in producing the knowledge of interest. The effectiveness is shown in terms of the representative-ness of sampling data and both the accuracy and errors of sampled data sets when subjected to the fuzzy clustering algorithm. Two population data in the weld inspection domain were used for the evaluation. Based on the results obtained, a number of observations are made.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agarwal, R., Gehrke, J., Gunopulos, D., and Raghavan, P., “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications,” SIGMOD ‘88, Seattle, WA, 94–105, 1998.

    Google Scholar 

  • Ball, G. H. and Hall, D. J., ISODATA, an iterative method of multivariate analysis and pattern recognition, Behavior Science, 153, 1967.

    Google Scholar 

  • Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms ( Plenum Press, New York and London, 1987 ).

    MATH  Google Scholar 

  • Chen, M.-S., Han, J., and Yu, P. S., “Data Mining: An Overview from a Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, 8 (6), 866–883, 1996.

    Article  Google Scholar 

  • Dunn, J. C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, J. Cybernet., 3, 1974, 32–57.

    Article  MathSciNet  MATH  Google Scholar 

  • Duran, B. S. and Odell, P. L., Cluster Analysis: a Survey, Volume 100 of Lecture Notes in Economics and Mathematical Systems. Springer-Verlag, 1974.

    Google Scholar 

  • Guha, S., Rastogi, R., and Shim, K., “CURE: An Efficient Clustering Algorithm for Large Databases,” SIGMOD ‘88, Seattle, WA, 73–84, 1998.

    Google Scholar 

  • Kohavi, R., Sommerfield, D., and Dougherty, J., Data Mining Using MLC++: A Machining Learning Library in C++, http://robotics.stanford.edu/—ronnyk.

    Google Scholar 

  • Krishnapuram, R. and Keller, J. M.. “A Possibilistic Approach to Clustering,” IEEE Trans. on Fuzzy Systems, 1 (2), 1993, 98–110.

    Article  Google Scholar 

  • Liao, T. W., Li, D.-M., and Li, Y.-M., “Extraction of Welds from Radiographic Images Using Fuzzy Classifiers,” Information Sciences, 126, 21–42, 2000.

    Article  MATH  Google Scholar 

  • Liao, T. W., Li, D.-M., and Li, Y.-M., “Detection of Welding Flaws from Radiographic Images with Fuzzy Clustering Methods”, Fuzzy Sets and Systems, 108 (2), 145–158, 1999.

    Article  Google Scholar 

  • Loslever, P., Lepoutre, F. X., Kebab, A., and Sayarh, H., “Descriptive multidimensional statistical methods for analyzing signals in a multifactorial biomedical database,” Med. & Biol. Eng. & Compt., 34, 13–20, 1996.

    Article  Google Scholar 

  • Ng, R. T. and Han, J., “Efficient and Effective Clustering Methods for Spatial Data Mining,” in Proc. of the VLDB Conference, Santiago, Chile, 144–155, 1994.

    Google Scholar 

  • Quinlan, J. R., C4.5: Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann, 1993.

    Google Scholar 

  • Rana, O. F. and Fisk, D., “A Distributed Framework for Parallel Data Mining Using HPJava,” BT Technology Journal, 17 (3), 146–154, 1999.

    Article  Google Scholar 

  • Reinartz, T., Focusing Solutions for Data Mining, Springer, 1999.

    Google Scholar 

  • Zhang, T., Ramakrishnan, R., and Livny, M., “BIRCH: An Efficient Data Clustering Method for Very Large Databases, ” in Proc. of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Josien, K., Wang, G., Liao, T.W., Triantaphyllou, E., Liu, M.C. (2001). An Evaluation of Sampling Methods for Data Mining with Fuzzy C-Means. In: Braha, D. (eds) Data Mining for Design and Manufacturing. Massive Computing, vol 3. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-4911-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-4911-3_15

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5205-9

  • Online ISBN: 978-1-4757-4911-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics