An Evaluation of Sampling Methods for Data Mining with Fuzzy C-Means

Josien, K.; Wang, G.; Liao, T. W.; Triantaphyllou, E.; Liu, M. C.

doi:10.1007/978-1-4757-4911-3_15

K. Josien²,
G. Wang²,
T. W. Liao²,
E. Triantaphyllou² &
…
M. C. Liu³

Part of the book series: Massive Computing ((MACO,volume 3))

413 Accesses
6 Citations

Abstract

Using fuzzy c-means as the data-mining tool, this study evaluates the effectiveness of sampling methods in producing the knowledge of interest. The effectiveness is shown in terms of the representative-ness of sampling data and both the accuracy and errors of sampled data sets when subjected to the fuzzy clustering algorithm. Two population data in the weld inspection domain were used for the evaluation. Based on the results obtained, a number of observations are made.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R., Gehrke, J., Gunopulos, D., and Raghavan, P., “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications,” SIGMOD ‘88, Seattle, WA, 94–105, 1998.
Google Scholar
Ball, G. H. and Hall, D. J., ISODATA, an iterative method of multivariate analysis and pattern recognition, Behavior Science, 153, 1967.
Google Scholar
Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms ( Plenum Press, New York and London, 1987 ).
MATH Google Scholar
Chen, M.-S., Han, J., and Yu, P. S., “Data Mining: An Overview from a Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, 8 (6), 866–883, 1996.
Article Google Scholar
Dunn, J. C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, J. Cybernet., 3, 1974, 32–57.
Article MathSciNet MATH Google Scholar
Duran, B. S. and Odell, P. L., Cluster Analysis: a Survey, Volume 100 of Lecture Notes in Economics and Mathematical Systems. Springer-Verlag, 1974.
Google Scholar
Guha, S., Rastogi, R., and Shim, K., “CURE: An Efficient Clustering Algorithm for Large Databases,” SIGMOD ‘88, Seattle, WA, 73–84, 1998.
Google Scholar
Kohavi, R., Sommerfield, D., and Dougherty, J., Data Mining Using MLC++: A Machining Learning Library in C++, http://robotics.stanford.edu/—ronnyk.
Google Scholar
Krishnapuram, R. and Keller, J. M.. “A Possibilistic Approach to Clustering,” IEEE Trans. on Fuzzy Systems, 1 (2), 1993, 98–110.
Article Google Scholar
Liao, T. W., Li, D.-M., and Li, Y.-M., “Extraction of Welds from Radiographic Images Using Fuzzy Classifiers,” Information Sciences, 126, 21–42, 2000.
Article MATH Google Scholar
Liao, T. W., Li, D.-M., and Li, Y.-M., “Detection of Welding Flaws from Radiographic Images with Fuzzy Clustering Methods”, Fuzzy Sets and Systems, 108 (2), 145–158, 1999.
Article Google Scholar
Loslever, P., Lepoutre, F. X., Kebab, A., and Sayarh, H., “Descriptive multidimensional statistical methods for analyzing signals in a multifactorial biomedical database,” Med. & Biol. Eng. & Compt., 34, 13–20, 1996.
Article Google Scholar
Ng, R. T. and Han, J., “Efficient and Effective Clustering Methods for Spatial Data Mining,” in Proc. of the VLDB Conference, Santiago, Chile, 144–155, 1994.
Google Scholar
Quinlan, J. R., C4.5: Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann, 1993.
Google Scholar
Rana, O. F. and Fisk, D., “A Distributed Framework for Parallel Data Mining Using HPJava,” BT Technology Journal, 17 (3), 146–154, 1999.
Article Google Scholar
Reinartz, T., Focusing Solutions for Data Mining, Springer, 1999.
Google Scholar
Zhang, T., Ramakrishnan, R., and Livny, M., “BIRCH: An Efficient Data Clustering Method for Very Large Databases, ” in Proc. of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Industrial & Manufacturing Systems Engineering Department, Louisiana State University, Baton Rouge, LA, 70803, USA
K. Josien, G. Wang, T. W. Liao & E. Triantaphyllou
Manufacturing R&D, Boeing Company, Wichita, KS, USA
M. C. Liu

Authors

K. Josien
View author publications
You can also search for this author in PubMed Google Scholar
G. Wang
View author publications
You can also search for this author in PubMed Google Scholar
T. W. Liao
View author publications
You can also search for this author in PubMed Google Scholar
E. Triantaphyllou
View author publications
You can also search for this author in PubMed Google Scholar
M. C. Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Ben-Gurion University, Israel
Dan Braha

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Josien, K., Wang, G., Liao, T.W., Triantaphyllou, E., Liu, M.C. (2001). An Evaluation of Sampling Methods for Data Mining with Fuzzy C-Means. In: Braha, D. (eds) Data Mining for Design and Manufacturing. Massive Computing, vol 3. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-4911-3_15

Download citation

DOI: https://doi.org/10.1007/978-1-4757-4911-3_15
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5205-9
Online ISBN: 978-1-4757-4911-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics