Skip to main content
Log in

A set-cover-based approach for the test-cost-sensitive attribute reduction problem

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In data mining application, the test-cost-sensitive attribute reduction is an important task which aims to decrease the test cost of data. In operational research, the set cover problem is a typical optimization problem and has a long investigation history compared to the attribute reduction problem. In this paper, we employ the methods of set cover problem to deal with the test-cost-sensitive attribute reduction. First, we equivalently transform the test-cost-sensitive reduction problem into the set cover problem by using a constructive approach. It is shown that computing a reduct of a decision system with minimal test cost is equal to computing an optimal solution of the set cover problem. Then, a set-cover-based heuristic algorithm is introduced to solve the test-cost-sensitive reduction problem. In the end, we conduct several numerical experiments on data sets from UCI machine learning repository. Experimental results indicate that the set-cover-based algorithm has superior performances in most cases, and the algorithm is efficient on data sets with many attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Bolón-Canedo V, Porto-Díaz I, Sánchez-Maroño N, Alonso-Betanzos A (2014) A framework for cost-based feature selection. Pattern Recogn 47:2481–2489

    Article  Google Scholar 

  • Brown G, Pocock A, Zhao MJ, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66

    MathSciNet  MATH  Google Scholar 

  • Caprara A, Toth P, Fischetti M (2000) Algorithms for the set covering problem. Ann Oper Res 98:353–371

    Article  MathSciNet  MATH  Google Scholar 

  • Chen CY, Li ZG (2004) A study of reduction of attributes and set covering problem. Comput Eng Appl 2:1–14

    Google Scholar 

  • Chen DG, Zhao SY, Zhang L, Yang YP, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24:2080–2093

    Article  Google Scholar 

  • Chen JK, Lin YJ, Lin GP, Li JJ, Ma ZM (2015) The relationship between attribute reducts in rough sets and minimal vertex covers of graphs. Inf Sci 325:87–97

    Article  MathSciNet  Google Scholar 

  • Chvatal V (1979) A greedy-heuristic for the set covering problem. Math Oper Res 4:233–235

    Article  MathSciNet  MATH  Google Scholar 

  • Fan AJ, Zhao H, Zhu W (2015) Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model. Soft Comput. doi:10.1007/s00500-015-1770-x

    MATH  Google Scholar 

  • Gao C, Yao X, Weise T, Li JL (2015) An efficient local search heuristic with row weighting for the unicost set covering problem. Eur J Oper Res 246:750–761

    Article  MathSciNet  MATH  Google Scholar 

  • Hu QH, Pan WW, Zhang L, Zhang D, Song YP, Guo MZ, Yu DR (2012) Feature selection for monotonic classification. IEEE Trans Fuzzy Syst 20(1):69–81

    Article  Google Scholar 

  • Jia XY, Liao WH, Tang ZM, Shang L (2013) Minimum cost attribute reduction in decision-theoretic rough set models. Inf Sci 219:151–167

    Article  MathSciNet  MATH  Google Scholar 

  • Jing SY (2014) A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18(7):1373–1382

    Article  Google Scholar 

  • Kusunoki Y, Inuiguchi M (2010) A unified approach to reducts in dominance-based rough set approach. Soft Comput 14(5):507–515

    Article  MATH  Google Scholar 

  • Lavrac N, Gamberger D, Turney P (1996) Cost-sensitive feature reduction applied to a hybrid genetic algorithm. In: Proceedings of the 7th international workshop on algorithmic learning theory, ALT

  • Liang JY, Shi ZZ (2004) The information entropy, rough entropy and knowledge granulation in rough set theory. Int J Uncertain Fuzziness Knowl Based Syst 12:37–46

    Article  MathSciNet  MATH  Google Scholar 

  • Liu JNK, Hua YX, He YL (2014) A set covering based approach to find the reduct of variable precision rough set. Inf Sci 275:83–100

    Article  MathSciNet  MATH  Google Scholar 

  • Mi JS, Leung Y, Wu WZ (2011) Dependence-space-based attribute reduction in consistent decision tables. Soft Comput 15:261–268

    Article  MATH  Google Scholar 

  • Miao DQ, Zhao Y, Yao YY, Li H, Xu F (2009) Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Inf Sci 179(24):4140–4150

    Article  MathSciNet  MATH  Google Scholar 

  • Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179:2442–2452

    Article  MathSciNet  MATH  Google Scholar 

  • Min F, He HP, Qian YH, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942

    Article  Google Scholar 

  • Min F, Zhu W (2012) Attribute reduction of data with error ranges and test costs. Inf Sci 211:48–67

    Article  MathSciNet  MATH  Google Scholar 

  • Min F, Hu QH, Zhu W (2014) Feature selection with test cost constraint. Int J Approx Reason 55:167–179

    Article  MathSciNet  MATH  Google Scholar 

  • Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, Dordrecht

    Book  MATH  Google Scholar 

  • Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174:597–618

    Article  MathSciNet  MATH  Google Scholar 

  • Qian YH, Liang JY, Dang CY (2010) Incomplete multigranulation rough set. IEEE Trans Syst Man Cybern A 20:420–431

    Article  Google Scholar 

  • Qian YH, Wang Q, Cheng HH, Liang JY, Dang CY (2015) Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst 258:61–78

    Article  MathSciNet  MATH  Google Scholar 

  • Quan GR, Hong BR, Ye F, Ren SJ (1998) A heuristic function algorithm for minimum set-covering problem. J Softw 9:156–160

    Google Scholar 

  • Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Slowinski R (ed) Intelligent decision support, theory and decision library, vol 11. Springer, Netherlands, pp 331–362

    Chapter  Google Scholar 

  • Slavík P (1996) A tight analysis of the greedy algorithm for set cover. In: Proceedings of the 28th annual ACM symposium on theory of computing, STOC ’96, ACM, pp 435–441

  • Slezak D (2002) Approximate entropy reducts. Fundam Informat 53:365–390

    MathSciNet  MATH  Google Scholar 

  • Xu YT, Wang LS, Zhang RY (2011) A dynamic attribute reduction algorithm based on 0–1 integer programming. Knowl-Based Syst 24:1341–1347

    Article  Google Scholar 

  • Yang XB, Qi YS, Song XN, Yang JY (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199

    Article  MathSciNet  MATH  Google Scholar 

  • Yao YY, Zhao Y (2009) Discernibility matrix simplification for constructing attribute reducts. Inf Sci 179:867–882

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao H, Zhu W (2014) Optimal cost-sensitive granularization based on rough sets for variable costs. Knowl-Based Syst 65:72–82

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by Grants from National Natural Science Foundation of China (Nos. 61573321, 61272021, 61202206 and 61173181), Zhejiang Provincial Natural Science Foundation of China (Nos. LZ12F03002, LY14F030001), Open Foundation from Marine Sciences in the Most Important Subjects of Zhejiang (No. 20130109), and Scientific Research Start-up Fund of Zhejiang Ocean University (No. 21065014715).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anhui Tan.

Ethics declarations

Conflict of interest

Author Anhui Tan declares that he has no conflict of interest. Author Weizhi Wu declares that he has no conflict of interest. Author Yuzhi Tao declares that she has no conflict of interest.

Ethical standard

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, A., Wu, W. & Tao, Y. A set-cover-based approach for the test-cost-sensitive attribute reduction problem. Soft Comput 21, 6159–6173 (2017). https://doi.org/10.1007/s00500-016-2173-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2173-3

Keywords

Navigation