Skip to main content

Advertisement

Log in

Searching for the most significant rules: an evolutionary approach for subgroup discovery

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this paper, a new genetic algorithm (GAR-SD\(^{+})\) for subgroup discovery tasks is described. The main feature of this new method is that it can work with both discrete and continuous attributes without previous discretization. The ranges of numeric attributes are obtained in the rules induction process itself. In this way, we ensure that these intervals are the most suitable for maximizing the quality measures. An experimental study was carried out to verify the performance of the method. GAR-SD\(^{+}\) was compared with other subgroup discovery methods by evaluating certain measures (such as number of rules, number of attributes, significance, unusualness, support and confidence). For subgroup discovery tasks, GAR-SD\(^{+}\) obtained good results compared with existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(13):307–318

  • Atzmüller M, Puppe F (2006) SD-Map a fast algorithm for exhaustive subgroup discovery. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases (PKDD-06), pp 6–17

  • Bay SD, Pazzani MJ (2001) Detecting group differences. Mining contrast sets. Data Min Knowl Discov 5(3):213–246

    Article  MATH  Google Scholar 

  • Berlanga F, del Jesus MJ, González P, Herrera F, Mesonero M (2006) Multiobjective evolutionary induction of subgroup discovery fuzzy rules: a case study in marketing. In: Perner P (ed) ICDM 2006. LNCS, vol 4065. Springer, pp 337–349 (2006)

  • Carmona CJ, González P, del Jesús MJ, Herrera F (2010) NMEEF-SD: non-dominated multi-objective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans Fuzzy Syst 18(5):958–970

    Article  Google Scholar 

  • Chen MY (2014) A high-order fuzzy time series forecasting model for Internet stock trading. Future Gen Comput Syst—Int J Grid Comput eSci 37:461–467

    Article  Google Scholar 

  • Chen MY (2013) A hybrid ANFIS model for business failure prediction–utilization of particle swarm optimization and subtractive clustering. Inf Sci 220:180–195

    Article  Google Scholar 

  • Chen MY, Fan MH, Chen YL, Wei HM (2013) Design of experiments on neural network’s parameters optimization for time series forecasting in stock markets. Neural Netw World 23(4):369–393

    Article  Google Scholar 

  • del Jesús MJ, González P, Herrera F, Mesonero M (2007) Evolutionary fuzzy rule induction process for subgroup discovery. A case study in marketing. IEEE Trans Fuzzy Syst 15(4):578–592

    Article  Google Scholar 

  • Dong G , Li J (1999) Efficient mining of emerging patterns. Discovering trends and differences. In: Proccedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, pp 43–52

  • Fayyad U, Irani KB (1990) Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th international joint conference on artificial intelligence, pp 1022–1029

  • Guan Y-Y, Wang H-K, Wang Y, Yang F (2009) Attribute reduction and optimal decision rules acquisition for continuous valued information systems. Inf Sci 179:2974–2984 (8/5)

    Article  MathSciNet  MATH  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Wesley Longman, Reading

    MATH  Google Scholar 

  • Grosskreutz H, Rüping S (2009) On subgroup discovery in numerical domains. Data Min Knowl Discov 19:210–226

    Article  MathSciNet  Google Scholar 

  • Grosskreutz H, Rüping S, Wrobel S (2008) Tight optimistic estimates for fast subgroup discovery. In: Proceedings of the ECML/PKDD. Lecture notes in artificial intelligence, vol 5211. Springer, pp 440–456

  • Herrera F (2008) Genetic fuzzy systems: taxonomy, current research trends and propects. Evolut Intell 1:27–46

    Article  Google Scholar 

  • Kavsek B, Lavrač N (2006) APRIORI-SD: adapting association rule learning to subgroup discovery. Appl Artif Intell 20(7):543–583

    Article  Google Scholar 

  • Klösgen W, May M (2002) Spatial subgroup mining integrated in an object-relational spatial database. In Proccedings of the 6th European conference on principles and practice of KDD, pp 275–286

  • Klösgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Advances in knowledge discovery and data mining, pp 249–271

  • Lavrač N, Flach P, Zupan B (1999) Rule evaluation measures: a unifying view. In: Proceedings of the 9th international workshop on inductive logic programming (ILP-99). LNCS, vol 1634. Springer, pp 174–183

  • Lavrač N, Kavsek B, Flach P, Todorovski L (2004) Subgroup discovery with CN2-SD. J Mach Learn Res 5:153–188

    MathSciNet  Google Scholar 

  • Lemmerich F, Puppe F (2011) Local models for expectation-driven subgroup discovery. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE, Washington, DC, pp 360–369

  • Lemmerich F, Rohlfs M, Atzmueller M (2010) Fast discovery of relevant subgroup patterns. In: Proceedings of the 23rd international FLAIRS conference. AAAI Press, pp 428–433

  • Mata J, Alvarez JL, Riquelme JC (2002) Discovering numeric association rules via evolutionary algorithm. In: Proccedings of the of PAKDD 2002. Springer, pp 40–51

  • Novak PK, Lavrač N, Webb GI (2009) Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J Mach Learn Res 10:377–403

  • Pachón V, Mata J (2012) An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization. Expert Syst Appl 39(1):585–593

  • Pachón V, Mata J, Domínguez JL, Maña MJ (2011) A multi-objective evolutionary approach for subgroup discovery. In: Corchado E, Kurzynski M, Wozniak M (eds) Proceedings of the 6th international conference on hybrid artificial intelligent systems–volume part II (HAIS’11). Springer, Berlin, Heidelberg, pp 271–279

  • Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191(15):14–30

    Article  Google Scholar 

  • Terlecki P, Walczak K (2007) On the relation between rough set reducts and jumping emerging patterns. Inf Sci 177:74–83 (1/1)

    Article  MathSciNet  MATH  Google Scholar 

  • Tsai C-J, Lee C-I, Yang W-P (2008) A discretization algorithm based on class-attribute contingency coefficient. Inf Sci 178:714–731 (2/1)

    Article  Google Scholar 

  • Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In: Proccedings of the 1st European conference on principles of data mining and knowledge discovery (PKDD-97), pp 78–87

  • Zelezny F, Lavrač N (2006) Propositionalization-based relational subgroup discovery with RSD. Mach Learn 62:33–63

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially funded by the Regional Government of Andalusia (Junta de Andalucía, Grant Number TIC-7629) and the Spanish Ministry of Economy and Competitiveness (Grant Number TIN2013-47153-C3-2-R).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jacinto Mata.

Ethics declarations

Conflict of interest

The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers? bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial inter-est (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pachón, V., Mata, J. & Domínguez, J.L. Searching for the most significant rules: an evolutionary approach for subgroup discovery. Soft Comput 21, 2609–2618 (2017). https://doi.org/10.1007/s00500-015-1961-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1961-5

Keywords

Navigation