Searching for the most significant rules: an evolutionary approach for subgroup discovery

Pachón, Victoria; Mata, Jacinto; Domínguez, Juan Luis

doi:10.1007/s00500-015-1961-5

Searching for the most significant rules: an evolutionary approach for subgroup discovery

Methodologies and Application
Published: 11 December 2015

Volume 21, pages 2609–2618, (2017)
Cite this article

Soft Computing Aims and scope Submit manuscript

Victoria Pachón¹,
Jacinto Mata¹ &
Juan Luis Domínguez¹

261 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, a new genetic algorithm (GAR-SD\(^{+})\) for subgroup discovery tasks is described. The main feature of this new method is that it can work with both discrete and continuous attributes without previous discretization. The ranges of numeric attributes are obtained in the rules induction process itself. In this way, we ensure that these intervals are the most suitable for maximizing the quality measures. An experimental study was carried out to verify the performance of the method. GAR-SD\(^{+}\) was compared with other subgroup discovery methods by evaluating certain measures (such as number of rules, number of attributes, significance, unusualness, support and confidence). For subgroup discovery tasks, GAR-SD\(^{+}\) obtained good results compared with existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Subgroups by Means of Genetic Programming

A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm

For real: a thorough look at numeric attributes in subgroup discovery

Article Open access 21 September 2020

References

Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(13):307–318
Atzmüller M, Puppe F (2006) SD-Map a fast algorithm for exhaustive subgroup discovery. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases (PKDD-06), pp 6–17
Bay SD, Pazzani MJ (2001) Detecting group differences. Mining contrast sets. Data Min Knowl Discov 5(3):213–246
Article MATH Google Scholar
Berlanga F, del Jesus MJ, González P, Herrera F, Mesonero M (2006) Multiobjective evolutionary induction of subgroup discovery fuzzy rules: a case study in marketing. In: Perner P (ed) ICDM 2006. LNCS, vol 4065. Springer, pp 337–349 (2006)
Carmona CJ, González P, del Jesús MJ, Herrera F (2010) NMEEF-SD: non-dominated multi-objective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans Fuzzy Syst 18(5):958–970
Article Google Scholar
Chen MY (2014) A high-order fuzzy time series forecasting model for Internet stock trading. Future Gen Comput Syst—Int J Grid Comput eSci 37:461–467
Article Google Scholar
Chen MY (2013) A hybrid ANFIS model for business failure prediction–utilization of particle swarm optimization and subtractive clustering. Inf Sci 220:180–195
Article Google Scholar
Chen MY, Fan MH, Chen YL, Wei HM (2013) Design of experiments on neural network’s parameters optimization for time series forecasting in stock markets. Neural Netw World 23(4):369–393
Article Google Scholar
del Jesús MJ, González P, Herrera F, Mesonero M (2007) Evolutionary fuzzy rule induction process for subgroup discovery. A case study in marketing. IEEE Trans Fuzzy Syst 15(4):578–592
Article Google Scholar
Dong G , Li J (1999) Efficient mining of emerging patterns. Discovering trends and differences. In: Proccedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, pp 43–52
Fayyad U, Irani KB (1990) Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th international joint conference on artificial intelligence, pp 1022–1029
Guan Y-Y, Wang H-K, Wang Y, Yang F (2009) Attribute reduction and optimal decision rules acquisition for continuous valued information systems. Inf Sci 179:2974–2984 (8/5)
Article MathSciNet MATH Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Wesley Longman, Reading
MATH Google Scholar
Grosskreutz H, Rüping S (2009) On subgroup discovery in numerical domains. Data Min Knowl Discov 19:210–226
Article MathSciNet Google Scholar
Grosskreutz H, Rüping S, Wrobel S (2008) Tight optimistic estimates for fast subgroup discovery. In: Proceedings of the ECML/PKDD. Lecture notes in artificial intelligence, vol 5211. Springer, pp 440–456
Herrera F (2008) Genetic fuzzy systems: taxonomy, current research trends and propects. Evolut Intell 1:27–46
Article Google Scholar
Kavsek B, Lavrač N (2006) APRIORI-SD: adapting association rule learning to subgroup discovery. Appl Artif Intell 20(7):543–583
Article Google Scholar
Klösgen W, May M (2002) Spatial subgroup mining integrated in an object-relational spatial database. In Proccedings of the 6th European conference on principles and practice of KDD, pp 275–286
Klösgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Advances in knowledge discovery and data mining, pp 249–271
Lavrač N, Flach P, Zupan B (1999) Rule evaluation measures: a unifying view. In: Proceedings of the 9th international workshop on inductive logic programming (ILP-99). LNCS, vol 1634. Springer, pp 174–183
Lavrač N, Kavsek B, Flach P, Todorovski L (2004) Subgroup discovery with CN2-SD. J Mach Learn Res 5:153–188
MathSciNet Google Scholar
Lemmerich F, Puppe F (2011) Local models for expectation-driven subgroup discovery. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE, Washington, DC, pp 360–369
Lemmerich F, Rohlfs M, Atzmueller M (2010) Fast discovery of relevant subgroup patterns. In: Proceedings of the 23rd international FLAIRS conference. AAAI Press, pp 428–433
Mata J, Alvarez JL, Riquelme JC (2002) Discovering numeric association rules via evolutionary algorithm. In: Proccedings of the of PAKDD 2002. Springer, pp 40–51
Novak PK, Lavrač N, Webb GI (2009) Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J Mach Learn Res 10:377–403
Pachón V, Mata J (2012) An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization. Expert Syst Appl 39(1):585–593
Pachón V, Mata J, Domínguez JL, Maña MJ (2011) A multi-objective evolutionary approach for subgroup discovery. In: Corchado E, Kurzynski M, Wozniak M (eds) Proceedings of the 6th international conference on hybrid artificial intelligent systems–volume part II (HAIS’11). Springer, Berlin, Heidelberg, pp 271–279
Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191(15):14–30
Article Google Scholar
Terlecki P, Walczak K (2007) On the relation between rough set reducts and jumping emerging patterns. Inf Sci 177:74–83 (1/1)
Article MathSciNet MATH Google Scholar
Tsai C-J, Lee C-I, Yang W-P (2008) A discretization algorithm based on class-attribute contingency coefficient. Inf Sci 178:714–731 (2/1)
Article Google Scholar
Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In: Proccedings of the 1st European conference on principles of data mining and knowledge discovery (PKDD-97), pp 78–87
Zelezny F, Lavrač N (2006) Propositionalization-based relational subgroup discovery with RSD. Mach Learn 62:33–63
Article Google Scholar

Download references

Acknowledgments

This work was partially funded by the Regional Government of Andalusia (Junta de Andalucía, Grant Number TIC-7629) and the Spanish Ministry of Economy and Competitiveness (Grant Number TIN2013-47153-C3-2-R).

Author information

Authors and Affiliations

Escuela Técnica Superior de Ingeniería, Universidad de Huelva, Carretera Palos de la Frontera SN, Huelva, Spain
Victoria Pachón, Jacinto Mata & Juan Luis Domínguez

Authors

Victoria Pachón
View author publications
You can also search for this author in PubMed Google Scholar
Jacinto Mata
View author publications
You can also search for this author in PubMed Google Scholar
Juan Luis Domínguez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacinto Mata.

Ethics declarations

Conflict of interest

The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers? bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial inter-est (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pachón, V., Mata, J. & Domínguez, J.L. Searching for the most significant rules: an evolutionary approach for subgroup discovery. Soft Comput 21, 2609–2618 (2017). https://doi.org/10.1007/s00500-015-1961-5

Download citation

Published: 11 December 2015
Issue Date: May 2017
DOI: https://doi.org/10.1007/s00500-015-1961-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Searching for the most significant rules: an evolutionary approach for subgroup discovery

Abstract

Access this article

Similar content being viewed by others

Discovering Subgroups by Means of Genetic Programming

A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm

For real: a thorough look at numeric attributes in subgroup discovery

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Searching for the most significant rules: an evolutionary approach for subgroup discovery

Abstract

Access this article

Similar content being viewed by others

Discovering Subgroups by Means of Genetic Programming

A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm

For real: a thorough look at numeric attributes in subgroup discovery

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation