Skip to main content

Pattern Mining and Machine Learning for Demographic Sequences

  • Conference paper
  • First Online:
Knowledge Engineering and Semantic Web (KESW 2015)

Abstract

In this paper, we present the results of our first studies in application of pattern mining and machine learning techniques to analysis of demographic sequences in Russia based on data of 11 generations from 1930 to 1984. The main goal is not prediction and data mining methods themselves but rather extraction of interesting patterns and knowledge acquisition from substantial datasets of demographic data. We use decision trees as techniques for demographic events prediction and emergent patterns for searching significant and potentially useful sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aisenbrey, S., Fasang, A.E.: New life for old ideas: The second wave of sequence analysis bringing the course back into the life course. Sociological Methods & Research 38(3), 420–462 (2010)

    Article  MathSciNet  Google Scholar 

  2. Billari, F.C.: Sequence analysis in demographic research. Canadian Studies in Population 28(2), 439–458 (2001)

    Google Scholar 

  3. Aassve, A., Billari, F.C., Piccarreta, R.: Strings of adulthood: A sequence analysis of young british womens work-family trajectories. European Journal of Population 23(3/4), 369–388 (2007)

    Article  Google Scholar 

  4. Jackson, P.B., Berkowitz, A.: The structure of the life course: Gender and racioethnic variation in the occurrence and sequencing of role transitions. Advances in Life Course Research 9, 55–90 (2005)

    Article  Google Scholar 

  5. Worts, D., Sacker, A., McMunn, A., McDonough, P.: Individualization, opportunity and jeopardy in american womens work and family lives: A multi-state sequence analysis. Advances in Life Course Research 18(4), 296–318 (2013)

    Article  Google Scholar 

  6. Abbott, A., Tsay, A.: Sequence analysis and optimal matching methods in sociology: Review and prospect. Sociological Methods & Research (2000)

    Google Scholar 

  7. Billari, F., Piccarreta, R.: Analyzing demographic life courses through sequence analysis. Mathematical Population Studies 12(2), 81–106 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  8. Billari, F.C., Frnkranz, J., Prskawetz, A.: Timing, Sequencing, and Quantum of Life Course Events: A Machine Learning Approach. European Journal of Population 22(1), 37–65 (2006)

    Article  Google Scholar 

  9. Gauthier, J.A., Widmer, E.D., Bucher, P., Notredame, C.: How Much Does It Cost? Optimization of Costs in Sequence Analysis of Social Science Data. Sociological Methods & Research 38(1), 197–231 (2009)

    Article  MathSciNet  Google Scholar 

  10. Ritschard, G., Oris, M.: Life course data in demography and social sciences: Statistical and data-mining approaches. Advances in Life Course Research 10, 283–314 (2005)

    Article  Google Scholar 

  11. Gabadinho, A., Ritschard, G., Mller, N.S., Studer, M.: Analyzing and Visualizing State Sequences in R with TraMineR. J. of Statistical Software 40(4), 1–37 (2011)

    Article  Google Scholar 

  12. Blockeel, H., Fürnkranz, J., Prskawetz, A., Billari, F.C.: Detecting temporal change in event sequences: an application to demographic data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 29–41. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: A Java Open-Source Pattern Mining Library. Journal of Machine Learning Research 15, 3389–3393 (2014)

    Google Scholar 

  14. Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proc. of the Fifth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52. ACM (1999)

    Google Scholar 

  15. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)

    Google Scholar 

  16. Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)

    Google Scholar 

  17. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3–14 (1995)

    Google Scholar 

  18. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006)

    Google Scholar 

  19. Mill, J.S.: A system of logic, ratonative and inductive, vol. 1. J. W. Parker, London (1843)

    Google Scholar 

  20. Finn, V.K.: On Machine-Oriented Formalization of Plausible Reasoning in the Style of F. BackonJ. S. Mill. Semiotika i Informatika 20, 35–101 (1983)

    MATH  MathSciNet  Google Scholar 

  21. Kuznetsov, S.O.: Learning of simple conceptual graphs from positive and negative examples. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 384–391. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  22. Low-Kam, C., Raissi, C., Kaytoue, M., Pei, J.: Mining statistically significant sequential patterns. In: IEEE 13th Int. Conf. on Data Mining, pp. 488–496 (2013)

    Google Scholar 

  23. Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Štajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., Zupan, B.: Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research 14, 2349–2353 (2013)

    MATH  Google Scholar 

  24. Bouckaert, R.R., Frank, E., Hall, M.A., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: WEKA - Experiences with a Java Open-Source Project. Journal of Machine Learning Research 11, 2533–2541 (2010)

    MATH  Google Scholar 

  25. Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: Özsoyoglu, Z.M., Zdonik, S.B. (eds.) Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, pp. 79–90. IEEE Computer Society (2004)

    Google Scholar 

  26. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: mining sequential patterns by prefix-projected growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224 (2001)

    Google Scholar 

  27. Cerf, L., Gay, D., Selmaoui-Folcher, N., Crmilleux, B., Boulicaut, J.F.: Parameter-free classification in multi-class imbalanced data sets. Data & Knowledge Engineering 87, 109–129 (2013)

    Article  Google Scholar 

  28. Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S.O., Napoli, A., Raïssi, C.: On projections of sequential pattern structures (with an application on care trajectories). In: 10th Int. Conf. on Concept Lattices and Their Applications, pp. 199–208 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitry I. Ignatov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ignatov, D.I., Mitrofanova, E., Muratova, A., Gizdatullin, D. (2015). Pattern Mining and Machine Learning for Demographic Sequences. In: Klinov, P., Mouromtsev, D. (eds) Knowledge Engineering and Semantic Web. KESW 2015. Communications in Computer and Information Science, vol 518. Springer, Cham. https://doi.org/10.1007/978-3-319-24543-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24543-0_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24542-3

  • Online ISBN: 978-3-319-24543-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics