Skip to main content

EER\(\rightarrow \)MLN: EER Approach for Modeling, Mapping, and Analyzing Complex Data Using Multilayer Networks (MLNs)

  • Conference paper
  • First Online:
Conceptual Modeling (ER 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12400))

Included in the following conference series:

Abstract

Extended Entity Relationship (or EER) modeling is an important step after application requirements for data analysis are gathered, and is critical for translating user requirements to a given executable data model (e.g., relational, or for this paper Multilayer Networks or MLNs.) EER modeling provides a more precise understanding of the application and data requirements and an unambiguous representation from which the data model (on which analysis is performed) can be generated algorithmically. EER has played a central role in the modeling of user-level requirements to relational, object oriented etc. UML, whose roots are in EER modeling, is extensively used in the industry.

Although big data analysis has warranted many new data models, not much attention has been paid to their modeling from requirements. Going straight from application requirements to data model and analysis, especially for complex data sets, is likely to be difficult, error prone, and not extensible to say the least. Hence for data models used in big data analysis, such as Multilayer Networks, there is a need to transform the user/application requirements using a modeling approach such as EER.

In this paper, we start with application requirements of complex data sets including analysis objectives and show how the EER approach can be leveraged for modeling given data to generate MLNs and appropriate analysis expressions on them. This is timely as MLNs are gaining popularity (and also subsume graphs) as a meaningful data representation for big data analysis.

For demonstrating the algorithm and applicability of the proposed approach, we demonstrate our approach on three data sets to generate MLNs, to map analysis requirements into expressions on MLNs. We also demonstrate it for three types of MLNs. The data sets are from DBLP (Database Bibliography-Computer Science Publications), IMDb, a large international movie data set, and US commercial airlines. Our experimental analysis validate modeling and mapping. We do not elaborate on computations as it is a separate topic in itself. The correctness of results are verified using independently available ground truth.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that the relationship details can change based on analysis objectives.

  2. 2.

    Choice of coefficient reflects relationship quality and its value can be based on how actors are weighted against genres. We have chosen 0.9 for relating actors in their top genres.

References

  1. DBLP dataset. http://dblp.uni-trier.de/xml/

  2. The internet movie database. ftp://ftp.fu-berlin.de/pub/misc/movies/database/

  3. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. (CSUR) 40(1), 1–39 (2008)

    Article  Google Scholar 

  4. Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of community hierarchies in large networks. CoRR abs/0803.0476 (2008)

    Google Scholar 

  5. Chakravarthy, S., Beera, R., Balachandran, R.: DB-subdue: database approach to graph mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 341–350. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_42

    Chapter  Google Scholar 

  6. Chen, P.P.S.: The entity-relationship model–toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)

    Article  MathSciNet  Google Scholar 

  7. Das, S., Santra, A., Bodra, J., Chakravarthy, S.: Query processing on large graphs: approaches to scalability and response time trade offs. Data Knowl. Eng. 126, 101736 (2020)

    Article  Google Scholar 

  8. De Virgilio, R., Maccioni, A., Torlone, R.: Model-driven design of graph databases. In: Yu, E., Dobbie, G., Jarke, M., Purao, S. (eds.) ER 2014. LNCS, vol. 8824, pp. 172–185. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12206-9_14

    Chapter  Google Scholar 

  9. Edmonds, J.: Maximum matching and a polyhedron with 0, 1-vertices. J. Res. Natl. Bureau Stand. B 69(125–130), 55–56 (1965)

    MathSciNet  Google Scholar 

  10. Elmasri, R.: Fundamentals of database systems. Pearson Education India (2008)

    Google Scholar 

  11. Graves, M., Bergeman, E.R., Lawrence, C.B.: Graph database systems. IEEE Eng. Med. Biol. Mag. 14(6), 737–745 (1995)

    Article  Google Scholar 

  12. Jayaram, N., Khan, A., Li, C., Yan, X., Elmasri, R.: Querying knowledge graphs by example entity tuples. IEEE Trans. Knowl. Data Eng. 27, 2797–2811 (2015)

    Article  Google Scholar 

  13. Kim, J., Lee, J.: Community detection in multi-layer graphs: a survey. SIGMOD Rec. 44(3), 37–48 (2015)

    Article  Google Scholar 

  14. Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multilayer networks. CoRR abs/1309.7233 (2013)

    Google Scholar 

  15. Melamed, D.: Community structures in bipartite networks: a dual-projection approach. PLoS ONE 9(5), e97823 (2014)

    Article  Google Scholar 

  16. Newman, M.: Networks: An Introduction. Oxford University Press Inc., New York (2010)

    Book  Google Scholar 

  17. Pokornỳ, J.: Conceptual and database modelling of graph databases. In: Proceedings of the 20th International Database Engineering & Applications Symposium (2016)

    Google Scholar 

  18. Roy-Hubara, N., Rokach, L., Shapira, B., Shoval, P.: Modeling graph database schema. IT Professional 19(6), 34–43 (2017)

    Article  Google Scholar 

  19. Santra, A., Bhowmick, S., Chakravarthy, S.: Efficient community re-creation in multilayer networks using Boolean operations. In: International Conference on Computational Science (2017)

    Google Scholar 

  20. Santra, A., Bhowmick, S., Chakravarthy, S.: Hubify: efficient estimation of central entities across multiplex layer compositions. In: IEEE ICDM Workshops (2017)

    Google Scholar 

  21. Reddy, P.K., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.): BDA 2017. LNCS, vol. 10721. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72413-3

    Book  Google Scholar 

  22. Santra, A., Komar, K.S., Bhowmick, S., Chakravarthy, S.: A new community definition for multilayer networks and a novel approach for its efficient computation. arXiv preprint arXiv:2004.09625 (2020)

  23. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)

    Article  Google Scholar 

  24. Stolworthy, J.: Dark universe: Johnny Depp and Javier Bardem join tom cruise in universal’s monster movie franchise (2017). https://www.independent.co.uk/us

    Google Scholar 

  25. Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Exp. Newslett. 14(2), 20–28 (2013)

    Article  Google Scholar 

  26. Vu, X.S., Santra, A., Chakravarthy, S., Jiang, L.: Generic multilayer network data analysis with the fusion of content and structure. In: CICLing 2019 (2019)

    Google Scholar 

Download references

Acknowledgments

For this work, Dr. Chakravarthy was partly supported by NSF Grant 1955798 and Dr. Bhowmick was partly supported by NSF grant 1916084.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sharma Chakravarthy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Komar, K.S., Santra, A., Bhowmick, S., Chakravarthy, S. (2020). EER\(\rightarrow \)MLN: EER Approach for Modeling, Mapping, and Analyzing Complex Data Using Multilayer Networks (MLNs). In: Dobbie, G., Frank, U., Kappel, G., Liddle, S.W., Mayr, H.C. (eds) Conceptual Modeling. ER 2020. Lecture Notes in Computer Science(), vol 12400. Springer, Cham. https://doi.org/10.1007/978-3-030-62522-1_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62522-1_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62521-4

  • Online ISBN: 978-3-030-62522-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics