Skip to main content

How to Understand Connections Based on Big Data: From Cliques to Flexible Granules

  • Chapter
  • First Online:
Information Granularity, Big Data, and Computational Intelligence

Part of the book series: Studies in Big Data ((SBD,volume 8))

Abstract

One of the main objectives of science and engineering is to predict the future state of the world—and to come up with actions which will lead to the most favorable outcome. To be able to do that, we need to have a quantitative model describing how the values of the desired quantities change—and for that, we need to know which factors influence this change. Usually, these factors are selected by using traditional statistical techniques, but with the current drastic increase in the amount of available data—known as the advent of big data—the traditional techniques are no longer feasible. A successful semi-heuristic method has been proposed to detect true connections in the presence of big data. However, this method has its limitations. The first limitation is that this method is heuristic—its main justifications are common sense and the fact that in several practical problems, this method was reasonably successful. The second limitation is that this heuristic method is based on using “crisp” granules (clusters), while in reality, the corresponding granules are flexible (“fuzzy”). In this chapter, we explain how the known semi-heuristic method can be justified in statistical terms, and we also show how the ideas behind this justification enable us to improve the known method by taking granule flexibility into account.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aczel, J.: Functional Equations and Their Applications. Academic, New York (1966)

    MATH  Google Scholar 

  2. Brassard, J.-P., Gecsei, J.: Path building in cellular partitioning networks. ACM SIGARCH Computer Archit News 8(3), 44–50 (1980)

    Google Scholar 

  3. Di Ciaccio, A., Coli, M., Angulo Ibanez, J.M. (eds.): Advanced Statistical Methods for the Analysis of Large Data. Springer, Berlin (2012)

    MATH  Google Scholar 

  4. Faloutsos, C., McCurley, K.S., Tomkins, A.: Fast discovery of connection subgraphs. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’04, Seattle, Washington, pp. 118–127. 22–25 Aug 2004

    Google Scholar 

  5. Fang, L., Sarma, A.D., Yu, C., Bohannon, P.: Rex: explaining relationships between entity pairs. Proc. VLDB Endowment 5(3), 241–252 (2011)

    Article  Google Scholar 

  6. Heath, K., Gelfand, N., Ovsjanikov, M., Aanjaneya, M., Guibas, L.: Image webs: computing and exploiting connectivity in image collections. In: Proceedings of the 23th IEEE Conference on Computer Vision and Pattern Recognition CVPR’2010, San Francisco, California, pp. 3432–3439. 13–18 June 2010

    Google Scholar 

  7. Hossain, M.S., Akbar, M., Polys, N.F.: Narratives in the network: interactive methods for mining cell signaling networks. J. Comput. Biol. 19(9), 1043–1059 (2012)

    Article  MathSciNet  Google Scholar 

  8. Hossain, M.S., Butler, P., Boedihardjo, A.P., Ramakrishnan, N.: Storytelling in entity networks to support intelligence analysts. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’12, Beijing, China, pp. 1375–1383. 12–16 Aug 2012

    Google Scholar 

  9. Hossain, M.S., Gresock, J., Edmonds, Y., Helm, R., Potts, M., Ramakrishnan, N.: Connecting the dots between PubMed abstracts. PLoS ONE 7(1), Paper e29509 (2012)

    Google Scholar 

  10. Klir, G., Yuan, B.: Fuzzy Sets and Fuzzy Logic. Prentice Hall, Upper Saddle River (1995)

    MATH  Google Scholar 

  11. Kumar, D., Ramakrishnan, N., Helm, R., Potts, M.: Algorithms for storytelling. IEEE Trans. Knowl. Data Eng. 20(6), 736–751 (2008)

    Article  Google Scholar 

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  13. Nguyen, H.T., Walker, E.A.: A First Course in Fuzzy Logic. Chapman and Hall/CRC, Boca Raton, Florida (2006)

    Google Scholar 

  14. Ohlhorst, F.J.: Big Data Analytics. Wiley, New York (2012)

    Google Scholar 

  15. Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press/Francis Taylor, Boca Raton (2013)

    Book  Google Scholar 

  16. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)

    Book  Google Scholar 

  17. Roy, R., Olver, D.W.J.: Lambert W function. In: Olver, W.J., Lozier, D.M., Boisvert, R.F., Clark, C.F. (eds.) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge (2010)

    Google Scholar 

  18. Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. Chapman and Hall/CRC Press, Boca Raton, Florida (2011)

    MATH  Google Scholar 

  19. Srinivasa, S., Bhatnagar, V. (eds.): Big data analytics. In: Proceedings of the First International Conference on Big Data Analytics BDA’2012. Lecture Notes in Computer Science, vol. 7678. Springer, New Delhi, 24–26 Dec 2012

    Google Scholar 

  20. Swanson, D.R.: Complementary structures in disjoint science literatures. In: Bookstein, A., Chiaramella, Y., Salton, G., Raghavan, V.V. (eds.) Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR’91, Chicago, Illinois, pp. 280–289. 13–16 Oct 1991

    Google Scholar 

  21. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Science Foundation (NSF) grants HRD-0734825 and HRD-1242122 (Cyber-ShARE Center of Excellence), NSF grant DUE-0926721, and by M. S. Hossain’s startup grant at UTEP. The authors are greatly thankful to the anonymous referees for valuable suggestions and to the editors of this volume, Shyi-Ming Chen and Witold Pedrycz, for their support and encouragement.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladik Kreinovich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Jalal-Kamali, A., Shahriar Hossain, M., Kreinovich, V. (2015). How to Understand Connections Based on Big Data: From Cliques to Flexible Granules. In: Pedrycz, W., Chen, SM. (eds) Information Granularity, Big Data, and Computational Intelligence. Studies in Big Data, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-08254-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08254-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08253-0

  • Online ISBN: 978-3-319-08254-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics