How to Understand Connections Based on Big Data: From Cliques to Flexible Granules

Jalal-Kamali, Ali; Shahriar Hossain, M.; Kreinovich, Vladik

doi:10.1007/978-3-319-08254-7_4

Ali Jalal-Kamali⁴,
M. Shahriar Hossain⁴ &
Vladik Kreinovich⁴

Part of the book series: Studies in Big Data ((SBD,volume 8))

3076 Accesses
1 Citations

Abstract

One of the main objectives of science and engineering is to predict the future state of the world—and to come up with actions which will lead to the most favorable outcome. To be able to do that, we need to have a quantitative model describing how the values of the desired quantities change—and for that, we need to know which factors influence this change. Usually, these factors are selected by using traditional statistical techniques, but with the current drastic increase in the amount of available data—known as the advent of big data—the traditional techniques are no longer feasible. A successful semi-heuristic method has been proposed to detect true connections in the presence of big data. However, this method has its limitations. The first limitation is that this method is heuristic—its main justifications are common sense and the fact that in several practical problems, this method was reasonably successful. The second limitation is that this heuristic method is based on using “crisp” granules (clusters), while in reality, the corresponding granules are flexible (“fuzzy”). In this chapter, we explain how the known semi-heuristic method can be justified in statistical terms, and we also show how the ideas behind this justification enable us to improve the known method by taking granule flexibility into account.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aczel, J.: Functional Equations and Their Applications. Academic, New York (1966)
MATH Google Scholar
Brassard, J.-P., Gecsei, J.: Path building in cellular partitioning networks. ACM SIGARCH Computer Archit News 8(3), 44–50 (1980)
Google Scholar
Di Ciaccio, A., Coli, M., Angulo Ibanez, J.M. (eds.): Advanced Statistical Methods for the Analysis of Large Data. Springer, Berlin (2012)
MATH Google Scholar
Faloutsos, C., McCurley, K.S., Tomkins, A.: Fast discovery of connection subgraphs. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’04, Seattle, Washington, pp. 118–127. 22–25 Aug 2004
Google Scholar
Fang, L., Sarma, A.D., Yu, C., Bohannon, P.: Rex: explaining relationships between entity pairs. Proc. VLDB Endowment 5(3), 241–252 (2011)
Article Google Scholar
Heath, K., Gelfand, N., Ovsjanikov, M., Aanjaneya, M., Guibas, L.: Image webs: computing and exploiting connectivity in image collections. In: Proceedings of the 23th IEEE Conference on Computer Vision and Pattern Recognition CVPR’2010, San Francisco, California, pp. 3432–3439. 13–18 June 2010
Google Scholar
Hossain, M.S., Akbar, M., Polys, N.F.: Narratives in the network: interactive methods for mining cell signaling networks. J. Comput. Biol. 19(9), 1043–1059 (2012)
Article MathSciNet Google Scholar
Hossain, M.S., Butler, P., Boedihardjo, A.P., Ramakrishnan, N.: Storytelling in entity networks to support intelligence analysts. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’12, Beijing, China, pp. 1375–1383. 12–16 Aug 2012
Google Scholar
Hossain, M.S., Gresock, J., Edmonds, Y., Helm, R., Potts, M., Ramakrishnan, N.: Connecting the dots between PubMed abstracts. PLoS ONE 7(1), Paper e29509 (2012)
Google Scholar
Klir, G., Yuan, B.: Fuzzy Sets and Fuzzy Logic. Prentice Hall, Upper Saddle River (1995)
MATH Google Scholar
Kumar, D., Ramakrishnan, N., Helm, R., Potts, M.: Algorithms for storytelling. IEEE Trans. Knowl. Data Eng. 20(6), 736–751 (2008)
Article Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Nguyen, H.T., Walker, E.A.: A First Course in Fuzzy Logic. Chapman and Hall/CRC, Boca Raton, Florida (2006)
Google Scholar
Ohlhorst, F.J.: Big Data Analytics. Wiley, New York (2012)
Google Scholar
Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press/Francis Taylor, Boca Raton (2013)
Book Google Scholar
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)
Book Google Scholar
Roy, R., Olver, D.W.J.: Lambert W function. In: Olver, W.J., Lozier, D.M., Boisvert, R.F., Clark, C.F. (eds.) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge (2010)
Google Scholar
Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. Chapman and Hall/CRC Press, Boca Raton, Florida (2011)
MATH Google Scholar
Srinivasa, S., Bhatnagar, V. (eds.): Big data analytics. In: Proceedings of the First International Conference on Big Data Analytics BDA’2012. Lecture Notes in Computer Science, vol. 7678. Springer, New Delhi, 24–26 Dec 2012
Google Scholar
Swanson, D.R.: Complementary structures in disjoint science literatures. In: Bookstein, A., Chiaramella, Y., Salton, G., Raghavan, V.V. (eds.) Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR’91, Chicago, Illinois, pp. 280–289. 13–16 Oct 1991
Google Scholar
Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Science Foundation (NSF) grants HRD-0734825 and HRD-1242122 (Cyber-ShARE Center of Excellence), NSF grant DUE-0926721, and by M. S. Hossain’s startup grant at UTEP. The authors are greatly thankful to the anonymous referees for valuable suggestions and to the editors of this volume, Shyi-Ming Chen and Witold Pedrycz, for their support and encouragement.

Author information

Authors and Affiliations

Department of Computer Science, University of Texas at El Paso, El Paso, TX, 79968, USA
Ali Jalal-Kamali, M. Shahriar Hossain & Vladik Kreinovich

Authors

Ali Jalal-Kamali
View author publications
You can also search for this author in PubMed Google Scholar
M. Shahriar Hossain
View author publications
You can also search for this author in PubMed Google Scholar
Vladik Kreinovich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladik Kreinovich .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
Witold Pedrycz
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shyi-Ming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jalal-Kamali, A., Shahriar Hossain, M., Kreinovich, V. (2015). How to Understand Connections Based on Big Data: From Cliques to Flexible Granules. In: Pedrycz, W., Chen, SM. (eds) Information Granularity, Big Data, and Computational Intelligence. Studies in Big Data, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-08254-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-08254-7_4
Published: 15 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08253-0
Online ISBN: 978-3-319-08254-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics