Abstract
In many distributed systems, from cloud to sensor networks, different configurations impact system performance, while strongly depending on the network topology. Hence, topological changes may entail costly reconfiguration and optimisation processes. This paper proposes a multi-agent solution for recovering networks from node failures. To preserve the network topology, the proposed approach relies on local information about the network’s structure, which is collected and disseminated at runtime. The paper studies two strategies for distributing topological data: one based on mobile agents (our proposal) and the other based on Trickle (a reference gossiping protocol from the literature). These two strategies were adapted for our self-healing approach—to collect topological information for recovering the network; and were evaluated in terms of resource overheads. Experimental results show that both variants can recover the network topology, up to a certain node failure rate, which depends on the network topology. At the same time, mobile agents collect less information, focusing on local dissemination, which suffices for network recovery. This entails less bandwidth overheads than when Trickle is used. Still, mobile agents utilise more memory and exchange more messages, during data-collection, than Trickle does. These results validate the viability of the proposed self-healing solution, offering two variant implementations with diverse performance characteristics, which may suit different application domains.
Similar content being viewed by others
Notes
Source code is available at: https://github.com/arleserp/NetworkRecoverySim/tree/master-JAAMAS.
References
Aderaldo, C. M., Mendonça, N. C., Pahl, C., & Jamshidi, P. (2017). Benchmark requirements for microservices architecture research. In Proceedings—2017 IEEE/ACM 1st international workshop on establishing the community-wide infrastructure for architecture-based software engineering, ECASE 2017 (Vol. 1, pp. 8–13). https://doi.org/10.1109/ECASE.2017.4.
Amazon: Summary of the Amazon S3 Service Disruption in the Northern Virginia (US-EAST-1) Region. https://aws.amazon.com/es/message/41926/.
Bai, Y. N., Huang, N., Sun, L., & Wang, L. (2019). Reliability-based topology design for large-scale networks. ISA Transactions, 94, 144–150. https://doi.org/10.1016/j.isatra.2019.04.004.
Barabási, A. L., & Bonabeau, E. (2003). Scale-free networks. Scientific American, 288(5), 60–69. https://doi.org/10.1038/scientificamerican0503-60.
Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Third international AAAI conference on weblogs and social media. http://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/154.
Chaoqi, F., Ying, W., Kun, Z., & Yangjun, G. (2018). Complex networks under dynamic repair model. Physica A: Statistical Mechanics and Its Applications, 490, 323–330. https://doi.org/10.1016/j.physa.2017.08.071.
Chen, Z., Wu, J., Rong, Z., & Tse, C. K. (2018). Optimal topologies for maximizing network transmission capacity. Physica A: Statistical Mechanics and Its Applications, 495, 191–201. https://doi.org/10.1016/j.physa.2017.12.084.
Debbabi, B., Diaconescu, A., & Lalanda, P. (2012). Controlling self-organising software applications with archetypes. In International conference on self-adaptive and self-organizing systems, SASO (pp. 69–78). https://doi.org/10.1109/SASO.2012.21.
Feldmann, M., Scheideler, C., & Schmid, S. (2020). Survey on algorithms for self-stabilizing overlay networks. ACM Computing Survey,. https://doi.org/10.1145/3397190.
Frasheri, M., Cano-Garcia, J., González-Parada, E., Çürüklü, B., Ekström, M., Papadopoulos, A. V., & Urdiales, C. (2020). Adaptive autonomy in wireless sensor networks. In Proceedings of the 19th international conference on autonomous agents and multiagent systems, AAMAS ’20 (pp. 375–383). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems.
Gallos, L. K., & Fefferman, N. H. (2015). Simple and efficient self-healing strategy for damaged complex networks. Physical Review E-Statistical, Nonlinear, and Soft Matter Physics,. https://doi.org/10.1103/PhysRevE.92.052806.
Ghaffari, M., & Haeupler, B. (2013). Near optimal leader election in multi-hop radio networks. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on discrete algorithms (pp. 748–766). SIAM.
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12), 7821–6. https://doi.org/10.1073/pnas.122653799.
Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: A review and first update. Ecological Modelling, 221(23), 2760–2768. https://doi.org/10.1016/j.ecolmodel.2010.08.019.
Grimm, V., Polhill, J. G., & Touza, J. (2013). Documenting social simulation models: The ODD protocol. Simulating social complexity: A handbook (pp. 117–34).
Hayashi, Y. (2016). Spatially self-organized resilient networks by a distributed cooperative mechanism. Physica A: Statistical Mechanics and Its Applications, 457, 255–269. https://doi.org/10.1016/j.physa.2016.03.090.
Huang, D., & Wu, H. (2017). Mobile cloud computing: Foundations and service models. Burlington: Morgan Kaufmann Publishers Inc.
Kuhn, F. (2020). Faster deterministic distributed coloring through recursive list coloring. https://doi.org/10.1137/1.9781611975994.76.
Lalanda, P., Mccann, J. A., & Diaconescu, A. (2013). Autonomic computing: Principles, design and implementation. New York: Springer.
Leskovec, J., Kleinberg, J., & Faloutsos, C. (2005). Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining (pp. 177–187). https://snap.stanford.edu/data/as-733.html.
Levis, P., Clausen, T., Hui, J., Gnawali, O., & Ko, J. (2011). The trickle algorithm. Internet Engineering Task Force, RFC6206.
Mahmood, Z., & Hill, R. (Eds.). (2011). Cloud computing for enterprise architectures. Computer communications and networks. London: Springer. https://doi.org/10.1007/978-1-4471-2236-4.
Majdandzic, A., Braunstein, L. A., Curme, C., Vodenska, I., Levy-Carciente, S., Stanley, H. E., et al. (2016). Multiple tipping points and optimal repairing in interacting networks. Nature Communications, 7, 1–10. https://doi.org/10.1038/ncomms10850.
Makowski, Ł., & Grosso, P. (2019). Evaluation of virtualization and traffic filtering methods for container networks. Future Generation Computer Systems, 93, 345–357. https://doi.org/10.1016/j.future.2018.08.012.
Marinescu, D. C. (2017). Cloud computing: Theory and practice. Burlington: Morgan Kaufmann.
Mori, H., Uehara, M., & Matsumoto, K. (2015). Parallel architectures with small world network model. In IEEE 29th international conference on advanced information networking and applications workshops (pp. 467–472). https://doi.org/10.1109/WAINA.2015.84.
Nikolic, M. (2012). Measuring similarity of graphs and their nodes by neighbor matching. Intelligent Data Analysis, 16(6), 865–878. https://doi.org/10.3233/IDA-2012-00556.
Ochoa-Aday, L., Cervello-Pastor, C., & Fernandez-Fernandez, A. (2018). Self-healing topology discovery protocol for software-defined networks. IEEE Communications Letters, 22(5), 1070–1073. https://doi.org/10.1109/LCOMM.2018.2816921.
Rodrigues, L. A., Duarte, E. P., & Arantes, L. (2018). A distributed k-mutual exclusion algorithm based on autonomic spanning trees. Journal of Parallel and Distributed Computing, 115, 41–55. https://doi.org/10.1016/j.jpdc.2018.01.008.
Rodríguez, A., Botina, N., Gómez, J., & Diaconescu, A. (2019). Improving data collection in complex networks with failure-prone agents via local marking. Journal of Intelligent and Fuzzy Systems, 36(5), 5081–5089. https://doi.org/10.3233/JIFS-179053.
Rodríguez, A., Gómez, J., & Diaconescu, A. (2017). Exploring complex networks with failure-prone agents. In O. Pichardo-Lagunas, S. Miranda-Jiménez (eds.) Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 10062, pp. 81–98). Lecture notes in computer science, Cham. https://doi.org/10.1007/978-3-319-62428-0_7.
Safaei, F., Yeganloo, H., & Akbar, R. (2020). Robustness on topology reconfiguration of complex networks: An entropic approach. Mathematics and Computers in Simulation, 170, 379–409. https://doi.org/10.1016/j.matcom.2019.11.013.
Sashika, D. (2014). Measuring graph similarity using neighbor matching. https://wadsashika.wordpress.com/2014/09/19/measuring-graph-similarity-using-neighbor-matching/.
Small, M. (2016). “Scale-Free Network”—MathWorld-A Wolfram Web Resource. http://mathworld.wolfram.com/Scale-FreeNetwork.html.
Takemoto, K., & Oosawa, C. (2012). Introduction to complex networks: measures, statistical properties, and models. In Statistical and machine learning approaches for network analysis (pp. 45–75). Hoboken, NJ: Wiley. https://doi.org/10.1002/9781118346990.ch2.
Van Der Hofstad, R. (2016). Random graphs and complex networks (Vol. I). http://www.win.tue.nl/rhofstad/NotesRGCN.pdfI, http://www.win.tue.nl/~jkomjath/NotesRGCN2013may.pdf.
Wang, J., Rong, L., Zhang, L., & Zhang, Z. (2008). Attack vulnerability of scale-free networks due to cascading failures. Physica A: Statistical Mechanics and Its Applications, 387(26), 6671–6678. https://doi.org/10.1016/j.physa.2008.08.037.
Wang, T., Zhang, J., & Wandelt, S. (2017). Exploiting global information in complex network repair processes. Chinese Journal of Aeronautics, 30(3), 1086–1100. https://doi.org/10.1016/j.cja.2017.03.007.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ’small-world’ networks. Nature, 393(6684), 440–2. https://doi.org/10.1038/30918.
Weng, T., Small, M., Zhang, J., & Hui, P. (2015). Lévy walk navigation in complex networks: A distinct relation between optimal transport exponent and network dimension. Nature Publishing Group, 5, 1–9. https://doi.org/10.1038/srep17309.
White, S. (2005). Analysis and visualization of network data using JUNG. Journal Of Statistical Software, VV(Ii), 1–35.
Funding
Funding was provided by Fundacion Universitaria Konrad Lorenz (CO).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rodríguez, A., Gómez, J. & Diaconescu, A. A decentralised self-healing approach for network topology maintenance. Auton Agent Multi-Agent Syst 35, 6 (2021). https://doi.org/10.1007/s10458-020-09486-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s10458-020-09486-3