Abstract
Learning automata (LA), a powerful tool for reinforcement learning in the field of machine learning, could explore its optimal state by continuously interacting with an external environment. Generally, the traditional LA algorithms, especially estimator LA algorithms, can be ultimately abstracted out as P- or Q-models, which are simply located in the stationary environments. A more comprehensive consideration would be S-model operating in the non-stationary environment. For this specific LA, presently the most popular achievement belongs to stochastic estimator LA (SELA). However, synchronously handing four parameters involved in SELA is an intractable job, as these parameters may vary dramatically in values under different environments, making it essential to develop a strategy for parameter tuning. In this paper, we first propose a scheme to determine the parameter searching scope and subsequently present a series of parameter searching methods, including a four-dimensional method and a two-dimensional method, making SELA applicable for any environment with switching non-stationary characteristics. Furthermore, to decrease the tuning cost, a reduced parameter SELA supported by the new two-dimensional parameter searching method emerges. And to break the traditional limit that the environmental reward probability must be symmetrically distributed, the S-model is constructed from a new perspective, thus forming a novel reduced parameter S-model of SELA (rpS-SELA). A detailed mathematical proof theoretically reveals the absolute expediency of rpS-SELA. In addition, it is demonstrated by experimental simulations that rpS-SELA outperforms others with a reduced tuning cost, a minor time consumption, a higher accuracy rate, and above all, a stronger tracking ability to the environmental switches.
Similar content being viewed by others
References
Agache M, Oommen BJ (2002) Generalized pursuit learning schemes: new families of continuous and discretized learning automata. IEEE Trans Syst Man Cybern Part B Cybern A Publ IEEE Syst Man Cybern Soc 32(6):738–749
Amiri F, Yazdani N, Faili H, Rezvanian A (2013) A novel community detection algorithm for privacy preservation in social networks. Springer, Berlin
Baba N, Mogami Y (2002) A new learning algorithm for the hierarchical structure learning automata operating in the nonstationary s-model random environment. IEEE Trans Syst Man Cybern Part B (Cybern) 32(6):750–758
Beigy H, Meybodi MR (2020) An iterative stochastic algorithm based on distributed learning automata for finding the stochastic shortest path in stochastic graphs. J Supercomput 76(7):5540–5562
Beigy H, Meybodi MR (2021) A sampling method based on distributed learning automata for solving stochastic shortest path problem. Knowl-Based Syst 212(106):638
Cetlin ML (1961) On the behavior of finite automata in random media. Autom Remote Control 22(10):1345–1354
Cuevas E, Wario F, Zaldivar D, Pèrez-Cisneros M (2013) Circle detection on images using learning automata. In: Artificial intelligence, evolutionary computing and metaheuristics. Springer, pp 545–570
El Khamlichi B, Nguyen DH, El Abbadi J, Rowe NW, Kumar S (2018) Learning automaton-based neighbor discovery for wireless networks using directional antennas. IEEE Wirel Commun Lett 8(1):69–72
Ge H, Jiang W, Li S, Li J, Wang Y, Jing Y (2015) A novel estimator based learning automata algorithm. Appl Intell 42(2):262–275
Guo H, Li S, Li B, Ma Y, Ren X (2017) A new learning automata-based pruning method to train deep neural networks. IEEE Internet Things J 5(5):3263–3269
Hasanzadeh M, Meybodi MR (2014) Grid resource discovery based on distributed learning automata. Computing 96(9):909–922
Jiang W, Li B, Li S, Tang Y, Chen CLP (2016) A new prospective for learning automata: a machine learning approach. Neurocomputing 188:319–325
Koulouriotis DE, Xanthopoulos A (2008) Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems. Appl Math Comput 196(2):913–922
Maravall D, De Lope J, Fuentes JP (2013) Fusion of probabilistic knowledge-based classification rules and learning automata for automatic recognition of digital images. Pattern Recogn Lett 34(14):1719–1724
Mousavian A, Rezvanian A, Meybodi MR (2013) Solving minimum vertex cover problem using learning automata. Computer Science
Najim K, Poznyak AS (2014) Learning automata: theory and applications. Elsevier, Amsterdam
Narendra KS, Thathachar MA (2012) Learning automata: an introduction. Courier Corporation, Chelmsford
Obaidat MS, Papadimitriou GI, Pomportsis AS (2003) Efficient fast learning automata. Inf Sci 157(1):121–133
Oommen BJ, Hashem MK (2010) Modeling a student’s behavior in a tutorial-like system using learning automata. IEEE Trans Syst Man Cybern Part B (Cybern) 40(2):481–492
Oommen BJ, Hashem MK (2013) Modeling the learning processes of the teacher in a tutorial-like system using learning automata. IEEE Trans Cybern 43(6):2020–2031
Oommen J, Misra S (2009) Cybernetics and learning automata. In: Springer handbook of automation. Springer, pp 221–235
Papadimitriou GI (1995) A new approach to the design of reinforcement schemes for learning automata: stochastic estimator learning algorithms. IEEE Trans Knowl Data Eng 7(3):275–297
Papadimitriou GI, Pomportsis AS, Kiritsi S, Talahoupi E (2002) Absorbing stochastic estimator learning automata for s-model stationary environments. Inf Sci 147(1–4):193–199
Papadimitriou GI, Sklira M, Pomportsis AS (2004) A new class of epsilon-optimal learning automata. IEEE Trans Syst Man Cybern Part B Cybern A Publ IEEE Syst Man Cybern Soc 34(1):246
Rezvanian A, Meybodi MR (2010) An adaptive mutation operator for artificial immune network using learning automata in dynamic environments. In: Nature and biologically inspired computing, pp 479–483
Rezvanian A, Meybodi MR (2010) Tracking extrema in dynamic environments using a learning automata-based immune algorithm. Springer, Berlin
Rezvanian A, Meybodi MR (2015) Finding maximum clique in stochastic graphs using distributed learning automata. Int J Uncertain Fuzziness Knowl-Based Syst 23(01):1–31
Rezvanian A, Meybodi MR (2015) A new learning automata-based sampling algorithm for social networks. Int J Commun Syst 30:e3091
Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A Stat Mech Appl 396(2):224–234
Rezvanian A, Saghiri AM, Vahidipour SM, Esnaashari M, Meybodi MR (2018) Recent advances in learning automata, vol 754. Springer, Berlin
Rezvanian A, Moradabadi B, Ghavipour M, Khomami MMD, Meybodi MR (2019) Introduction to learning automata models. In: Learning automata approach for social networks. Springer, pp 1–49
Rezvanian A, Moradabadi B, Ghavipour M, Khomami MMD, Meybodi MR (2019) Learning automata approach for social networks, vol 820. Springer, Berlin
Sutton RS, Barto AG (2013) Reinforcement learning: an introduction. IEEE Trans Neural Netw 9(5):1054
Tsetlin ML (1973) Automaton theory and modeling of biological systems. Amereconrev pp 234–244
Vahidipour SM, Meybodi MR, Esnaashari M (2015) Learning automata-based adaptive petri net and its application to priority assignment in queuing systems with unknown parameters. IEEE Trans Syst Man Cybern Syst 45(10):1373–1384
Vahidipour SM, Meybodi MR, Esnaashari M (2016) Adaptive petri net based on irregular cellular learning automata with an application to vertex coloring problem. Appl Intell 46:1–13
Vasilakos AV, Papadimitriou GI (1992) Ergodic discretized estimator learning automata with high accuracy and high adaptation rate for nonstationary environments. Neurocomputing 4(3–4):181–196
Yazidi A, Oommen BJ, Horn G, Granmo OC (2016) Stochastic discretized learning-based weak estimation: a novel estimation method for non-stationary environments. Pattern Recogn 60:430–443
Zhang J, Wang C, Zhou M (2014) Last-position elimination-based learning automata. IEEE Trans Cybern 44(12):2484–2492
Zhang X, Granmo OC, Oommen BJ (2013) On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl Intell 39(4):782–792
Acknowledgements
This research work is funded by the Science Foundation of North China University of Technology 110051360002, the Basic Scientific Research from Beijing Education Commission 110052972027, and the National Nature Science Foundation of China under Grant 61971283.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, Y., Di, C. & Li, S. A novel reduced parameter s-model of estimator learning automata in the switching non-stationary environment. Neural Comput & Applic 34, 6811–6824 (2022). https://doi.org/10.1007/s00521-021-06777-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06777-y