D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit

Kuo, Fang-Chang; Schindelhauer, Christian; Wang, Hwang-Cheng; Lin, Wen-Jun; Tseng, Chih-Cheng

doi:10.1007/s11277-020-07313-2

D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit

Published: 16 April 2020

Volume 113, pages 1455–1470, (2020)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Fang-Chang Kuo¹,
Christian Schindelhauer³,
Hwang-Cheng Wang¹,
Wen-Jun Lin¹ &
…
Chih-Cheng Tseng²

367 Accesses
3 Citations
Explore all metrics

Abstract

Device-to-device (D2D) communication is defined as the direct communication between two D2D user equipments (DUEs) without traversing the evolved NodeB of 5G networks. In the underlay mode of resource reuse, DUEs and cellular user equipments share resource blocks to improve system throughput by reusing the spectrum. In order to further enhance the performance, an extended version of reinforcement learning algorithm, Multi-Player Multi-Armed Bandit, is employed to control the transmission power of the DUEs to reduce the interference induced by resource sharing. Three learning strategies, namely Epsilon-first, Epsilon-greedy, Upper-Confidence-Bound, are applied. Simulation results show that the proposed method improves performance in terms of the average transmission power of D2D pairs, the ratio of unallocated D2D pairs, energy efficiency, and total throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fair Resource Reusing for D2D Communication Based on Reinforcement Learning

D2D Resource Allocation Based on Reinforcement Learning and QoS

Article 11 July 2023

Reinforcement learning based distributed resource allocation technique in device-to-device (D2D) communication

Article 07 February 2023

References

Tehrani, M. N., Uysal, M., & Yanikomeroglu, H. (2014). Device-to-device communication in 5G cellular networks: Challenges, solutions, and future directions. IEEE Communications Magazine,52(5), 86–92.
Article Google Scholar
Asadi, A., Wang, Q., & Mancuso, V. (2014). A survey on device-to-device communication in cellular networks. IEEE Communications Surveys & Tutorials,16(4), 1801–1819.
Article Google Scholar
Feng, D., Lu, L., Yuan-Wu, Y., Li, G. Y., Li, S., & Feng, G. (2014). Device-to-device communications in cellular networks. IEEE Communications Magazine,52(4), 49–55.
Article Google Scholar
Huynh, T., Onuma, T., Kuroda, K., Hasegawa, M., & Hwang, W.-J. (2016). Joint downlink and uplink interference management for device to device communication underlaying cellular networks. IEEE Access,4, 4420–4430.
Article Google Scholar
Luo, Y., Shi, Z., Zhou, X., Liu, Q., & Yi, Q. (2014). Dynamic resource allocations based on q-learning for d2d communication in cellular networks. In 2014 11th international computer conference on wavelet actiev media technology and information processing (ICCWAMTIP) (pp. 385–388). IEEE.
Lee, N., Lin, X., Andrews, J. G., & Heath, R. W. (2014). Power control for D2D underlaid cellular networks: Modeling, algorithms, and analysis. IEEE Journal on Selected Areas in Communications,33(1), 1–13.
Article Google Scholar
Nie, S., Fan, Z., Zhao, M., Gu, X., & Zhang, L. (2016). Q-learning based power control algorithm for D2D communication. In 2016 IEEE 27th annual international symposium on personal, indoor, and mobile radio communications (PIMRC) (pp. 1–6). IEEE.
Bubeck, S., & Bianchi, N. C. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems (No. 1). Boston: Now Publishers Inc.
Book Google Scholar
Maghsudi, S., & Hossain, E. (2016). Multi-armed bandits with application to 5G small cells. IEEE Wireless Communications,23(3), 64–73.
Article Google Scholar
Kalathil, D., Nayyar, N., & Jain, R. (2014). Decentralized learning for multiplayer multiarmed bandits. IEEE Transactions on Information Theory,60(4), 2331–2345.
Article MathSciNet Google Scholar
Bistritz I., & Leshem, A. (2018). Distributed multi-player bandits-a game of thrones approach. In Proceedings of the 32nd international conference on neural information processing systems (pp. 7222–7232). Curran Associates Inc.
Maghsudi, S., & Stańczak, S. (2014). Joint channel selection and power control in infrastructureless wireless networks: A multiplayer multiarmed bandit framework. IEEE Transactions on Vehicular Technology,64(10), 4565–4578.
Article Google Scholar
GPP. (2016). TS 36.213: Evolved universal terrestrial radio access (E-UTRA); Physical layer procedures.
Ghosh, A., & Ratasuk, R. (2011). Essentials of LTE and LTE-A. Cambridge: Cambridge University Press.
Book Google Scholar
Sutton, R. S., & Barto, A. G. (2011). Reinforcement learning: An introduction. Cambridge: The MIT Press.
MATH Google Scholar
GPP. (2014). TR 36.843: Study on LTE device to device proximity services; Radio aspects.

Download references

Acknowledgements

This research was supported by the Ministry of Science and Technology of Taiwan under Grant Nos. 108-2221-E-197-009 and 108-2221-E-197-011. The authors also would like to thank Prof. Cho-Chin Lin and MS student Yu-Yang Hsieh for their help with the pseudo-codes.

Author information

Authors and Affiliations

Department of Electronic Engineering, National Ilan University, Yilan County, Taiwan
Fang-Chang Kuo, Hwang-Cheng Wang & Wen-Jun Lin
Department of Electrical Engineering, National Ilan University, Yilan County, Taiwan
Chih-Cheng Tseng
Department of Computer Networks and Telematics, University of Freiburg, Freiburg im Breisgau, Germany
Christian Schindelhauer

Authors

Fang-Chang Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Christian Schindelhauer
View author publications
You can also search for this author in PubMed Google Scholar
Hwang-Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Jun Lin
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Cheng Tseng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chih-Cheng Tseng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuo, FC., Schindelhauer, C., Wang, HC. et al. D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit. Wireless Pers Commun 113, 1455–1470 (2020). https://doi.org/10.1007/s11277-020-07313-2

Download citation

Published: 16 April 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s11277-020-07313-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit

Abstract

Access this article

Similar content being viewed by others

Fair Resource Reusing for D2D Communication Based on Reinforcement Learning

D2D Resource Allocation Based on Reinforcement Learning and QoS

Reinforcement learning based distributed resource allocation technique in device-to-device (D2D) communication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit

Abstract

Access this article

Similar content being viewed by others

Fair Resource Reusing for D2D Communication Based on Reinforcement Learning

D2D Resource Allocation Based on Reinforcement Learning and QoS

Reinforcement learning based distributed resource allocation technique in device-to-device (D2D) communication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation