Reinforcement Based User Scheduling for Cellular Communications

Gradus, Nimrod; Cohen, Asaf; Biton, Erez; Gurwitz, Omer

doi:10.1007/978-3-031-07689-3_15

Nimrod Gradus^10,11,
Asaf Cohen¹⁰,
Erez Biton¹¹ &
…
Omer Gurwitz¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13301))

Included in the following conference series:

International Symposium on Cyber Security, Cryptology, and Machine Learning

903 Accesses

Abstract

Scheduling in cellular networks is one of the most influential factors in performance in wireless deployments such as 4G and 5G and is one of the most challenging and influential resource allocation tasks performed by the base station. It requires the handling of two important performance metrics, throughput and fairness. Fundamentally, these two metrics challenge one another, and maximization of one might come at the expense of the other. On the one hand maximizing the throughput, which is the goal of many communication networks, requires allocating the resources to users with better channel conditions. On the other hand, fairness requires allocating some resources to users with poor channel conditions. One of the prevalent scheduling schemes relies on maximization of the proportional fairness criterion that balances between the two aforementioned metrics with minimal compromise. Proportional fairness based schedulers commonly rely on a greedy approach in which each resource block is allocated to the user that maximizes the proportional fairness criterion. However, typically users can tolerate some delay especially if it boosts their performance.

Motivated by this assertion, we suggest a reinforcement-based proportional-fair scheduler for cellular networks. The suggested scheduler incorporates users’ channel estimates together with predicted future channel estimates in the process of resource allocation, in order to maximize the proportional fairness criterion in predefined periodic time epochs. We developed a reinforcement learning tool that learns the users’ channel fluctuations and decides upon the best user selection at each time slot in order to achieve the best fairness in throughput trade-off over multiple time slots. We demonstrate through simulations how such a scheduler outperforms the standardized proportional fairness. We further implemented the suggested scheme on a real live 4G base station, also known as an EnodeB, and showed similar gains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Asadi, A., Mancuso, V.: A survey on opportunistic scheduling in wireless communications. IEEE Commun. Surv. Tutor. 15(4), 1671–1688 (2013)
Article Google Scholar
Bang, H.J., Ekman, T., Gesbert, D.: Channel predictive proportional fair scheduling. IEEE Trans. Wirel. Commun. 7(2), 482–487 (2008)
Article Google Scholar
Capozzi, F., Piro, G., Grieco, L.A., Boggia, G., Camarda, P.: Downlink packet scheduling in ITE cellular networks: key design issues and a survey. IEEE Commun. Surv. Tutor. 15(2), 678–700 (2012)
Article Google Scholar
Chung, S.T., Goldsmith, A.J.: Degrees of freedom in adaptive modulation: a unified view. IEEE Trans. Commun. 49(9), 1561–1571 (2001)
Article Google Scholar
Donthi, S.N., Mehta, N.B.: An accurate model for EESM and its application to analysis of CQI feedback schemes and scheduling in ITE. IEEE Trans. Wirel. Commun. 10(10), 3436–3448 (2011)
Article Google Scholar
Duran, A., Toril, M., Ruiz, F., Mendo, A.: Self-optimization algorithm for outer loop link adaptation in ITE. IEEE Commun. Lett. 19(11), 2005–2008 (2015)
Article Google Scholar
Elliott, E.O.: Estimates of error rates for codes on burst-noise channels. Bell Syst. Tech. J. 42(5), 1977–1997 (1963)
Article Google Scholar
Gilbert, E.N.: Capacity of a burst-noise channel. Bell Syst. Tech. J. 39(5), 1253–1265 (1960)
Article MathSciNet Google Scholar
Huaizhou, S.H.I., Venkatesha Prasad, R., Onur, E., Niemegeers, I.G.M.M.: Fairness in wireless networks: issues, measures and challenges. IEEE Commun. Surv. Tutor. 16(1), 5–24 (2013)
Google Scholar
Kelly, F.: Charging and rate control for elastic traffic. Eur. Trans. Telecommun. 8(1), 33–37 (1997)
Article Google Scholar
Morales-Jimnez, D., Scnchez, J.J., Gmez, G., Aguayo-Torres, M.C., Entrambasaguas, J.T.: Imperfect adaptation in next generation OFDMA cellular systems (2009)
Google Scholar
Ouyang, W., Eryilmaz, A., Shroff, N.B.: Downlink scheduling over Markovian fading channels. IEEE/ACM Trans. Netw. 24(3), 1801–1812 (2015)
Article Google Scholar
Piazza, D., Milstein, L.B.: Multiuser diversity-mobility tradeoff: modeling and performance analysis of a proportional fair scheduling. In: Global Telecommunications Conference, 2002 (GLOBECOM’02), vol. 1, pp. 906–910. IEEE (2002)
Google Scholar
Sesia, S., Toufik, I., Baker, M.: LTE-the UMTS Long Term Evolution: From Theory to Practice. Wiley (2011)
Google Scholar
Shmuel, O., Cohen, A., Gurewitz, O.: Performance analysis of opportunistic distributed scheduling in multi-user systems. IEEE Trans. Commun. 66(10), 4637–4652 (2018)
Google Scholar
Tokic, M., Palm, G.: Value-difference based exploration: adaptive control between epsilon-greedy and softmax. In: Bach, J., Edelkamp, S. (eds.) KI 2011. LNCS (LNAI), vol. 7006, pp. 335–346. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24455-1_33
Tsai, T.-Y., , Chung, Y.-L., Tsai, Z.: Introduction to packet scheduling algorithms for communication networks. Sciyo (2010)
Google Scholar
Viswanath, P., Tse, D.N.C., Laroia, R.: Opportunistic beamforming using dumb antennas. In: Proceedings IEEE International Symposium on Information Theory, p. 449. IEEE (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communication Systems Engineering, Ben-Gurion University of the Negev, Beersheba, Israel
Nimrod Gradus, Asaf Cohen & Omer Gurwitz
Parallel Wireless, Kefar Sava, Israel
Nimrod Gradus & Erez Biton

Authors

Nimrod Gradus
View author publications
You can also search for this author in PubMed Google Scholar
Asaf Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Erez Biton
View author publications
You can also search for this author in PubMed Google Scholar
Omer Gurwitz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Nimrod Gradus , Asaf Cohen , Erez Biton or Omer Gurwitz .

Editor information

Editors and Affiliations

Ben-Gurion University of the Negev, Be’er Sheva, Israel
Shlomi Dolev
University of Maryland, College Park, MD, USA
Jonathan Katz
Ben-Gurion University of the Negev, Be’er Sheva, Israel
Amnon Meisels

8 Appendix

1.1 8.1 LTE Basic Terms

In this study we generally follow the conventional frequency division duplex (FDD) cellular resource units, in which the time is slotted into frames, and each frame is divided into constant 1 ms intervals, denoted as sub-frames. Each subframe is divided into parts termed physical resource resource blocks, which we shall refer to as simply resource blocks. Each such resource block comprises a bandwidth and time duration, e.g., in LTE each resource block comprises 12 sub-carriers in the frequency domain and 14 OFDM symbols in the time domain.

1.2 8.2 Downlink Link Adaptation (DLLA)

As mentioned earlier, opportunistic scheduling, e.g., proportional fairness, takes into consideration the users channel quality reports for better scheduling decisions. In particular, note that in the algorithm presented above, in order for the scheduler to select the user according to Eq. $\underset{k}{\arg \max } \frac{R_k(t)}{T_k(t)}$ it needs to know the instantaneous rates of all users. In wireless networks, these channel states of users are attained via reports indicating the users’ supported rates for transmission. Furthermore, each practical system supports only a finite set of rates. Link Adaptation is the mechanism where the users’ transmission code rates and modulation schemes are selected based on the channel conditions.

In this section, we briefly explain the concepts and processes of DLLA that is utilized in simulations and experimental results for scheduling using RL. since in the evaluation part both in the simulations and experimental results we follow a typical LTE DLLA, in the following subsection we will provide a technical description of the DLLA we utilized. Our description follows the common terminology and the accepted acronyms hence it is somewhat cumbersome.

The DLLA process is a crucial part of current wireless communication systems. Such technique increases the data rate that can be reliably transmitted [4] and has been adopted as a core feature in cellular standards such as LTE. The LA role in the MAC layer of the base station (BS) is to suggest the scheduler an appropriate modulation and coding scheme (MCS) to be used in the next transmissions to a certain user equipment (UE) in order to keep the block error rate (BLER) below a target. The proposed MCS is signaled from the UE by means of channel quality indicator (CQI) in the form of reports it sends to the BS, [14]. Afterwards, the BS uses a pre-calculated table for the mapping of CQI to a transport block size index (ITBS), an integer ranging from 1–26, which is used in the decision of the transport block (TB) size to be transmitted to the UE. The TB size is also determined by the number of physical resource blocks (PRBs) which can be allocated to the UE. In LTE the radio resources are allocated in the time/frequency domain. In particular, the time is slotted into intervals of 1 ms corresponding to 14 OFDM symbols. and in the frequency domain, the total bandwidth is divided into sub-channels of 180 kHz, each one with twelve consecutive and equally spaced OFDM sub-carriers. A time/frequency radio resource spanning over 1 ms time slot/14 OFDM symbols and twelve consecutive sub-carriers is called a physical resource block(PRB), or just RB, and corresponds to the smallest radio resource unit that can be assigned to a user for transmission. As the sub-channel size is fixed, the number of RBs varies according to the system bandwidth configuration, and it is the scheduler’s decision to divide the total number of RBs to each scheduled UE in the time slot. The ITBS, together with the number of RBs that are allocated to the UE are mapped to the size of the TB.

The CQI reported by the UE on a per transmission time interval (TTI) basis, delivers information on how good/bad the downlink communication channel is. The UE’s measurement of CQI depends solely on the chipset vendors and is derived from UE’s measurement of the reference signals transmitted by the BS. The reference signals received power (RSRP) that is measured by the UE is than used to calculate the link quality metric (LQM) which quantifies the quality of the downlink and is used to determine the CQI. The LQM that is mostly used in LTE is the exponential effective SNR mapping (EESM) [5]. The process of selecting the most suitable MCS based on the link quality measurements is called inner loop link adaptation (ILLA) [6].

Due to various errors in the CQI measurements of the UE, the delay in the reporting process and deviations from the assumed channel conditions, e.g., multi-path environment, UE speed [11], a compensation process is needed and called outer loop link adaptation (OLLA). The correction of OLLA is based on the hybrid automatic repeat request (HARQ) feedback and is depicted as follows, the mapped ITBS from the UE’s CQI report, defined as, ITBS(CQI), is updated by a margin, $ITBS_{margin}$, for each received positive/negative acknowledgment (ACK/NACK) from the UE. When an ACK is received, $ITBS_{margin}$ is decreased by $\varDelta _{down}$, and when a NACK is received, the margin is increased by, $\varDelta _{up}$. The ratio $\frac{\varDelta _{down}}{\varDelta _{up}}$ is controlled by the target BLER that OLLA is designed to converge to, given by

$$\begin{aligned} \frac{\varDelta {down}}{\varDelta _{up}} = \frac{BLER_{T}}{100-BLER_{T}} \end{aligned}$$

Intuitively, if $BLER_{T}$ is set to $10\%$, this means that the user should receive at least $90\%$ successful downlink transmissions. As explained the OLLA process is formulated as such,

$$\begin{aligned} ITBS = ITBS(CQI) - ITBS_{margin} \end{aligned}$$

$$\begin{aligned} ITBS_{margin} = {\left\{ \begin{array}{ll} ITBS_{margin} - \varDelta _{down} &{} \text {if ACK}\\ ITBS_{margin} + \varDelta _{up} &{} \text {if NACK}\\ \end{array}\right. } \end{aligned}$$

(5)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gradus, N., Cohen, A., Biton, E., Gurwitz, O. (2022). Reinforcement Based User Scheduling for Cellular Communications. In: Dolev, S., Katz, J., Meisels, A. (eds) Cyber Security, Cryptology, and Machine Learning. CSCML 2022. Lecture Notes in Computer Science, vol 13301. Springer, Cham. https://doi.org/10.1007/978-3-031-07689-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-07689-3_15
Published: 23 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07688-6
Online ISBN: 978-3-031-07689-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reinforcement Based User Scheduling for Cellular Communications

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

8 Appendix

8 Appendix

1.1 8.1 LTE Basic Terms

1.2 8.2 Downlink Link Adaptation (DLLA)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation