Downlink power allocation scheme for LEO satellites based on deep reinforcement learning

doi:10.7523/j.ucas.2020.0045

Abstract

Abstract: Most of the current satellite resource allocation schemes are designed for geosynchronous orbit satellites. In view of the highly dynamic characteristics and limitation of frequency and power resources in LEO satellites, a power allocation algorithm based on deep reinforcement learning is proposed. First of all, we model the LEO satellite power allocation scenario, and introduce a time slot division scheme to simplify the dynamic characteristics model of the LEO satellite. Then a power allocation policy is proposed based on deep reinforcement learning algorithm which can reduce the co-channel interference by adjusting the power value of the subcarriers in each beam of a single LEO satellite, thus improving the spectral efficiency of the LEO satellite. Simulation results illustrate that the proposed algorithm can converge and reach a stable state in a relatively short time. Under the condition of constant total power, this scheme can effectively improve the throughput of a single LEO satellite. The spectral efficiency based on deep reinforcement learning algorithm is significantly higher than that of water-filling algorithm and Q-learning algorithm.

Key words: LEO satellite, spectrum efficiency, power allocation, deep reinforcement learning

CLC Number:

TN927

ZHANG Huaming, LI Qiang. Downlink power allocation scheme for LEO satellites based on deep reinforcement learning[J]. Journal of University of Chinese Academy of Sciences, 2022, 39(4): 543-550.

References

[1] 汪春霆,李宁,翟立君,等.卫星通信与地面5G的融合初探(一)[J].卫星与网络, 2018(9):14-21.DOI:10.3969/j.issn.1672-965X.2018.09.004.
[2] 汪春霆,李宁,翟立君,等.卫星通信与地面5G的融合初探(二)[J].卫星与网络, 2018(11):22-26,28.DOI:10.3969/j.issn.1672-965X.2018.11.005.
[3] Choi J P, Chan V W S. Optimum power and beam allocation based on traffic demands and channel conditions over satellite downlinks[J]. IEEE Transactions on Wireless Communications, 2005, 4(6):2983-2993.DOI:10.1109/TWC.2005.858365.
[4] Alexis I, Shankar B, Arapoglou P, et al. Power allocation in multibeam satellite systems:A two-stage multi-objective optimization[J]. IEEE Transactions on Wireless Communications, 2015, 14(6):3171-3182.
[5] Nakahira K, Kobayashi K and Ueba M. Capacity and quality enhancement using an adaptive resource allocation for multi-beam mobile satellite communication systems//IEEE Wireless Communications and Networking Conference WCNC 2006. April 3-6, 2006, Las Vegas, NV, USA:IEEE, 2006:153-158.DOI:10.1109/WCNC.2006.1683456.
[6] 史煜,张邦宁,郭道省,等.考虑波束间干扰的多波束卫星功率带宽联合分配算法[J].计算机工程, 2018, 44(2):103-106,113.DOI:10.3969/j.issn.1000-3428.2018.02.018.
[7] Fu A C, Modiano E, and Tsitsiklis J. Optimal energy allocation and admission control for communications satellites[J]. IEEE/ACM Transactions on Networking, 2003, 11(3):488-500.DOI:10.1109/TNET.2003.813041.
[8] Qiu C, Yao H, Yu F R, et al. Deep Q-learning aided networking, caching and computing resources allocation in software-defined satellite-terrestrial networks[J]. IEEE Transactions on Vehicular Technology, 2019, 68(6):5871-5883.DOI:10.1109/TVT.2019.2907682.
[9] 3GPP. Study on New Radio (NR) to support non terrestrial networks (Release 15):3GPP TR 38.811.(2018-8-10). https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3234.
[10] Ippolito L J, Joseph L. Satellite communications systems engineering:atmospheric effects, satellite link design and system performance[M]. Chichester, UK:John Wiley&Sons, 2017.
[11] Christopoulos D, Chatzinotas S, Zheng G, et al. Linear and nonlinear techniques for multibeam joint processing in satellite communications[J]. EURASIP Journal on Wireless Communications and Networking, 2012, 162(2012):1-13.DOI:10.1186/1687-1499-2012-162.
[12] 刘帅军.卫星通信系统中动态资源管理技术研究.北京:北京邮电大学,2018.
[13] Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540):529-533.DOI:10.1038/nature14236.
[14] Srivastava N, Hinton G E, Krizhevsky A, et al. Dropout:a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.
[15] 翟继强,李雄飞. OneWeb卫星系统及国内低轨互联网卫星系统发展思考[J].空间电子技术, 2017, 14(6):1-7.DOI:10.3969/j.issn.1674-7135.2017.06.001.
[16] Pratt S R, Raines R A, Fossa C E, et al. An operational and performance overview of the IRIDIUM low earth orbit satellite system[J]. IEEE Communications Surveys, 1999, 2(2):2-10.DOI:10.1109/COMST.1999.5340513.
[17] 赵星惟,吕源,刘会杰,等. LEO通信卫星多波束天线构型方案设计[J].中国科学院研究生院学报, 2011, 28(5):636-641.DOI:10.7523/j.issn.2095-6134.2011.5.011.
[18] 张冬梅,徐友云,蔡跃明. OFDMA系统中线性注水功率分配算法[J].电子与信息学报,2007,29(6):1286-1289.