欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2022, Vol. 39 ›› Issue (3): 403-409.DOI: 10.7523/j.ucas.2020.0011

• 电子信息与计算机科学 • 上一篇    下一篇

无线密集网络中的低损耗多臂老虎机算法

赵耀1,2,3, 罗喜良1   

  1. 1. 上海科技大学信息科学与技术学院, 上海 201210;
    2. 中国科学院上海微系统与信息技术研究所, 上海 200050;
    3 中国科学院大学, 北京 100049
  • 收稿日期:2020-01-14 修回日期:2020-04-20 发布日期:2021-05-31
  • 通讯作者: 罗喜良
  • 基金资助:
    国家自然科学基金(61971286)资助

A low cost multi-armed bandit algorithm for dense wireless network

ZHAO Yao1,2,3, LUO Xiliang1   

  1. 1 School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China;
    2 Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China;
    3 University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-01-14 Revised:2020-04-20 Published:2021-05-31

摘要: 近年来人们对移动无线服务的需求与日俱增,为应对这一挑战,超密集无线网络被认为是下一代无线通信网络的基础设施架构和重要组成部分,基站的密集布置可以减少每个小区的服务用户数量,从而可为网络用户提供高速且低延迟的无线服务。但同时带来的不可避免的问题是用户在选择接入时会触发频繁的网络切换以确保可以接入到服务最佳的网络。用户接入问题往往被建模成在线学习模型。本文旨在寻找一个高效的在线用户接入方案以应对频繁网络切换造成的额外性能损失。通过对多臂老虎机模型的分析,提出基于操作杆淘汰机制的改进算法,并通过严格理论分析及数值仿真实验两个角度论证该算法的有效性。

关键词: 在线学习, 用户接入, 密集网络, 多臂老虎机

Abstract: In recent years, people's demand for mobile wireless services has been increasing. In order to meet this challenge, ultra-dense wireless networks are considered to be the infrastructure and important components of the next-generation wireless communication network. Massive deployment of small base stations can reduce the number of network users in each cell, which can in turn provide the users with high-speed and low-latency wireless service. However, the inevitable problem brought with it at the same time is that users will cause frequent network handover when choosing access to ensure that they can access the network with the best service provider. User association problem is often modeled as the online learning model. This paper aims to find an efficient online user association scheme to deal with the additional network performance loss caused by frequent handover. Based on the analysis of the multi-armed bandit (MAB) model, this paper proposes an improved algorithm based on the arm elimination strategy, and demonstrates the effectiveness of the algorithm through rigorous theoretical analysis and numerical simulation experiments.

Key words: online learning, user association, dense network, multi-armed bandit

中图分类号: