欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2022, Vol. 39 ›› Issue (1): 134-143.DOI: 10.7523/j.ucas.2020.0001

• 电子信息与计算机科学 • 上一篇    

基于Q-learning的飞行自组织网络QoS路由方法

黄鑫陈1,2, 陈光祖1, 郑敏1, 谭冲1, 刘洪1   

  1. 1. 中国科学院上海微系统与信息技术研究所, 上海;
    2. 中国科学院大学微电子学院, 北京 100049
  • 收稿日期:2020-01-02 修回日期:2020-04-29 发布日期:2021-05-31
  • 通讯作者: 黄鑫陈
  • 基金资助:
    中国科学院青年创新促进会(2018269)资助

Q-learning based QoS routing for high dynamic flying Ad Hoc networks

HUANG Xinchen1,2, CHEN Guangzu1, ZHENG Min1, TAN Chong1, LIU Hong1   

  1. 1. Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China;
    2. School of Microelectronics, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-01-02 Revised:2020-04-29 Published:2021-05-31

摘要: 针对无人机自组网等高动态飞行自组织网络中,网络拓扑的快速变化导致通信链路断裂和路由重建频繁的问题,研究一种基于Q-learning的QoS (quality of service)路由方法。该方法以Q-learning强化学习框架为基础,将邻居节点数量、链路持续时间和链路可用带宽作为路由度量信息,设计一种提供QoS保证的Q-learning奖励函数。网络节点通过广播Hello消息交互各自的本地路由度量信息,邻居节点接收到Hello分组或者数据分组,根据奖励函数计算并更新Q值,待转发数据分组的节点根据其维护的Q值表智能选择下一跳转发节点。EXata无线网络仿真环境中的仿真结果表明,该方法能为高动态飞行自组织网络中的数据传输提供稳定性好、服务质量高的通信链路。

关键词: 飞行自组网, QoS路由, Q-learning, 链路可用带宽, 链路持续时间

Abstract: In high dynamic flying ad hoc networks (FANETs), such as UAV (unmanned aerial vehicle) ad hoc networks, the rapid change of network topology leads to the breakage of communication links and the frequent reconstruction of routes. To solve this problem, a QoS (quality of service) routing method based on Q-learning is studied. Based on the basic Q-learning framework, this method takes the number of neighbor nodes, link duration and link available bandwidth as routing metrics, and designs a Q-learning reward function to provide QoS guarantee. All nodes exchange local routing metrics information with neighbor nodes by broadcasting Hello messages and forwarding data packets. After receiving Hello packets or data packets, neighbor nodes calculate and update the Q value according to the reward function. Then one of neighbor nodes selects a next hop node to forward data packets intelligently according to the Q value table that it maintains. The simulation results in EXata simulator show that this method can provide stable and high QoS communication links for high dynamic flying ad hoc networks.

Key words: FANETs, QoS routing, Q-learning, link available bandwidth, link expiration time

中图分类号: