欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2024, Vol. 41 ›› Issue (3): 398-410.DOI: 10.7523/j.ucas.2023.076

• 电子信息与计算机科学 • 上一篇    下一篇

基于时空依赖关系多智能体强化学习的多路口交通信号协同控制方法

王兆瑞, 岩延, 张宝贤   

  1. 中国科学院大学人工智能学院, 北京 100049
  • 收稿日期:2023-06-20 修回日期:2023-09-11 发布日期:2024-05-17
  • 通讯作者: 张宝贤,E-mail:Bxzhang@ucas.ac.cn
  • 基金资助:
    国家重点研发计划项目(2018AAA0100804)和国家自然科学基金(61872331)资助

Cooperative traffic signal control method for multi-intersection: an approach based on spatiotemporal dependence multi-agent reinforcement learning

WANG Zhaorui, YAN Yan, ZHANG Baoxian   

  1. School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-06-20 Revised:2023-09-11 Published:2024-05-17

摘要: 面对日益严重的交通拥堵现象,智能交通信号控制已成为提升城市道路网络性能必不可少的手段。提出一种基于时空依赖关系多智能体强化学习算法的多路口交通信号控制方法STLight(spatiotemporal traffic light control)。通过基于注意力机制的时空依赖模块STDM(spatiotemporal dependent module),STLight可将初始交通观测数据提取为时空特征,以有效捕获各交叉路口间的时空依赖关系。此外,基于所提取的时空特征,STLight在基于集中训练分散执行框架的多智能体强化学习算法基础之上进一步为各个智能体引入全局时空信息,从而进一步提升多智能体之间的协作能力。实验结果表明,STLight在提升城市道路网络的性能方面具有显著的优势,有助于缓解当前大规模城市道路网络的交通拥堵问题。

关键词: 多智能体强化学习, 多路口交通信号控制, 注意力机制, 马尔可夫博弈, 时空依赖

Abstract: In the face of increasingly serious traffic congestion, intelligent traffic signal control has become an indispensable means to improve the performance of urban road network. In this paper, a spatiotemporal traffic light control (STLight) based on multi-agent reinforcement learning algorithm is proposed. Through the spatiotemporal dependent module (STDM) based on the attention mechanism, STLight can extract the initial traffic observation data as spatiotemporal features, so as to effectively capture the spatiotemporal dependence relationship between intersections. In addition, based on the extracted spatiotemporal characteristics, STLight further introduces global spatiotemporal information to each agent on the basis of the multi-agent reinforcement learning algorithm based on the centralized training decentralized execution framework, so as to further improve the cooperation ability among multi-agents. The experimental results show that STLight has significant advantages in improving the performance of urban road networks, and helps to alleviate the traffic congestion problem of current large-scale urban road networks.

Key words: multi-agent reinforcement learning, multi-intersection traffic signal control, attention mechanism, Markov game, spatiotemporal dependent

中图分类号: