当前位置:网站首页>Intensive reading of the paper: deep reinforcement learning and intelligent transportation (I)
Intensive reading of the paper: deep reinforcement learning and intelligent transportation (I)
2022-07-18 06:01:00 【Demeanor 78】
author : Song Xujie

Editor's note :
With the development of urbanization and emerging intelligent technology , Transportation system contains more and more artificial intelligence technology Technique (AI), It is called intelligent transportation system (ITS). This article mainly discusses reinforcement learning (RL) Application in intelligent transportation system , The following excerpts are related to “ Traffic light control ” Related content .
Text begins ······
Paper title :Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey
Author of the paper :Ammar Haydari, Yasin Yılmaz
With the development of urbanization and emerging intelligent technology , Transportation system contains more and more artificial intelligence technology Technique (AI), It is called intelligent transportation system (ITS).ITS and AI The combination of 21 Transportation science research in the 21st century provides Effective solutions , The main application fields include “ Traffic signal control ”、“ Autopilot ”、“ Traffic flow control ” etc. . This article mainly discusses reinforcement learning (RL) Application in intelligent transportation system .
stay Reinforcement Learning (RL) in , The agent is in 𝑡 Observe the system state at all times 𝑠 𝑡 , According to the current strategy 𝜋 Take action 𝑎 𝑡 , And get rewards from the environment 𝑟 𝑡 , The system moves to the next state 𝑠 𝑡+1 . In every round of interaction , Agent updates learn from the environment Knowledge , The following figure shows RL Learning process .

RL It can be divided into the following two types :
Model-based RL: The agent knows from the state 𝑠 𝑡 Transferred to the 𝑠 𝑡+1 Probability .
Model-free RL: The agent does not know the transition probability , Learn this state transition model by exploring the environment .Model-free RL It can also be divided into value-based and policy-based Two kinds of .
use RL Solve the traffic signal control (TSC) The following modules need to be well designed :
01
state (state)
(1) Discrete traffic state encoding (DTSE): This is the state representation of a kind of image , Divide a lane into 𝑚 Lattice , The length of each grid can accommodate one car , Use the value in the grid to represent the current intersection state . This state representation method is more refined , But data is relatively difficult to collect .

(2) Feature-based value vector: This kind of method expresses the average value or total amount of a certain information in a vector , Like queue length 、 Waiting time 、 Average speed 、 Number of vehicles, etc . Usually, this information can be passed through the roadside Sensors collect easily .
02
action (action)
(1) Select a phase : At an intersection , Usually there can be 4 A green phase , The first action can Select one of the phases and execute .
(2) Choose whether to change the current phase : The second action can be a 01 Variable . In this case , Usual phase The bit order is certain , Choose whether to change the current phase and enter the next phase by action .
(3) Change the duration of the current phase : The third action can be the action in the continuous action space , It said Duration of current phase .
03
Return (reward)
Common rewards include : Waiting time (waiting time), Cumulative delay (cumulative delay), The length of the queue (queue length) And their combination .
04
Neural network structure (neural network structure)
Common neural network structures include :
(1) MLP: That's multilayer perceptron , This is a standard fully connected neural network .
(2) CNN: It can be regarded as having convolution kernel MLP, Have strong mapping ability from image to output .(3) ResNet: Used to solve CNN Network structure of type network over fitting problem .
(4) GCN: Graph convolution neural network , Convolution can be done in the graph structure .
(5) RNN: It's a cyclic neural network , for example LSTM The Internet . It is often used to process sequence data .
05
Simulation environment (traffic simulator)
(1)Green Light District (GLD): This is an early one based on Java Developed traffic simulator , There are many RL Study using this simulator .
(2) Simulation Urban Mobility (SUMO): This is one of the most popular open source emulators , Users can Use Python Of TraCI The interface library realizes the interaction with the environment .
(3) AIMSUN: This is from Transport Simulation Systems (Spain) Designed a commercial solver .
(4) Paramics: This is from Quadstone Paramics (UK) Designed a commercial solver .(5) VISSIM: Is similar to the AIMSUN Emulator for , Have MATLAB Interface .

reference :
Haydari A, Yilmaz Y. Deep reinforcement learning for intelligent transportation systems: A survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2020.

Past highlights
It is suitable for beginners to download the route and materials of artificial intelligence ( Image & Text + video ) Introduction to machine learning series download Chinese University Courses 《 machine learning 》( Huang haiguang keynote speaker ) Print materials such as machine learning and in-depth learning notes 《 Statistical learning method 》 Code reproduction album machine learning communication qq Group 955171419, Please scan the code to join wechat group ( Please explain )
边栏推荐
- 单视图重构—影消点、影消线与相机内参、平面法向量的推导
- IM即时通讯软件开发之扫码登录功能
- 【动态规划】—— 状态压缩DP
- 【深度学习】YOLOv7速度精度超越其他变体,大神AB发推,网友:还得是你!|开源...
- 【多线程】 CAS 机制解析及应用( 原子类 . 自旋锁 )、解决 ABA 问题
- Using JMeter pressure test upload and download interface practice
- Is it a transliteration of loanwords or does it have a specific meaning?
- 哈工大讯飞联合实验室获得SemEval-2022最佳论文提名奖
- C # use the Browse button to obtain the file path and folder path
- Mysql/mairadb master-slave replication
猜你喜欢

单视图重构—影消点、影消线与相机内参、平面法向量的推导

Alibaba cloud architect Ma song: high performance computing on the cloud helps gene sequencing

Teach people to fish - see a field on the sap mm material display interface, how to find which field of which database table to store the trial version

工程监测仪器多通道振弦无线采集仪的采集数据发送方式和在线监测管理系统

Huawei cloud stack opens its framework to the south to help ecological partners enter the cloud efficiently

OPENGL学习(一)认识OPENGL和各种库

i7-12700H 和 R7-6800H,这两个 CPU 差距有多大?

CEO干货| CSDN演讲回顾:如何利用低代码提升研发和IT效能?

The SQL implementation divides the values in the fields in the data table into multiple columns according to the separator

双倍数据速率同步动态随机存储器(Double Data Rate Synchronous Dynamic Random Access Memory, DDR SDRAM)
随机推荐
[computing talk club] Lecture 6 | Sanxingdui fantasy trip: an experience that only cloud computing can bring
How does win11 set the multitask window? Win11 method of setting multi task window
OpenGL learning (I) understanding OpenGL and various libraries
Sword finger offer punch stack queue heap
【NDI】关于NDI的注意事项
Use of RV
Send your code into space and develop "the greatest work" with Huawei cloud
[interview: concurrency 14: multithreading: monitor concept]
干货|语义网、Web3.0、Web3、元宇宙这些概念还傻傻分不清楚?(上)
Dialogue Yinqi: what we insist on will not change, and broad vision will jump out of the "cyclical law" of enterprise scientific research
Treasure features new! Calendar view + card view are combined, and the work efficiency is fast to fly
Huawei cloud stack opens its framework to the south to help ecological partners enter the cloud efficiently
Use excel2016's functions to generate random 16, 32, and 36 bit ID string contents
Pychart tutorial: 5 very useful tips
The data transmission mode and online monitoring management system of multi-channel vibrating wire wireless acquisition instrument for engineering monitoring instruments
深度学习基础:8.卷积与池化
OPENGL学习(一)认识OPENGL和各种库
授人以渔-在 SAP MM 物料显示界面上看到一个字段,如何查找哪张数据库表的哪个字段进行的存储的试读版
海外休闲游戏的网络连接方案
Calculation of curvature radius of points on curve