Reinforcement Learning Decision-Making for Autonomous Vehicles Based on Semantic Segmentation

100%

Reinforcement Learning Decision-Making for Autonomous Vehicles Based on Semantic Segmentation

Abstract: In the complex and stochastic traffic flow, ensuring safe driving requires improvements in perception and decision-making. This paper proposed a decision-control method that leveraged the scene perception and understanding capabilities of semantic segmentation networks and the stable convergence strategies of Deep Reinforcement Learning algorithms to achieve more accurate and effective autonomous driving decision-control. Perception features obtained from cameras and sensors equipped with a semantic segmentation model were used as input for the intelligent agent. Deep Reinforcement Learning algorithms were employed to update decisions based on reward feedback. Experimental results on the CARLA simulation platform demonstrated that the semantic segmentation network effectively identified obstacles, vehicles, and drivable areas, providing high-quality perception data input for the intelligent agent's decision-making model. Compared to the original algorithms, the proposed Double Deep Q-Network-Semantic Segmentation and Proximal Policy Optimization-Semantic Segmentation increased the reward value by approximately twenty-five percent and enhanced driving stability by fourteen point two percent and twenty-eight point five percent, respectively, enabling more stable and precise decision-control during driving. The method proposed in this paper has better improved the decision-control performance of Proximal Policy Optimization and Double Deep Q-Network in complex scenarios.

One. Introduction

In the development of autonomous driving technology, safety is the primary consideration. Decision-control, as one of the core modules of autonomous driving, serves as a bridge between perception and execution, directly influencing the vehicle's behavior decisions and safety in complex traffic environments. To prove that autonomous vehicles were safer than human drivers, approximately five billion miles of road testing were required, which consumed substantial human, material, and time resources for real-world testing. With the continuous development of autonomous driving technology, simulation experiments not only save resources but also allow for the simulation of various complex and dangerous traffic scenarios in a virtual environment. Powerful perception and understanding capabilities provide accurate environmental information for simulation-based decision-control, thereby supporting complex decision-making processes. In this process, the application of semantic segmentation network technology becomes an important means of achieving environmental understanding.

Semantic segmentation, as one of the technologies in intelligent driving perception and advanced driver-assistance systems, has played a role in classifying environmental elements in scene images into various category labels. Compared to single-object recognition tasks and image segmentation, scene semantic segmentation technology has provided granular and high-level semantic information for subsequent scene analysis and visual understanding. This has helped autonomous vehicles simplify the extraction of environmental features and achieve an understanding of complex scenes during driving decision-making. In the application of semantic segmentation in autonomous driving scenarios, it was necessary to design semantic segmentation models that balance efficiency and accuracy. A video prediction-based method that expanded the training set by synthesizing new training samples to improve the accuracy of semantic segmentation networks. This method utilized the ability of video prediction models to predict future frames to generate future labels. A lightweight semantic segmentation network, the LMBA convolutional unit, which enabled efficient feature extraction and helped semantic segmentation networks achieve better perception and recognition capabilities in autonomous driving scenarios. However, although the aforementioned studies have improved the accuracy of semantic segmentation in specific scenarios, they have not thoroughly explored its application in perception and recognition within complex road environments, nor have they comprehensively evaluated its practicality and stability. Common perception modules typically relied on raw images or object detection using bounding boxes, which introduced significant limitations in terms of efficiency and semantic richness. Raw images have high dimensionality, making feature extraction computationally expensive and slow to converge. Bounding box methods, while identifying the position and category of objects, failed to capture the detailed spatial and semantic information needed for complex decision-making tasks in autonomous driving. Semantic segmentation addressed these gaps by providing pixel-level classification of the entire scene. This allowed for precise identification of object contours and enabled the separation of drivable and non-drivable areas, pedestrians, vehicles, and other key scene elements. Compared to bounding box methods, semantic segmentation provided richer global scene understanding, reduced input complexity, and enhanced decision robustness, especially in dynamic and complex environments.

Traditional autonomous driving systems were rule-based. Due to the complexity of scene structural information, constructing rules was challenging, making it difficult to form complete and effective system decision models. Therefore, vehicles in the past primarily relied on advanced driver-assistance systems to actively improve vehicle safety by assisting in acceleration/deceleration and steering control to reduce accidents and injuries. For example, a car-following model that worked in a human-like manner, utilizing speed, relative speed, and vehicle distance. However, advanced driver-assistance systems could only be applied to relatively simple scenarios, such as highways. With the development of artificial intelligence in the field of autonomous driving, the understanding and decision-making in complex scenarios were synthesized by neural networks, and traditional advanced driver-assistance systems transitioned to deep reinforcement learning agent-based decision-control methods. Reinforcement learning decision algorithms had the advantages of a simple structure and excellent performance. Some studies combined deep reinforcement learning methods with autonomous driving to achieve advanced decision-making in simulation tests. A framework that combined modular approaches with deep reinforcement learning methods to generate deep reinforcement learning strategies for urban driving tasks such as vehicle following and intersection navigation. An end-to-end autonomous driving decision-making method based on an improved TD3 algorithm. This method used forward-facing cameras to capture data and introduced a new critic network to form a triple-critic structure, combining it with target-maximization operations to address the underestimation problem in the TD3 algorithm.

Utilized forward-facing camera images, current vehicle speed, and steering angles as inputs to train lane-keeping strategies, demonstrating the applicability of deep reinforcement learning in real-world autonomous driving scenarios. Introduced a method for safely navigating autonomous vehicles on highways by combining insights from deep Q-networks and control theory. The deep Q-network was trained in simulations and acted as the central decision-making unit by setting targets for the trajectory planner.

However, traditional rule-based autonomous driving systems have not been able to effectively construct decision models for complex road conditions, making it difficult to address decision-making demands in urban traffic scenarios. Other reinforcement learning studies have mostly focused on the comparative evaluation of deep reinforcement learning algorithm strategies, which has impacted decision-making accuracy. Given this, this paper integrates semantic segmentation technology with a deep reinforcement learning decision model to propose a decision-control method. It fully leverages semantic segmentation to enhance perception performance, thereby helping the deep reinforcement learning algorithm capitalize on its advantages in strategy update stability and balancing short-term and long-term rewards, ultimately strengthening autonomous driving decision-making capabilities in complex urban traffic environments. This study generates high-quality perception data through semantic segmentation, which provides pixel-level classification of the environment, offering detailed semantic information about drivable areas, obstacles, and dynamic objects. Rule-based methods, due to their reliance on predefined logic, lack of semantic understanding, and inflexibility, are inherently limited in handling complex and dynamic environments. These methods often fail to generalize to new scenarios and are sensitive to noise and edge cases. Semantic segmentation allows for a more comprehensive understanding of the scene, enabling the reinforcement learning agent to effectively perceive and respond to dynamic and unpredictable situations.

This study generates high-quality perception data through semantic segmentation, simplifying the complexity of the state space and significantly improving the efficiency and performance of the reinforcement learning algorithms. By enhancing environmental perception capabilities with the semantic segmentation network and tightly integrating it with the decision-making control of the reinforcement learning algorithm, a self-driving decision-control framework is formed. This framework not only improves perception accuracy but also reduces the interference of low-quality perception data in the decision-making process, significantly enhancing the system's robustness and adaptability in complex urban driving environments.

Reinforcement Learning Decision-Making for Autonomous Vehicles Based on Semantic Segmentation

One. Introduction

Two. Materials and Methods

Two point two. Reinforcement Learning

Two point two point one. Double Deep Q Networks

Two point two point two. Proximal Policy Optimization Algorithm

Two point three. Agent Model

Two point three point one. State Space

Two point three point two. Action Space

Two point three point three. Incentive Mechanism

Three. Experimental Procedure and Results

Three point two. CARLA Simulator

Three point three. Scenario Generation for Simulation Platforms

Three point four. Simulation Parameter Setting

Three point five. Simulation Results and Discussion

Three point five point two. Agent Training Results

Four. Discussion and Future Works

Five. Conclusions

KarGO: A Smarter Mobile Platform for Tricycle Transportation

KarGO: A Smarter Transportation Solution for Tricycles

KarGO: A Smarter Way to Move Your Community

Introducing KarGO: A Smarter Transportation Solution for Tricyle Services

Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment