To safely navigate intricate real-world scenarios, autonomous vehicles (AVs) must be able to adapt to diverse road conditions and anticipate future events. World model based reinforcement learning (RL) has emerged as a promising approach by learning and predicting the complex dynamics of various environments. Nevertheless, to the best of our knowledge, there does not exist an open-source platform for training and testing such algorithms in complicated driving environments.
To fill this void, we introduce CarDreamer, the first open-source learning platform designed specifically for developing and evaluating world model based autonomous driving algorithms. It comprises a few key components, including 1) World model (WM) backbone: CarDreamer has integrated some state-of-the-art world models, which simplifies the reproduction of RL algorithms. 2) Built-in tasks: CarDreamer offers a comprehensive set of highly configurable driving tasks which are compatible with Gym interfaces and are equipped with empirically optimized reward functions. 3) Task development suite: CarDreamer integrates a flexible task development suite to streamline the creation of driving tasks. This suite enables easy definition of traffic flows and vehicle routes, along with automatic collection of multi-modal observation data. Furthermore, we conduct extensive experiments using built-in tasks to evaluate the performance and potential of WMs in autonomous driving. Thanks to the richness and flexibility of CarDreamer, we also systematically study the impact of observation modality, observability, and sharing of vehicle intentions on AV safety and efficiency. All code and documents are accessible on our GitHub page.
World model "imagines" the future states based on current and history information. This offers the world model with precious prediction and high sample efficiency.
CarDreamer provides a comprehensive set of driving tasks, such as lane changing and overtaking. These tasks allow extensive customization in terms of difficulty, observability, observation modalities, and communication of vehicle intentions.
This table illustrates the performance metrics such as success rate, collision rate, and average speed across different driving tasks.
This table shows the effect of different observability settings (FOV, SFOV, Full) on task success, collision rates, and speed.
This table compares how different levels of transmission error affect the agent’s success rate, collision rate, and average speed when sharing vehicle intentions.
We stream observations, rewards, terminal conditions, and custom metrics to a web browser for each training session in real-time, making it easier to engineer rewards and debug.
@ARTICLE{10714437,
author={Gao, Dechen and Cai, Shuangyu and Zhou, Hanchu and Wang, Hang and Soltani, Iman and Zhang, Junshan},
journal={IEEE Internet of Things Journal},
title={CarDreamer: Open-Source Learning Platform for World Model Based Autonomous Driving},
year={2024},
volume={},
number={},
pages={1-1},
keywords={Autonomous Driving;Reinforcement Learning;World Model},
doi={10.1109/JIOT.2024.3479088}
}