This work presents a novel Learning Model Predictive Control (LMPC) strategy for autonomous racing at the handling limit that can iteratively explore and learn unknown dynamics in high-speed operational domains. We start from existing LMPC formulations and modify the system dynamics learning method. In particular, our approach uses a nominal, global, nonlinear, physics-based model with a local, linear, data-driven learning of the error dynamics. We conduct experiments in simulation, 1/10th scale hardware, and deployed the proposed LMPC on a full-scale autonomous race car used in the Indy Autonomous Challenge (IAC) with closed loop experiments at the Putnam Park Road Course in Indiana, USA.
The results show that the proposed control policy exhibits improved robustness to parameter tuning and data scarcity. Incremental and safety-aware exploration toward the limit of handling and iterative learning of the vehicle dynamics in high-speed domains is observed both in simulations and experiments.
Learning MPC combines some of the best aspects of traditional and data-driven optimal control. On one side, we have a strong prior in the form of a physics-based model. On the other side, we have the flexibility of data-driven methods to learn the modeling error. This is complemented with some other things:
Naturally, some learning tasks benifit more from LMPC. These are usually safety-critical or agile tasks in the real world. For example, autonomous racing, where the vehicle is operating at the limit of handling, is an interesting candidate for LMPC.
Learning MPC is a data-driven MPC method that uses previous experience to iteratively improve the control policy. It can also be augmented with regression to learn the dynamics of the system.
Previous iterations of states and actions are stored in the sampled Safe Set (SS). Assuming a time-invariant environment, returning to the SS states will guarantee successful completion of the task - this is the control-invariance property of the SS. Therefore, we constrain the last state in the MPC horizon to be in the SS, which yields a iterative and safe exploration behavior around the SS. With the appropriate cost function, the control policy will learn to converge to the optimal policy.
We can query the SS to obtain neighboring states and actions to the robot's current state. We can then use this data to learn the dynamics of the system. This is done by fitting a linear model to the data, and has been attempted in previous work.
Instead of doing a full regression to learn the entire dynamics, this work introduces a residual learning approach. The motivation is to leverage a prior model, either physics-based or learned-in-simulation. For the latter, this method helps bridge the Sim2Real gap. We hypothesize that this approach can be more data-efficient and robust to hyperparameter tunings.
To formulate the A, B and C matrices for the DDP-based MPC, we first linearize the prior model to form the baseline prediction for these matrices. We then query the SS to obtain the residual dynamics, which is then added on top of the prior prediction. This is done by solving a weighted least-squares problem.
Small-scale experiment on F1TENTH race cars shows that the proposed error-regression LMPC can learn the dynamics of the vehicle and converge to the optimal policy. Compared with the full-regression LMPC, the error-regression LMPC is more robust to hyperparameter tuning and data scarcity. The plot below compares the failure rate of the two methods. Full-regression LMPC fails 5 out of 10 trials with suboptimal hyperparameters, while the error-regression LMPC fails only 1 out of 10 trials.
The plot below visualizes the learning progress at the 1st, 5th, and 20th iteration. The vehicle starts with few-shot non-expert demonstrations of driving, and gradually learns to drive faster and more aggressively. The vehicle is able to learn to drive at the limit of handling and converge to a consistent lap time.
We also did full-size experiments on profession autonomous race cars in the Indy Autonomous Challenge. These autonomous race cars are capable of driving up to 340 km/h (200 mph).
The series of works on Learning MPC shows that
Check out previous works in Learning MPC at UC Berkeley's MPC Lab.
For exciting autonomous racing clips, check out Haoru's YouTube channel.
Learn more about the collaborative autonomous racing research between UC Berkeley, CMU, and UC San Diego at AI Racing Tech.
@misc{xue2024learning, title={Learning Model Predictive Control with Error Dynamics Regression for Autonomous Racing}, author={Haoru Xue and Edward L. Zhu and John M. Dolan and Francesco Borrelli}, year={2024}, eprint={2309.10716}, archivePrefix={arXiv}, primaryClass={cs.RO} }
Some cool autonomous race car videos!