Joint Differentiable Optimization and Verification for Certified Reinforcement Learning
Published in ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), 2023
This paper presents a novel framework that jointly conducts reinforcement learning and formal verification by formulating and solving a bilevel optimization problem. The approach is end-to-end differentiable through gradients from the value function and certificates formulated by linear programs and semi-definite programs, enabling certified safe policy learning.
Authors: Yixuan Wang, Simon Sinong Zhan, Zhilu Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu (*equal contribution)
Citation
@inproceedings{wang2023joint, title={Joint differentiable optimization and verification for certified reinforcement learning}, author={Wang, Yixuan and Zhan, Simon and Wang, Zhilu and Huang, Chao and Wang, Zhaoran and Yang, Zhuoran and Zhu, Qi}, booktitle={Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2023)}, pages={132--141}, year={2023} }