State-wise Safe Reinforcement Learning With Pixel Observations

Published in Learning for Dynamics and Control Conference (L4DC), 2024

This paper addresses the challenge of safe reinforcement learning when only pixel observations are available. We propose a novel algorithm that efficiently encodes state-wise safety constraints with unknown hazard regions through a latent barrier function learning mechanism. The approach enables safe policy learning directly from high-dimensional visual observations without requiring explicit state representations.

Authors: Simon Sinong Zhan, Yixuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, Qi Zhu

Citation

@inproceedings{zhan2024statewise, title={State-wise Safe Reinforcement Learning With Pixel Observations}, author={Zhan, Sinong Simon and Wang, Yixuan and Wu, Qingyuan and Jiao, Ruochen and Huang, Chao and Zhu, Qi}, booktitle={Learning for Dynamics and Control Conference (L4DC)}, year={2024}, url={https://arxiv.org/abs/2311.02227} }