Sinong(Simon) Zhan

I am a 1st year PhD student of ECE department, Northwestern University advised by Qi Zhu, and I also work closely with Zhaoran Wang and Chao Huang. Before Northwestern, I did my undergad on Applied Math and Computer Science at UC Berkeley, where I was advised by Sanjit A. Seshia. I had experience on Ubiquitous Computing and Novel sensing and have been fortunately advised by Xing-Dong Yang, Teng Han, and Tian Feng.

I'm intertested in combining techniques from machine learning, control theory, and formal method to enfore safety and robustness of the large-scale cyber-physics systems. I'm also broadly interested in Generative Models, Human Factor, and trending new technologies.

Email  /  CV  /  Google Scholar  /  DBLP  /  Github

profile photo
News

[04/2024] Our paper "State-wise Safe RL With Pixel Observations" is accepted to L4DC 2024

[03/2024] One paper accepted to LLMAgent@ICLR 2024

[09/2023] I will start my PhD journey at ECE department, Northwestern University

Publications
Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task
Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Chao Huang,
ICML 2024
arXiv

Auxiliary-Delayed Reinforcement Learning (AD-RL) leverages an auxiliary short-delayed task to accelerate the learning on a long-delayed task without compromising the performance in stochastic environments.

State-wise Safe Reinforcement Learning With Pixel Observations
Simon Sinong Zhan, Yixuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, Qi Zhu,
L4DC 2024
arXiv / code and demos

In this paper, we propose a novel pixel-observation safe RL algorithm that efficiently encodes state-wise safety constraints with unknown hazard regions through the introduction of a latent barrier function learning mechanism.

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu,
ICML 2023
arXiv / code

A safe RL approach that can jointly learn the environment and optimize the control policy, while effectively avoiding unsafe regions with safety probability optimization.

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning
Yixuan Wang*, Simon Sinong Zhan*, Zhilu Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu,
ICCPS 2023
arXiv / code

A framework that jointly conducts reinforcement learning and formal verification by formulating and solving a novel bilevel optimization problem, which is end-to-end differentiable by the gradients from the value function and certificates formulated by linear programs and semi-definite programs.

Tools

MARS: a toolchain for Modeling, Analyzing and veRifying hybrid Systems

Toolchain to solve 3D bin packing in SMT formulation

Service
  • 2024: external reviewer ASP-DAC 2024; Artifact Evaluation PC ICCPS 2024
  • 2023: reviewer Neurips 2023
  • Teaching
  • Fall 2022: Math 128A Numerical Analysis TA

  • Source website.