I am Dengyun Peng, a senior BS @HIT and also an incoming MS @HIT, a member of the SCIR LA. My current research interests focus on RL4LLM, LLM reasoning. I have research experience in Safe RL and Offline RL.
WestlakeU (2023.12-2024.9)
Du Xiaoman Financial (2024.1-)
(ICML2024, second author) Reinformer: Max-Return Sequence Modeling for Offline RL (https://proceedings.mlr.press/v235/zhuang24b.html)
(Preprint, Fourth Author) Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models (https://arxiv.org/abs/2503.09567)
(Preprint, Fourth Author) ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model (https://arxiv.org/abs/2502.03325)