AnswerOfTime sfasfaffa

Hi there 👋

I am Dengyun Peng, a senior BS @HIT and also an incoming MS @HIT, a member of the SCIR LA. My current research interests focus on RL4LLM, LLM reasoning. I have research experience in Safe RL and Offline RL.

Intern:

WestlakeU (2023.12-2024.9)

Du Xiaoman Financial (2024.1-)

Publication:

(ICML2024, second author) Reinformer: Max-Return Sequence Modeling for Offline RL (https://proceedings.mlr.press/v235/zhuang24b.html)

(Preprint, Fourth Author) Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models (https://arxiv.org/abs/2503.09567)

(Preprint, Fourth Author) ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model (https://arxiv.org/abs/2502.03325)

Email:

dypeng@ir.hit.edu.cn

pengdengyun@qq.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnswerOfTime sfasfaffa

Achievements

Achievements

Block or report sfasfaffa

Hi there 👋

Intern:

Publication:

Email:

Popular repositories Loading