Picture for Yilin Niu

Yilin Niu

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Add code
Mar 05, 2026
Viaarxiv icon

GLM-5: from Vibe Coding to Agentic Engineering

Add code
Feb 17, 2026
Viaarxiv icon

Data-Efficient RLVR via Off-Policy Influence Guidance

Add code
Oct 30, 2025
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

Does RLHF Scale? Exploring the Impacts From Data, Model, and Method

Add code
Dec 08, 2024
Figure 1 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 2 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 3 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 4 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Viaarxiv icon

LongReward: Improving Long-context Large Language Models with AI Feedback

Add code
Oct 28, 2024
Viaarxiv icon

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Add code
Jun 18, 2024
Figure 1 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 2 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 3 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 4 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Viaarxiv icon

ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback

Add code
Apr 03, 2024
Figure 1 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 2 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 3 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 4 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Add code
Feb 02, 2024
Figure 1 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 2 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 3 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 4 for Towards Efficient and Exact Optimization of Language Model Alignment
Viaarxiv icon