Picture for Yilin Niu

Yilin Niu

Data-Efficient RLVR via Off-Policy Influence Guidance

Add code
Oct 30, 2025
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

Does RLHF Scale? Exploring the Impacts From Data, Model, and Method

Add code
Dec 08, 2024
Figure 1 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 2 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 3 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 4 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Viaarxiv icon

LongReward: Improving Long-context Large Language Models with AI Feedback

Add code
Oct 28, 2024
Viaarxiv icon

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Add code
Jun 18, 2024
Figure 1 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 2 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 3 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 4 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Viaarxiv icon

ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback

Add code
Apr 03, 2024
Figure 1 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 2 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 3 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Figure 4 for ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Add code
Feb 02, 2024
Figure 1 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 2 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 3 for Towards Efficient and Exact Optimization of Language Model Alignment
Figure 4 for Towards Efficient and Exact Optimization of Language Model Alignment
Viaarxiv icon

A Semantic-based Method for Unsupervised Commonsense Question Answering

Add code
May 31, 2021
Figure 1 for A Semantic-based Method for Unsupervised Commonsense Question Answering
Figure 2 for A Semantic-based Method for Unsupervised Commonsense Question Answering
Figure 3 for A Semantic-based Method for Unsupervised Commonsense Question Answering
Figure 4 for A Semantic-based Method for Unsupervised Commonsense Question Answering
Viaarxiv icon

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

Add code
May 18, 2021
Figure 1 for REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Figure 2 for REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Figure 3 for REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Figure 4 for REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Viaarxiv icon