Picture for Zhenyu Hou

Zhenyu Hou

Scaling Reinforcement Learning for Content Moderation with Large Language Models

Add code
Dec 23, 2025
Figure 1 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 2 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 3 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 4 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Viaarxiv icon

Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference

Add code
Dec 18, 2025
Figure 1 for Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference
Figure 2 for Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference
Figure 3 for Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference
Figure 4 for Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

Add code
Jun 13, 2025
Viaarxiv icon

SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling

Add code
Jun 09, 2025
Figure 1 for SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling
Figure 2 for SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling
Figure 3 for SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling
Figure 4 for SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling
Viaarxiv icon

Controlling Large Language Model with Latent Actions

Add code
Mar 27, 2025
Viaarxiv icon

Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian Process

Add code
Mar 06, 2025
Viaarxiv icon

Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Add code
Jan 20, 2025
Figure 1 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 2 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 3 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 4 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Viaarxiv icon

Does RLHF Scale? Exploring the Impacts From Data, Model, and Method

Add code
Dec 08, 2024
Figure 1 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 2 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 3 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 4 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Viaarxiv icon