Picture for Rongxiang Weng

Rongxiang Weng

OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation

Add code
Sep 03, 2025
Viaarxiv icon

XBOUND: Exploring the Capability Boundaries of Device-Control Agents through Trajectory Tree Exploration

Add code
May 27, 2025
Viaarxiv icon

Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment

Add code
May 19, 2025
Viaarxiv icon

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training

Add code
Apr 02, 2025
Viaarxiv icon

FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy

Add code
Feb 08, 2025
Figure 1 for FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Figure 2 for FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Figure 3 for FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Figure 4 for FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Viaarxiv icon

Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM

Add code
Dec 10, 2024
Figure 1 for Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM
Figure 2 for Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM
Figure 3 for Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM
Figure 4 for Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM
Viaarxiv icon

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision

Add code
Nov 25, 2024
Figure 1 for Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Figure 2 for Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Figure 3 for Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Figure 4 for Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Viaarxiv icon

Multi-Programming Language Sandbox for LLMs

Add code
Oct 30, 2024
Figure 1 for Multi-Programming Language Sandbox for LLMs
Figure 2 for Multi-Programming Language Sandbox for LLMs
Figure 3 for Multi-Programming Language Sandbox for LLMs
Figure 4 for Multi-Programming Language Sandbox for LLMs
Viaarxiv icon

Length Desensitization in Directed Preference Optimization

Add code
Sep 10, 2024
Figure 1 for Length Desensitization in Directed Preference Optimization
Figure 2 for Length Desensitization in Directed Preference Optimization
Figure 3 for Length Desensitization in Directed Preference Optimization
Figure 4 for Length Desensitization in Directed Preference Optimization
Viaarxiv icon

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Add code
Jul 08, 2024
Figure 1 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 2 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 3 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 4 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Viaarxiv icon