Picture for Chengsong Huang

Chengsong Huang

RelayLLM: Efficient Reasoning via Collaborative Decoding

Add code
Jan 08, 2026
Viaarxiv icon

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Add code
Jan 07, 2026
Viaarxiv icon

UniRel-R1: RL-tuned LLM Reasoning for Knowledge Graph Relational Question Answering

Add code
Dec 18, 2025
Viaarxiv icon

VisPlay: Self-Evolving Vision-Language Models from Images

Add code
Nov 19, 2025
Viaarxiv icon

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Add code
Sep 09, 2025
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Add code
Aug 07, 2025
Figure 1 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 2 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 3 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 4 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Viaarxiv icon

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Add code
Mar 30, 2025
Figure 1 for CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Figure 2 for CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Figure 3 for CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Figure 4 for CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Viaarxiv icon

Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning

Add code
Oct 14, 2024
Figure 1 for Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Figure 2 for Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Figure 3 for Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Figure 4 for Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Viaarxiv icon

Taming Overconfidence in LLMs: Reward Calibration in RLHF

Add code
Oct 13, 2024
Figure 1 for Taming Overconfidence in LLMs: Reward Calibration in RLHF
Figure 2 for Taming Overconfidence in LLMs: Reward Calibration in RLHF
Figure 3 for Taming Overconfidence in LLMs: Reward Calibration in RLHF
Figure 4 for Taming Overconfidence in LLMs: Reward Calibration in RLHF
Viaarxiv icon