Picture for Biqing Qi

Biqing Qi

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

ReviewRL: Towards Automated Scientific Review with RL

Add code
Aug 14, 2025
Viaarxiv icon

Graph Counselor: Adaptive Graph Exploration via Multi-Agent Synergy to Enhance LLM Reasoning

Add code
Jun 04, 2025
Viaarxiv icon

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Add code
May 26, 2025
Viaarxiv icon

TTRL: Test-Time Reinforcement Learning

Add code
Apr 22, 2025
Viaarxiv icon

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Add code
Apr 01, 2025
Viaarxiv icon

Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform

Add code
Mar 09, 2025
Figure 1 for Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Figure 2 for Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Figure 3 for Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Figure 4 for Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Viaarxiv icon

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Add code
Feb 10, 2025
Viaarxiv icon

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Add code
Dec 23, 2024
Viaarxiv icon