Picture for Xingwei Qu

Xingwei Qu

First Return, Entropy-Eliciting Explore

Add code
Jul 09, 2025
Viaarxiv icon

A Survey on Latent Reasoning

Add code
Jul 08, 2025
Viaarxiv icon

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models

Add code
Apr 25, 2025
Viaarxiv icon

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Add code
Apr 15, 2025
Viaarxiv icon

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Add code
Apr 07, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm

Add code
Feb 26, 2025
Viaarxiv icon

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

Aligning Instruction Tuning with Pre-training

Add code
Jan 16, 2025
Figure 1 for Aligning Instruction Tuning with Pre-training
Figure 2 for Aligning Instruction Tuning with Pre-training
Figure 3 for Aligning Instruction Tuning with Pre-training
Figure 4 for Aligning Instruction Tuning with Pre-training
Viaarxiv icon

Observing Micromotives and Macrobehavior of Large Language Models

Add code
Dec 10, 2024
Viaarxiv icon