Picture for Xipeng Qiu

Xipeng Qiu

Dynamic and Generalizable Process Reward Modeling

Add code
Jul 23, 2025
Viaarxiv icon

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Add code
Jun 17, 2025
Figure 1 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 2 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 3 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 4 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Viaarxiv icon

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Add code
Jun 13, 2025
Viaarxiv icon

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training

Add code
Jun 12, 2025
Figure 1 for Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Figure 2 for Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Figure 3 for Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Figure 4 for Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Viaarxiv icon

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning

Add code
May 26, 2025
Figure 1 for R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
Figure 2 for R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
Figure 3 for R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
Figure 4 for R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
Viaarxiv icon

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

Add code
May 21, 2025
Viaarxiv icon

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

Add code
May 20, 2025
Viaarxiv icon

Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches

Add code
May 18, 2025
Viaarxiv icon

Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback

Add code
May 15, 2025
Figure 1 for Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
Figure 2 for Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
Figure 3 for Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
Figure 4 for Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
Viaarxiv icon