Picture for Shusheng Xu

Shusheng Xu

How Far Are We from Optimal Reasoning Efficiency?

Add code
Jun 08, 2025
Viaarxiv icon

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Add code
May 30, 2025
Viaarxiv icon

On Designing Effective RL Reward at Training Time for LLM Reasoning

Add code
Oct 19, 2024
Figure 1 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 2 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 3 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Figure 4 for On Designing Effective RL Reward at Training Time for LLM Reasoning
Viaarxiv icon

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Add code
Apr 16, 2024
Viaarxiv icon

Language-Guided Generation of Physically Realistic Robot Motion and Control

Add code
Jun 18, 2023
Figure 1 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 2 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 3 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Figure 4 for Language-Guided Generation of Physically Realistic Robot Motion and Control
Viaarxiv icon

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension

Add code
Dec 14, 2021
Figure 1 for Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension
Figure 2 for Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension
Figure 3 for Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension
Figure 4 for Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension
Viaarxiv icon

A Benchmark for Low-Switching-Cost Reinforcement Learning

Add code
Dec 13, 2021
Figure 1 for A Benchmark for Low-Switching-Cost Reinforcement Learning
Figure 2 for A Benchmark for Low-Switching-Cost Reinforcement Learning
Figure 3 for A Benchmark for Low-Switching-Cost Reinforcement Learning
Figure 4 for A Benchmark for Low-Switching-Cost Reinforcement Learning
Viaarxiv icon

PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism

Add code
Nov 03, 2021
Figure 1 for PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism
Figure 2 for PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism
Figure 3 for PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism
Figure 4 for PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism
Viaarxiv icon

Sequence Level Contrastive Learning for Text Summarization

Add code
Sep 24, 2021
Figure 1 for Sequence Level Contrastive Learning for Text Summarization
Figure 2 for Sequence Level Contrastive Learning for Text Summarization
Figure 3 for Sequence Level Contrastive Learning for Text Summarization
Figure 4 for Sequence Level Contrastive Learning for Text Summarization
Viaarxiv icon

Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers

Add code
Oct 16, 2020
Figure 1 for Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers
Figure 2 for Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers
Figure 3 for Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers
Figure 4 for Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers
Viaarxiv icon