Picture for Simon Shaolei Du

Simon Shaolei Du

Cold-Start Personalization via Training-Free Priors from Structured World Models

Add code
Feb 16, 2026
Viaarxiv icon

Global Convergence of Four-Layer Matrix Factorization under Random Initialization

Add code
Nov 19, 2025
Viaarxiv icon

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Add code
Nov 10, 2025
Figure 1 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 2 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 3 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 4 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Viaarxiv icon

Spurious Rewards: Rethinking Training Signals in RLVR

Add code
Jun 12, 2025
Figure 1 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 2 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 3 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 4 for Spurious Rewards: Rethinking Training Signals in RLVR
Viaarxiv icon

Policy-Based Trajectory Clustering in Offline Reinforcement Learning

Add code
Jun 12, 2025
Viaarxiv icon

Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models

Add code
Jun 09, 2025
Viaarxiv icon

Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval

Add code
May 21, 2025
Figure 1 for Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Figure 2 for Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Figure 3 for Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Figure 4 for Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Viaarxiv icon

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Add code
Apr 29, 2025
Figure 1 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 2 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 3 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 4 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Viaarxiv icon

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

Add code
Apr 20, 2025
Figure 1 for LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Figure 2 for LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Figure 3 for LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Figure 4 for LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Viaarxiv icon

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters

Add code
Feb 11, 2025
Figure 1 for SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Figure 2 for SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Figure 3 for SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Figure 4 for SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Viaarxiv icon