Picture for Shangshang Wang

Shangshang Wang

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Add code
Mar 20, 2026
Viaarxiv icon

Resa: Transparent Reasoning Models via SAEs

Add code
Jun 11, 2025
Viaarxiv icon

Tina: Tiny Reasoning Models via LoRA

Add code
Apr 22, 2025
Viaarxiv icon

AI-University: An LLM-based platform for instructional alignment to scientific classrooms

Add code
Apr 11, 2025
Viaarxiv icon

METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring

Add code
Jan 03, 2025
Figure 1 for METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
Figure 2 for METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
Figure 3 for METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
Figure 4 for METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
Viaarxiv icon

Learning to Schedule Online Tasks with Bandit Feedback

Add code
Feb 26, 2024
Figure 1 for Learning to Schedule Online Tasks with Bandit Feedback
Figure 2 for Learning to Schedule Online Tasks with Bandit Feedback
Figure 3 for Learning to Schedule Online Tasks with Bandit Feedback
Figure 4 for Learning to Schedule Online Tasks with Bandit Feedback
Viaarxiv icon