Picture for Shaohan Huang

Shaohan Huang

On-Policy RL with Optimal Reward Baseline

Add code
May 29, 2025
Viaarxiv icon

Think Only When You Need with Large Hybrid-Reasoning Models

Add code
May 21, 2025
Viaarxiv icon

Reward Reasoning Model

Add code
May 20, 2025
Figure 1 for Reward Reasoning Model
Figure 2 for Reward Reasoning Model
Figure 3 for Reward Reasoning Model
Figure 4 for Reward Reasoning Model
Viaarxiv icon

Efficient RL Training for Reasoning Models via Length-Aware Optimization

Add code
May 18, 2025
Viaarxiv icon

BitNet b1.58 2B4T Technical Report

Add code
Apr 16, 2025
Figure 1 for BitNet b1.58 2B4T Technical Report
Figure 2 for BitNet b1.58 2B4T Technical Report
Figure 3 for BitNet b1.58 2B4T Technical Report
Figure 4 for BitNet b1.58 2B4T Technical Report
Viaarxiv icon

Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection

Add code
Jan 21, 2025
Figure 1 for Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection
Figure 2 for Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection
Figure 3 for Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection
Figure 4 for Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection
Viaarxiv icon

GeAR: Generation Augmented Retrieval

Add code
Jan 06, 2025
Figure 1 for GeAR: Generation Augmented Retrieval
Figure 2 for GeAR: Generation Augmented Retrieval
Figure 3 for GeAR: Generation Augmented Retrieval
Figure 4 for GeAR: Generation Augmented Retrieval
Viaarxiv icon

Context-DPO: Aligning Language Models for Context-Faithfulness

Add code
Dec 18, 2024
Figure 1 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 2 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 3 for Context-DPO: Aligning Language Models for Context-Faithfulness
Figure 4 for Context-DPO: Aligning Language Models for Context-Faithfulness
Viaarxiv icon

Quantum Machine Learning in Log-based Anomaly Detection: Challenges and Opportunities

Add code
Dec 18, 2024
Viaarxiv icon

Multimodal Latent Language Modeling with Next-Token Diffusion

Add code
Dec 11, 2024
Figure 1 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 2 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 3 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 4 for Multimodal Latent Language Modeling with Next-Token Diffusion
Viaarxiv icon