Picture for Shuohuan Wang

Shuohuan Wang

Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention Looping

Add code
Mar 25, 2026
Viaarxiv icon

Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation

Add code
Mar 05, 2026
Viaarxiv icon

ERNIE 5.0 Technical Report

Add code
Feb 04, 2026
Viaarxiv icon

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction

Add code
Jan 09, 2026
Viaarxiv icon

Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding

Add code
Dec 11, 2025
Viaarxiv icon

Advantageous Parameter Expansion Training Makes Better Large Language Models

Add code
May 30, 2025
Figure 1 for Advantageous Parameter Expansion Training Makes Better Large Language Models
Figure 2 for Advantageous Parameter Expansion Training Makes Better Large Language Models
Figure 3 for Advantageous Parameter Expansion Training Makes Better Large Language Models
Figure 4 for Advantageous Parameter Expansion Training Makes Better Large Language Models
Viaarxiv icon

Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking

Add code
Feb 19, 2025
Figure 1 for Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Figure 2 for Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Figure 3 for Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Figure 4 for Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Viaarxiv icon

BeamLoRA: Beam-Constraint Low-Rank Adaptation

Add code
Feb 19, 2025
Viaarxiv icon

Curiosity-Driven Reinforcement Learning from Human Feedback

Add code
Jan 20, 2025
Figure 1 for Curiosity-Driven Reinforcement Learning from Human Feedback
Figure 2 for Curiosity-Driven Reinforcement Learning from Human Feedback
Figure 3 for Curiosity-Driven Reinforcement Learning from Human Feedback
Figure 4 for Curiosity-Driven Reinforcement Learning from Human Feedback
Viaarxiv icon

Mixture of Hidden-Dimensions Transformer

Add code
Dec 10, 2024
Figure 1 for Mixture of Hidden-Dimensions Transformer
Figure 2 for Mixture of Hidden-Dimensions Transformer
Figure 3 for Mixture of Hidden-Dimensions Transformer
Figure 4 for Mixture of Hidden-Dimensions Transformer
Viaarxiv icon