Picture for Shizhe Diao

Shizhe Diao

Celine

LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer

Add code
Jun 08, 2025
Viaarxiv icon

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Add code
May 30, 2025
Viaarxiv icon

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Add code
May 28, 2025
Viaarxiv icon

LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement

Add code
Apr 22, 2025
Viaarxiv icon

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Add code
Apr 17, 2025
Viaarxiv icon

MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving

Add code
Mar 05, 2025
Viaarxiv icon

Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training

Add code
Feb 05, 2025
Viaarxiv icon

Entropy-Regularized Process Reward Model

Add code
Dec 15, 2024
Viaarxiv icon

Hymba: A Hybrid-head Architecture for Small Language Models

Add code
Nov 20, 2024
Figure 1 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 2 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 3 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 4 for Hymba: A Hybrid-head Architecture for Small Language Models
Viaarxiv icon

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Add code
Oct 04, 2024
Figure 1 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 2 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 3 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 4 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Viaarxiv icon