Picture for Fengzhuo Zhang

Fengzhuo Zhang

Muon Learns More Robust and Transferable Features than Adam

Add code
Jun 08, 2026
Viaarxiv icon

INFUSER: Influence-Guided Self-Evolution Improves Reasoning

Add code
Jun 08, 2026
Viaarxiv icon

Why Muon Outperforms Adam: A Curvature Perspective

Add code
Jun 03, 2026
Viaarxiv icon

Neural Networks Provably Learn Spectral Representations for Group Composition

Add code
Jun 02, 2026
Viaarxiv icon

Annealed Relaxation of Speculative Decoding for Faster Autoregressive Image Generation

Add code
Jan 14, 2026
Viaarxiv icon

Demystifying the Slash Pattern in Attention: The Role of RoPE

Add code
Jan 13, 2026
Viaarxiv icon

Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs

Add code
May 25, 2025
Viaarxiv icon

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms

Add code
May 21, 2025
Figure 1 for BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
Figure 2 for BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
Figure 3 for BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
Figure 4 for BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
Viaarxiv icon

LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification

Add code
Feb 24, 2025
Figure 1 for LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Figure 2 for LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Figure 3 for LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Figure 4 for LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Viaarxiv icon

Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory

Add code
Dec 23, 2024
Figure 1 for Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Figure 2 for Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Figure 3 for Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Figure 4 for Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Viaarxiv icon