Picture for Ziwei He

Ziwei He

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

Add code
Mar 03, 2026
Viaarxiv icon

PonderLM-3: Adaptive Token-Wise Pondering with Differentiable Masking

Add code
Mar 02, 2026
Viaarxiv icon

AdaPonderLM: Gated Pondering Language Models with Token-Wise Adaptive Depth

Add code
Mar 02, 2026
Viaarxiv icon

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Add code
Feb 09, 2026
Viaarxiv icon

FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation

Add code
Jan 30, 2026
Viaarxiv icon

DiRL: An Efficient Post-Training Framework for Diffusion Language Models

Add code
Dec 23, 2025
Viaarxiv icon

Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs

Add code
Dec 08, 2025
Viaarxiv icon

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Add code
Jun 17, 2025
Figure 1 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 2 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 3 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Figure 4 for LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Viaarxiv icon

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Add code
Jun 13, 2025
Viaarxiv icon

Pretraining Language Models to Ponder in Continuous Space

Add code
May 27, 2025
Figure 1 for Pretraining Language Models to Ponder in Continuous Space
Figure 2 for Pretraining Language Models to Ponder in Continuous Space
Figure 3 for Pretraining Language Models to Ponder in Continuous Space
Figure 4 for Pretraining Language Models to Ponder in Continuous Space
Viaarxiv icon