Picture for Zanlin Ni

Zanlin Ni

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

Add code
May 14, 2026
Viaarxiv icon

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

Add code
May 07, 2026
Viaarxiv icon

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Add code
Jan 21, 2026
Viaarxiv icon

Co-GRPO: Co-Optimized Group Relative Policy Optimization for Masked Diffusion Model

Add code
Dec 25, 2025
Viaarxiv icon

Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception

Add code
Sep 18, 2025
Viaarxiv icon

ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis

Add code
Nov 11, 2024
Figure 1 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 2 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 3 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 4 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Viaarxiv icon

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation

Add code
Aug 31, 2024
Figure 1 for AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Figure 2 for AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Figure 3 for AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Figure 4 for AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Viaarxiv icon

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis

Add code
Jun 08, 2024
Figure 1 for Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Figure 2 for Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Figure 3 for Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Figure 4 for Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Viaarxiv icon

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Add code
Jun 06, 2024
Figure 1 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 2 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 3 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 4 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Viaarxiv icon

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Add code
Mar 18, 2024
Figure 1 for LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Figure 2 for LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Figure 3 for LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Figure 4 for LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Viaarxiv icon