Picture for Zhenan Sun

Zhenan Sun

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Add code
Aug 20, 2025
Viaarxiv icon

ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension

Add code
Jul 22, 2025
Viaarxiv icon

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Add code
May 08, 2025
Viaarxiv icon

Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images

Add code
May 06, 2025
Viaarxiv icon

Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection

Add code
May 06, 2025
Viaarxiv icon

VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction

Add code
Apr 30, 2025
Figure 1 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 2 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 3 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 4 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Viaarxiv icon

Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance

Add code
Dec 21, 2024
Figure 1 for Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Figure 2 for Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Figure 3 for Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Figure 4 for Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Viaarxiv icon

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

Add code
Nov 28, 2024
Viaarxiv icon

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Add code
Oct 03, 2024
Figure 1 for UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Figure 2 for UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Figure 3 for UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Figure 4 for UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Viaarxiv icon

Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation

Add code
Jul 03, 2024
Viaarxiv icon