Picture for Zhenan Sun

Zhenan Sun

RA-SSU: Towards Fine-Grained Audio-Visual Learning with Region-Aware Sound Source Understanding

Add code
Mar 10, 2026
Viaarxiv icon

Affinity Contrastive Learning for Skeleton-based Human Activity Understanding

Add code
Jan 23, 2026
Viaarxiv icon

Dual-Phase LLM Reasoning: Self-Evolved Mathematical Frameworks

Add code
Jan 09, 2026
Viaarxiv icon

3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory

Add code
Dec 22, 2025
Figure 1 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 2 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 3 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 4 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Viaarxiv icon

TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Add code
Dec 18, 2025
Figure 1 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 2 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 3 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 4 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Viaarxiv icon

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Add code
Aug 20, 2025
Viaarxiv icon

ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension

Add code
Jul 22, 2025
Figure 1 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 2 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 3 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 4 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Viaarxiv icon

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Add code
May 08, 2025
Viaarxiv icon

Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection

Add code
May 06, 2025
Viaarxiv icon

Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images

Add code
May 06, 2025
Figure 1 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 2 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 3 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 4 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Viaarxiv icon