Picture for Rongrong Ji

Rongrong Ji

Xiamen University, Peng Cheng Laboratory

DAMamba: Vision State Space Model with Dynamic Adaptive Scan

Add code
Feb 18, 2025
Figure 1 for DAMamba: Vision State Space Model with Dynamic Adaptive Scan
Figure 2 for DAMamba: Vision State Space Model with Dynamic Adaptive Scan
Figure 3 for DAMamba: Vision State Space Model with Dynamic Adaptive Scan
Figure 4 for DAMamba: Vision State Space Model with Dynamic Adaptive Scan
Viaarxiv icon

Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery

Add code
Feb 09, 2025
Figure 1 for Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery
Figure 2 for Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery
Figure 3 for Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery
Figure 4 for Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery
Viaarxiv icon

AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection

Add code
Feb 08, 2025
Viaarxiv icon

Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray

Add code
Feb 07, 2025
Viaarxiv icon

Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting

Add code
Jan 30, 2025
Figure 1 for Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting
Figure 2 for Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting
Figure 3 for Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting
Figure 4 for Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting
Viaarxiv icon

SVFR: A Unified Framework for Generalized Video Face Restoration

Add code
Jan 03, 2025
Figure 1 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 2 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 3 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 4 for SVFR: A Unified Framework for Generalized Video Face Restoration
Viaarxiv icon

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Add code
Jan 03, 2025
Figure 1 for VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Figure 2 for VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Figure 3 for VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Figure 4 for VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Viaarxiv icon

Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers

Add code
Dec 21, 2024
Viaarxiv icon

DiffusionTrend: A Minimalist Approach to Virtual Fashion Try-On

Add code
Dec 19, 2024
Viaarxiv icon

Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search

Add code
Dec 19, 2024
Figure 1 for Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
Figure 2 for Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
Figure 3 for Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
Figure 4 for Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
Viaarxiv icon