Picture for Marcella Cornia

Marcella Cornia

Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers

Add code
May 13, 2026
Viaarxiv icon

GramSR: Visual Feature Conditioning for Diffusion-Based Super-Resolution

Add code
Apr 28, 2026
Viaarxiv icon

RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models

Add code
Apr 16, 2026
Viaarxiv icon

Look Twice: Training-Free Evidence Highlighting in Multimodal Large Language Models

Add code
Apr 01, 2026
Viaarxiv icon

Tiny Inference-Time Scaling with Latent Verifiers

Add code
Mar 25, 2026
Viaarxiv icon

Dress-ED: Instruction-Guided Editing for Virtual Try-On and Try-Off

Add code
Mar 23, 2026
Viaarxiv icon

CounterVid: Counterfactual Video Generation for Mitigating Action and Temporal Hallucinations in Video-Language Models

Add code
Jan 08, 2026
Viaarxiv icon

Seeing Beyond Words: Self-Supervised Visual Learning for Multimodal Large Language Models

Add code
Dec 17, 2025
Viaarxiv icon

Recurrence Meets Transformers for Universal Multimodal Retrieval

Add code
Sep 10, 2025
Figure 1 for Recurrence Meets Transformers for Universal Multimodal Retrieval
Figure 2 for Recurrence Meets Transformers for Universal Multimodal Retrieval
Figure 3 for Recurrence Meets Transformers for Universal Multimodal Retrieval
Figure 4 for Recurrence Meets Transformers for Universal Multimodal Retrieval
Viaarxiv icon

RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors

Add code
Jun 09, 2025
Viaarxiv icon