Text


Edit2Interp: Adapting Image Foundation Models from Spatial Editing to Video Frame Interpolation with Few-Shot Learning

Add code
Mar 16, 2026
Viaarxiv icon

Exposing Cross-Modal Consistency for Fake News Detection in Short-Form Videos

Add code
Mar 16, 2026
Viaarxiv icon

Anchoring Emotions in Text: Robust Multimodal Fusion for Mimicry Intensity Estimation

Add code
Mar 16, 2026
Viaarxiv icon

Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework

Add code
Mar 16, 2026
Viaarxiv icon

GT-PCQA: Geometry-Texture Decoupled Point Cloud Quality Assessment with MLLM

Add code
Mar 16, 2026
Viaarxiv icon

Relevance Feedback in Text-to-Image Diffusion: A Training-Free And Model-Agnostic Interactive Framework

Add code
Mar 16, 2026
Viaarxiv icon

Decision-Level Ordinal Modeling for Multimodal Essay Scoring with Large Language Models

Add code
Mar 16, 2026
Viaarxiv icon

The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments

Add code
Mar 16, 2026
Viaarxiv icon

DamageArbiter: A CLIP-Enhanced Multimodal Arbitration Framework for Hurricane Damage Assessment from Street-View Imagery

Add code
Mar 16, 2026
Viaarxiv icon

AnyPhoto: Multi-Person Identity Preserving Image Generation with ID Adaptive Modulation on Location Canvas

Add code
Mar 16, 2026
Viaarxiv icon