Image


When to Align, When to Predict: A Phase Diagram for Multimodal Learning

Add code
Jun 09, 2026
Viaarxiv icon

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

Add code
Jun 09, 2026
Viaarxiv icon

AnyMod-LLVE: Low-Light Video Enhancement with Modality-Agnostic Inference

Add code
Jun 09, 2026
Viaarxiv icon

Itô maps for any-step SDEs

Add code
Jun 09, 2026
Viaarxiv icon

Mean Flow Distillation: Robust and Stable Distillation for Flow Matching Models

Add code
Jun 09, 2026
Viaarxiv icon

P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

Add code
Jun 09, 2026
Viaarxiv icon

MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

Add code
Jun 09, 2026
Viaarxiv icon

UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors

Add code
Jun 09, 2026
Viaarxiv icon

Multimodal Brain Tumour Classification Using Feature Fusion

Add code
Jun 09, 2026
Viaarxiv icon

FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model

Add code
Jun 09, 2026
Viaarxiv icon