Multimodal Models


Training Data Efficiency in Multimodal Process Reward Models

Add code
Feb 05, 2026
Viaarxiv icon

RISE-Video: Can Video Generators Decode Implicit World Rules?

Add code
Feb 05, 2026
Viaarxiv icon

A Unified Multimodal Framework for Dataset Construction and Model-Based Diagnosis of Ameloblastoma

Add code
Feb 05, 2026
Viaarxiv icon

Layer-wise LoRA fine-tuning: a similarity metric approach

Add code
Feb 05, 2026
Viaarxiv icon

Adaptive Global and Fine-Grained Perceptual Fusion for MLLM Embeddings Compatible with Hard Negative Amplification

Add code
Feb 05, 2026
Viaarxiv icon

Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs

Add code
Feb 05, 2026
Viaarxiv icon

Constrained Group Relative Policy Optimization

Add code
Feb 05, 2026
Viaarxiv icon

V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval

Add code
Feb 05, 2026
Viaarxiv icon

Multimodal Latent Reasoning via Hierarchical Visual Cues Injection

Add code
Feb 05, 2026
Viaarxiv icon

LMMRec: LLM-driven Motivation-aware Multimodal Recommendation

Add code
Feb 05, 2026
Viaarxiv icon