Image To Image Translation


Image-to-image translation is the process of converting an image from one domain to another using deep learning techniques.

MetaCLIP 2: A Worldwide Scaling Recipe

Add code
Jul 29, 2025
Figure 1 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 2 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 3 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 4 for MetaCLIP 2: A Worldwide Scaling Recipe
Viaarxiv icon

Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis

Add code
Jul 09, 2025
Figure 1 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 2 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 3 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 4 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Viaarxiv icon

EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow

Add code
Jul 08, 2025
Figure 1 for EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow
Figure 2 for EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow
Figure 3 for EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow
Figure 4 for EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow
Viaarxiv icon

Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings

Add code
Jul 10, 2025
Viaarxiv icon

DreamLight: Towards Harmonious and Consistent Image Relighting

Add code
Jun 17, 2025
Viaarxiv icon

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Add code
Jul 02, 2025
Viaarxiv icon

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Add code
Jun 18, 2025
Viaarxiv icon

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Add code
Jun 24, 2025
Viaarxiv icon

ShapeEmbed: a self-supervised learning framework for 2D contour quantification

Add code
Jul 01, 2025
Viaarxiv icon

NeoBabel: A Multilingual Open Tower for Visual Generation

Add code
Jul 08, 2025
Viaarxiv icon