Image To Image Translation


Image-to-image translation is the process of converting an image from one domain to another using deep learning techniques.

MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning

Add code
Aug 13, 2025
Viaarxiv icon

COCO-Urdu: A Large-Scale Urdu Image-Caption Dataset with Multimodal Quality Estimation

Add code
Sep 10, 2025
Viaarxiv icon

GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation

Add code
Sep 19, 2025
Viaarxiv icon

XOCT: Enhancing OCT to OCTA Translation via Cross-Dimensional Supervised Multi-Scale Feature Learning

Add code
Sep 09, 2025
Viaarxiv icon

MM2CT: MR-to-CT translation for multi-modal image fusion with mamba

Add code
Aug 07, 2025
Viaarxiv icon

Image-Guided Surgery: Technology, Quality, Innovation, and Opportunities for Medical Physics

Add code
Sep 03, 2025
Viaarxiv icon

On the Utility of Virtual Staining for Downstream Applications as it relates to Task Network Capacity

Add code
Jul 31, 2025
Viaarxiv icon

GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model

Add code
Sep 17, 2025
Figure 1 for GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
Figure 2 for GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
Figure 3 for GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
Figure 4 for GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
Viaarxiv icon

Bangla-Bayanno: A 52K-Pair Bengali Visual Question Answering Dataset with LLM-Assisted Translation Refinement

Add code
Aug 27, 2025
Figure 1 for Bangla-Bayanno: A 52K-Pair Bengali Visual Question Answering Dataset with LLM-Assisted Translation Refinement
Figure 2 for Bangla-Bayanno: A 52K-Pair Bengali Visual Question Answering Dataset with LLM-Assisted Translation Refinement
Figure 3 for Bangla-Bayanno: A 52K-Pair Bengali Visual Question Answering Dataset with LLM-Assisted Translation Refinement
Figure 4 for Bangla-Bayanno: A 52K-Pair Bengali Visual Question Answering Dataset with LLM-Assisted Translation Refinement
Viaarxiv icon

Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation

Add code
Jul 10, 2025
Viaarxiv icon