Image To Image Translation


Image-to-image translation is the process of converting an image from one domain to another using deep learning techniques.

You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation

Add code
Aug 20, 2025
Figure 1 for You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation
Figure 2 for You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation
Figure 3 for You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation
Figure 4 for You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation
Viaarxiv icon

Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound

Add code
Aug 09, 2025
Viaarxiv icon

Camera Pose Refinement via 3D Gaussian Splatting

Add code
Aug 25, 2025
Figure 1 for Camera Pose Refinement via 3D Gaussian Splatting
Figure 2 for Camera Pose Refinement via 3D Gaussian Splatting
Figure 3 for Camera Pose Refinement via 3D Gaussian Splatting
Figure 4 for Camera Pose Refinement via 3D Gaussian Splatting
Viaarxiv icon

Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications

Add code
Aug 10, 2025
Figure 1 for Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
Figure 2 for Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
Figure 3 for Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
Figure 4 for Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
Viaarxiv icon

Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions

Add code
Aug 26, 2025
Figure 1 for Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions
Figure 2 for Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions
Figure 3 for Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions
Figure 4 for Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions
Viaarxiv icon

HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation

Add code
Aug 28, 2025
Viaarxiv icon

SATURN: Autoregressive Image Generation Guided by Scene Graphs

Add code
Aug 20, 2025
Viaarxiv icon

Designing Practical Models for Isolated Word Visual Speech Recognition

Add code
Aug 25, 2025
Viaarxiv icon

ConlangCrafter: Constructing Languages with a Multi-Hop LLM Pipeline

Add code
Aug 08, 2025
Viaarxiv icon

Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR

Add code
Aug 29, 2025
Figure 1 for Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Figure 2 for Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Figure 3 for Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Figure 4 for Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Viaarxiv icon