Picture for Bin Fu

Bin Fu

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Add code
Jun 13, 2024
Figure 1 for EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Figure 2 for EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Figure 3 for EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Figure 4 for EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Viaarxiv icon

MeshXL: Neural Coordinate Field for Generative 3D Foundation Models

Add code
May 31, 2024
Viaarxiv icon

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Add code
Mar 08, 2024
Figure 1 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 2 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 3 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 4 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Viaarxiv icon

AppAgent: Multimodal Agents as Smartphone Users

Add code
Dec 22, 2023
Figure 1 for AppAgent: Multimodal Agents as Smartphone Users
Figure 2 for AppAgent: Multimodal Agents as Smartphone Users
Figure 3 for AppAgent: Multimodal Agents as Smartphone Users
Figure 4 for AppAgent: Multimodal Agents as Smartphone Users
Viaarxiv icon

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

Add code
Dec 22, 2023
Figure 1 for Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Figure 2 for Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Figure 3 for Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Figure 4 for Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Viaarxiv icon

FaceStudio: Put Your Face Everywhere in Seconds

Add code
Dec 06, 2023
Figure 1 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 2 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 3 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 4 for FaceStudio: Put Your Face Everywhere in Seconds
Viaarxiv icon

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

Add code
Nov 27, 2023
Figure 1 for ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Figure 2 for ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Figure 3 for ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Figure 4 for ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Viaarxiv icon

SAM-Med3D

Add code
Oct 29, 2023
Figure 1 for SAM-Med3D
Figure 2 for SAM-Med3D
Figure 3 for SAM-Med3D
Figure 4 for SAM-Med3D
Viaarxiv icon

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

Add code
Sep 18, 2023
Figure 1 for Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Figure 2 for Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Figure 3 for Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Figure 4 for Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Viaarxiv icon

Deformation Robust Text Spotting with Geometric Prior

Add code
Aug 31, 2023
Figure 1 for Deformation Robust Text Spotting with Geometric Prior
Figure 2 for Deformation Robust Text Spotting with Geometric Prior
Figure 3 for Deformation Robust Text Spotting with Geometric Prior
Figure 4 for Deformation Robust Text Spotting with Geometric Prior
Viaarxiv icon