Picture for Rongrong Ji

Rongrong Ji

Xiamen University, Peng Cheng Laboratory

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

Add code
May 09, 2024
Viaarxiv icon

ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion

Add code
May 02, 2024
Figure 1 for ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Figure 2 for ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Figure 3 for ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Figure 4 for ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Viaarxiv icon

X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation

Add code
May 02, 2024
Figure 1 for X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Figure 2 for X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Figure 3 for X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Figure 4 for X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Viaarxiv icon

GraCo: Granularity-Controllable Interactive Segmentation

Add code
May 01, 2024
Figure 1 for GraCo: Granularity-Controllable Interactive Segmentation
Figure 2 for GraCo: Granularity-Controllable Interactive Segmentation
Figure 3 for GraCo: Granularity-Controllable Interactive Segmentation
Figure 4 for GraCo: Granularity-Controllable Interactive Segmentation
Viaarxiv icon

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

Add code
Apr 24, 2024
Viaarxiv icon

Multi-Modal Prompt Learning on Blind Image Quality Assessment

Add code
Apr 23, 2024
Figure 1 for Multi-Modal Prompt Learning on Blind Image Quality Assessment
Figure 2 for Multi-Modal Prompt Learning on Blind Image Quality Assessment
Figure 3 for Multi-Modal Prompt Learning on Blind Image Quality Assessment
Figure 4 for Multi-Modal Prompt Learning on Blind Image Quality Assessment
Viaarxiv icon

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

Add code
Apr 23, 2024
Figure 1 for CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
Figure 2 for CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
Figure 3 for CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
Figure 4 for CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
Viaarxiv icon

NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation

Add code
Apr 22, 2024
Viaarxiv icon

Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization

Add code
Apr 17, 2024
Figure 1 for Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Figure 2 for Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Figure 3 for Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Figure 4 for Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Viaarxiv icon

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

Add code
Mar 31, 2024
Figure 1 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 2 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 3 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 4 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Viaarxiv icon