Picture for Rongrong Ji

Rongrong Ji

Xiamen University, Peng Cheng Laboratory

AnyTrans: Translate AnyText in the Image with Large Scale Models

Add code
Jun 17, 2024
Viaarxiv icon

VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

Add code
Jun 14, 2024
Viaarxiv icon

Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval

Add code
Jun 09, 2024
Figure 1 for Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Figure 2 for Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Figure 3 for Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Figure 4 for Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Viaarxiv icon

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

Add code
Jun 03, 2024
Figure 1 for SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Figure 2 for SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Figure 3 for SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Figure 4 for SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Viaarxiv icon

Image Captioning via Dynamic Path Customization

Add code
Jun 01, 2024
Figure 1 for Image Captioning via Dynamic Path Customization
Figure 2 for Image Captioning via Dynamic Path Customization
Figure 3 for Image Captioning via Dynamic Path Customization
Figure 4 for Image Captioning via Dynamic Path Customization
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Figure 1 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 2 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 3 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 4 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Viaarxiv icon

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Add code
May 29, 2024
Figure 1 for FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
Figure 2 for FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
Figure 3 for FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
Figure 4 for FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
Viaarxiv icon

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

Add code
May 29, 2024
Figure 1 for UniPTS: A Unified Framework for Proficient Post-Training Sparsity
Viaarxiv icon

GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane

Add code
May 27, 2024
Figure 1 for GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane
Figure 2 for GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane
Figure 3 for GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane
Figure 4 for GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane
Viaarxiv icon

Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion

Add code
May 16, 2024
Figure 1 for Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion
Figure 2 for Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion
Figure 3 for Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion
Figure 4 for Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion
Viaarxiv icon