Picture for Xin Tan

Xin Tan

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description

Add code
Aug 09, 2024
Figure 1 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 2 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 3 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 4 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Viaarxiv icon

Harmonizing Visual Text Comprehension and Generation

Add code
Jul 23, 2024
Viaarxiv icon

Mutual Information Guided Optimal Transport for Unsupervised Visible-Infrared Person Re-identification

Add code
Jul 17, 2024
Viaarxiv icon

Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining

Add code
Jul 10, 2024
Figure 1 for Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining
Figure 2 for Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining
Figure 3 for Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining
Figure 4 for Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining
Viaarxiv icon

Teola: Towards End-to-End Optimization of LLM-based Applications

Add code
Jun 29, 2024
Viaarxiv icon

PIG: Prompt Images Guidance for Night-Time Scene Parsing

Add code
Jun 15, 2024
Figure 1 for PIG: Prompt Images Guidance for Night-Time Scene Parsing
Figure 2 for PIG: Prompt Images Guidance for Night-Time Scene Parsing
Figure 3 for PIG: Prompt Images Guidance for Night-Time Scene Parsing
Figure 4 for PIG: Prompt Images Guidance for Night-Time Scene Parsing
Viaarxiv icon

FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping

Add code
Jun 04, 2024
Figure 1 for FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping
Figure 2 for FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping
Figure 3 for FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping
Figure 4 for FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping
Viaarxiv icon

Gradient Projection For Parameter-Efficient Continual Learning

Add code
May 22, 2024
Figure 1 for Gradient Projection For Parameter-Efficient Continual Learning
Figure 2 for Gradient Projection For Parameter-Efficient Continual Learning
Figure 3 for Gradient Projection For Parameter-Efficient Continual Learning
Figure 4 for Gradient Projection For Parameter-Efficient Continual Learning
Viaarxiv icon

GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision

Add code
May 17, 2024
Figure 1 for GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision
Figure 2 for GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision
Figure 3 for GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision
Figure 4 for GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Figure 1 for Efficient Multimodal Large Language Models: A Survey
Figure 2 for Efficient Multimodal Large Language Models: A Survey
Figure 3 for Efficient Multimodal Large Language Models: A Survey
Figure 4 for Efficient Multimodal Large Language Models: A Survey
Viaarxiv icon