Picture for Bo Zhao

Bo Zhao

VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval

Add code
Jun 06, 2024
Figure 1 for VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
Figure 2 for VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
Figure 3 for VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
Figure 4 for VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
Viaarxiv icon

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Add code
Jun 06, 2024
Figure 1 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 2 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 3 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 4 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Viaarxiv icon

The SkatingVerse Workshop & Challenge: Methods and Results

Add code
May 27, 2024
Figure 1 for The SkatingVerse Workshop & Challenge: Methods and Results
Figure 2 for The SkatingVerse Workshop & Challenge: Methods and Results
Viaarxiv icon

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Add code
May 22, 2024
Figure 1 for VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Figure 2 for VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Figure 3 for VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Figure 4 for VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Figure 1 for Efficient Multimodal Large Language Models: A Survey
Figure 2 for Efficient Multimodal Large Language Models: A Survey
Figure 3 for Efficient Multimodal Large Language Models: A Survey
Figure 4 for Efficient Multimodal Large Language Models: A Survey
Viaarxiv icon

Understanding the Difficulty of Solving Cauchy Problems with PINNs

Add code
May 04, 2024
Figure 1 for Understanding the Difficulty of Solving Cauchy Problems with PINNs
Figure 2 for Understanding the Difficulty of Solving Cauchy Problems with PINNs
Figure 3 for Understanding the Difficulty of Solving Cauchy Problems with PINNs
Figure 4 for Understanding the Difficulty of Solving Cauchy Problems with PINNs
Viaarxiv icon

Advances and Open Challenges in Federated Learning with Foundation Models

Add code
Apr 29, 2024
Figure 1 for Advances and Open Challenges in Federated Learning with Foundation Models
Figure 2 for Advances and Open Challenges in Federated Learning with Foundation Models
Figure 3 for Advances and Open Challenges in Federated Learning with Foundation Models
Figure 4 for Advances and Open Challenges in Federated Learning with Foundation Models
Viaarxiv icon

Tele-FLM Technical Report

Add code
Apr 25, 2024
Figure 1 for Tele-FLM Technical Report
Figure 2 for Tele-FLM Technical Report
Figure 3 for Tele-FLM Technical Report
Figure 4 for Tele-FLM Technical Report
Viaarxiv icon

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

Add code
Mar 31, 2024
Figure 1 for M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
Figure 2 for M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
Figure 3 for M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
Figure 4 for M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
Viaarxiv icon

SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model

Add code
Mar 05, 2024
Figure 1 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 2 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 3 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 4 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Viaarxiv icon