Picture for Song Wang

Song Wang

Jack

CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction

Add code
Jul 23, 2024
Figure 1 for CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Figure 2 for CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Figure 3 for CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Figure 4 for CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Viaarxiv icon

Developing a Reliable, General-Purpose Hallucination Detection and Mitigation Service: Insights and Lessons Learned

Add code
Jul 22, 2024
Figure 1 for Developing a Reliable, General-Purpose Hallucination Detection and Mitigation Service: Insights and Lessons Learned
Figure 2 for Developing a Reliable, General-Purpose Hallucination Detection and Mitigation Service: Insights and Lessons Learned
Figure 3 for Developing a Reliable, General-Purpose Hallucination Detection and Mitigation Service: Insights and Lessons Learned
Figure 4 for Developing a Reliable, General-Purpose Hallucination Detection and Mitigation Service: Insights and Lessons Learned
Viaarxiv icon

OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking

Add code
Jul 19, 2024
Figure 1 for OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Figure 2 for OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Figure 3 for OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Figure 4 for OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Viaarxiv icon

A Benchmark for Fairness-Aware Graph Learning

Add code
Jul 16, 2024
Viaarxiv icon

Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations

Add code
Jul 06, 2024
Figure 1 for Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations
Figure 2 for Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations
Figure 3 for Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations
Figure 4 for Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations
Viaarxiv icon

CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

Add code
Jul 02, 2024
Viaarxiv icon

TokenPacker: Efficient Visual Projector for Multimodal LLM

Add code
Jul 02, 2024
Figure 1 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 2 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 3 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 4 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Viaarxiv icon

"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

Add code
Jun 26, 2024
Figure 1 for "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models
Figure 2 for "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models
Figure 3 for "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models
Figure 4 for "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models
Viaarxiv icon

Knowledge Graph-Enhanced Large Language Models via Path Selection

Add code
Jun 19, 2024
Viaarxiv icon

Few-shot Knowledge Graph Relational Reasoning via Subgraph Adaptation

Add code
Jun 19, 2024
Viaarxiv icon