Picture for An Yan

An Yan

Enhancing Sa2VA for Referent Video Object Segmentation: 2nd Solution for 7th LSVOS RVOS Track

Add code
Sep 19, 2025
Viaarxiv icon

Pseudo-Label Enhanced Cascaded Framework: 2nd Technical Report for LSVOS 2025 VOS Track

Add code
Sep 18, 2025
Figure 1 for Pseudo-Label Enhanced Cascaded Framework: 2nd Technical Report for LSVOS 2025 VOS Track
Figure 2 for Pseudo-Label Enhanced Cascaded Framework: 2nd Technical Report for LSVOS 2025 VOS Track
Figure 3 for Pseudo-Label Enhanced Cascaded Framework: 2nd Technical Report for LSVOS 2025 VOS Track
Figure 4 for Pseudo-Label Enhanced Cascaded Framework: 2nd Technical Report for LSVOS 2025 VOS Track
Viaarxiv icon

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

Add code
Dec 09, 2024
Figure 1 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 2 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 3 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 4 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Viaarxiv icon

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Add code
Nov 12, 2024
Figure 1 for BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Figure 2 for BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Figure 3 for BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Figure 4 for BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Viaarxiv icon

Trust but Verify: Programmatic VLM Evaluation in the Wild

Add code
Oct 17, 2024
Figure 1 for Trust but Verify: Programmatic VLM Evaluation in the Wild
Figure 2 for Trust but Verify: Programmatic VLM Evaluation in the Wild
Figure 3 for Trust but Verify: Programmatic VLM Evaluation in the Wild
Figure 4 for Trust but Verify: Programmatic VLM Evaluation in the Wild
Viaarxiv icon

Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect

Add code
Oct 10, 2024
Figure 1 for Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect
Figure 2 for Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect
Figure 3 for Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect
Figure 4 for Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect
Viaarxiv icon

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Add code
Aug 16, 2024
Figure 1 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 2 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 3 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 4 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Viaarxiv icon

CRAG -- Comprehensive RAG Benchmark

Add code
Jun 07, 2024
Figure 1 for CRAG -- Comprehensive RAG Benchmark
Figure 2 for CRAG -- Comprehensive RAG Benchmark
Figure 3 for CRAG -- Comprehensive RAG Benchmark
Figure 4 for CRAG -- Comprehensive RAG Benchmark
Viaarxiv icon

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Add code
Apr 25, 2024
Viaarxiv icon

Bridging Language and Items for Retrieval and Recommendation

Add code
Mar 06, 2024
Figure 1 for Bridging Language and Items for Retrieval and Recommendation
Figure 2 for Bridging Language and Items for Retrieval and Recommendation
Figure 3 for Bridging Language and Items for Retrieval and Recommendation
Figure 4 for Bridging Language and Items for Retrieval and Recommendation
Viaarxiv icon