Picture for Yaxin Luo

Yaxin Luo

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Add code
May 30, 2025
Figure 1 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 2 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 3 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 4 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Viaarxiv icon

Dynamic Pyramid Network for Efficient Multimodal Large Language Model

Add code
Mar 26, 2025
Figure 1 for Dynamic Pyramid Network for Efficient Multimodal Large Language Model
Figure 2 for Dynamic Pyramid Network for Efficient Multimodal Large Language Model
Figure 3 for Dynamic Pyramid Network for Efficient Multimodal Large Language Model
Figure 4 for Dynamic Pyramid Network for Efficient Multimodal Large Language Model
Viaarxiv icon

Dataset Distillation via Committee Voting

Add code
Jan 13, 2025
Viaarxiv icon

$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Add code
Oct 17, 2024
Figure 1 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 2 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 3 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 4 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Viaarxiv icon