Picture for Yujuan Ding

Yujuan Ding

OVS-DINO: Open-Vocabulary Segmentation via Structure-Aligned SAM-DINO with Language Guidance

Add code
Apr 09, 2026
Viaarxiv icon

HV-Attack: Hierarchical Visual Attack for Multimodal Retrieval Augmented Generation

Add code
Nov 19, 2025
Viaarxiv icon

WebRec: Enhancing LLM-based Recommendations with Attention-guided RAG from Web

Add code
Nov 18, 2025
Viaarxiv icon

Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

Add code
Oct 02, 2025
Figure 1 for Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Figure 2 for Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Figure 3 for Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Figure 4 for Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Viaarxiv icon

More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration

Add code
Oct 02, 2025
Figure 1 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 2 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 3 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 4 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Viaarxiv icon

A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models

Add code
Mar 30, 2025
Viaarxiv icon

ChartAdapter: Large Vision-Language Model for Chart Summarization

Add code
Dec 30, 2024
Figure 1 for ChartAdapter: Large Vision-Language Model for Chart Summarization
Figure 2 for ChartAdapter: Large Vision-Language Model for Chart Summarization
Figure 3 for ChartAdapter: Large Vision-Language Model for Chart Summarization
Figure 4 for ChartAdapter: Large Vision-Language Model for Chart Summarization
Viaarxiv icon

GalleryGPT: Analyzing Paintings with Large Multimodal Models

Add code
Aug 01, 2024
Figure 1 for GalleryGPT: Analyzing Paintings with Large Multimodal Models
Figure 2 for GalleryGPT: Analyzing Paintings with Large Multimodal Models
Figure 3 for GalleryGPT: Analyzing Paintings with Large Multimodal Models
Figure 4 for GalleryGPT: Analyzing Paintings with Large Multimodal Models
Viaarxiv icon

Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning

Add code
Aug 01, 2024
Figure 1 for Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning
Figure 2 for Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning
Figure 3 for Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning
Figure 4 for Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning
Viaarxiv icon

A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models

Add code
May 10, 2024
Figure 1 for A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models
Figure 2 for A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models
Figure 3 for A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models
Figure 4 for A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models
Viaarxiv icon