Picture for Wenwen Yu

Wenwen Yu

DocThinker: Explainable Multimodal Large Language Models with Rule-based Reinforcement Learning for Document Understanding

Add code
Aug 12, 2025
Viaarxiv icon

LLMs-guided adaptive compensator: Bringing Adaptivity to Automatic Control Systems with Large Language Models

Add code
Jul 28, 2025
Viaarxiv icon

Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans

Add code
Jun 24, 2025
Viaarxiv icon

OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

Add code
Feb 22, 2025
Viaarxiv icon

ClickTrack: Towards Real-time Interactive Single Object Tracking

Add code
Nov 24, 2024
Figure 1 for ClickTrack: Towards Real-time Interactive Single Object Tracking
Figure 2 for ClickTrack: Towards Real-time Interactive Single Object Tracking
Figure 3 for ClickTrack: Towards Real-time Interactive Single Object Tracking
Figure 4 for ClickTrack: Towards Real-time Interactive Single Object Tracking
Viaarxiv icon

Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction

Add code
Nov 20, 2024
Figure 1 for Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction
Figure 2 for Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction
Figure 3 for Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction
Figure 4 for Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction
Viaarxiv icon

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

Add code
Mar 28, 2024
Viaarxiv icon

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

Add code
Jan 18, 2024
Viaarxiv icon

P2RBox: A Single Point is All You Need for Oriented Object Detection

Add code
Nov 22, 2023
Figure 1 for P2RBox: A Single Point is All You Need for Oriented Object Detection
Figure 2 for P2RBox: A Single Point is All You Need for Oriented Object Detection
Figure 3 for P2RBox: A Single Point is All You Need for Oriented Object Detection
Figure 4 for P2RBox: A Single Point is All You Need for Oriented Object Detection
Viaarxiv icon

Turning a CLIP Model into a Scene Text Spotter

Add code
Aug 21, 2023
Figure 1 for Turning a CLIP Model into a Scene Text Spotter
Figure 2 for Turning a CLIP Model into a Scene Text Spotter
Figure 3 for Turning a CLIP Model into a Scene Text Spotter
Figure 4 for Turning a CLIP Model into a Scene Text Spotter
Viaarxiv icon