Picture for Yifan Du

Yifan Du

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Add code
Jan 03, 2025
Viaarxiv icon

Exploring the Design Space of Visual Context Representation in Video MLLMs

Add code
Oct 17, 2024
Figure 1 for Exploring the Design Space of Visual Context Representation in Video MLLMs
Figure 2 for Exploring the Design Space of Visual Context Representation in Video MLLMs
Figure 3 for Exploring the Design Space of Visual Context Representation in Video MLLMs
Figure 4 for Exploring the Design Space of Visual Context Representation in Video MLLMs
Viaarxiv icon

Towards Event-oriented Long Video Understanding

Add code
Jun 20, 2024
Viaarxiv icon

Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs

Add code
Jun 13, 2024
Figure 1 for Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs
Figure 2 for Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs
Figure 3 for Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs
Figure 4 for Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs
Viaarxiv icon

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

Add code
Nov 02, 2023
Viaarxiv icon

Learning to Imagine: Visually-Augmented Natural Language Generation

Add code
Jun 04, 2023
Figure 1 for Learning to Imagine: Visually-Augmented Natural Language Generation
Figure 2 for Learning to Imagine: Visually-Augmented Natural Language Generation
Figure 3 for Learning to Imagine: Visually-Augmented Natural Language Generation
Figure 4 for Learning to Imagine: Visually-Augmented Natural Language Generation
Viaarxiv icon

Zero-shot Visual Question Answering with Language Model Feedback

Add code
May 26, 2023
Viaarxiv icon

Evaluating Object Hallucination in Large Vision-Language Models

Add code
May 23, 2023
Figure 1 for Evaluating Object Hallucination in Large Vision-Language Models
Figure 2 for Evaluating Object Hallucination in Large Vision-Language Models
Figure 3 for Evaluating Object Hallucination in Large Vision-Language Models
Figure 4 for Evaluating Object Hallucination in Large Vision-Language Models
Viaarxiv icon