Picture for Daoyuan Chen

Daoyuan Chen

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

Add code
May 23, 2025
Viaarxiv icon

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

Add code
May 22, 2025
Viaarxiv icon

MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions

Add code
Mar 12, 2025
Viaarxiv icon

Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data

Add code
Feb 05, 2025
Viaarxiv icon

HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data

Add code
Dec 23, 2024
Figure 1 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 2 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 3 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 4 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Viaarxiv icon

ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction

Add code
Dec 18, 2024
Figure 1 for ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
Figure 2 for ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
Figure 3 for ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
Figure 4 for ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
Viaarxiv icon

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Add code
Aug 09, 2024
Figure 1 for Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Figure 2 for Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Figure 3 for Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Figure 4 for Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Viaarxiv icon

Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development

Add code
Jul 16, 2024
Figure 1 for Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Figure 2 for Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Figure 3 for Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Figure 4 for Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Viaarxiv icon

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Add code
Jul 11, 2024
Viaarxiv icon

Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

Add code
May 23, 2024
Viaarxiv icon