Picture for Baotian Hu

Baotian Hu

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Add code
Jun 12, 2025
Viaarxiv icon

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation

Add code
Jun 11, 2025
Viaarxiv icon

Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs

Add code
Jun 11, 2025
Viaarxiv icon

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Add code
Jun 05, 2025
Viaarxiv icon

VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Add code
May 25, 2025
Viaarxiv icon

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Viaarxiv icon

VideoVista-CulturalLingo: 360$^\circ$ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension

Add code
Apr 23, 2025
Viaarxiv icon

A Unified Agentic Framework for Evaluating Conditional Image Generation

Add code
Apr 09, 2025
Viaarxiv icon

Take Off the Training Wheels Progressive In-Context Learning for Effective Alignment

Add code
Mar 13, 2025
Viaarxiv icon

Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents

Add code
Feb 27, 2025
Figure 1 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 2 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 3 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 4 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Viaarxiv icon