Picture for Conghui He

Conghui He

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Add code
Aug 13, 2025
Viaarxiv icon

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Add code
Jul 23, 2025
Viaarxiv icon

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models

Add code
Jun 15, 2025
Viaarxiv icon

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Add code
Jun 12, 2025
Viaarxiv icon

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

Add code
Jun 09, 2025
Viaarxiv icon

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Add code
Jun 08, 2025
Viaarxiv icon

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

Add code
Jun 08, 2025
Viaarxiv icon

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Add code
May 25, 2025
Viaarxiv icon

A Survey of LLM $\times$ DATA

Add code
May 24, 2025
Viaarxiv icon