Picture for Weihua Luo

Weihua Luo

AI Business, Alibaba Group

Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images

Add code
Dec 19, 2025
Viaarxiv icon

Omni-View: Unlocking How Generation Facilitates Understanding in Unified 3D Model based on Multiview images

Add code
Nov 10, 2025
Viaarxiv icon

Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language Models

Add code
Jul 16, 2025
Viaarxiv icon

Rethinking Multilingual Vision-Language Translation: Dataset, Evaluation, and Adaptation

Add code
Jun 13, 2025
Viaarxiv icon

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation

Add code
Jun 11, 2025
Viaarxiv icon

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Add code
Jun 05, 2025
Viaarxiv icon

Multimodal Tabular Reasoning with Privileged Structured Information

Add code
Jun 04, 2025
Viaarxiv icon

TransBench: Benchmarking Machine Translation for Industrial-Scale Applications

Add code
May 20, 2025
Viaarxiv icon

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Figure 1 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 2 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 3 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 4 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Viaarxiv icon

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Add code
May 05, 2025
Viaarxiv icon