Picture for Tao Yuan

Tao Yuan

and Other Contributors

TongSIM: A General Platform for Simulating Intelligent Machines

Add code
Dec 23, 2025
Viaarxiv icon

Factorization-in-Loop: Proximal Fill-in Minimization for Sparse Matrix Reordering

Add code
Nov 12, 2025
Viaarxiv icon

Query-Kontext: An Unified Multimodal Model for Image Generation and Editing

Add code
Sep 30, 2025
Figure 1 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 2 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 3 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 4 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Viaarxiv icon

Megrez2 Technical Report

Add code
Jul 23, 2025
Viaarxiv icon

A Simple Linear Patch Revives Layer-Pruned Large Language Models

Add code
May 30, 2025
Figure 1 for A Simple Linear Patch Revives Layer-Pruned Large Language Models
Figure 2 for A Simple Linear Patch Revives Layer-Pruned Large Language Models
Figure 3 for A Simple Linear Patch Revives Layer-Pruned Large Language Models
Figure 4 for A Simple Linear Patch Revives Layer-Pruned Large Language Models
Viaarxiv icon

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs

Add code
May 26, 2025
Figure 1 for Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs
Figure 2 for Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs
Figure 3 for Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs
Figure 4 for Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs
Viaarxiv icon

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL

Add code
May 21, 2025
Figure 1 for Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
Figure 2 for Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
Figure 3 for Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
Figure 4 for Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
Viaarxiv icon

Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning

Add code
May 06, 2025
Viaarxiv icon

Iterative Trajectory Exploration for Multimodal Agents

Add code
Apr 30, 2025
Viaarxiv icon

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

Add code
Apr 17, 2025
Viaarxiv icon