Picture for Hao Zhang

Hao Zhang

refer to the report for detailed contributions

Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning

Add code
Aug 12, 2025
Viaarxiv icon

VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Add code
Jul 30, 2025
Figure 1 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 2 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 3 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Figure 4 for VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Viaarxiv icon

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Add code
Jul 29, 2025
Figure 1 for HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Figure 2 for HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Figure 3 for HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Figure 4 for HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
Viaarxiv icon

Interactive Adversarial Testing of Autonomous Vehicles with Adjustable Confrontation Intensity

Add code
Jul 29, 2025
Viaarxiv icon

Kimi K2: Open Agentic Intelligence

Add code
Jul 28, 2025
Figure 1 for Kimi K2: Open Agentic Intelligence
Figure 2 for Kimi K2: Open Agentic Intelligence
Figure 3 for Kimi K2: Open Agentic Intelligence
Figure 4 for Kimi K2: Open Agentic Intelligence
Viaarxiv icon

A Comprehensive Data-centric Overview of Federated Graph Learning

Add code
Jul 22, 2025
Viaarxiv icon

A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction

Add code
Jul 16, 2025
Figure 1 for A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction
Figure 2 for A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction
Figure 3 for A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction
Figure 4 for A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction
Viaarxiv icon

General Modular Harness for LLM Agents in Multi-Turn Gaming Environments

Add code
Jul 15, 2025
Viaarxiv icon

PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling

Add code
Jun 26, 2025
Viaarxiv icon

Scaling Speculative Decoding with Lookahead Reasoning

Add code
Jun 24, 2025
Viaarxiv icon