Picture for Kaiwen Zhou

Kaiwen Zhou

Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills

Add code
Jun 12, 2025
Viaarxiv icon

GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent

Add code
May 22, 2025
Viaarxiv icon

SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning

Add code
May 22, 2025
Viaarxiv icon

GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents

Add code
May 21, 2025
Viaarxiv icon

VideoAgent2: Enhancing the LLM-Based Agent System for Long-Form Video Understanding by Uncertainty-Aware CoT

Add code
Apr 06, 2025
Viaarxiv icon

Generative Models in Decision Making: A Survey

Add code
Feb 25, 2025
Viaarxiv icon

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Add code
Feb 18, 2025
Viaarxiv icon

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

Add code
Jan 27, 2025
Figure 1 for FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
Figure 2 for FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
Figure 3 for FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
Figure 4 for FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
Viaarxiv icon

Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration

Add code
Dec 01, 2024
Figure 1 for Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration
Figure 2 for Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration
Figure 3 for Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration
Figure 4 for Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration
Viaarxiv icon

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

Add code
Oct 19, 2024
Figure 1 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 2 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 3 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 4 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Viaarxiv icon