Picture for Mike Zheng Shou

Mike Zheng Shou

In-Context Defense in Computer Agents: An Empirical Study

Add code
Mar 12, 2025
Viaarxiv icon

TPDiff: Temporal Pyramid Video Diffusion Model

Add code
Mar 12, 2025
Viaarxiv icon

Balanced Image Stylization with Style Matching Score

Add code
Mar 10, 2025
Figure 1 for Balanced Image Stylization with Style Matching Score
Figure 2 for Balanced Image Stylization with Style Matching Score
Figure 3 for Balanced Image Stylization with Style Matching Score
Figure 4 for Balanced Image Stylization with Style Matching Score
Viaarxiv icon

Automated Movie Generation via Multi-Agent CoT Planning

Add code
Mar 10, 2025
Figure 1 for Automated Movie Generation via Multi-Agent CoT Planning
Figure 2 for Automated Movie Generation via Multi-Agent CoT Planning
Figure 3 for Automated Movie Generation via Multi-Agent CoT Planning
Figure 4 for Automated Movie Generation via Multi-Agent CoT Planning
Viaarxiv icon

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Add code
Mar 05, 2025
Figure 1 for DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Figure 2 for DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Figure 3 for DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Figure 4 for DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Viaarxiv icon

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Add code
Mar 03, 2025
Viaarxiv icon

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Add code
Feb 23, 2025
Viaarxiv icon

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Add code
Feb 20, 2025
Figure 1 for InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback
Figure 2 for InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback
Figure 3 for InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback
Figure 4 for InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback
Viaarxiv icon

PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning

Add code
Feb 17, 2025
Figure 1 for PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Figure 2 for PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Figure 3 for PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Figure 4 for PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Viaarxiv icon

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Add code
Feb 12, 2025
Viaarxiv icon