Picture for Wu Liu

Wu Liu

GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI Agents

Add code
Jan 14, 2026
Viaarxiv icon

Region-Constraint In-Context Generation for Instructional Video Editing

Add code
Dec 19, 2025
Figure 1 for Region-Constraint In-Context Generation for Instructional Video Editing
Figure 2 for Region-Constraint In-Context Generation for Instructional Video Editing
Figure 3 for Region-Constraint In-Context Generation for Instructional Video Editing
Figure 4 for Region-Constraint In-Context Generation for Instructional Video Editing
Viaarxiv icon

MotionPro: A Precise Motion Controller for Image-to-Video Generation

Add code
May 26, 2025
Figure 1 for MotionPro: A Precise Motion Controller for Image-to-Video Generation
Figure 2 for MotionPro: A Precise Motion Controller for Image-to-Video Generation
Figure 3 for MotionPro: A Precise Motion Controller for Image-to-Video Generation
Figure 4 for MotionPro: A Precise Motion Controller for Image-to-Video Generation
Viaarxiv icon

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Add code
Mar 31, 2025
Viaarxiv icon

OmniPrism: Learning Disentangled Visual Concept for Image Generation

Add code
Dec 16, 2024
Figure 1 for OmniPrism: Learning Disentangled Visual Concept for Image Generation
Figure 2 for OmniPrism: Learning Disentangled Visual Concept for Image Generation
Figure 3 for OmniPrism: Learning Disentangled Visual Concept for Image Generation
Figure 4 for OmniPrism: Learning Disentangled Visual Concept for Image Generation
Viaarxiv icon

LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation

Add code
Dec 13, 2024
Figure 1 for LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation
Figure 2 for LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation
Figure 3 for LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation
Figure 4 for LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation
Viaarxiv icon

T-SVG: Text-Driven Stereoscopic Video Generation

Add code
Dec 12, 2024
Figure 1 for T-SVG: Text-Driven Stereoscopic Video Generation
Figure 2 for T-SVG: Text-Driven Stereoscopic Video Generation
Figure 3 for T-SVG: Text-Driven Stereoscopic Video Generation
Figure 4 for T-SVG: Text-Driven Stereoscopic Video Generation
Viaarxiv icon

It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment

Add code
Nov 16, 2024
Figure 1 for It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment
Figure 2 for It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment
Figure 3 for It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment
Figure 4 for It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment
Viaarxiv icon

Motion Capture from Inertial and Vision Sensors

Add code
Jul 23, 2024
Figure 1 for Motion Capture from Inertial and Vision Sensors
Figure 2 for Motion Capture from Inertial and Vision Sensors
Figure 3 for Motion Capture from Inertial and Vision Sensors
Figure 4 for Motion Capture from Inertial and Vision Sensors
Viaarxiv icon

An Application of Large Language Models to Coding Negotiation Transcripts

Add code
Jul 18, 2024
Figure 1 for An Application of Large Language Models to Coding Negotiation Transcripts
Figure 2 for An Application of Large Language Models to Coding Negotiation Transcripts
Figure 3 for An Application of Large Language Models to Coding Negotiation Transcripts
Figure 4 for An Application of Large Language Models to Coding Negotiation Transcripts
Viaarxiv icon